Dominik Göddeke's former homepage -- GPGPU Tutorials

Conference Tutorials

These tutorials have been offered at various conferences and cover general GPGPU programming techniques as well as introductions to CUDA, OpenCL and old-school GPGPU programming through graphics APIs. Full slide decks and in some cases, also sample code are available. Feel free to use these tutorials in your own work, but please keep the acknowledgements.

OpenCL and CUDA Sample Code

The sample code below demonstrate how a simple vector addition can be implemented in OpenCL and CUDA. The OpenCL version includes contributions by Dirk Ribbrock. These tutorial codes are also featured on http://gpgpu.org and have been made available as part of GPGPU.org's SourceForge project.

More advanced sample code is provided as part of the PPAM 2013 tutorial.

Old-School GPGPU Coding Tutorials

Back in 2005-2006, I assembled a set of beginners' tutorials on GPGPU linear algebra programming using graphics APIs. What makes these tutorials different from the official GPGPU Hello World (official as in: back in the days) is that they don't even open up a window for display, it's all about offscreen co-processor style computing. I keep them here for historical and admittedly sentimental reasons. All tutorials are also featured on http://gpgpu.org and have been made available as part of GPGPU.org's SourceForge project.

Outdated, Unsupported Old-School Tutorials

My first tutorial ever, the GPGPU Ping Pong Tutorial is still available. In contrast to the Basic Math Tutorial, it is based on an outdated OpenGL technique called pBuffers and contains less details. [PDF] [sources using Cg as shader language] [sources using GLSL as shader language]

The GPGPU Performance Tuning Tutorial outlines several steps to increase the performance of a Jacobi iteration (usually to be used as a smoother in multigrid) for banded FEM matrices. The application and especially the tricks presented are however completely independent of the FEM background. The implementation is based on pBuffers and therefore fundamentally outdated. [PDF] [sources for the first data layout] [sources for the second data layout]