ARCS 2008 GPGPU and CUDA Tutorials

Welcome to the supporting web page of the ARCS 2008 GPGPU and CUDA tutorials

The graphics processing unit (GPU) on today's commodity video cards has evolved into an extremely powerful and flexible processor. GPUs provide superior memory bandwidth and computational horsepower, creating interest in them beyond the field of computer graphics. GPGPU stands for "General-Purpose Computation on GPUs". Researchers have found that exploiting the GPU can accelerate some non-graphics problems by over an order of magnitude over the CPU. Several high level languages have emerged for graphics hardware, making this computational power accessible.

However, significant barriers still exist for the developer who wishes to use the inexpensive power of GPUs. These chips are designed for and driven by video game development; the programming model is unusual, resources are tightly constrained, and the underlying architectures are largely secret. This course provides detailed coverage of general-purpose computation on graphics hardware. We emphasize core computational building blocks, for example linear algebra, and review the tools, perils, and tricks of the trade in GPU programming. Example applications and case studies are discussed and evaluated, with a special focus on large-scale GPU cluster computing. Common misconceptions and concerns about GPGPU (e.g. precision vs. accuracy) are addressed.

NVIDIA's CUDA is a new system for general purpose computing on GPUs. CUDA is based on a new programming API which is entirely separate from the graphics driver. It uses the standard C language with extensions, and exposes new hardware features that are not available from OpenGL or Direct3D. The most important of these new features are shared memory, which can greatly improve the performance of bandwidth-limited applications, and an arbitrary load/store memory model, which enables many new algorithms which were previously difficult or impossible on the GPU.

Date and Time

Monday, February 25, 2008, Faculty of Computer Science, TU Dresden

Tutorial organisers

Dominik Göddeke, Applied Mathematics, Dortmund University of Technology, Germany
Robert Strzodka, Max Planck Institut Informatik, Saarbrücken, Germany
Simon Green, NVIDIA, London, UK