Dominik Göddeke -- Publications::Talks::Workshops
Supercomputers on a chip
GPUs as Mathematical Coprocessors in Finite-Element Simulations
Workshop, Dortmund University, Dep. of Mathematics, April 11th 2005
Organization: Dominik Göddeke
Abstracts
-
Robert Strzodka (caesar, Bonn)
Introduction to Data-Stream-Based Processing on Graphics Processors
The performance gap between computational and memory logic has become the main obstacle to fast data processing of large data sets. The von Neumann computing paradigm reinforces this problem by focusing on instruction rather than data processing. Graphics Processing Units (GPUs) have traditionally been optimized for high data throughput. In contrast to instruction-stream-based micro-processors, they subscribe to a data-stream-based computing paradigm which maximizes memory efficiency and exploits the parallelism in situations when the same operation is applied to many data items. Current GPUs execute up to 128 floating point operations in one clock cycle, which equivalents 64 GFLOPS at 500 MHz. The presentation gives an introduction to the GPU programming model and shows how to utilize this parallel processing power.
-
Robert Strzodka (caesar, Bonn)
Scientific Computing on Graphics Processors - Examples in PDE based Image Processing
The hardware accelerated solution of different PDE and minimum problems is presented. The algorithms are applied to image processing but the same solvers can also be used for numerical simulations. Based on the applications different techniques for efficient utilization of the parallel processing power of graphics processors are discussed.
-
Christian Becker (University of Dortmund)
Hardware-oriented numerics and the FEAST framework
Current trends in the software development for Partial Differential Equations, and here in particular for Finite Element (FEM) approaches, go clearly towards object--oriented techniques and adaptive methods in any sense.
Hereby the employed data and solver structures, and especially the matrix structures, are often in contradiction to modern hardware platforms. As a result, the observed computational efficiency is far away from expected peak rates of almost 4 GFLOP/s nowadays, and the "real life" gap will even further increase. So special techniques are necessary to come closer to peak performance. Some of these techniques and their realization within the FEM package FEAST are discussed.
-
Dominik Göddeke (University of Dortmund)
The GPU as a FEM-coprocessor: Algorithmic design goals
When using the GPU in a straightforward manner, outperforming the CPU is a task of moderate complexity once one gets the knack of it. But compared to the (measured) peak performance, GFLOP/s rates are disappointing. The set of experimental results being presented clearly shows what future research should be focussed on to achieve proper fractions of the available performance reserves.
Slides