The goal of this tutorial is to explain the background and all necessary steps that are required to implement a simple linear algebra operator on the GPU: saxpy() as known from the BLAS library. For two vectors x and y of length N and a scalar value alpha, we want to compute a scaled vector-vector addition: y = y + alpha * x. The saxpy() operation requires almost no background in linear algebra, and serves well to illustrate all entry-level GPGPU concepts. The techniques and implementation details introduced in this tutorial can easily be extended to more complex calculations on GPUs.