We present an efficient method for the simulation of three-dimensional laminar fluid flows with free surfaces including its interaction with moving rigid bodies, based on the two-dimensional shallow water equations and the Lattice-Boltzmann method. Our implementation targets multiple fundamentally different architectures such as commodity multicore CPUs with SSE, GPUs, the Cell BE and clusters. We show that our code scales well on a MPI-based cluster; that an eightfold speedup can be achieved using modern GPUs in contrast to multithreaded CPU code and finally, that it is possible to solve fluid-structure interaction scenarios with high resolution at interactive rates.