Parallel multigrid methods belong to the most prominent tools for solving huge systems of (non-)linear equations arising from the discretisation of PDEs, as for instance in Computational Fluid Dynamics (CFD). However, the quality of (parallel) multigrid methods in regard of numerical and computational complexity mainly stands and falls with the smoothing algorithms (`smoother`) used. Since the inherent highly recursive character of many global smoothers SOR, ILU) often impedes a direct parallelisation, the application of block smoothers is an alternative. However, due to the weakened recursive character, the resulting parallel efficiency may decrease in comparison to the sequential performance, due to a weaker total numerical efficiency. Within this paper, we show the consequences of such a strategy for the resulting total efficiency on the Hitachi SR8000-F1 if incorporated into the parallel CFD solver parpp3d++ for 3D incompressible flow. Moreover, we analyse the numerical losses of parallel efficiency due to communication costs and numerical efficiency on several modern parallel computer platforms.