The mathematical theory of positivity-preserving schemes is reviewed. Linear high-order methods are known to violate the positivity constraint in the vicinity of steep gradients. This can be rectified by adding artificial diffusion in order to dampen nonphysical oscillations. The `optimal` diffusion coefficients can be determined using the concept of an M-matrix. They are designed so as to eliminate all negative off-diagonal entries of the discrete transport operator stemming from the Galerkin discretization. The resulting method contains the least amount of artificial diffusion required to preserve positivity. Nevertheless, like any other monotone linear scheme, it is at most first-order accurate. The excessive numerical diffusion can be removed by using a nonlinear mixture of high- and low-order methods. This idea can be traced back to the flux-corrected-transport paradigm originating from finite differences and adapted to finite elements by L/``ohner [1]. Unfortunately, the classical FEM-FCT procedure is restricted to explicit time-stepping. The alternative formulation [2],[3] to be presented in this talk is applicable to both explicit and implicit time discretizations. For the latter ones the Galerkin method constitutes a usable high-order scheme without any extra stabilization by streamline diffusion. The proposed approach to the construction of the low-order scheme makes it possible to decompose the antidiffusive terms into internodal fluxes which can be limited in an essentially one-dimensional manner. In this talk the FCT framework developed for scalar problems is extended to nonlinear systems of conservation laws such as the Euler equations of gas dynamics. The unconditional stability / positivity of implicit FEM-FCT schemes comes at the expense of solving large algebraic systems in each time step. Depending on the size of the problem this can be accomplished efficiently by Krylov subspace or multigrid methods. The edge-by-edge algorithm devised for matrix assembly incorporates some elements of the 1D theory of hyperbolic systems but works independently of the underlying mesh and of the number of spatial dimensions. The scalar dissipation for the low-order method is proportional to the spectral radius of the Roe matrix. A number of examples for standard 2D-benchmark problems (shock tube problem, compression corner, Prandtl-Meyer expansion) demonstrate the potential of the new methodology.