Home » Software » FEATFLOW2 » Concepts » Block assembly

The block assembly

The block matrix assembly in FEAT2 is an alternative assembly strategy for block matrices. Instead of setting up each submatrix in a block matrix separately, this technique directly works on the complete block matrix, thus saving a lot of computational time for the precomputation of all the basis functions. However, for applying this technique, one has to know a bit more about the assembly of matrices and vectors.

In contrast to the "scalar" assembly, the "block" assembly does not have separate structures for a linear form, bilinear form or trilinear form. Instead, appropriate callback routines directly calculate local matrix/vector entries which are imposed by the block assembly routines into the global matrices/vectors. This involves some additional loops in the callback routines, and the callback routines directly have to deal with finite element basis functions.

	Contents
1	Involved modules and functionality
2	Basics
3	Providing nonlinearities / FE functions / temporary memory
4	Standard operators
5	Block matrices: Advanced operator assembly
6	Block vectors: Advanced vector assembly
7	Accessing FE functions and temporary arrays

Involved modules and functionality

The following modules are involved in the block assembly:

Module	Description
blockmatassemblybase.f90	Definition of structures
blockmatassembly.f90	The actual assembly routines
blockmatassemblystdop.f90	Implementation of standard operators with block assembly routines
feevaluation2.f90	Evaluation of finite element functions for nonlinearities

Currently, the block assembly routines in blockmatassembly.f90 realise the following functionalities:

Assembly of block matrices: Subroutine bma_buildMatrix

This subroutine realises the assembly of block matrices. A special callback routine has to be provided that calculates local matrix entries.
Assembly of block vectors: Subroutine bma_buildVector

This subroutine realises the assembly of block vectors. A special callback routine has to be provided that calculates local vector entries.
Assembly of integrals: Subroutines bma_buildIntegral and bma_buildIntegrals

These subroutines can be used to calculate one or multiple integrals or other distributed operators. They are usually used to compute $L_2$ or $H^1$ norms but can also be used to compute the $sup$ norm of a vector or vector field.

The assembly routines support scalar-values as well as vector-values finite element functions and can also deal with interleaved matrices and vectors. However, each functionality needs a special implementation in the corresponding callback routines, see below.

Basics

When a block assembly routine is invoked, the following steps are done:

The routine loops over sets of cells. On every cell, there exist a set of cubature points.
On every cell and in each cubature point on every cell, the assembly routine calculates the values of the basis functions of all finite element spaces that are involved in the assembly. This includes the basis functions that come from the submatrices in the block matrix as well as from the basis functions of all involved nonlinearities.
If there are finite element nonlinearities specified (e.g., a convection given by a finite element function), their values in all the cubature points are calculated.
Next, a callback routine is invoked. This callback routine has to compute the local matrices/vectors/integrals in all the cubature points on all the elements given.
Finally, the assembly routines collect the calculated local matrix/vector/integral contributions and form a global matrix/vector/integral from them.

Simple block matrix assembly

In the simplest form, calling the matrix assembly reads as follows:

type(t_matrixBlock) :: rmatrix
...
call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,fcalcLocalMatrices)

The routine fcalcLocalMatrices is a user-defined callback routine that calculates the local matrix contributions.

Simple block vector assembly

In the simplest form, calling the vector assembly reads as follows:

type(t_vectorBlock) :: rrhs
...
call bma_buildVector (rrhs,BMA_CALC_STANDARD,fcalcLocalVectors)

The routine fcalcLocalVectors is a user-defined callback routine that calculates the local vector contributions.

Simple integral assembly: General integral

In the simplest form, calling the integral assembly reads as follows:

type(t_scalarCubatureInfo), target :: rmyCubature
real(DP) :: dintvalue
...
call bma_buildIntegral (dintvalue,BMA_CALC_STANDARD,fcalcLocalIntegral,&
    rtriangulation,rcubatureInfo=rmyCubature)

The routine fcalcLocalIntegral is a callback routine that calculates the local integral contribution on a set of cells. rtriangulation specifies the underlying mesh and rmyCubature the cubature formula to be used. The result of the integration is saved to dintvalue.

The alternative routine bma_buildIntegrals cab be used to calculate multiple integrals at once. For example,

type(t_scalarCubatureInfo), target :: rmyCubature
real(DP), dimension(3) :: Dintvalues
...
call bma_buildIntegrals (Dintvalue,BMA_CALC_STANDARD,fcalcLocalIntegral,&
    rtriangulation,rcubatureInfo=rmyCubature)

calculates three domain integrals at once into Dintvalues(:). The corresponding local contributions are calculated by fcalcLocalIntegral. rmyCubature defines the cubature rule to be used on the mesh rtriangulation.

Remark: This form of the integration does not necessitate any finite element function and is rather seldom used. The original idea of the integral assembly is to calculate norms which involves finite element functions. For such an application, the routine is used in a slightly different manner.

Providing nonlinearities / FE functions / temporary memory

All assembly routines can be called with an optional parameter revalVectors. In combination with the module feevaluation2.f90, this parameter offers the possibility to specify finite element functions that live on the same mesh as on which the assembly is applied. All finite element functions given in this structure are automatically evaluated in all cubature points, and the calculated values are provided to the callback routines for further processing.

Example: A discrete stationary Oseen equation has the form $$-Delta u_h + v_h nabla u_h + nabla p_h = f_h$$ with $u_h$, $v_h$ and $f_h$ vector fields. The operator $n(v_h,cdot):=v_h nabla cdot$ needs a finite element function $v_h$ for the assembly, which specifies the direction of a flow. Such a function can be provided to the callback routines via revalVectors.

Let a finite element function $v_h$ be given as a block vector structure rvectorVh. This vector can be provided to the assembly routines as follows. We demonstrate the approach for the matrix assembly, however, for vector or integral assembly, the approach is exactly the same.

Providing a scalar function: Let $v_h=(v_1)$ be a block vector containing only one subvector. Such a subvector is provided to the assembly routines as follows:

type(t_fev2Vectors) :: rmyVectors
type(t_vectorBlock) :: rvectorVh

type(t_matrixBlock) :: rmatrix
...
call fev2_addVectorToEvalList(rmyVectors,rvectorVh%RvectorBlock(1),0)

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
    fcalcLocalMatrices,revalVectors=rmyVectors)

call fev2_releaseVectorList (rmyVectors)

There is no special initialisation routine for the structure rmyVectors. One simply adds all scalar components of the vector $v_h$ to rmyVectors and provides this structure to the bma_buildXXX routine. The assembly routine will evaluate the provided functions automatically.

Derivatives: The last parameter nmaxDerivative in the routine fev2_addVectorToEvalList is a derivative quantifier. It advises the evaluation routines what to evaluate. nmaxDerivative=0 (as specified above) calculates the function values in the cubature points, =1 the 1st derivatives, =2 the 2nd derivatives and so on. This can be used for more complicated operators which also involve derivatives of nonlinearities.

The following example assumes that $v_h=(v_1,v_2,v_3)$ has multiple components. It adds all components, one by one, and advises the assembly routine to calculate also their derivatives in the cubature points. So in the callback function the values of $v_h$ as well as the values of $nabla v_h$ are available.

type(t_fev2Vectors) :: rmyVectors
type(t_vectorBlock) :: rvectorVh

type(t_matrixBlock) :: rmatrix
...
call fev2_addVectorToEvalList(rmyVectors,rvectorVh%RvectorBlock(1),1)
call fev2_addVectorToEvalList(rmyVectors,rvectorVh%RvectorBlock(2),1)
call fev2_addVectorToEvalList(rmyVectors,rvectorVh%RvectorBlock(3),1)

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
    fcalcLocalMatrices,revalVectors=rmyVectors)

call fev2_releaseVectorList (rmyVectors)

Providing vector fields: Let $v_h=(v_1,v_2)$ be a block vector describing a vector field (for example, a velocity vector field of a flow in 2D). The following example provides the vector field to the evaluation routine via the function fev2_addVectorFieldToEvalList. This subroutine expects all components of the vector field in the parameters and allows 1D, 2D and 3D vector fields:

type(t_fev2Vectors) :: rmyVectors
type(t_vectorBlock) :: rvectorVh

type(t_matrixBlock) :: rmatrix
...
call fev2_addVectorFieldToEvalList(rmyVectors,0,&
    rvectorVh%RvectorBlock(1),rvectorVh%RvectorBlock(2))

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
    fcalcLocalMatrices,revalVectors=rmyVectors)

call fev2_releaseVectorList (rmyVectors)

Again, the parameter nmaxDerivative (which is set to nmaxDerivative=0 in this example) specifies in how far derivatives from the vector field should be calculated. A call in the form

...
call fev2_addVectorFieldToEvalList(rmyVectors,1,&
    rvectorVh%RvectorBlock(1),rvectorVh%RvectorBlock(2))
...

would calculate $v_h$ as well as $nabla v_h$ and provide it to the callback routine.

Providing temporary memory: Additionally to vectors and vector fields, the user can also specify "dummy vectors" and "dummy vector fields" in the t_fev2Vectors structure. With this technique, it is possible to provide temporary memory in all cubature points on all elements to callback routines. Such temporary memory is needed in complicated operators for intermediate calculations which cannot be done in one simple step.

Example: A non-Newtonian fluid following the Power law model is given by the equation $$-nu(x)Delta u + u nabla u + nabla p = f$$ with a nonlinear viscosity $$nu(x) = nu_0 z^{e/2 - 1},$$ $e,,nu_0inmathbb{R}$ a constant and $z$ be given by $$z=||D(u)||^2+varepsilon,qquad D(u)=frac{1}{2} (nabla u + nabla u^T),$$ with $varepsiloninmathbb{R}$ a constant. Computation routines for this model at first evaluate $nabla u$ and $nabla u^T$, then $||D(u)||$ and at the end $nu(x)$. For the intermediate steps, some temporary memory is necessary which can be provided by "dummy vectors".

The following routines provide temporary memory:

fev2_addDummyVectorToEvalList

provides one or multiple temporary arrays with memory in every cubature point on every element. The following example adds one temporary array to the structure rmyVectors.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...
call fev2_addDummyVectorToEvalList(rmyVectors)

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
    fcalcLocalMatrices,revalVectors=rmyVectors)
...

fev2_addDummyVecFieldToEvalList

provides temporary memory for one or multiple vector fields with memory in every cubature point on every element. The following example adds one temporary vector field with 2 vector components to the structure rmyVectors.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...
call fev2_addDummyVecFieldToEvalList(rmyVectors,2)

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
    fcalcLocalMatrices,revalVectors=rmyVectors)
...

Using the parameter nsubarrays allows to reserve also some subarrays associated to this temporary array. This is used, e.g., to allocate memory for a vector field as well as its derivatives. The following example allocates temporary memory for a temporary vector field $v_h=(v_1,v_2)$ and its derivative $nabla v_h$ in all cubature points on all elements. We need nsubarrays=3, corresponding to the three terms $v_h$, $partial_x v_h$ and $partial_y v_h$.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...
call fev2_addDummyVecFieldToEvalList(rmyVectors,2,3)

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
    fcalcLocalMatrices,revalVectors=rmyVectors)
...

fev2_addCellDummyToEvalList

provides one or multiple temporary arrays with memory for one information on every element (cell based information). The following example adds one temporary cell based array to the structure rmyVectors.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...
call fev2_addCellDummyToEvalList(rmyVectors)

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
    fcalcLocalMatrices,revalVectors=rmyVectors)
...

fev2_addDummyVecFieldToEvalList

provides temporary memory for one or multiple vector fields with memory for one information on every element (cell based information). The following example adds one temporary vector field with 2 cell based vector components to the structure rmyVectors.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...
call fev2_addCellVecFieldDummy(rmyVectors,2)

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
    fcalcLocalMatrices,revalVectors=rmyVectors)
...

Standard operators

The module blockmatassemblystdop.f90 provides a lot of standard implementations for the most important standard operators - matrices, vectors and integrals. These implemetations are given as callback routines which can either be provided to the bma_buildXXXX routines directly, or which can be called inside of a callback function. The following naming conventions are used here:

bma_fcalc_XXXX: Routines with this name can be used as callback function in a call to bma_buildXXXX. Parameters are passed via a t_collection structure and the vector evaluation structure revalVectors. There is no need for writing a user-defined callback function. It is a wrapper for the corresponding bma_docalc_XXXX subroutine.
bma_docalc_XXXX: This is the immediate version of the bma_fcalc_XXXX routine from above. Routines of type bma_docalc_XXXX can be called inside of a callback function of bma_buildXXXX to calculate a specific operator. This allows to easily compute multiple operators in one step.

Using `bma_fcalc_XXXX` for simple calculations

With the routines bma_fcalc_XXXX from blockmatassemblystdop.f90 one can easily assemble all types of matrices with very few commands. Parameters for these routines are passed via the "QuickAccess" arrays of the collection structure in a specified manner. In the following, we give some examples how these routines are used.

Assembly of a Laplace matrix

The following code calculates a Laplace matrix at position (ix,iy) of a block matrix. We specify "1.0" as multiplier in front of the Laplace matrix:

type(t_matrixBlock) :: rmatrix
type(t_collection) :: rcoll
...
  rcoll%DquickAccess(1) = 1.0_DP  ! multiplier
  rcoll%IquickAccess(1) = ix
  rcoll%IquickAccess(2) = iy

  call bma_buildMatrix (&
      rmatrix,BMA_CALC_STANDARD,bma_fcalc_laplace, rcoll)
...

Assembly of a tensor Laplace matrix

The following code calculates a tensor Laplace matrix at position (1,1) of a block matrix - i.e., a $dtimes d$ block matrix with the Laplace operator on the diagonal. The dimension $d$ of the tensor is taken from the underlying triangulation. We specify "1.0" as multiplier in front of the tensor Laplace matrix and use a tensor size of 2. This is typically used for the Stokes equations in 2D where the Stokes operator is exactly a $2times 2$ block matrix with Laplace on the diagonal:

type(t_matrixBlock) :: rmatrix
type(t_collection) :: rcoll
...
  rcoll%DquickAccess(1) = 1.0_DP  ! multiplier
  rcoll%IquickAccess(1) = 1
  rcoll%IquickAccess(2) = 1
  rcoll%IquickAccess(3) = 2

  call bma_buildMatrix (&
      rmatrix,BMA_CALC_STANDARD,bma_fcalc_laplaceTensor, rcoll)
...

Assembly of a Mass matrix

The following code calculates a mass matrix at position (ix,iy) of a block matrix. We specify "1.0" as multiplier in front of the Laplace matrix:

type(t_matrixBlock) :: rmatrix
type(t_collection) :: rcoll
...
  rcoll%DquickAccess(1) = 1.0_DP  ! multiplier
  rcoll%IquickAccess(1) = ix
  rcoll%IquickAccess(2) = iy

  call bma_buildMatrix (&
      rmatrix,BMA_CALC_STANDARD,bma_fcalc_mass, rcoll)
...

A tensor mass matrix can be assembled via bma_fcalc_massTensor, similar to the tensor Laplace matrix above.

Assembly of a "scalar" convection operator

The following code assembles the convection operator $n(u_h,v_h,w_h)=( (u_h nabla) v_h, w_h )$ at position (1,1) of a block matrix with convection (2.0,3.0). We apply the operator only to the scalar matrix at position (1,1) which realises the convection in a convection-diffusion-reaction equation of type $$-Delta u + u nabla u + u = f.$$

type(t_matrixBlock) :: rmatrix
type(t_collection) :: rcoll
...
  rcoll%IquickAccess(1) = 1  ! x-position
  rcoll%IquickAccess(2) = 1  ! y-position
  rcoll%IquickAccess(3) = 0  ! scalar destination
  rcoll%IquickAccess(4) = 0  ! constant direction

  rcoll%DquickAccess(1) = 1.0_DP  ! multiplier
  rcoll%DquickAccess(2) = 2.0_DP  ! x-direction
  rcoll%DquickAccess(3) = 3.0_DP  ! y-direction

  call bma_buildMatrix (&
      rmatrix,BMA_CALC_STANDARD,bma_fcalc_convec_ugradvw, rcoll)
...

Assembly of a convection operator

The following code assembles the convection operator $n(u_h,v_h,w_h)=( (u_h nabla) v_h, w_h )$ at position (1,1) of a block matrix. The nonlinearity $u_h$ is given as a finite element function, realised by the block vector rvelocity. We apply the nonlinearity to the full velocity tensor; the dimension is derived from the underlying triangulation, which is assumed to be 2D here.

type(t_matrixBlock) :: rmatrix
type(t_vectorBlock) :: rvelocity

type(t_fev2Vectors) :: rmyVectors
type(t_collection) :: rcoll
...
  ! Parameters
  rcoll%IquickAccess(1) = 1  ! x-position
  rcoll%IquickAccess(2) = 1  ! y-position
  rcoll%IquickAccess(3) = 1  ! tensor destination
  rcoll%IquickAccess(4) = 1  ! nonconstant direction

  rcoll%DquickAccess(1) = 1.0_DP  ! multiplier

  ! Provide the vector field
  call fev2_addVectorFieldToEvalList(rmyVectors,0,&
      rvelocity%RvectorBlock(1),rvelocity%RvectorBlock(2))

  call bma_buildMatrix (&
      rmatrix,BMA_CALC_STANDARD,bma_fcalc_convec_ugradvw, rcoll,&
      revalVectors = rmyVectors)

  call fev2_releaseVectorList (rmyVectors)
...

Assembly of a constant RHS vector

The following code assembles a RHS vector to the constant RHS function $f=1$. The RHS is assumed to have only one component:

type(t_matrixBlock) :: rrhs
type(t_collection) :: rcoll
...
  ! Parameters
  rcoll%IquickAccess(1) = 1       ! Number of components
  rcoll%DquickAccess(1) = 1.0_DP  ! The value of the RHS function

  call bma_buildVector (&
      rmatrix,BMA_CALC_STANDARD,bma_fcalc_rhsConst, rcoll)
...

Assembly of a $||u_h||_{L_2}$

The following code assembles the $L_2$ norm of a finite element function $u_h=(u_1)$ given by rx.

type(t_vectorBlock) :: rx
type(t_fev2Vectors) :: rmyVectors
real(DP) :: dintegral
...
  ! Provide the vector
  call fev2_addVectorToEvalList(rmyVectors,rx%RvectorBlock(1),0)

  call bma_buildIntegral (&
      dintvalue,BMA_CALC_STANDARD,bma_fcalc_L2norm,revalVectors=rmyVectors)

  call fev2_releaseVectorList (rmyVectors)
...

Remark: If a finite element function is specified via revalVectors, a triangulation, boundary definition and cubature rule do not have to be specified. Although it may be advisable top specify a cubature rule, triangulation and boundary description is taken from the first vector in the vector list. This simplifies the call to bma_buildIntegral.

Assembly of a $||u_h||_{H^1}$

The following code assembles the $H^1$ norm of a finite element function $u_h=(u_1,u_2)$ given by rx. The callback routine needs the 1st derivatives of the function $u_h$, and thus, nmaxDerivative=1 is specified in fev2_addVectorToEvalList:

type(t_vectorBlock) :: rx
type(t_fev2Vectors) :: rmyVectors
real(DP) :: dintegral
...
  ! Provide the vector and its derivative
  call fev2_addVectorToEvalList(rmyVectors,rx%RvectorBlock(1),1)
  call fev2_addVectorToEvalList(rmyVectors,rx%RvectorBlock(2),1)

  call bma_buildIntegral (&
      dintvalue,BMA_CALC_STANDARD,bma_fcalc_H1norm,revalVectors=rmyVectors)

  call fev2_releaseVectorList (rmyVectors)
...

Assembly of $||u_h||_{infty}$

The following call uses an extended syntax of bma_buildIntegral to calculate the $sup$ norm of a finite element function. Normally, bma_buildIntegral calculates integrals by summing up the contributions of all subdomains. Modifying the optional parameter coperation, bma_buildIntegral can be told to compute the maximum of the values of the subdomains. In combination with bma_fcalc_MAXnorm, this allows to compute the maximum norm:

type(t_vectorBlock) :: rx
type(t_fev2Vectors) :: rmyVectors
real(DP) :: dmaxnorm
...
  ! Provide the vector
  call fev2_addVectorToEvalList(rmyVectors,rx%RvectorBlock(1),0)

  call bma_buildIntegral (&
      dintvalue,BMA_CALC_STANDARD,bma_fcalc_MAXnorm,&
      revalVectors=rmyVectors,coperation=BMA_INT_MAX)

  call fev2_releaseVectorList (rmyVectors)
...

Using `bma_docalc_XXXX` in callback routines

For more complex matrices and vectors, the user has to provide a user-defined callback routine that does the assembly. The actual calculation routines bma_docalc_XXXX are designed to be called in such user-defined callback routines for the assembly of standard operators. Using these routines allows to assemble a lot of standard operators with high efficiency.

The following example demonstrates how to set up the operator $text{DR}:=-Delta + I$ at position (1,1) of a block matrix. This operator realises the left-hand side of the diffusion-reaction equation $-Delta u + u = f$:

subroutine fcalc_DR(RmatrixData,rassemblyData,rmatrixAssembly,&
    npointsPerElement,nelements,revalVectors,rcollection)
...

  call bma_docalc_laplace(RmatrixData,rassemblyData,rmatrixAssembly,&
      npointsPerElement,nelements,1.0_DP,1,1)

  call bma_docalc_mass(RmatrixData,rassemblyData,rmatrixAssembly,&
      npointsPerElement,nelements,1.0_DP,1,1)

end subroutine

  ...
  type(t_matrixBlock) :: rmatrix
  ...
  call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,fcalc_DR)
  ...

The callback routine just calls the bma_docalc_XXXX routines for all operators to be assembled. For more complicated operators which probably involve a nonlinearity, parameters must be passed to the user-defined callback routine via the collection structure and the vector evaluation structure revalVectors. The callback routine can then use this information and call the predefined calculation routines.

The following example demonstrates this approach. The callback routine assembles the operator for the diffusion-convection-equation $$-nuDelta u_h + v_h nabla u_h = f_h.$$ The convection is given as a finite element function $v_h$ which is available inside of the callback function via revalVectors%p_RvectorData(1). The constant $nu$ is passed via the DquickAccess array of the collection.

subroutine fcalc_DC(RmatrixData,rassemblyData,rmatrixAssembly,&
    npointsPerElement,nelements,revalVectors,rcollection)
...
  real(DP) :: dnu

  ! Get the parameter
  dnu = rcollection%DquickAccess(1)

  ! Calculate nu*Laplace
  call bma_docalc_laplace(RmatrixData,rassemblyData,rmatrixAssembly,&
      npointsPerElement,nelements,dnu,1,1)

  ! Calculate the convection
  call bma_docalc_convec_ugradvw(RmatrixData,rassemblyData,rmatrixAssembly,&
      npointsPerElement,nelements,1.0_DP,1,1,&
      rvectorField=revalVectors%p_RvectorData(1))

end subroutine

  type(t_matrixBlock) :: rmatrix
  type(t_vectorBlock) :: rvectorVh

  type(t_fev2Vectors) :: rmyVectors
  type(t_collection) :: rcoll
  ...
    ! Parameters
    rcoll%DquickAccess(1) = nu  ! multiplier

    ! Provide the nonlinearity
    call fev2_addVectorToEvalList(rmyVectors,rvectorVh,0)

    call bma_buildMatrix (&
        rmatrix,BMA_CALC_STANDARD,fcalc_DC, rcoll,&
        revalVectors = rmyVectors)

    call fev2_releaseVectorList (rmyVectors)
  ...

Block matrices: Advanced operator assembly

For complex operators, it is necessary to use an appropriately designed callback routine for the matrix assembly. While standard operators can be assembled with the techniques above, the implementation of complex operators involves loops over all elements, cubature points as well as trial and test basis functions - and probably, the dimension of the FE space. However, there is no such a "general" case as it is for the "scalar" assembly. For optimal efficiency and a maximum of generality, one has to implement an operator four times, once for each of the following cases:

Scalar valued trial/test FE space
Vector valued trial/test FE space
Scalar valued, interleaved trial/test FE space
Vector valued, interleaved trial/test FE space

The four implementations are rather similar, however, differ in some small parts of the loops. One should note firthermore that some operators cannot be implemented in all cases. The user has to decide on which implementation is to choose for the desired case.

Remark: Also the default implementations in blockmatassemblystdop.f90 do not cover all the above four cases for all types of finite element spaces. Most implementations cover only case 1 while only the most important standard operators (like Laplace and Mass) are designed for all cases. For the implementation of a new operator, the user is advised to do copy&paste the code for case 1 and later generalise this if necessary.

Input/output of the callback routine

The callback routine used for the calculation of matrices has the following interface:

subroutine fcalcLocalMatrices(RmatrixData,rassemblyData,rmatrixAssembly,&
    npointsPerElement,nelements,revalVectors,rcollection)

The variables have the following meaning:

Variable	Description
RmatrixData	An $mtimes n$ array of `t_bmaMatrixData` structures, one for each submatrix in the global matrix
rmatrixData	Data necessary for the assembly. Contains determinants and cubature weights for the cubature,...
rmatrixAssembly	Structure with all data about the assembly
npointsPerElement	Number of points per element
nelements	Number of elements
revalVectors	Provided FEM routines, encapsules nonlinearities and temporary memory
rcollection	Collection structure for user-defined parameters

The callback routine has to use the values of the finite element trial and test spaces, cubature points/weights and the values of all nonlinearities/coefficients to compute local matrices for a set of nelements elements. The main assembly routine will later on impose these local matrices into the global matrix. Depending on which of the above four cases the callback routine is used four, different variables of the above parameters can be used and have to be written to. Here a small overview about input/output variables in the four cases.

Case 1: Scalar valued trial/test FE space

The following parameters hold:

Variable	Input/Output	Description
RmatrixData(:,:)%p_Dentry	Output	The local matrix entries, to be computed

rassemblyData%p_DcubWeight	Input	Cubature weights in all cubature points
RmatrixData(:,:)%p_DbasTrial		For all matrices: values of the corresponding trial basis functions in all cubature points
RmatrixData(:,:)%p_DbasTest		For all matrices: values of the corresponding test basis functions in all cubature points
RmatrixData(:,:)%ndofTrial		Number of degrees of freedom in the trial space
RmatrixData(:,:)%ndofTest		Number of degrees of freedom in the test space

RmatrixData(:,:)%ndimfeTrial	Input	=1, indicates scalar-valued trial FE space
RmatrixData(:,:)%ndimfeTest		=1, indicates scalar-valued test FE space
RmatrixData(:,:)%bisInterleaved		=.false. here, the matrix is not interleaved

p_Dentry (1:ndofTrial, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local matrices.
p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)
p_DbasTrial (1:ndofTrial, :, 1:ncubp, 1:nelements) and
p_DbasTest (1:ndofTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative.

Case 2: Vector valued trial/test FE space

The following parameters hold:

Variable	Input/Output	Description
RmatrixData(:,:)%p_Dentry	Output	The local matrix entries, to be computed

rassemblyData%p_DcubWeight	Input	Cubature weights in all cubature points
RmatrixData(:,:)%p_DbasTrial		For all matrices: values of the corresponding trial basis functions in all cubature points
RmatrixData(:,:)%p_DbasTest		For all matrices: values of the corresponding test basis functions in all cubature points
RmatrixData(:,:)%ndofTrial		Number of degrees of freedom in the trial space
RmatrixData(:,:)%ndofTest		Number of degrees of freedom in the test space
RmatrixData(:,:)%ndimfeTrial		>1, Dimension of the trial FE space
RmatrixData(:,:)%ndimfeTest		>1, Dimension of the test FE space

RmatrixData(:,:)%bisInterleaved	Input	=.false. here, the matrix is not interleaved

p_Dentry (1:ndofTrial, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local matrices.
p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)
p_DbasTrial (1:ndofTrial*ndimfeTrial, :, 1:ncubp, 1:nelements) and
p_DbasTest (1:ndofTest*ndimfeTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative. More precisely
- p_DbasTrial (1:ndofTrial + 0*ndofTrial , :,:,:) = 1st dimension, trial space
- p_DbasTrial (1:ndofTrial + 1*ndofTrial , :,:,:) = 2nd dimension, trial space
- p_DbasTrial (1:ndofTrial + 2*ndofTrial , :,:,:) = 3rd dimension, trial space, ...
- p_DbasTest (1:ndofTest + 0*ndofTest , :,:,:) = 1st dimension, test space
- p_DbasTest (1:ndofTest + 1*ndofTest , :,:,:) = 2nd dimension, test space
- p_DbasTest (1:ndofTest + 2*ndofTest , :,:,:) = 3rd dimension, test space, ...

Case 3: Scalar valued, interleaved trial/test FE space

The following parameters hold:

Variable	Input/Output	Description
RmatrixData(:,:)%p_DentryIntl	Output	The local matrix entries, to be computed

rassemblyData%p_DcubWeight	Input	Cubature weights in all cubature points
RmatrixData(:,:)%p_DbasTrial		For all matrices: values of the corresponding trial basis functions in all cubature points
RmatrixData(:,:)%p_DbasTest		For all matrices: values of the corresponding test basis functions in all cubature points
RmatrixData(:,:)%ndofTrial		Number of degrees of freedom in the trial space
RmatrixData(:,:)%ndofTest		Number of degrees of freedom in the test space
RmatrixData(:,:)%nvar		Number of variables per matrix entry

RmatrixData(:,:)%ndimfeTrial	Input	=1, indicates scalar-valued trial FE space
RmatrixData(:,:)%ndimfeTest		=1, indicates scalar-valued test FE space
RmatrixData(:,:)%bisInterleaved		=.true. here, the matrix is interleaved

p_DentryIntl (1:nvar, 1:ndofTrial, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local matrices.
p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)
p_DbasTrial (1:ndofTrial, :, 1:ncubp, 1:nelements) and
p_DbasTest (1:ndofTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative.

Case 4: Vector valued, interleaved trial/test FE space

The following parameters hold:

Variable	Input/Output	Description
RmatrixData(:,:)%p_DentryIntl	Output	The local matrix entries, to be computed

rassemblyData%p_DcubWeight	Input	Cubature weights in all cubature points
RmatrixData(:,:)%p_DbasTrial		For all matrices: values of the corresponding trial basis functions in all cubature points
RmatrixData(:,:)%p_DbasTest		For all matrices: values of the corresponding test basis functions in all cubature points
RmatrixData(:,:)%ndofTrial		Number of degrees of freedom in the trial space
RmatrixData(:,:)%ndofTest		Number of degrees of freedom in the test space
RmatrixData(:,:)%ndimfeTrial		>1, Dimension of the trial FE space
RmatrixData(:,:)%ndimfeTest		>1, Dimension of the test FE space
RmatrixData(:,:)%nvar		Number of variables per matrix entry

RmatrixData(:,:)%bisInterleaved	Input	=.true. here, the matrix is interleaved

p_DentryIntl (1:nvar, 1:ndofTrial, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local matrices.
p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)
p_DbasTrial (1:ndofTrial*ndimfeTrial, :, 1:ncubp, 1:nelements) and
p_DbasTest (1:ndofTest*ndimfeTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative. More precisely
- p_DbasTrial (1:ndofTrial + 0*ndofTrial , :,:,:) = 1st dimension, trial space
- p_DbasTrial (1:ndofTrial + 1*ndofTrial , :,:,:) = 2nd dimension, trial space
- p_DbasTrial (1:ndofTrial + 2*ndofTrial , :,:,:) = 3rd dimension, trial space, ...
- p_DbasTest (1:ndofTest + 0*ndofTest , :,:,:) = 1st dimension, test space
- p_DbasTest (1:ndofTest + 1*ndofTest , :,:,:) = 2nd dimension, test space
- p_DbasTest (1:ndofTest + 2*ndofTest , :,:,:) = 3rd dimension, test space, ...

Basic structure of the callback routine

The actual structure of the callback routine is now demonstrated on an example:

Example: The following example demonstrates how to set up a Laplace operator in 2D at position (1,1) of a block matrix, here for scalar-valued FEM spaces like $Q_1$, $Q_2$, etc:

    subroutine fcalc_laplace(RmatrixData,rassemblyData,rmatrixAssembly,&
        npointsPerElement,nelements,revalVectors,rcollection)

     ...

      real(DP) :: dbasIx, dbasJx, dbasIy, dbasJy
      integer :: iel, icubp, idofe, jdofe, ndofTrial, ndofTest
      real(DP), dimension(:,:,:), pointer :: p_DlocalMat
      real(DP), dimension(:,:,:,:), pointer :: p_DbasTrial,p_DbasTest
      real(DP), dimension(:,:), pointer :: p_DcubWeight

      ! Get cubature weights data
      p_DcubWeight => rassemblyData%p_DcubWeight

      ! Get local data
      ndofTrial = RmatrixData(1,1)%ndofTrial
      ndofTest  = RmatrixData(1,1)%ndofTrial
      p_DbasTrial => RmatrixData(1,1)%p_DbasTrial
      p_DbasTest  => RmatrixData(1,1)%p_DbasTest
      p_DlocalMat => RmatrixData(1,1)%p_Dentry

      ! Loop over the elements in the current set.
      do iel = 1,nelements

        ! Loop over all cubature points on the current element
        do icubp = 1,npointsPerElement

          ! Loop over the test functions
          do idofe=1,ndofTest

            ! Get the values of the test basis functions
            dbasIx = p_DbasTest(idofe,DER_DERIV2D_X,icubp,iel)
            dbasIy = p_DbasTest(idofe,DER_DERIV2D_Y,icubp,iel)

            ! Loop over the trial basis functions
            do jdofe=1,ndofTrial

              ! Get the values of the trial basis functions
              dbasJx = p_DbasTrial(jdofe,DER_DERIV2D_X,icubp,iel)
              dbasJy = p_DbasTrial(jdofe,DER_DERIV2D_Y,icubp,iel)

              ! Calculate the local matrix entries
              p_DlocalMat(jdofe,idofe,iel) = p_DlocalMat(jdofe,idofe,iel) + &
                  p_DcubWeight(icubp,iel) * ( dbasJx*dbasIx + dbasJy*dbasIy )

            end do ! jdofe

          end do ! idofe

        end do ! icubp

      end do ! iel

    end subroutine

In the above example it can be seen that setting up a Laplace matrix needs a callback routine with altogether four nested do loops:

An outer DO loop over elements
An inner DO loop over cubature points
For every cubature point, an inner loop over the DOFs in the test space
For every DOF in the test space, an inner loop over the DOFs in the trial space

In the innerst loop, the actual matrix entries are computed using cubature. The variables dbasIx, dbasIy, dbasJx and dbasJy fetch the values of the basis functions in the cubature points. The sum in the innerst loop

...
  p_DlocalMat(...) = p_DlocalMat(...) + &
      p_DcubWeight(...) * ( dbasJx*dbasIx + dbasJy*dbasIy )
...

realises the summation $$sum_{x_k}sum_{i,j} omega(x_k) (partial_{x_1}psi_j(x_k) partial_{x_1}varphi_i(x_k) + partial_{x_2}psi_j(x_k) partial_{x_2}varphi_i(x_k) )$$ with $x_k$ the cubature points, $psi_j$ the local test functions and $varphi_i$ the local trial functions on the cell $T$. This is the discrete counterpart to the integral $$(nabla psi_j, nabla varphi_i)T = int_T partial{x_1}psi_j partial_{x_1}varphi_i + partial_{x_2}psi_j partial_{x_2}varphi_i$$ which realises the local Laplace operator on a cell $T$ in 2D. Summing up the contributions of all elements gives the global Laplace operator.

Remark: For the actual assembly, the order of the summation is changed. The above loops do not calculate $(nabla psi_j, nabla varphi_i)_T$ but only its contribution in one cubature point for one combination of trial/test functions, summing up later. In this order, the assembly is faster.

Block vectors: Advanced vector assembly

THe assembly of complex right-hand side vectors is rather similar to the assembly of complex vectors. One needs an appropriately designed callback routine which applies loops over all elements, cubature points as well as trial and test basis functions. For optimal efficiency and a maximum of generality, also here one has to implement an operator four times, once for each of the following cases:

Scalar valued trial/test FE space
Vector valued trial/test FE space
Scalar valued, interleaved trial/test FE space
Vector valued, interleaved trial/test FE space

Input/output of the callback routine

The callback routine used for the calculation of matrices has the following interface:

subroutine fcalcLocalVectors(rvectorData,rassemblyData,rvectorAssembly,&
    npointsPerElement,nelements,revalVectors,rcollection)

The variables have the following meaning:

Variable	Description
RvectorData	An array of `t_bmaVectorData` structures, one for each subvector in the global vector
rassemblyData	Data necessary for the assembly. Contains determinants and cubature weights for the cubature,...
rvectorAssembly	Structure with all data about the assembly
npointsPerElement	Number of points per element
nelements	Number of elements
revalVectors	Provided FEM routines, encapsules nonlinearities and temporary memory
rcollection	Collection structure for user-defined parameters

The callback routine has to use the values of the finite element test spaces, cubature points/weights and the values of all nonlinearities/coefficients to compute local vectors for a set of nelements elements. The main assembly routine will later on impose these local vectors into the global vector. Depending on which of the above four cases the callback routine is used four, different variables of the above parameters can be used and have to be written to. Here a small overview about input/output variables in the four cases.

Case 1: Scalar valued test FE space

The following parameters hold:

Variable	Input/Output	Description
RvectorData(:)%p_Dentry	Output	The local vector entries, to be computed

rassemblyData%p_DcubWeight	Input	Cubature weights in all cubature points
RvectorData(:)%p_DbasTest		For all vectors: values of the corresponding test basis functions in all cubature points
RvectorData(:)%ndofTest		Number of degrees of freedom in the test space

RvectorData(:)%ndimfeTest	Input	=1, indicates scalar-valued test FE space
RvectorData(:)%bisInterleaved		=.false. here, the vector is not interleaved

p_Dentry (1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local vectors.
p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)
p_DbasTest (1:ndofTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative.

Case 2: Vector valued test FE space

The following parameters hold:

Variable	Input/Output	Description
RvectorData(:)%p_Dentry	Output	The local vector entries, to be computed

rassemblyData%p_DcubWeight	Input	Cubature weights in all cubature points
RvectorData(:)%p_DbasTest		For all vectors: values of the corresponding test basis functions in all cubature points
RvectorData(:)%ndofTest		Number of degrees of freedom in the test space
RvectorData(:)%ndimfeTest		>1, Dimension of the test FE space

RvectorData(:)%bisInterleaved	Input	=.false. here, the vector is not interleaved

p_Dentry (1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local vectors.
p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)
p_DbasTest (1:ndofTest*ndimfeTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative. More precisely
- p_DbasTest (1:ndofTest + 0*ndofTest , :,:,:) = 1st dimension, test space
- p_DbasTest (1:ndofTest + 1*ndofTest , :,:,:) = 2nd dimension, test space
- p_DbasTest (1:ndofTest + 2*ndofTest , :,:,:) = 3rd dimension, test space, ...

Case 3: Scalar valued, interleaved test FE space

The following parameters hold:

Variable	Input/Output	Description
RvectorData(::)%p_DentryIntl	Output	The local vector entries, to be computed

rassemblyData%p_DcubWeight	Input	Cubature weights in all cubature points
RvectorData(:)%p_DbasTest		For all vectors: values of the corresponding test basis functions in all cubature points
RvectorData(:)%ndofTest		Number of degrees of freedom in the test space
RvectorData(:)%nvar		Number of variables per vector entry

RvectorData(:)%ndimfeTest	Input	=1, indicates scalar-valued test FE space
RvectorData(:)%bisInterleaved		=.true. here, the vector is interleaved

p_DentryIntl (1:nvar, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local vectors.
p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)
p_DbasTest (1:ndofTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative.

Case 4: Vector valued, interleaved test FE space

The following parameters hold:

Variable	Input/Output	Description
RvectorData(:)%p_DentryIntl	Output	The local vector entries, to be computed

rassemblyData%p_DcubWeight	Input	Cubature weights in all cubature points
RvectorData(:)%p_DbasTest		For all vectors: values of the corresponding test basis functions in all cubature points
RvectorData(:)%ndofTest		Number of degrees of freedom in the test space
RvectorData(:)%ndimfeTest		>1, Dimension of the test FE space
RvectorData(:)%nvar		Number of variables per vector entry

RvectorData(:)%bisInterleaved	Input	=.true. here, the vector is interleaved

p_DentryIntl (1:nvar, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local vectors.
p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)
p_DbasTest (1:ndofTest*ndimfeTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative. More precisely
- p_DbasTest (1:ndofTest + 0*ndofTest , :,:,:) = 1st dimension, test space
- p_DbasTest (1:ndofTest + 1*ndofTest , :,:,:) = 2nd dimension, test space
- p_DbasTest (1:ndofTest + 2*ndofTest , :,:,:) = 3rd dimension, test space, ...

Basic structure of the callback routine

The actual structure of the callback routine is now demonstrated on an example:

Example: The following example demonstrates how to set up the right-hand side $f=x_1 x_2$ in 2D for the first subvector of a block vector, here for scalar-valued FEM spaces like $Q_1$, $Q_2$, etc:

    subroutine fcalc_rhsxy(rvectorData,rassemblyData,rvectorAssembly,&
        npointsPerElement,nelements,revalVectors,rcollection)

     ...

      ! Local variables
      integer :: iel, icubp, idofe, dbasI, dval
      real(DP), dimension(:,:), pointer :: p_DlocalVector
      real(DP), dimension(:,:,:,:), pointer :: p_DbasTest
      real(DP), dimension(:,:), pointer :: p_DcubWeight
      type(t_bmaVectorData), pointer :: p_rvectorData
      real(DP), dimension(:,:,:), pointer :: p_Dpoints

      ! Get cubature weights data
      p_DcubWeight => rassemblyData%p_DcubWeight

      ! Get the coordinates of the cubature points
      p_Dpoints => rassemblyData%revalElementSet%p_DpointsReal

      ! Get the data arrays of the subvector
      p_rvectorData => RvectorData(icomp)
      p_DlocalVector => p_rvectorData%p_Dentry
      p_DbasTest => p_rvectorData%p_DbasTest

      ! Loop over the elements in the current set.
      do iel = 1,nelements

        ! Loop over the cubature points
        do icubp = 1,npointsPerElement

          ! Get the coordinates of the cubature point.
          dx = p_Dpoints(1,icubp,iel)
          dy = p_Dpoints(2,icubp,iel)

          ! Loop over the test space
          do idofe=1,p_rvectorData%ndofTest

            ! Get the value of the basis function
            dbasI = p_DbasTest(idofe,DER_FUNC,icubp,iel)

            ! Value of the function f=x_1 x_2
            dval = dx * dy

            ! Multiply the values of the basis functions
            ! (1st derivatives) by the cubature weight and sum up
            ! into the local vectors.
            p_DlocalVector(idofe,iel) = p_DlocalVector(idofe,iel) + &
                p_DcubWeight(icubp,iel) * dval * dbasI 

          end do ! jdofe

        end do ! icubp

      end do ! iel

    end subroutine

In the above example it can be seen that setting up a right-hand side needs a callback routine with altogether three nested do loops:

An outer DO loop over elements
An inner DO loop over cubature points
For every cubature point, an inner loop over the DOFs in the test space

In the innerst loop, the actual vector entries are computed using cubature. The variables dbasI fetches the values of the basis functions in the cubature points. The sum in the innerst loop

...
  p_DlocalVector(...) = p_DlocalVector(...) + &
      p_DcubWeight(...) * dval * dbasI
...

realises the summation $$sum_{x_k}sum_{i} omega(x_k) f(x_k) psi_i(x_k) $$ with $x_k$ the cubature points and $psi_i$ the test functions on the cell $T$. This is the discrete counterpart to the integral $$(f, psi_i)_T = int_T f psi_i $$ which realises the local right-hand side on a cell $T$ in 2D. Summing up the contributions of all elements gives the global right-hand side.

Remark: For the actual assembly, the order of the summation is changed. The above loops do not calculate $(f, psi_i)_T$ but only its contribution in one cubature point for one test functions, summing up later. In this order, the assembly is faster.

Accessing FE functions and temporary arrays

As introduced above, it is possible to pass finite element functions an temporary arrays to the callback routine in order to do complex calculations there. To demonstrate the exact passing strategy, we start with an example based on a matrix assembly.

Example: Assume that there are two finite element functions to be passed to a callback routine as well as a set of temporary arrays.

The first finite element function, $v_h=(v_1)$, is a scalar finite element function, denoted by rvectorVh for which we need $v_h$ and $nabla v_h$.

The second one, $w_h=(w_1,w_2,w_3)$, is a vector field for which we also need $w_h$ and $nabla w_h$.

Furthermore, we need a temporary arrays where we store intermediate information for processing $v_h$, $nabla v_h$, $w_h$ and $nabla w_h$.

Finally, we need a temporary arrays that store intermediate information for processing $v_h$, $nabla v_h$, $w_h$ and $nabla w_h$.

The main program sets up a vector evaluation structure as follows. Herein, the dummy vector fields use "4" temporary vectors for each dimension, one for the function values and three for the x/y/z derivatives of every component - resulting in 12 temporary arrays in total.

    ...
    type(t_vectorBlock) :: rvectorVh
    type(t_vectorBlock) :: rvectorWh

    type(t_fev2Vectors) :: rmyVectors
    ...

    ! V_h, nabla V_h
    call fev2_addVectorToEvalList(&
        rmyVectors,rvectorVh%RvectorBlock(1),1)

    ! W_h, nabla W_h
    call fev2_addVectorFieldToEvalList(rmyVectors,1,&
        rvectorWh%RvectorBlock(1),&
        rvectorWh%RvectorBlock(2),&
        rvectorWh%RvectorBlock(3))

    ! Temporary memory, corresponding to V_h, nabla V_h.
    call fev2_addDummyVectorToEvalList (rmyVectors,4)

    ! Temporary memory, corresponding to W_h, nabla W_h.
    call fev2_addDummyVecFieldToEvalList (rmyVectors,3,4)

    ! Cell based temp memory, corresponding to V_h, nabla V_h.
    call fev2_addDummyVectorToEvalList (rmyVectors,4)

    ! Cell based temp memory, corresponding to W_h, nabla W_h.
    call fev2_addDummyVecFieldToEvalList (rmyVectors,3,4)

    ! Assembly
    call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
        fcalcMatrix,revalVectors=rmyVectors)

    ! Cleanup
    call fev2_releaseVectorList (rmyVectors)
    ...

The callback routine fcalcMatrix which is called in bma_buildMatrix receives a parameter revalVectors=rmyVectors which contains the vectors and temporary arrays in exactly the order they were added to rmyVectors. The data is accessible via revalVectors%p_RvectorData which is an array of all data added to rmyVectors. revalVectors%p_RvectorData contains data arrays for all cubature points and all elements that are currently in process of the assembly.

The content of the structure revalVectors%p_RvectorData(i) depends on the type of the data which is represented. The callback routine has to "know" which data is added to revalVectors and use it in the appropriate order. In the following tables, there will be an overview which data is accessible in which situation:

Structural data

The following table depicts general information that describe the type of data represented by revalVectors%p_RvectorData(i):

Variable	Description
bisInterleaved	Specifies if the data is interleaved
bisVectorField	Specifies if the data is a vector field
bisCellBased	Specifies if the data is cell based
ndimVectorField	Dimension of the vector field.
nvar	Number of entries per vector if the vector is interleaved
ndimfe	Dimension of the underlying FEM space (for vector valued FEM spaces. =1 for standard FEM spaces)
nmaxDerivativeIdx	Number of allocated subarrays. This corresponds to the maximum derivative quantifier `DER_xxxx` that can be used

Scalar data in every cubature point

If the structure represents a scalar FE function in every cubature point:

Variable	Description
p_Ddata	Array with values in every cubature point on every element

Vector data in every cubature point

If the structure represents a fector field FE function in every cubature point:

Variable	Description
p_DdataVec	Array with values in every dimension, cubature point and on every element

Interleaved scalar data in every cubature point

If the structure represents an interleaved scalar FE function in every cubature point:

Variable	Description
p_DdataIntl	Array with values in every dimension, cubature point and on every element

Scalar data on every element

If the structure represents one data entry per element:

Variable	Description
p_DcellData	Array with one value per element

Vector data on every element

If the structure represents one vector field data entry per element:

Variable	Description
p_DcellDataVec	Array with one value in every dimension on every element

Interleaved scalar data on every element

If the structure represents an one interleaved data entry per element:

Variable	Description
p_DcellDataIntl	Array with one value per element

For vector data added by fev2_addVector_XXXX, the data is initialised with the values in the cubature points on all elements. "Dummy" arrays are treated the same way as data arrays but are not initialised; they represent uninitialised temporary memory the callback routine can use arbitrarily.

Example: A callback routine fcalcMatrix corresponding to the above code may look like this:

    subroutine fcalcMatrix (RmatrixData,rassemblyData,rmatrixAssembly,&
        npointsPerElement,nelements,revalVectors,rcollection)
    ...

      real(DP), dimension(:,:,:), pointer   :: p_DdataVh
      real(DP), dimension(:,:,:,:), pointer :: p_DdataWh

      real(DP), dimension(:,:,:), pointer   :: p_DdataTempVh
      real(DP), dimension(:,:,:,:), pointer :: p_DdataTempWh

      real(DP), dimension(:,:), pointer     :: p_DdataCellVh
      real(DP), dimension(:,:,:), pointer   :: p_DdataCellWh

      ! Get the data arrays
      p_DdataVh     => revalVectors%p_RvectorData(1)%p_Ddata
      p_DdataWh     => revalVectors%p_RvectorData(2)%p_DdataVec

      p_DdataTempVh => revalVectors%p_RvectorData(1)%p_Ddata   
      p_DdataTempWh => revalVectors%p_RvectorData(2)%p_DdataVec

      p_DdataCellVh => revalVectors%p_RvectorData(1)%p_DcellData   
      p_DdataCellWh => revalVectors%p_RvectorData(2)%p_DcellDataVec

      ...
      do iel = 1,nelements
        do icubp = 1,npointsPerElement

          ! Get the value of V_h in the cubature point
          dvh = p_DdataVh(icubp,iel,DER_FUNC)

          ! Save for later use
          p_DdataTempVh(icubp,iel,DER_FUNC) = dvh**2

          ! Sum up to norm over the cell
          p_DdataCellVh(iel,DER_FUNC) = p_DdataCellVh(iel,DER_FUNC) + dvh**2

          do idofe=1,ndofTest

            dbasI = p_DbasText(idofe,DER_FUNC,icubp,iel)

            do jdofe=1,ndofTrial

              dbasJ = p_DbasTrial(jdofe,DER_FUNC,icubp,iel)

              ! Calculate the local matrix entries
              p_DlocalMat(jdofe,idofe,iel) = p_DlocalMat(jdofe,idofe,iel) + &
                  p_DcubWeight(icubp,iel) * ( dvh*dbasJ*dbasI )

            end do ! jdofe
          end do
        end do
      end do
      ...

    end subroutine

This code calculates the bilinear form $$m(v_h)(varphi_h,psi_h) = ( psi_h, v_h varphi_h )$$ which corresponds to the operator $v_h I$ with $I$ the identity operator (resulting in a mass matrix with nonconstant coefficient).

Using nonconstant data: The data arrays p_DdataXXXX in this subroutine are extracted from the revalVectors%p_RvectorData(:) structures in exactly the order they were added to rmyVectors above. The shape of the data arrays correspond to the type of data that was added and is designed for fast data access inside of the loops. In the above example, p_DdataVh contains the values of $v_1$ in all cubature points and all elements. As soon as iel and icubp identify one cubature point on one element, the value dvh of $v_1$ can be obtained via p_DdataVh and be used in all the loops over the test and trial functions that calculate the local matrices.

Using temporary arrays: The temp arrays can be arbitrarily used in the callback routine. In the above example, the value of $v_h^2$ is saved to p_DdataTempVh which has the same layout as p_DdataVh. The data can then be accessed in a later step of the assembly.

Cell based data is used in a similar way. Such temporary data provides one data entry per cell. The above example uses, e.g., p_DdataCellVh to compute the norm of $v_h$ on a cell $T$, $||v_h||_{L_2(T)}$, which may be used later, for example in a stabilisation technique.

Featflow

The block assembly

Involved modules and functionality

Basics

Simple block matrix assembly

Simple block vector assembly

Simple integral assembly: General integral

Providing nonlinearities / FE functions / temporary memory

Standard operators

Using bma_fcalc_XXXX for simple calculations

Using bma_docalc_XXXX in callback routines

Block matrices: Advanced operator assembly

Input/output of the callback routine

Basic structure of the callback routine

Block vectors: Advanced vector assembly

Input/output of the callback routine

Basic structure of the callback routine

Accessing FE functions and temporary arrays

Using `bma_fcalc_XXXX` for simple calculations

Using `bma_docalc_XXXX` in callback routines