
UMFPACK Version 2.2.1:  Unsymmetric-pattern Multifrontal Package
--------------------------------------------------------------

Authors:  Timothy A. Davis and Iain S. Duff.  Copyright (C) 1997.
Date:     January, 1998.

Changes since Version 2.2:

	Minor bug fix to UMF2F0 (added code to initialize XOUT and IOUT).
	The bug occurs only when BTF is turned on (default) and the matrix
	is permutable to upper triangular form.


***********************************************************************
* NOTICE:  "The UMFPACK Package may be used SOLELY for educational,   *
* research, and benchmarking purposes by non-profit organizations and *
* the U.S. government.  Commercial and other organizations may make   *
* use of UMFPACK SOLELY for benchmarking purposes only.  UMFPACK may  *
* be modified by or on behalf of the User for such use but at no time *
* shall UMFPACK or any such modified version of UMFPACK become the    *
* property of the User.  UMFPACK is provided without warranty of any  *
* kind, either expressed or implied.  Neither the Authors nor their   *
* employers shall be liable for any direct or consequential loss or   *
* damage whatsoever arising out of the use or misuse of UMFPACK by    *
* the User.  UMFPACK must not be sold.  You may make copies of        *
* UMFPACK, but this NOTICE and the Copyright notice must appear in    *
* all copies.  Any other use of UMFPACK requires written permission.  *
* Your use of UMFPACK is an implicit agreement to these conditions."  *
*                                                                     *
* The MA38 Package in Release 12 of the Harwell Subroutine Library    *
* (HSL) has equivalent functionality (and identical calling interface)*
* as UMFPACK (the HSL has single and double precision versions only,  *
* however).  It is available for commercial use.   Technical reports, *
* information on HSL, and matrices are available via the World Wide   *
* Web at http://www.cse.clrc.ac.uk/Activity/HSL, or by                *
* anonymous ftp at seamus.cc.rl.ac.uk/pub.  Also contact Dr. Scott    *
* Roberts, Harwell Subroutine Library, B 552, AEA Technology,         *
* Harwell, Didcot, Oxon OX11 0RA, England.                            *
* telephone (44) 1235 434988, fax (44) 1235 434136                    *
* email Scott.Roberts@aeat.co.uk, who will provide details of price   *
* and conditions of use.                                              *
***********************************************************************


Summary
-------

UMFPACK Version 2.2 is a package for solving systems of sparse linear systems,
Ax=b, where A is sparse and can be unsymmetric.  It is written in ANSI Fortran
77.  There are options for choosing a good pivot order, factorizing a
subsequent matrix with the same pivot order and nonzero pattern as a
previously factorized matrix, and solving systems of linear equations with
the factors (with A, L, or U; or with their transposes in the single/double
precision versions).  Iterative refinement, with sparse backward error
estimates, can be performed.  Single and double precision, complex, and
complex double precision (complex*16) routines are available.  (Note that
complex*16 is not ANSI Fortran-77, but is a common extension to it).

Note that transposed and complex-conjugate transposed systems (A', L', or U')
are not currently handled by the complex and complex*16 versions.

There are four primary routines that can be called by the user (where "*"
is D for double precision, S for single precision, C for complex, and Z
for complex*16):

        UM*21I: sets the default the control parameters for UM*2FA, UM*2RF,
                and UM*2SO.

        UM*2FA: factors A into PAQ=LU, finding the pivot order (P and Q)
                based on both numerical and fill-reducing criteria.  This
                routine performs both symbolic and numerical factorization
                in a single step.

        UM*2RF: factors A into PAQ=LU, using information (same P and Q and the
                same symbolic factorization) from a prior call to UM*2FA.
                Normally significantly faster than UM*2FA, since UM*2RF only
                performs numerical factorization.

        UM*2SO: solves a system of linear equations using the factors,
                optionally performing iterative refinement.


Harwell Subroutine Compatibilty
-------------------------------

The following routines have the same arguments, and perform similar functions:

	UMS21I		MA38I
	UMS2FA		MA38A
	UMS2RF		MA38B
	UMS2FA		MA38C

	UMD21I		MA38ID
	UMD2FA		MA38AD
	UMD2RF		MA38BD
	UMD2FA		MA38CD



The Method
----------

The multifrontal method factorizes a large sparse matrix using a sequence of
small dense frontal matrices.  The square frontal matrices are factorized
efficiently using dense matrix kernels.  Classical multifrontal methods assume
a symmetric nonzero pattern.  The unsymmetric-pattern multifrontal method
(UMFPACK), relaxes this assumption by using rectangular frontal matrices.
High performance is achieved by using dense matrix kernels to factorize these
rectangular frontal matrices, and also through an approximate degree update
algorithm that is much faster (asymptotically and in practice) than computing
the exact degrees.  Since a general sparse code must select pivots based on
both numerical and symbolic (fill-reducing) criteria, the analysis phase
(pivot selection and symbolic factorization) and the numerical factorization
are combined.  The rectangular frontal matrices are constructed dynamically,
since the structure is not known prior to factorization.

Version 2.2 of UMFPACK combines features of both unifrontal and multifrontal
methods.  In the multifrontal method, in contrast with a (uni-)frontal method,
several frontal matrices are used.  Each is used for one or more pivot steps,
and the resulting Schur complement is summed with other Schur complements to
generate another frontal matrix.  Although this means that arbitrary
sparsity patterns can be handled efficiently, extra work is required to add
the Schur complements together and can be costly because indirect addressing
is required.  The frontal method avoids this extra work by factorizing the
matrix with a single frontal matrix.  Rows and columns are added to the frontal
matrix, and pivot rows and columns are removed.  Data movement is simpler,
but higher fill-in can result if the matrix cannot be permuted into a
variable-band form with small profile.  UMFPACK Version 2.2 is based on a
combined unifrontal/multifrontal algorithm that enables a general fill-in
reduction ordering to be applied but avoiding the data movement of previous
multifrontal approaches.


The Different Versions of UMFPACK (Versions 1.0, 1.1, 2.0, 2.1, and 2.2)
------------------------------------------------------------------------

Version 1.0 and Version 1.1 are nearly identical.  Version 1.1 includes a few
minor bug fixes to Version 1.0.

Version 2.0 is up to four times as fast as Version 1.1, and uses at little as
half the memory.  For some matrices, Version 2.0 has about the same performance
as Version 1.1.  The improvement obtained between the two methods depends on the
matrix, and how much can be gained from exploiting unifrontal-style data
movement.

Version 2.1 is essentially the same as Version 2.0, with identical performance.
Modifications are listed below.  Version 2.1 is compatible (more or less) with
the MA38 package in the Harwell Subroutine Library.  MA38 has been more heavily
tested, is a more reliable code, comes with an extensive test program, and is a
fully supported code.

Version 2.2 removes some "dead copying" (copying a small amount uninitialized
memory to uninitialized memory, in order to fuse two loops together for higher
speed on a vector computer).

Changes since Version 2.1
-------------------------

VERSION 2.2 IS FULLY UPWARD-COMPATIBLE WITH VERSION 2.0 and 2.1.

1)  Removal of "dead copying", which causes memory profilers to panic on
	Version 2.1 (falsely, in this case).

2)  The counting of floating-point operations has been revised to use all-real
	arithmetic, instead of computing terms in integer arithmatic and
	accumulating them in a real value (RINFO).

Changes since Version 2.0
-------------------------

VERSION 2.1 IS FULLY UPWARD-COMPATIBLE WITH VERSION 2.0.

1) A new initialization routine (UM*21I) has been added for compatibility with
   MA38I/ID.  The Version 2.0 UM*2IN routine is still available, for backward
   compatibilty with Version 2.0 of UMFPACK.  The UM*2IN routine is not
   compatible with the MA38I/ID initialization routine.

2) A single non-ANSI construct (an ENDDO in UM*2FB) was removed.

3) COMPLEX and COMPLEX*16 versions added.  Some superficial changes were made
   so that the four code types (single, double, complex, and complex*16) would
   be as idential as possible in Version 2.1.  The COMPLEX and COMPLEX*16
   codes are not present in the current version of the Harwell Subroutine
   Library.

4) Additional printing level added (see Icntl (3) argument in UM*21I and UM*2P1
   for details).


How to install UMFPACK Version 2.2
----------------------------------

The following files are included in the NETLIB distribution of UMFPACK Version
2.2 (the file you are reading is the README file):


	Makefile and test programs:
	---------------------------

	Makefile	how to compile UMFPACK on a typical UNIX computer
	dmain.f		demo program, double precision (also listed below)
	smain.f		demo program, single precision
	cmain.f		demo program, complex
	zmain.f		demo program, complex*16
	in		input file for dmain and smain programs (also below)
	cin		input file for zmain and cmain
	dmain.out0	example output file from dmain (also listed below)
	smain.out0	example output file from smain
	cmain.out0	example output file from cmain
	zmain.out0	example output file from zmain


	Double precision version: (subroutine and file name are the same)
	-------------------------

	umd21i.f	USER CALLABLE: set default control parameters
	umd2in.f	USER CALLABLE: for compatibility with Version 2.0 only.

	umd2fa.f	USER CALLABLE:  factorize a matrix in triplet form
	umd2f0.f	permute to BTF and factorize matrix in column-form
	umd2fb.f	find permutation to BTF
	umd2f1.f	factorize one diagonal block, in column-form
	umd2f2.f	factorize one diagonal block, in expanded column-form
	umd2fg.f	garbage collection for UMD2F2

	umd2rf.f	USER CALLABLE:  (re)factorize a matrix, numerical only
	umd2r0.f	permute to BTF and (re)factorize matrix in column-form
	umd2ra.f	convert a column-form matrix to arrowhead form
	umd2r2.f	(re)factorize one diagonal block in arrowhead form
	umd2rg.f	garbage collection for UMD2R2

	umd2so.f	USER CALLABLE:  solve a linear system
	umd2s2.f	solve a linear system, with iterative refinement
	umd2sl.f	solve Lx=b, for one diagonal block
	umd2su.f	solve Ux=b, for one diagonal block
	umd2lt.f	solve L'x=b, for one diagonal block
	umd2ut.f	solve U'x=b, for one diagonal block

			utility routines for the entire package:
	umd2of.f	permute to BTF according to final permutation
	umd2co.f	convert triplet-form matrix into column-oriented form
	umd2er.f	error handling
	umd2p1.f	print input/output parameters
	umd2p2.f	print error and warning messages

	Single precision version: (subroutine and file name are the same)
	-------------------------

	... same as umd* routines above, except replace umd with ums.

	Complex version: (subroutine and file name are the same)
	-------------------------

	... same as umd* routines above, except replace umd with ums.
	In addtion, umc2lt.f and umc2ut.t do not exist. 


	Complex*16 version: (subroutine and file name are the same)
	-------------------------

	... same as umc* routines above, except replace umc with umz.


To install UMFPACK Version 2.2, you also need the BLAS (Basic Linear Algebra
Subprograms) and two routines from the Harwell Subroutine Library (HSL).  All
of these routines are available in NETLIB.  However, the two HSL routines are
optional (refer to the "INSTALLATION NOTES" below).  We HIGHLY recommend that
you use vendor-optimized versions of the BLAS for your computer.  This can
increase the performance of UMFPACK several times over the non-optimized
Fortran BLAS.  To obtain the HSL routines and the non-optimized Fortran BLAS,
send email to netlib@ornl.gov with the message:

        send blas.shar from blas
        send mc13e.f mc21b.f from harwell

Note that there are licensing restrictions on the HSL routines in NETLIB,
as there are on UMFPACK.  A BLAS that is much faster than the NETLIB
BLAS on RISC workstations can be obtained via anonymous ftp to
ftp://ftp.enseeiht.fr/pub/numerique/BLAS/RISC/blas_risc.tar.Z.  However,
since the trans-Atlantic link can be slow, we've placed a copy in
ftp://ftp.cis.ufl.edu/pub/umfpack/blas_risc.tar.gz, with the author's
permission (for the most recent version, get the copy at ENSEEIHT, not at
the Univ. of Florida).  The RISC BLAS is up to four times faster than the
NETLIB BLAS on a Sparc 20.

For UNIX users, a Makefile is included.  Just type "make" to compile all
versions (single / double / complex / complex*16), and to run four small demo
programs.  You first need to modify the Makefile to reflect location of the
BLAS and the two HSL routines, MC13E and MC12B.


How to use UMFPACK Version 2.2
------------------------------

The *main.f program (* = S, D, C, or Z) gives a short example of how to use
UMFPACK Version 2.2.  For a description of the arguments to UM*21I, UM*2FA,
UM*2RF, and UM*2SO, refer to the comments in the corresponding files.

The dmain.f program is repeated here:

--------------------------------------------------------------------------------
C UMFPACK demo program.
C
C Factor and solve a 5-by-5 system, Ax=b, using default parameters,
C except with complete printing of all arguments on input and output,
C where

C     [ 2  3  0  0  0 ]      [  8 ]                  [ 1 ]
C     [ 3  0  4  0  6 ]      [ 45 ]                  [ 2 ]
C A = [ 0 -1 -3  2  0 ], b = [ -3 ]. Solution is x = [ 3 ].
C     [ 0  0  1  0  0 ]      [  3 ]                  [ 4 ]
C     [ 0  4  2  0  1 ]      [ 19 ]                  [ 5 ]
C
C Then solve A'x=b, with solution:
C       x = [  1.8158  1.4561 1.5000 -24.8509 10.2632 ]'
C using the factors of A.   Modify one entry (A (5,2) = 1.0D-14) and
C refactorize.  Solve Ax=b both without and with iterative refinement,
C with true solution (rounded to 4 digits past the decimal point):
C       x = [-15.0000 12.6667 3.0000   9.3333 13.0000 ]'

        PROGRAM DMAIN
        INTEGER NMAX, NEMAX, LVALUE, LINDEX
        PARAMETER (NMAX=20, NEMAX=100, LVALUE=300, LINDEX=300)
        INTEGER KEEP (20), INDEX (LINDEX), INFO (40),
     $     I, ICNTL (20), N, NE, AI (2*NEMAX)
        DOUBLE PRECISION 
     $     B (NMAX), X (NMAX), W (4*NMAX), VALUE (LVALUE), AX (NEMAX)
        DOUBLE PRECISION
     $     CNTL (10), RINFO (20)

C Read input matrix and right-hand side.  Keep a copy of the triplet
C form in AI and AX.

        READ (5, *) N, NE
        READ (5, *) (AI (I), AI (NE+I), I = 1,NE)
        READ (5, 1) (AX (I), I = 1,NE)
1       FORMAT (F5.1)
        READ (5, 1) (B (I), I = 1,N)
        DO 10 I = 1, NE
           INDEX (I) = AI (I)
           INDEX (NE+I) = AI (NE+I)
           VALUE (I) = AX (I)
10      CONTINUE

C Initialize controls, and change default printing control.  Note that
C this change from the default should only be used for test cases.  It
C can generate a lot of output for large matrices. 

        CALL UMD21I (KEEP, CNTL, ICNTL)
        ICNTL (3) = 4

C Factorize A, and print the factors.  Input matrix is not preserved.

        CALL UMD2FA (N, NE, 0, .FALSE., LVALUE, LINDEX, VALUE, INDEX,
     $               KEEP, CNTL, ICNTL, INFO, RINFO)
        IF (INFO (1) .LT. 0) STOP

C Reset default printing control (UMD21I could be called instead)
        ICNTL (3) = 2

C Solve Ax = b and print solution.

        CALL UMD2SO (N, 0, .FALSE., LVALUE, LINDEX, VALUE, INDEX,
     $               KEEP, B, X, W, CNTL, ICNTL, INFO, RINFO)
        WRITE (6, *) 'Solution to Ax=b:'
        WRITE (6, 2) (X (I), I = 1, N)
2       FORMAT (E20.12)
        IF (INFO (1) .LT. 0) STOP

C Solve A'x = b and print solution.

        CALL UMD2SO (N, 0, .TRUE., LVALUE, LINDEX,  VALUE, INDEX,
     $               KEEP, B, X, W, CNTL, ICNTL, INFO, RINFO)
        WRITE (6, *) 'Solution to A''x=b:'
        WRITE (6, 2) (X (I), I = 1, N)
        IF (INFO (1) .LT. 0) STOP

C Modify one entry of A, and refactorize using UMD2RF.

        DO 20 I = 1, NE
           INDEX (I) = AI (I)
           INDEX (NE+I) = AI (NE+I)
           VALUE (I) = AX (I)
20      CONTINUE
C       A (5,2) happens to be (PAQ)_22, the second pivot entry:
        VALUE (10) = 1.0D-14

        CALL UMD2RF (N, NE, 1, .FALSE., LVALUE, LINDEX, VALUE, INDEX,
     $               KEEP, CNTL, ICNTL, INFO, RINFO)
        IF (INFO (1) .LT. 0) STOP

C Solve Ax = b without iterative refinement, and print solution.
C This will be very inaccurate due to the tiny second pivot entry.

        CALL UMD2SO (N, 0, .FALSE., LVALUE, LINDEX,  VALUE, INDEX,
     $               KEEP, B, X, W, CNTL, ICNTL, INFO, RINFO)
        WRITE (6, *) 'Solution to modified Ax=b, no iter. refinement:'
        WRITE (6, 2) (X (I), I = 1, N)
        IF (INFO (1) .LT. 0) STOP

C Solve Ax = b with iterative refinement, and print solution.
C This is much more accurate.

        ICNTL (8) = 10
        CALL UMD2SO (N, 0, .FALSE., LVALUE, LINDEX,  VALUE, INDEX,
     $               KEEP, B, X, W, CNTL, ICNTL, INFO, RINFO)
        WRITE (6, *) 'Solution to modified Ax=b, with iter. refinement:'
        WRITE (6, 2) (X (I), I = 1, N)
        IF (INFO (1) .LT. 0) STOP
        STOP
        END
--------------------------------------------------------------------------------


The input to dmain.f is the "in" file:

--------------------------------------------------------------------------------
5 12
1 1    1 2    2 1    2 3    2 5    3 2     3 3 
3 4    4 3    5 2    5 3    5 5
  2.0
  3.0
  3.0
  4.0
  6.0
 -1.0
 -3.0 
  2.0
  1.0
  4.0
  2.0
  1.0
  8.0
 45.0
 -3.0
  3.0
 19.0
--------------------------------------------------------------------------------


The output on a Sun workstation (Sparc 10) with IEEE arithmetic is listed
below.  Note that UMFPACK has a (non-default) option for printing the arguments
of the user-callable routines on entry and exit:

--------------------------------------------------------------------------------
 ===========================================================UMD2FA input:
 Scalar arguments:
    N:                    5 : order of matrix A
    NE:                  12 : entries in matrix A
    JOB:                  0 : matrix A not preserved
    TRANSA:         .false. : factorize A
    LVALUE:             300 : size of VALUE array
    LINDEX:             300 : size of INDEX array
 Control parameters, normally initialized by UMD21I:
    ICNTL (1):            6 : I/O unit for error and warning messages
    ICNTL (2):            6 : I/O unit for diagnostics
    ICNTL (3):            4 : printing control
    ICNTL (4):            1 : use block triangular form (BTF)
    ICNTL (5):            4 : columns examined during pivot search
    ICNTL (6):            0 : do not preserve symmetry
    ICNTL (7):           16 : block size for dense matrix multiply
    CNTL (1):    0.1000D+00 : relative pivot tolerance
    CNTL (2):    0.2000D+01 : frontal matrix growth factor
    KEEP (6):    2147483647 : largest positive integer
    KEEP (7):            64 : dense row/col control, d1
    KEEP (8):             1 : dense row/col control, d2
 Input matrix A (entry: row, column, value):
            1:            1            1  0.2000D+01
            2:            1            2  0.3000D+01
            3:            2            1  0.3000D+01
            4:            2            3  0.4000D+01
            5:            2            5  0.6000D+01
            6:            3            2 -0.1000D+01
            7:            3            3 -0.3000D+01
            8:            3            4  0.2000D+01
            9:            4            3  0.1000D+01
           10:            5            2  0.4000D+01
           11:            5            3  0.2000D+01
           12:            5            5  0.1000D+01
 ===========================================================end of UMD2FA input 
 ===========================================================UMD2FA output:
 Output information:
    INFO (1):             0 : no error or warning occurred
    INFO (2):             0 : duplicate entries in A
    INFO (3):             0 : invalid entries in A (indices not in 1..N)
    INFO (4):             0 : invalid entries in A (not in prior pattern)
    INFO (5):            12 : entries in A after summing duplicates
                              and removing invalid entries
    INFO (6):             8 : entries in diagonal blocks of A
    INFO (7):             4 : entries in off-diagonal blocks of A
    INFO (8):             2 : 1-by-1 diagonal blocks in A
    INFO (9):             3 : diagonal blocks in A (>1 only if BTF used)
    INFO (10):            3 : entries below diagonal in L
    INFO (11):            3 : entries above diagonal in U
    INFO (12):           15 : entries in L + U + offdiagonal blocks of A
    INFO (13):            1 : frontal matrices
    INFO (14):            0 : integer garbage collections
    INFO (15):            0 : real garbage collections
    INFO (16):            0 : diagonal pivots chosen
    INFO (17):            5 : numerically valid pivots found in A
    INFO (18):          123 : memory used in INDEX
    INFO (19):          128 : minimum memory needed in INDEX
    INFO (20):           33 : memory used in VALUE
    INFO (21):           27 : minimum memory needed in VALUE
    INFO (22):           90 : memory needed in INDEX for next call to UMD2RF
    INFO (23):           30 : memory needed in VALUE for next call to UMD2RF
    RINFO (1):   0.8000D+01 : total BLAS flop count
    RINFO (2):   0.6000D+01 : assembly flop count
    RINFO (3):   0.1500D+02 : pivot search flop count
    RINFO (4):   0.2000D+01 : Level-1 BLAS flop count
    RINFO (5):   0.6000D+01 : Level-2 BLAS flop count
    RINFO (6):   0.0000D+00 : Level-3 BLAS flop count
 -------------------------------------------------------------------------------
 Entries not in diagonal blocks (stored by row):
 one entry per line (column index, value):
    row:            1
            2: -0.1000D+01
            5: -0.3000D+01
    row:            2
            5:  0.2000D+01
    row:            4
            5:  0.4000D+01
 -------------------------------------------------------------------------------
 LU factors:
 Block:            1 (singleton) at index :            1
       value:  0.2000D+01
 ...............................................................................
 Block:            2 first index:            2 last index:            4
       L, col:            1
            1:  0.1000D+01
            2:  0.7500D+00
            3:  0.0000D+00
       L, col:            2
            2:  0.1000D+01
            3: -0.8000D+01
       L, col:            3
            3:  0.1000D+01
       U, row:            1
            1:  0.4000D+01
            2:  0.1000D+01
            3:  0.0000D+00
       U, row:            2
            2: -0.7500D+00
            3:  0.2000D+01
       U, row:            3
            3:  0.1900D+02
 ...............................................................................
 Block:            3 (singleton) at index :            5
       value:  0.1000D+01
 -------------------------------------------------------------------------------
 Column permutations
            4
            2
            5
            1
            3
 -------------------------------------------------------------------------------
 Row permutations
            3
            5
            1
            2
            4
 ===========================================================end of UMD2FA output
Solution to Ax=b:
  0.100000000000E+01
  0.200000000000E+01
  0.300000000000E+01
  0.400000000000E+01
  0.500000000000E+01
Solution to A'x=b:
  0.181578947368E+01
  0.145614035088E+01
  0.150000000000E+01
 -0.248508771930E+02
  0.102631578947E+02
Solution to modified Ax=b, no iter. refinement:
 -0.150000000000E+02
  0.126121335597E+02
  0.300000000000E+01
  0.930606677987E+01
  0.130000000000E+02
Solution to modified Ax=b, with iter. refinement:
 -0.150000000000E+02
  0.126666666667E+02
  0.300000000000E+01
  0.933333333333E+01
  0.130000000000E+02
--------------------------------------------------------------------------------



Information on each user-callable routine
-----------------------------------------

We describe here the double precision version only.  The other versions
are analogous (see the source code for UM*21I, UM*2FA, UM*2RF, and UM*2SO).



UMD21I:  Initialization routine
-------------------------------

Please see the installation notes below if you are using an IBM RS/6000
(set Icntl (4) to 0 if you encounter a optimization bug in an old compiler).

    UMD21I calling sequence
    -----------------------

        SUBROUTINE UMD21I (KEEP, CNTL, ICNTL)
        INTEGER ICNTL (20), KEEP (20)
        DOUBLE PRECISION
     $          CNTL (10)

    UMD21I description
    ------------------

          Initialize user-controllable parameters to default values, and
          non-user controllable parameters.  This routine is normally
          called once prior to any call to UMD2FA.

          This routine sets the default control parameters.  We recommend
          changing these defaults under certain circumstances:

          (1) If you know that your matrix has nearly symmetric nonzero
                pattern, then we recommend setting Icntl (6) to 1 so that
                diagonal pivoting is preferred.  This can have a significant
                impact on the performance for matrices that are essentially
                symmetric in pattern.

           (2) If you know that your matrix is not reducible to block
                triangular form, then we recommend setting Icntl (4) to 0
                so that UMFPACK does not try to permute the matrix to block
                triangular form (it will not do any useful work and will
                leave the matrix in its irreducible form).  The work saved
                is typically small, however.

           The other control parameters typically have less effect on overall
           performance.

    UMD21I arguments
    ----------------

                       --------------------------------------------------------
          Icntl:       An integer array of size 20.  Need not be set by
                       caller on input.  On output, it contains default
                       integer control parameters.

          Icntl (1):   Fortran output unit for error messages.
                       Default: 6

          Icntl (2):   Fortran output unit for diagnostic messages.
                       Default: 6

          Icntl (3):   printing-level.
                       0 or less: no output
                       1: error messages only
                       2: error messages and terse diagnostics
                       3: as 2, and print first few entries of all input and
                               output arguments.  Invalid and duplicate entries
                               are printed.
                       4: as 2, and print all entries of all input and
                               output arguments.  Invalid and duplicate entries
                               are printed.  The entire input matrix and its
                               factors are printed.
                       5 or more: as 4, and print out information on the data
                               structures used to represent the LU factors,
                               the assembly DAG, etc.
                       Default: 2

          Icntl (4):   whether or not to attempt a permutation to block
                       triangular form.  If nonzero, then attempt the
                       permutation.  If you know the matrix is not reducible
                       to block triangular form, then setting Icntl (4) to
                       zero can save a small amount of computing time.
		       (NOTE: The use of block triangular form requires the
		       MC13E and MC21B routines from the Harwell Subroutine
		       Library).
                       Default: 1 (attempt the permutation)

          Icntl (5):   the number of columns to examine during the global
                       pivot search.  A value less than one is treated as one.
                       Default: 4

          Icntl (6):   if not equal to zero, then pivots from the diagonal
                       of A (or the diagonal of the block-triangular form) are
                       preferred.  If the nonzero pattern of the matrix is
                       basically symmetric, we recommend that you change this
                       default value to 1 so that pivots on the diagonal
                       are preserved.
                       Default: 0 (do not prefer the diagonal)

          Icntl (7):   block size for the BLAS, controlling the tradeoff
                       between the Level-2 and Level-3 BLAS.  Values less than
                       one are treated as one.
                       Default: 16, which is suitable for the CRAY YMP.

          Icntl (8):   number of steps of iterative refinement to perform.
                       Values less than zero are treated as zero.  The matrix
                       must be preserved for iterative refinement to be done
                       (job=1 in UMD2FA or UMD2RF).
                       Default: 0 (no iterative refinement)

          Icntl (9 ... 20):  set to zero.  Reserved for future releases.

                       --------------------------------------------------------
          Cntl:        A double precision array of size 10.
                       Need not be set by caller on input.  On output, contains
                       default double precision control parameters.

          Cntl (1):    pivoting tradeoff between sparsity-preservation
                       and numerical stability.  An entry A(k,k) is numerically
                       acceptable if:
                          abs (A(k,k)) >= Cntl (1) * max (abs (A(*,k)))
                       Values less than zero are treated as zero (no numerical
                       constraints).  Values greater than one are treated as
                       one (partial pivoting with row interchanges).
                       Default: 0.1

          Cntl (2):    amalgamation parameter.  If the first pivot in a
                       frontal matrix has row degree r and column degree c,
                       then a working array of size
                          (Cntl (2) * c) - by - (Cntl (2) * r)
                       is allocated for the frontal matrix.  Subsequent pivots
                       within the same frontal matrix must fit within this
                       working array, or they are not selected for this frontal
                       matrix.  Values less than one are treated as one (no
                       fill-in due to amalgamation).  Some fill-in due to
                       amalgamation is necessary for efficient use of the BLAS
                       and to reduce the assembly operations required.
                       Default: 2.0

          Cntl (3):    Normally not modified by the user.
                       Defines the smallest positive number,
                       epsilon = Cntl (3), such that fl (1.0 + epsilon)
                       is greater than 1.0 (fl (x) is the floating-point
                       representation of x).  If the floating-point mantissa
                       is binary, then Cntl (3) is 2 ** (-b+1), where b
                       is the number of bits in the mantissa (including the
                       implied bit, if applicable).

                       Typical defaults:
                       For IEEE double precision, Cntl (3) = 2 ** (-53+1)
                       For IEEE single precision, Cntl (3) = 2 ** (-24+1)
                       For CRAY double precision, Cntl (3) = 2 ** (-96+1)
                       For CRAY single precision, Cntl (3) = 2 ** (-48+1)

                       A value of Cntl (3) less than or equal to zero
                       or greater than 2 ** (-15) is treated as 2 ** (-15),
                       which assumes that any floating point representation
                       has at least a 16-bit mantissa.  Cntl (3) is only
                       used in UMD2S2 to compute the sparse backward error
                       estimates, Rinfo (7) and Rinfo (8), when
                       Icntl (8) > 0 (the default is Icntl (8) = 0,
                       so by default, Cntl (3) is not used).

          Cntl (4 ... 10):  set to zero.  Reserved for future releases.

                       --------------------------------------------------------
          Keep:        An integer array of size 20.
                       Need not be set by the caller.  On output, contains
                       integer control parameters that are (normally) non-user
                       controllable (but can of course be modified by the
                       "expert" user or library installer).

          Keep (1 ... 5):  unmodified (see UMD2FA or UMD2RF for a description).

          Keep (6):    Largest representable positive integer.  Set to
                       2^31 - 1 = 2147483647 for 32-bit machines with 2's
                       complement arithmetic (the usual case).
                       Default: 2147483647

          Keep (7) and Keep (8): A column is treated as "dense" if
                       it has more than
                       max (0, Keep(7), Keep(8)*int(sqrt(float(n))))
                       original entries.  "Dense" columns are treated
                       differently that "sparse" rows and columns.  Dense
                       columns are transformed into a priori contribution
                       blocks of dimension cdeg-by-1, where cdeg is the number
                       of original entries in the column.  Modifying these two
                       parameters can change the pivot order.
                       Default:  Keep (7) = 64
                       Default:  Keep (8) = 1

          Keep (9 ... 20):  set to zero.  Reserved for future releases.


UMD2FA:  Primary analysis+factorization routine
-----------------------------------------------

    UMD2FA calling sequence:
    ------------------------

        SUBROUTINE UMD2FA (N, NE, JOB, TRANSA, LVALUE, LINDEX, VALUE,
     $          INDEX, KEEP, CNTL, ICNTL, INFO, RINFO)
        INTEGER N, NE, JOB, LVALUE, LINDEX, INDEX (LINDEX), KEEP (20),
     $          ICNTL (20), INFO (40)
        DOUBLE PRECISION
     $          VALUE (LVALUE), CNTL (10), RINFO (20)
        LOGICAL TRANSA

    UMD2FA description:
    -------------------

          Given a sparse matrix A, find a sparsity-preserving and numerically-
          acceptable pivot order and compute the LU factors, PAQ = LU.  The
          matrix is optionally preordered into a block upper triangular form
          (BTF).  Pivoting is performed within each diagonal block to maintain
          sparsity and numerical stability.  The method used to factorize the
          matrix is an unsymmetric-pattern variant of the multifrontal method.
          Most of the floating-point work is done in the Level-3 BLAS (dense
          matrix multiply).  In addition, approximate degrees are used in the
          Markowitz-style pivot search to reduce the symbolic overhead.  For
          best performance, be sure to use an optimized BLAS library.

          This routine is normally preceded by a call to UMD21I, to
          initialize the default control parameters.  UMD21I need only be
          called once.  A call to UMD2FA can be followed by any number of
          calls to UMD2SO, which solves a linear system using the LU factors
          computed by this routine.  A call to UMD2FA can also be followed by
          any number of calls to UMD2RF, which factorizes another matrix with
          the same nonzero pattern as the matrix factorized by UMD2FA (but with
          different numerical values).

    UMD2FA arguments:
    -----------------

                   ------------------------------------------------------------
          n:       An integer variable.
                   Must be set by caller on input (not modified).
                   Order of the matrix.  Restriction:  n >= 1.

                   ------------------------------------------------------------
          ne:      An integer variable.
                   Must be set by caller on input (not modified).
                   Number of entries in input matrix.  Restriction:  ne => 1.

                   ------------------------------------------------------------
          job:     An integer variable.
                   Must be set by caller on input (not modified).
                   If job=1, then a column-oriented form of the input matrix
                   is preserved, otherwise, the input matrix is overwritten
                   with its LU factors.  If iterative refinement is to done
                   in UMD2SO, (Icntl (8) > 0), then job must be set to 1.

                   ------------------------------------------------------------
          transa:  A logical variable.
                   Must be set by caller on input (not modified).
                   If false then A is factorized: PAQ = LU.  Otherwise, A
                   transpose is factorized:  PA'Q = LU.
		   NOTE:  this argument MUST be .false. for the complex and
		   complex*16 versions!

                   ------------------------------------------------------------
          lvalue:  An integer variable.
                   Must be set by caller on input (not modified).
                   Size of the Value array.  Restriction:  lvalue >= 2*ne
                   is required to convert the input form of the matrix into
                   the internal representation.  lvalue >= ne + axcopy is
                   required to start the factorization, where axcopy = ne if
                   job = 1, or axcopy = 0 otherwise.  During factorization,
                   additional memory is required to hold the frontal matrices.
                   The internal representation of the matrix is overwritten
                   with the LU factors, of size (Keep (2) - Keep (1) + 1
                   + axcopy), on output.

                   ------------------------------------------------------------
          lindex:  An integer variable.
                   Must be set by caller on input (not modified).
                   Size of the Index array.  Restriction: lindex >= 3*ne+2*n+1,
                   is required to convert the input form of the matrix into
                   its internal representation.  lindex >= wlen + alen + acopy
                   is required to start the factorization, where
                   wlen <= 11*n + 3*dn + 8 is the size of the workspaces,
                   dn <= n is the number of columns with more than d
                   entries (d = max (64, sqrt (n)) is the default),
                   alen <= 2*ne + 11*n + 11*dn + dne is the size of the
                   internal representation of the matrix, dne <= ne is the
                   number of entries in such columns with more than d entries,
                   and acopy = ne+n+1 if job = 1, or acopy = 0 otherwise.
                   During factorization, the internal representation of size
                   alen is overwritten with the LU factors, of size
                   luilen = (Keep (5) - Keep (3) + 1 - acopy) on output.
                   Additional memory is also required to hold the unsymmetric
                   quotient graph, but this also overwrites the input matrix.
                   Usually about 7*n additional space is adequate for this
                   purpose.  Just prior to the end of factorization,
                   lindex >= wlen + luilen + acopy is required.

                   ------------------------------------------------------------
          Value:   A double precision array of size lvalue.
                   Must be set by caller on input.  Modified on output.  On
                   input, Value (1..ne) holds the original matrix in triplet
                   form.  On output, Value holds the LU factors, and
                   (optionally) a column-oriented form of the original matrix
                   - otherwise the input matrix is overwritten with the LU
                   factors.

                   ------------------------------------------------------------
          Index:   An integer array of size lindex.
                   Must be set by caller on input.  Modified on output.  On
                   input, Index (1..2*ne) holds the original matrix in triplet
                   form.  On output, Index holds the LU factors, and
                   (optionally) a column-oriented form of the original matrix
                   - otherwise the input matrix is overwritten with the LU
                   factors.

                   On input the kth triplet (for k = 1...ne) is stored as:
                               A (row,col) = Value (k)
                               row         = Index (k)
                               col         = Index (k+ne)
                   If there is more than one entry for a particular position,
                   the values are accumulated, and the number of such duplicate
                   entries is returned in Info (2), and a warning flag is
                   set.  However, applications such as finite element methods
                   naturally generate duplicate entries which are then
                   assembled (added) together.  If this is the case, then
                   ignore the warning message.

                   On output, the LU factors and the column-oriented form
                   of A (if preserved) are stored in:
                       Value (Keep (1)...Keep (2))
                       Index (Keep (3)...Keep (5))
                   where Keep (2) = lvalue, and Keep (5) = lindex.

                   ------------------------------------------------------------
          Keep:    An integer array of size 20.

                   Keep (1 ... 5):  Need not be set by caller on input.
                       Modified on output.
                       Keep (1): LU factors start here in Value
                       Keep (2) = lvalue: LU factors end here in Value
                       Keep (3): LU factors start here in Index
                       Keep (4): LU factors needed for UMD2RF start here
                                     in Index
                       Keep (5) = lindex: LU factors end here in Index

                   Keep (6 ... 8):  Must be set by caller on input (not
                       modified).
                       integer control arguments not normally modified by the
                       user.  See UMD21I for details, which sets the defaults.
                       Keep (6) is the largest representable positive
                       integer.  Keep (7) and Keep (8) determine the
                       size of d, where columns with more than d original
                       entries are treated as a priori frontal matrices.

                   Keep (9 ... 20): Unused.  Reserved for future releases.

                   ------------------------------------------------------------
          Cntl:    A double precision array of size 10.
                   Must be set by caller on input (not modified).
                   real control arguments, see UMD21I for a description,
                   which sets the defaults. UMD2FA uses Cntl (1) and Cntl (2).

                   ------------------------------------------------------------
          Icntl:   An integer array of size 20.
                   Must be set by caller on input (not modified).
                   Integer control arguments, see UMD21I for a description,
                   which sets the defaults.  UMD2FA uses Icntl (1..7).

                   ------------------------------------------------------------
          Info:    An integer array of size 40.
                   Need not be set by caller on input.  Modified on output.
                   It contains information about the execution of UMD2FA.

                   Info (1): zero if no error occurred, negative if
                       an error occurred and the factorization was not
                       completed, positive if a warning occurred (the
                       factorization was completed).

                       These errors cause the factorization to terminate:

                       Error   Description
                       -1      n < 1
                       -2      ne < 1
                       -3      lindex too small
                       -4      lvalue too small
                       -5      both lindex and lvalue are too small

                       With these warnings the factorization was able to
                       complete:

                       Error   Description
                       1       invalid entries
                       2       duplicate entries
                       3       invalid and duplicate entries
                       4       singular matrix
                       5       invalid entries, singular matrix
                       6       duplicate entries, singular matrix
                       7       invalid and duplicate entries, singular matrix

                       Subsequent calls to UMD2RF and UMD2SO can only be made
                       if Info (1) is zero or positive.  If Info (1)
                       is negative, then some or all of the remaining
                       Info and Rinfo arrays may not be valid.

                   Info (2): duplicate entries in A.  A warning is set
                       if Info (2) > 0.  However, the duplicate entries
                       are summed and the factorization continues.  Duplicate
                       entries are sometimes intentional - for finite element
                       codes, for example.

                   Info (3): invalid entries in A, indices not in 1..n.
                       These entries are ignored and a warning is set
                       in Info (1).

                   Info (4): zero.  Used by UMD2RF only.

                   Info (5): entries in A after adding duplicates and
                       removing invalid entries.

                   Info (6): entries in diagonal blocks of A.

                   Info (7): entries in off-diagonal blocks of A.  Zero
                       if Info (9) = 1.

                   Info (8): 1-by-1 diagonal blocks.

                   Info (9): blocks in block-triangular form.

                   Info (10): entries below diagonal in L.

                   Info (11): entries below diagonal in U.

                   Info (12): entries in L+U+offdiagonal part.

                   Info (13): frontal matrices.

                   Info (14): garbage collections performed on Index, when
                       memory is exhausted.  Garbage collections are performed
                       to remove external fragmentation.  If Info (14) is
                       excessively high, performance can be degraded.  Try
                       increasing lindex if that occurs.  Note that external
                       fragmentation in *both* Index and Value is removed when
                       either is exhausted.

                   Info (15): garbage collections performed on Value.

                   Info (16): diagonal pivots chosen.

                   Info (17): numerically acceptable pivots found in A.
                       If less than n, then A is singular (or nearly so).
                       The factorization still proceeds, and UMD2SO can still
                       be called.  The zero-rank active submatrix of order
                       n - Info (17) is replaced with the identity matrix
                       (assuming BTF is not in use).  If BTF is in use, then
                       one or more of the diagonal blocks are singular.

                   Info (18): memory used in Index.

                   Info (19): minimum memory needed in Index
                       (or minimum recommended).  If lindex is set to
                       Info (19) on a subsequent call, then a moderate
                       number of garbage collections (Info (14)) will
                       occur.

                   Info (20): memory used in Value.

                   Info (21): minimum memory needed in Value
                       (or minimum recommended).  If lvalue is set to
                       Info (21) on a subsequent call, then a moderate
                       number of garbage collections (Info (15)) will
                       occur.

                   Info (22): memory needed in Index for the next call to
                       UMD2RF.

                   Info (23): memory needed in Value for the next call to
                       UMD2RF.

                   Info (24): zero.  Used by UMD2SO only.

                   Info (25 ... 40): reserved for future releases

                   ------------------------------------------------------------
          Rinfo:   A double precision array of size 20.
                   Need not be set by caller on input.  Modified on output.
                   It contains information about the execution of UMD2FA.

                   Rinfo (1): total flop count in the BLAS

                   Rinfo (2): total assembly flop count

                   Rinfo (3): total flops during pivot search

                   Rinfo (4): Level-1 BLAS flops

                   Rinfo (5): Level-2 BLAS flops

                   Rinfo (6): Level-3 BLAS flops

                   Rinfo (7): zero.  Used by UMD2SO only.

                   Rinfo (8): zero.  Used by UMD2SO only.

                   Rinfo (9 ... 20): reserved for future releases

   Calling UMD2RF or UMD2SO after calling UMD2FA
    ---------------------------------------------

          When calling UMD2SO to solve a linear system using the factors
          computed by UMD2FA or UMD2RF, the following must be preserved:

               n
               Value (Keep (1)...Keep (2))
               Index (Keep (3)...Keep (5))
               Keep (1 ... 20)

          When calling UMD2RF to factorize a subsequent matrix with a pattern
          similar to that factorized by UMD2FA, the following must be
          preserved:

               n
               Index (Keep (4)...Keep (5))
               Keep (4 ... 20)

          Note that the user may move the LU factors to a different position
          in Value and/or Index, as long as Keep (1 ... 5) are modified
          correspondingly.


UMD2RF: Numerical factorization routine
---------------------------------------

    UMD2RF calling sequence
    -----------------------

        SUBROUTINE UMD2RF (N, NE, JOB, TRANSA, LVALUE, LINDEX, VALUE,
     $          INDEX, KEEP, CNTL, ICNTL, INFO, RINFO)
        INTEGER N, NE, JOB, LVALUE, LINDEX, INDEX (LINDEX), KEEP (20),
     $          ICNTL (20), INFO (40)
        DOUBLE PRECISION
     $          VALUE (LVALUE), CNTL (10), RINFO (20)
        LOGICAL TRANSA

    UMD2RF description
    ------------------

          Given a sparse matrix A, and a sparsity-preserving and numerically-
          acceptable pivot order and symbolic factorization, compute the LU
          factors, PAQ = LU.  Uses the sparsity pattern and permutations from
          a prior factorization by UMD2FA or UMD2RF.  The matrix A should have
          the same nonzero pattern as the matrix factorized by UMD2FA or
          UMD2RF.  The matrix can have different numerical values.  No
          variations are made in the pivot order computed by UMD2FA.  If a
          zero pivot is encountered, an error flag is set and the
          factorization terminates.

          This routine can actually handle any matrix A such that (PAQ)_ij can
          be nonzero only if (LU)_ij is be nonzero, where L and U are the LU
          factors of the matrix factorized by UMD2FA.  If BTF (block triangular
          form) is used, entries above the diagonal blocks of (PAQ)_ij can have
          an arbitrary sparsity pattern.  Entries for which (LU)_ij is not
          present, or those below the diagonal blocks are invalid and ignored
          (a warning flag is set and the factorization proceeds without the
          invalid entries).  A listing of the invalid entries can be printed.

          This routine must be preceded by a call to UMD2FA or UMD2RF.
          A call to UMD2RF can be followed by any number of calls to UMD2SO,
          which solves a linear system using the LU factors computed by this
          routine or by UMD2FA.  A call to UMD2RF can also be followed by any
          number of calls to UMD2RF.

    UMD2RF arguments
    ----------------

                   ------------------------------------------------------------
          n:       An integer variable.
                   Must be set by caller on input (not modified).
                   Order of the matrix.  Must be identical to the value of n
                   in the last call to UMD2FA.

                   ------------------------------------------------------------
          ne:      An integer variable.
                   Must be set by caller on input (not modified).
                   Number of entries in input matrix.  Normally not modified
                   since the last call to UMD2FA.
                   Restriction:  1 <= ne < (Keep (4)) / 2

                   ------------------------------------------------------------
          job:     An integer variable.
                   Must be set by caller on input (not modified).
                   If job=1, then a column-oriented form of the input matrix
                   is preserved, otherwise, the input matrix is overwritten
                   with its LU factors.  If iterative refinement is to done
                   (Icntl (8) > 0), then job must be set to 1.  Can be
                   the same, or different, as the last call to UMD2FA.

                   ------------------------------------------------------------
          transa:  A logical variable.
                   Must be set by caller on input (not modified).
                   If false then A is factorized: PAQ = LU.  Otherwise, A
                   transpose is factorized:  PA'Q = LU.  Normally the same as
                   the last call to UMD2FA.
		   NOTE:  this argument MUST be .false. for the complex and
		   complex*16 versions!

                   ------------------------------------------------------------
          lvalue:  An integer variable.
                   Must be set by caller on input (not modified).
                   Size of the Value array.  Restriction:  lvalue >= 2*ne,
                   although a larger will typically be required to complete
                   the factorization.  The exact value required is computed
                   by the last call to UMD2FA or UMD2RF (Info (23)).
                   This value assumes that the ne, job, and transa parameters
                   are the same as the last call.  Some garbage collection may
                   occur if lvalue is set to Info (23), but usually not
                   much.  We recommend lvalue => 1.2 * Info (23).  The
                   lvalue parameter is usually the same as in the last call to
                   UMD2FA, however.

                   ------------------------------------------------------------
          lindex:  An integer variable.
                   Must be set by caller on input (not modified).
                   Size of the Index array.  Restriction:
                   lindex >= 3*ne+2*n+1 + (Keep (5) - Keep (4) + 1),
                   although a larger will typically be required to complete
                   the factorization.  The exact value required is computed
                   by the last call to UMD2FA or UMD2RF (Info (22)).
                   This value assumes that the ne, job, and transa parameters
                   are the same as the last call.  No garbage collection ever
                   occurs in the Index array, since UMD2RF does not create
                   external fragmentation in Index.  The lindex parameter is
                   usually the same as in the last call to UMD2FA, however.
                   Note that lindex >= Keep (5) is also required, since
                   the pattern of the prior LU factors reside in
                   Index (Keep (4) ... Keep (5)).

                   ------------------------------------------------------------
          Value:   A double precision array of size lvalue.
                   Must be set by caller on input (normally from the last call
                   to UMD2FA or UMD2RF).  Modified on output.  On input,
                   Value (1..ne) holds the original matrix in triplet form.
                   On output, Value holds the LU factors, and (optionally) a
                   column-oriented form of the original matrix - otherwise
                   the input matrix is overwritten with the LU factors.

                   ------------------------------------------------------------
          Index:   An integer array of size lindex.
                   Must be set by caller on input (normally from the last call
                   to UMD2FA or UMD2RF).  Modified on output.  On input,
                   Index (1..2*ne) holds the original matrix in triplet form,
                   and Index (Keep (4) ... Keep (5)) holds the pattern
                   of the prior LU factors.  On output, Index holds the LU
                   factors, and (optionally) a column-oriented form of the
                   original matrix - otherwise the input matrix is overwritten
                   with the LU factors.

                   On input the kth triplet (for k = 1...ne) is stored as:
                               A (row,col) = Value (k)
                               row         = Index (k)
                               col         = Index (k+ne)
                   If there is more than one entry for a particular position,
                   the values are accumulated, and the number of such duplicate
                   entries is returned in Info (2), and a warning flag is
                   set.  However, applications such as finite element methods
                   naturally generate duplicate entries which are then
                   assembled (added) together.  If this is the case, then
                   ignore the warning message.

                   On input, and the pattern of the prior LU factors is in
                       Index (Keep (4) ... Keep (5))

                   On output, the LU factors and the column-oriented form
                   of A (if preserved) are stored in:
                       Value (Keep (1)...Keep (2))
                       Index (Keep (3)...Keep (5))
                   where Keep (2) = lvalue, and Keep (5) = lindex.

                   ------------------------------------------------------------
          Keep:    An integer array of size 20.

                   Keep (1 ... 3):  Need not be set by caller on input.
                       Modified on output.
                       Keep (1): new LU factors start here in Value
                       Keep (2) = lvalue: new LU factors end here in Value
                       Keep (3): new LU factors start here in Index

                   Keep (4 ... 5): Must be set by caller on input (normally
                       from the last call to UMD2FA or UMD2RF). Modified on
                       output.
                       Keep (4):  On input, the prior LU factors start here
                       in Index, not including the prior (optionally) preserved
                       input matrix, nor the off-diagonal pattern (if BTF was
                       used in the last call to UMD2FA).  On output, the new
                       LU factors needed for UMD2RF start here in Index.
                       Keep (5):  On input, the prior LU factors end here in
                       Index.  On output, Keep (5) is set to lindex, which
                       is where the new LU factors end in Index

                   Keep (6 ... 8):  Unused.  These are used by UMD2FA only.
                       Future releases may make use of them, however.

                   Keep (9 ... 20): Unused.  Reserved for future releases.

                   ------------------------------------------------------------
          Cntl:    A double precision array of size 10.
                   Must be set by caller on input (not modified).
                   double precision control arguments, see UMD21I for a
                    description, which sets the default values.  The current
                    version of UMD2RF does not actually use Cntl.  It is
                    included to make the argument list of UMD2RF the same as
                    UMD2FA.  UMD2RF may use Cntl in future releases.

                   ------------------------------------------------------------
          Icntl:   An integer array of size 20.
                   Must be set by caller on input (not modified).
                   Integer control arguments, see UMD21I for a description,
                   which sets the default values.  UMD2RF uses Icntl (1),
                   Icntl (2), Icntl (3), and Icntl (7).

                   ------------------------------------------------------------
          Info:    An integer array of size 40.
                   Need not be set by caller on input.  Modified on output.
                   It contains information about the execution of UMD2RF.

                   Info (1): zero if no error occurred, negative if
                       an error occurred and the factorization was not
                       completed, positive if a warning occurred (the
                       factorization was completed).

                       These errors cause the factorization to terminate:

                       Error   Description
                       -1      n < 1
                       -2      ne < 1 or ne > maximum value
                       -3      lindex too small
                       -4      lvalue too small
                       -5      both lindex and lvalue are too small
                       -6      prior pivot ordering no longer acceptable
                       -7      LU factors are uncomputed, or are corrupted

                       With these warnings the factorization was able to
                       complete:

                       Error   Description
                       1       invalid entries
                       2       duplicate entries
                       3       invalid and duplicate entries
                       4       singular matrix
                       5       invalid entries, singular matrix
                       6       duplicate entries, singular matrix
                       7       invalid and duplicate entries, singular matrix

                       Subsequent calls to UMD2RF and UMD2SO can only be made
                       if Info (1) is zero or positive.  If Info (1)
                       is negative, then some or all of the remaining
                       Info and Rinfo arrays may not be valid.

                   Info (2): duplicate entries in A.  A warning is set
                       if Info (2) > 0.  However, the duplicate entries
                       are summed and the factorization continues.  Duplicate
                       entries are sometimes intentional - for finite element
                       codes, for example.

                   Info (3): invalid entries in A, indices not in 1..n.
                       These entries are ignored and a warning is set in
                       Info (1).

                   Info (4): invalid entries in A, not in prior LU
                       factors.  These entries are ignored and a warning is
                       set in Info (1).

                   Info (5): entries in A after adding duplicates and
                       removing invalid entries.

                   Info (6): entries in diagonal blocks of A.

                   Info (7): entries in off-diagonal blocks of A.  Zero
                       if Info (9) = 1.

                   Info (8): 1-by-1 diagonal blocks.

                   Info (9): blocks in block-triangular form.

                   Info (10): entries below diagonal in L.

                   Info (11): entries below diagonal in U.

                   Info (12): entries in L+U+offdiagonal part.

                   Info (13): frontal matrices.

                   Info (14): zero.  Used by UMD2FA only.

                   Info (15): garbage collections performed on Value.

                   Info (16): diagonal pivots chosen.

                   Info (17): numerically acceptable pivots found in A.
                       If less than n, then A is singular (or nearly so).
                       The factorization still proceeds, and UMD2SO can still
                       be called.  The zero-rank active submatrix of order
                       n - Info (17) is replaced with the identity matrix
                       (assuming BTF is not in use).  If BTF is in use, then
                       one or more of the diagonal blocks are singular.
                       UMD2RF can be called if the value of Info (17)
                       returned by UMD2FA was less than n, but the order
                       (n - Info (17)) active submatrix is still replaced
                       with the identity matrix.  Entries residing in this
                       submatrix are ignored, their number is included in
                       Info (4), and a warning is set in Info (1).

                   Info (18): memory used in Index.

                   Info (19): memory needed in Index (same as Info (18)).

                   Info (20): memory used in Value.

                   Info (21): minimum memory needed in Value
                       (or minimum recommended).  If lvalue is set to
                       Info (21) on a subsequent call, then a moderate
                       number of garbage collections (Info (15)) will
                       occur.

                   Info (22): memory needed in Index for the next call to
                       UMD2RF.

                   Info (23): memory needed in Value for the next call to
                       UMD2RF.

                   Info (24): zero.  Used by UMD2SO only.

                   Info (25 ... 40): reserved for future releases

                   ------------------------------------------------------------
          Rinfo:   A double precision array of size 20.
                   Need not be set by caller on input.  Modified on output.
                   It contains information about the execution of UMD2RF.

                   Rinfo (1): total flop count in the BLAS

                   Rinfo (2): total assembly flop count

                   Rinfo (3): zero.  Used by UMD2FA only.

                   Rinfo (4): Level-1 BLAS flops

                   Rinfo (5): Level-2 BLAS flops

                   Rinfo (6): Level-3 BLAS flops

                   Rinfo (7): zero.  Used by UMD2SO only.

                   Rinfo (8): zero.  Used by UMD2SO only.

                   Rinfo (9 ... 20): reserved for future releases



UMD2SO:  Routine for solving a linear system
--------------------------------------------


    UMD2SO calling sequence
    -----------------------

        SUBROUTINE UMD2SO (N, JOB, TRANSC, LVALUE, LINDEX, VALUE,
     $          INDEX, KEEP, B, X, W, CNTL, ICNTL, INFO, RINFO)
        INTEGER N, JOB, LVALUE, LINDEX, INDEX (LINDEX), KEEP (20),
     $          ICNTL (20), INFO (40)
        DOUBLE PRECISION
     $          VALUE (LVALUE), B (N), X (N), W (*), CNTL (10),
     $          RINFO (20)
        LOGICAL TRANSC


    UMD2SO description
    ------------------

          Given LU factors computed by UMD2FA or UMD2RF, and the
          right-hand-side, B, solve a linear system for the solution X.

          This routine handles all permutations, so that B and X are in terms
          of the original order of the matrix, A, and not in terms of the
          permuted matrix.

          If iterative refinement is done, then the residual is returned in W,
          and the sparse backward error estimates are returned in
          Rinfo (7) and Rinfo (8).  The computed solution X is the
          exact solution of the equation (A + dA)x = (b + db), where
            dA (i,j)  <= max (Rinfo (7), Rinfo (8)) * abs (A(i,j))
          and
            db (i) <= max (Rinfo (7) * abs (b (i)),
                           Rinfo (8) * maxnorm (A) * maxnorm (x computed))
          Note that dA has the same sparsity pattern as A.
          The method used to compute the sparse backward error estimate is
          described in M. Arioli, J. W. Demmel, and I. S. Duff, "Solving
          sparse linear systems with sparse backward error," SIAM J. Matrix
          Analysis and Applications, vol 10, 1989, pp. 165-190.


    UMD2SO arguments
    ----------------

                   ------------------------------------------------------------
          n:       An integer variable.
                   Must be set by caller on input (not modified).
                   Must be the same as passed to UMD2FA or UMD2RF.

                   ------------------------------------------------------------
          job:     An integer variable.
                   Must be set by caller on input (not modified).
                   What system to solve (see the transc argument below).
                   Iterative refinement is only performed if job = 0,
                   Icntl (8) > 0, and only if the original matrix was
                   preserved (job = 1 in UMD2FA or UMD2RF).

                   ------------------------------------------------------------
          transc:  A logical variable.
                   Must be set by caller on input (not modified).
                   solve with L and U factors or with L' and U', where
                   transa was passed to UMD2FA or UMD2RF.
		   NOTE:  this argument MUST be .false. for the complex and
		   complex*16 versions!

                   If transa = false, then PAQ = LU was performed,
                   and the following systems are solved:

                                       transc = false          transc = true
                                       ----------------        ----------------
                          job = 0      solve Ax = b            solve A'x = b
                          job = 1      solve P'Lx = b          solve L'Px = b
                          job = 2      solve UQ'x = b          solve QU'x = b

                   If transa = true, then A was transformed prior to LU
                   factorization, and P(A')Q = LU

                                       transc = false          transc = true
                                       ----------------        ----------------
                          job = 0      solve A'x = b           solve Ax = b
                          job = 1      solve P'Lx = b          solve L'Px = b
                          job = 2      solve UQ'x = b          solve QU'x = b

                   Other values of job are treated as zero.  Iterative
                   refinement can be done only when solving Ax=b or A'x=b.

                   The comments below use Matlab notation, where
                   x = L \ b means x = (L^(-1)) * b, premultiplication by
                   the inverse of L.

                   ------------------------------------------------------------
          lvalue:  An integer variable.
                   Must be set by caller on input (not modified).
                   The size of Value.

                   ------------------------------------------------------------
          lindex:  An integer variable.
                   Must be set by caller on input (not modified).
                   The size of Index.

                   ------------------------------------------------------------
          Value:   A double precision array of size lvalue.
                   Must be set by caller on input (normally from last call to
                   UMD2FA or UMD2RF) (not modified).
                   The LU factors, in Value (Keep (1) ... Keep (2)).
                   The entries in Value (1 ... Keep (1) - 1) and in
                   Value (Keep (2) + 1 ... lvalue) are not accessed.

                   ------------------------------------------------------------
          Index:   An integer array of size lindex.
                   Must be set by caller on input (normally from last call to
                   UMD2FA or UMD2RF) (not modified).
                   The LU factors, in Index (Keep (3) ... Keep (5)).
                   The entries in Index (1 ... Keep (3) - 1) and in
                   Index (Keep (5) + 1 ... lindex) are not accessed.

                   ------------------------------------------------------------
          Keep:    An integer array of size 20.

                   Keep (1..5): Must be set by caller on input (normally from
                       last call to UMD2FA or UMD2RF) (not modified).
                       Layout of the LU factors in Value and Index

                   ------------------------------------------------------------
          B:       A double precision array of size n.
                   Must be set by caller on input (not modified).
                   The right hand side, b, of the system to solve.

                   ------------------------------------------------------------
          W:       A double precision array of size 2*n or 4*n.
                   Need not be set by caller on input.  Modified on output.
                   Workspace of size W (1..2*n) if Icntl (8) = 0, which
                   is the default value.  If iterative refinement is
                   performed, and W must be of size W (1..4*n) and the
                   residual b-Ax (or b-A'x) is returned in W (1..n).

                   ------------------------------------------------------------
          X:       A double precision array of size n.
                   Need not be set by caller on input.  Modified on output.
                   The solution, x, of the system that was solved.  Valid only
                   if Info (1) is greater than or equal to 0.

                   ------------------------------------------------------------
          Cntl:    A double precision array of size 10.
                   Must be set by caller on input (not modified).
                   real control parameters, see UMD21I for a description,
                   which sets the defaults.

                   ------------------------------------------------------------
          Icntl:   An integer array of size 20.
                   Must be set by caller on input (not modified).
                   Integer control parameters, see UMD21I for a description,
                   which sets the defaults.  In particular, Icntl (8) is
                   the maximum number of steps of iterative refinement to be
                   performed.

                   ------------------------------------------------------------
          Info:    An integer array of size 40.
                   Need not be set by caller on input.  Modified on output.
                   It contains information about the execution of UMD2SO.

                   Info (1) is the error flag.  If Info (1) is -7, then
                   the LU factors are uncomputed, or have been corrupted since
                   the last call to UMD2FA or UMD2RF.  No system is solved,
                   and X (1..n) is not valid on output.  If Info (1) is 8,
                   then iterative refinement was requested but cannot be done.
                   To perform iterative refinement, the original matrix must be
                   preserved (job = 1 in UMD2FA or UMD2RF) and Ax=b or A'x=b
                   must be solved (job = 0 in UMD2SO).  Info (24) is the
                   steps of iterative refinement actually taken.

                   ------------------------------------------------------------
          Rinfo:   A double precision array of size 20.
                   Need not be set by caller on input.  Modified on output.
                   It contains information about the execution of UMD2SO.

                   If iterative refinement was performed then
                   Rinfo (7) is the sparse error estimate, omega1, and
                   Rinfo (8) is the sparse error estimate, omega2.




Acknowledgments
----------------

This work is supported by the National Science Foundation (DMS-9223088 and
DMS-9504974), and the State of Florida; and by CRAY Research Inc. through the
allocation of supercomputing resources.


Installation notes
------------------

The two HSL routines (MC13E and MC21B) contain additional licensing
restrictions.  If you want to run UMFPACK without them, see the "INSTALLATION
NOTE:" comment in UM*2FB.  Permutation to BTF will then not be available.

To permanently disable any diagnostic and/or error printing, see
the "INSTALLATION NOTE:" comments in UM*2P1 and UM*2P2.

To change the default control parameters, see the
"INSTALLATION NOTE:" comments in UM*21I.

IBM RS/6000:  be sure to use the latest (xlf) compiler release. UMFPACK Versions
2.0 and 2.1 cause an old version of the Fortran compiler to generate incorrect
code when the optimization option is use, resulting in a core dump when UMFPACK
is executed.  The error occurred only when (1) the matrix was reducible to block
triangular form, (2) the permutation to block triangular form was requested
(Icntl (4) = 1, the default), (3) the Fortran optimization flag was turned on,
and (4) an old version of the compiler was in use.  No problem arose when any
one of those four conditions did not hold.  If you encounter this problem, use
the latest xlf compiler release.  If that is not available, we recommend that
you set Icntl (4) to 0 (not the default), after calling UM*21I.


For more information
--------------------

For more information, see T. A. Davis and I. S. Duff, "An
unsymmetric-pattern multifrontal method for sparse LU factorization",
SIAM J. Matrix Analysis and Applications (to appear), also
technical report TR-94-038, CISE Dept., Univ. of Florida,
P.O. Box 116120, Gainesville, FL 32611-6120, USA.  The method used
here is a modification of that method, described in T. A. Davis,
"A combined unifrontal/multifrontal method for unsymmetric sparse
matrices," TR-94-005, and in T. A. Davis and I. S. Duff, (same title),
TR-95-020.  (Technical reports are available via WWW at
http://www.cis.ufl.edu/).  The (unsymmetric) approximate degree update
algorithm used here has been incorporated into a symmetric approximate
minimum degree ordering algorithm, described in P. Amestoy, T. A. Davis,
and I. S. Duff, "An approximate minimum degree ordering algorithm",
SIAM Journal on Matrix Analysis and Applications (to appear, also TR-94-039).
The approximate minimum degree ordering algorithm is implemented as MC47
in the Harwell Subroutine Library (MC47 is not used in UMFPACK).
Also take a look at our World Wide Web home pages:
        Tim Davis:  http://www.cis.ufl.edu/~davis
                    (also email: davis@cis.ufl.edu).
        Iain Duff:  http://www.cse.clrc.ac.uk/Person/I.S.Duff

