diff --git a/BLAS/DGEMM.f b/BLAS/DGEMM.f deleted file mode 100644 index 4f4af89e..00000000 --- a/BLAS/DGEMM.f +++ /dev/null @@ -1,324 +0,0 @@ -! 006 LAPACK_BLAS_AUX ############################################################################################################## - - SUBROUTINE DGEMM ( TRANSA, TRANSB, M, N, K, ALPHA, A, LDA, B, LDB, - $ BETA, C, LDC ) - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, TSEC - - USE OUTA_HERE_Interface - -* .. Scalar Arguments .. - CHARACTER*1 TRANSA, TRANSB - INTEGER M, N, K, LDA, LDB, LDC - REAL(DOUBLE) ALPHA, BETA -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), B( LDB, * ), C( LDC, * ) - - LOGICAL LSAME - EXTERNAL LSAME -* .. -* -* Purpose -* ======= -* -* DGEMM performs one of the matrix-matrix operations -* -* C := alpha*op( A )*op( B ) + beta*C, -* -* where op( X ) is one of -* -* op( X ) = X or op( X ) = X', -* -* alpha and beta are scalars, and A, B and C are matrices, with op( A ) -* an m by k matrix, op( B ) a k by n matrix and C an m by n matrix. -* -* Parameters -* ========== -* -* TRANSA - CHARACTER*1. -* On entry, TRANSA specifies the form of op( A ) to be used in -* the matrix multiplication as follows: -* -* TRANSA = 'N' or 'n', op( A ) = A. -* -* TRANSA = 'T' or 't', op( A ) = A'. -* -* TRANSA = 'C' or 'c', op( A ) = A'. -* -* Unchanged on exit. -* -* TRANSB - CHARACTER*1. -* On entry, TRANSB specifies the form of op( B ) to be used in -* the matrix multiplication as follows: -* -* TRANSB = 'N' or 'n', op( B ) = B. -* -* TRANSB = 'T' or 't', op( B ) = B'. -* -* TRANSB = 'C' or 'c', op( B ) = B'. -* -* Unchanged on exit. -* -* M - INTEGER. -* On entry, M specifies the number of rows of the matrix -* op( A ) and of the matrix C. M must be at least zero. -* Unchanged on exit. -* -* N - INTEGER. -* On entry, N specifies the number of columns of the matrix -* op( B ) and the number of columns of the matrix C. N must be -* at least zero. -* Unchanged on exit. -* -* K - INTEGER. -* On entry, K specifies the number of columns of the matrix -* op( A ) and the number of rows of the matrix op( B ). K must -* be at least zero. -* Unchanged on exit. -* -* ALPHA - REAL(DOUBLE). -* On entry, ALPHA specifies the scalar alpha. -* Unchanged on exit. -* -* A - REAL(DOUBLE) array of DIMENSION ( LDA, ka ), where ka is -* k when TRANSA = 'N' or 'n', and is m otherwise. -* Before entry with TRANSA = 'N' or 'n', the leading m by k -* part of the array A must contain the matrix A, otherwise -* the leading k by m part of the array A must contain the -* matrix A. -* Unchanged on exit. -* -* LDA - INTEGER. -* On entry, LDA specifies the first dimension of A as declared -* in the calling (sub) program. When TRANSA = 'N' or 'n' then -* LDA must be at least max( 1, m ), otherwise LDA must be at -* least max( 1, k ). -* Unchanged on exit. -* -* B - REAL(DOUBLE) array of DIMENSION ( LDB, kb ), where kb is -* n when TRANSB = 'N' or 'n', and is k otherwise. -* Before entry with TRANSB = 'N' or 'n', the leading k by n -* part of the array B must contain the matrix B, otherwise -* the leading n by k part of the array B must contain the -* matrix B. -* Unchanged on exit. -* -* LDB - INTEGER. -* On entry, LDB specifies the first dimension of B as declared -* in the calling (sub) program. When TRANSB = 'N' or 'n' then -* LDB must be at least max( 1, k ), otherwise LDB must be at -* least max( 1, n ). -* Unchanged on exit. -* -* BETA - REAL(DOUBLE). -* On entry, BETA specifies the scalar beta. When BETA is -* supplied as zero then C need not be set on input. -* Unchanged on exit. -* -* C - REAL(DOUBLE) array of DIMENSION ( LDC, n ). -* Before entry, the leading m by n part of the array C must -* contain the matrix C, except when beta is zero, in which -* case C need not be set on entry. -* On exit, the array C is overwritten by the m by n matrix -* ( alpha*op( A )*op( B ) + beta*C ). -* -* LDC - INTEGER. -* On entry, LDC specifies the first dimension of C as declared -* in the calling (sub) program. LDC must be at least -* max( 1, m ). -* Unchanged on exit. -* -* -* Level 3 Blas routine. -* -* -- Written on 8-February-1989. -* Jack Dongarra, Argonne National Laboratory. -* Iain Duff, AERE Harwell. -* Jeremy Du Croz, Numerical Algorithms Group Ltd. -* Sven Hammarling, Numerical Algorithms Group Ltd. -* -* -* .. External Functions .. -* .. External Subroutines .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. Local Scalars .. - LOGICAL NOTA, NOTB - INTEGER I, INFO, J, L, NCOLA, NROWA, NROWB - REAL(DOUBLE) TEMP -* .. Parameters .. - REAL(DOUBLE) ONE , ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Executable Statements .. -* -* Set NOTA and NOTB as true if A and B respectively are not -* transposed and set NROWA, NCOLA and NROWB as the number of rows -* and columns of A and the number of rows of B respectively. -* - NOTA = LSAME( TRANSA, 'N' ) - NOTB = LSAME( TRANSB, 'N' ) - IF( NOTA )THEN - NROWA = M - NCOLA = K - ELSE - NROWA = K - NCOLA = M - END IF - IF( NOTB )THEN - NROWB = K - ELSE - NROWB = N - END IF -* -* Test the input parameters. -* - INFO = 0 - IF( ( .NOT.NOTA ).AND. - $ ( .NOT.LSAME( TRANSA, 'C' ) ).AND. - $ ( .NOT.LSAME( TRANSA, 'T' ) ) )THEN - INFO = 1 - ELSE IF( ( .NOT.NOTB ).AND. - $ ( .NOT.LSAME( TRANSB, 'C' ) ).AND. - $ ( .NOT.LSAME( TRANSB, 'T' ) ) )THEN - INFO = 2 - ELSE IF( M .LT.0 )THEN - INFO = 3 - ELSE IF( N .LT.0 )THEN - INFO = 4 - ELSE IF( K .LT.0 )THEN - INFO = 5 - ELSE IF( LDA.LT.MAX( 1, NROWA ) )THEN - INFO = 8 - ELSE IF( LDB.LT.MAX( 1, NROWB ) )THEN - INFO = 10 - ELSE IF( LDC.LT.MAX( 1, M ) )THEN - INFO = 13 - END IF - IF( INFO.NE.0 )THEN - CALL XERBLA( 'DGEMM ', INFO ) - RETURN - END IF -* -* Quick return if possible. -* - IF( ( M.EQ.0 ).OR.( N.EQ.0 ).OR. - $ ( ( ( ALPHA.EQ.ZERO ).OR.( K.EQ.0 ) ).AND.( BETA.EQ.ONE ) ) ) - $ RETURN -* -* And if alpha.eq.zero. -* - IF( ALPHA.EQ.ZERO )THEN - IF( BETA.EQ.ZERO )THEN - DO 20, J = 1, N - DO 10, I = 1, M - C( I, J ) = ZERO - 10 CONTINUE - 20 CONTINUE - ELSE - DO 40, J = 1, N - DO 30, I = 1, M - C( I, J ) = BETA*C( I, J ) - 30 CONTINUE - 40 CONTINUE - END IF - RETURN - END IF -* -* Start the operations. -* - IF( NOTB )THEN - IF( NOTA )THEN -* -* Form C := alpha*A*B + beta*C. -* - DO 90, J = 1, N - IF( BETA.EQ.ZERO )THEN - DO 50, I = 1, M - C( I, J ) = ZERO - 50 CONTINUE - ELSE IF( BETA.NE.ONE )THEN - DO 60, I = 1, M - C( I, J ) = BETA*C( I, J ) - 60 CONTINUE - END IF - DO 80, L = 1, K - IF( B( L, J ).NE.ZERO )THEN - TEMP = ALPHA*B( L, J ) - DO 70, I = 1, M - C( I, J ) = C( I, J ) + TEMP*A( I, L ) - 70 CONTINUE - END IF - 80 CONTINUE - 90 CONTINUE - ELSE -* -* Form C := alpha*A'*B + beta*C -* - DO 120, J = 1, N - DO 110, I = 1, M - TEMP = ZERO - DO 100, L = 1, K - TEMP = TEMP + A( L, I )*B( L, J ) - 100 CONTINUE - IF( BETA.EQ.ZERO )THEN - C( I, J ) = ALPHA*TEMP - ELSE - C( I, J ) = ALPHA*TEMP + BETA*C( I, J ) - END IF - 110 CONTINUE - 120 CONTINUE - END IF - ELSE - IF( NOTA )THEN -* -* Form C := alpha*A*B' + beta*C -* - DO 170, J = 1, N - IF( BETA.EQ.ZERO )THEN - DO 130, I = 1, M - C( I, J ) = ZERO - 130 CONTINUE - ELSE IF( BETA.NE.ONE )THEN - DO 140, I = 1, M - C( I, J ) = BETA*C( I, J ) - 140 CONTINUE - END IF - DO 160, L = 1, K - IF( B( J, L ).NE.ZERO )THEN - TEMP = ALPHA*B( J, L ) - DO 150, I = 1, M - C( I, J ) = C( I, J ) + TEMP*A( I, L ) - 150 CONTINUE - END IF - 160 CONTINUE - 170 CONTINUE - ELSE -* -* Form C := alpha*A'*B' + beta*C -* - DO 200, J = 1, N - DO 190, I = 1, M - TEMP = ZERO - DO 180, L = 1, K - TEMP = TEMP + A( L, I )*B( J, L ) - 180 CONTINUE - IF( BETA.EQ.ZERO )THEN - C( I, J ) = ALPHA*TEMP - ELSE - C( I, J ) = ALPHA*TEMP + BETA*C( I, J ) - END IF - 190 CONTINUE - 200 CONTINUE - END IF - END IF -* - RETURN -* -* End of DGEMM . -* - END SUBROUTINE DGEMM - diff --git a/BLAS/DGEMV.f b/BLAS/DGEMV.f deleted file mode 100644 index 9da670dd..00000000 --- a/BLAS/DGEMV.f +++ /dev/null @@ -1,272 +0,0 @@ -! 007 LAPACK_BLAS_AUX ############################################################################################################## - - SUBROUTINE DGEMV ( TRANS, M, N, ALPHA, A, LDA, X, INCX, - $ BETA, Y, INCY ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, TSEC - - USE OUTA_HERE_Interface - -* .. Scalar Arguments .. - REAL(DOUBLE) ALPHA, BETA - INTEGER INCX, INCY, LDA, M, N - CHARACTER*1 TRANS -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), X( * ), Y( * ) -* .. -* -* Purpose -* ======= -* -* DGEMV performs one of the matrix-vector operations -* -* y := alpha*A*x + beta*y, or y := alpha*A'*x + beta*y, -* -* where alpha and beta are scalars, x and y are vectors and A is an -* m by n matrix. -* -* Parameters -* ========== -* -* TRANS - CHARACTER*1. -* On entry, TRANS specifies the operation to be performed as -* follows: -* -* TRANS = 'N' or 'n' y := alpha*A*x + beta*y. -* -* TRANS = 'T' or 't' y := alpha*A'*x + beta*y. -* -* TRANS = 'C' or 'c' y := alpha*A'*x + beta*y. -* -* Unchanged on exit. -* -* M - INTEGER. -* On entry, M specifies the number of rows of the matrix A. -* M must be at least zero. -* Unchanged on exit. -* -* N - INTEGER. -* On entry, N specifies the number of columns of the matrix A. -* N must be at least zero. -* Unchanged on exit. -* -* ALPHA - REAL(DOUBLE). -* On entry, ALPHA specifies the scalar alpha. -* Unchanged on exit. -* -* A - REAL(DOUBLE) array of DIMENSION ( LDA, n ). -* Before entry, the leading m by n part of the array A must -* contain the matrix of coefficients. -* Unchanged on exit. -* -* LDA - INTEGER. -* On entry, LDA specifies the first dimension of A as declared -* in the calling (sub) program. LDA must be at least -* max( 1, m ). -* Unchanged on exit. -* -* X - REAL(DOUBLE) array of DIMENSION at least -* ( 1 + ( n - 1 )*abs( INCX ) ) when TRANS = 'N' or 'n' -* and at least -* ( 1 + ( m - 1 )*abs( INCX ) ) otherwise. -* Before entry, the incremented array X must contain the -* vector x. -* Unchanged on exit. -* -* INCX - INTEGER. -* On entry, INCX specifies the increment for the elements of -* X. INCX must not be zero. -* Unchanged on exit. -* -* BETA - REAL(DOUBLE). -* On entry, BETA specifies the scalar beta. When BETA is -* supplied as zero then Y need not be set on input. -* Unchanged on exit. -* -* Y - REAL(DOUBLE) array of DIMENSION at least -* ( 1 + ( m - 1 )*abs( INCY ) ) when TRANS = 'N' or 'n' -* and at least -* ( 1 + ( n - 1 )*abs( INCY ) ) otherwise. -* Before entry with BETA non-zero, the incremented array Y -* must contain the vector y. On exit, Y is overwritten by the -* updated vector y. -* -* INCY - INTEGER. -* On entry, INCY specifies the increment for the elements of -* Y. INCY must not be zero. -* Unchanged on exit. -* -* -* Level 2 Blas routine. -* -* -- Written on 22-October-1986. -* Jack Dongarra, Argonne National Lab. -* Jeremy Du Croz, Nag Central Office. -* Sven Hammarling, Nag Central Office. -* Richard Hanson, Sandia National Labs. -* -* -* .. Parameters .. - REAL(DOUBLE) ONE , ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. Local Scalars .. - REAL(DOUBLE) TEMP - INTEGER I, INFO, IX, IY, J, JX, JY, KX, KY, LENX, LENY -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. External Subroutines .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - IF ( .NOT.LSAME( TRANS, 'N' ).AND. - $ .NOT.LSAME( TRANS, 'T' ).AND. - $ .NOT.LSAME( TRANS, 'C' ) )THEN - INFO = 1 - ELSE IF( M.LT.0 )THEN - INFO = 2 - ELSE IF( N.LT.0 )THEN - INFO = 3 - ELSE IF( LDA.LT.MAX( 1, M ) )THEN - INFO = 6 - ELSE IF( INCX.EQ.0 )THEN - INFO = 8 - ELSE IF( INCY.EQ.0 )THEN - INFO = 11 - END IF - IF( INFO.NE.0 )THEN - CALL XERBLA( 'DGEMV ', INFO ) - RETURN - END IF -* -* Quick return if possible. -* - IF( ( M.EQ.0 ).OR.( N.EQ.0 ).OR. - $ ( ( ALPHA.EQ.ZERO ).AND.( BETA.EQ.ONE ) ) ) - $ RETURN -* -* Set LENX and LENY, the lengths of the vectors x and y, and set -* up the start points in X and Y. -* - IF( LSAME( TRANS, 'N' ) )THEN - LENX = N - LENY = M - ELSE - LENX = M - LENY = N - END IF - IF( INCX.GT.0 )THEN - KX = 1 - ELSE - KX = 1 - ( LENX - 1 )*INCX - END IF - IF( INCY.GT.0 )THEN - KY = 1 - ELSE - KY = 1 - ( LENY - 1 )*INCY - END IF -* -* Start the operations. In this version the elements of A are -* accessed sequentially with one pass through A. -* -* First form y := beta*y. -* - IF( BETA.NE.ONE )THEN - IF( INCY.EQ.1 )THEN - IF( BETA.EQ.ZERO )THEN - DO 10, I = 1, LENY - Y( I ) = ZERO - 10 CONTINUE - ELSE - DO 20, I = 1, LENY - Y( I ) = BETA*Y( I ) - 20 CONTINUE - END IF - ELSE - IY = KY - IF( BETA.EQ.ZERO )THEN - DO 30, I = 1, LENY - Y( IY ) = ZERO - IY = IY + INCY - 30 CONTINUE - ELSE - DO 40, I = 1, LENY - Y( IY ) = BETA*Y( IY ) - IY = IY + INCY - 40 CONTINUE - END IF - END IF - END IF - IF( ALPHA.EQ.ZERO ) - $ RETURN - IF( LSAME( TRANS, 'N' ) )THEN -* -* Form y := alpha*A*x + y. -* - JX = KX - IF( INCY.EQ.1 )THEN - DO 60, J = 1, N - IF( X( JX ).NE.ZERO )THEN - TEMP = ALPHA*X( JX ) - DO 50, I = 1, M - Y( I ) = Y( I ) + TEMP*A( I, J ) - 50 CONTINUE - END IF - JX = JX + INCX - 60 CONTINUE - ELSE - DO 80, J = 1, N - IF( X( JX ).NE.ZERO )THEN - TEMP = ALPHA*X( JX ) - IY = KY - DO 70, I = 1, M - Y( IY ) = Y( IY ) + TEMP*A( I, J ) - IY = IY + INCY - 70 CONTINUE - END IF - JX = JX + INCX - 80 CONTINUE - END IF - ELSE -* -* Form y := alpha*A'*x + y. -* - JY = KY - IF( INCX.EQ.1 )THEN - DO 100, J = 1, N - TEMP = ZERO - DO 90, I = 1, M - TEMP = TEMP + A( I, J )*X( I ) - 90 CONTINUE - Y( JY ) = Y( JY ) + ALPHA*TEMP - JY = JY + INCY - 100 CONTINUE - ELSE - DO 120, J = 1, N - TEMP = ZERO - IX = KX - DO 110, I = 1, M - TEMP = TEMP + A( I, J )*X( IX ) - IX = IX + INCX - 110 CONTINUE - Y( JY ) = Y( JY ) + ALPHA*TEMP - JY = JY + INCY - 120 CONTINUE - END IF - END IF -* - RETURN -* -* End of DGEMV . -* - END SUBROUTINE DGEMV - diff --git a/BLAS/DLAMCH.f b/BLAS/DLAMCH.f deleted file mode 100644 index 07d9778e..00000000 --- a/BLAS/DLAMCH.f +++ /dev/null @@ -1,138 +0,0 @@ -! 031 LAPACK_BLAS_AUX ############################################################################################################## - - DOUBLE PRECISION FUNCTION DLAMCH( CMACH ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, TSEC - USE LAPACK_BLAS_AUX - - USE OUTA_HERE_Interface - -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* October 31, 1992 -* -* .. Scalar Arguments .. - CHARACTER CMACH -* .. -* -* Purpose -* ======= -* -* DLAMCH determines REAL(DOUBLE) machine parameters. -* -* Arguments -* ========= -* -* CMACH (input) CHARACTER*1 -* Specifies the value to be returned by DLAMCH: -* = 'E' or 'e', DLAMCH := eps -* = 'S' or 's , DLAMCH := sfmin -* = 'B' or 'b', DLAMCH := base -* = 'P' or 'p', DLAMCH := eps*base -* = 'N' or 'n', DLAMCH := t -* = 'R' or 'r', DLAMCH := rnd -* = 'M' or 'm', DLAMCH := emin -* = 'U' or 'u', DLAMCH := rmin -* = 'L' or 'l', DLAMCH := emax -* = 'O' or 'o', DLAMCH := rmax -* -* where -* -* eps = relative machine precision -* sfmin = safe minimum, such that 1/sfmin does not overflow -* base = base of the machine -* prec = eps*base -* t = number of (base) digits in the mantissa -* rnd = 1.0 when rounding occurs in addition, 0.0 otherwise -* emin = minimum exponent before (gradual) underflow -* rmin = underflow threshold - base**(emin-1) -* emax = largest exponent before overflow -* rmax = overflow threshold - (base**emax)*(1-eps) -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL FIRST, LRND - INTEGER BETA, IMAX, IMIN, IT - REAL(DOUBLE) BASE, EMAX, EMIN, EPS, PREC, RMACH, RMAX, RMIN, - $ RND, SFMIN, SMALL, T -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Save statement .. - SAVE FIRST, EPS, SFMIN, BASE, T, RND, EMIN, RMIN, - $ EMAX, RMAX, PREC -* .. -* .. Data statements .. - DATA FIRST / .TRUE. / -* .. -* .. Executable Statements .. -* - IF( FIRST ) THEN - FIRST = .FALSE. - CALL DLAMC2( BETA, IT, LRND, EPS, IMIN, RMIN, IMAX, RMAX ) - BASE = BETA - T = IT - IF( LRND ) THEN - RND = ONE - EPS = ( BASE**( 1-IT ) ) / 2 - ELSE - RND = ZERO - EPS = BASE**( 1-IT ) - END IF - PREC = EPS*BASE - EMIN = IMIN - EMAX = IMAX - SFMIN = RMIN - SMALL = ONE / RMAX - IF( SMALL.GE.SFMIN ) THEN -* -* Use SMALL plus a bit, to avoid the possibility of rounding -* causing overflow when computing 1/sfmin. -* - SFMIN = SMALL*( ONE+EPS ) - END IF - END IF -* - IF( LSAME( CMACH, 'E' ) ) THEN - RMACH = EPS - ELSE IF( LSAME( CMACH, 'S' ) ) THEN - RMACH = SFMIN - ELSE IF( LSAME( CMACH, 'B' ) ) THEN - RMACH = BASE - ELSE IF( LSAME( CMACH, 'P' ) ) THEN - RMACH = PREC - ELSE IF( LSAME( CMACH, 'N' ) ) THEN - RMACH = T - ELSE IF( LSAME( CMACH, 'R' ) ) THEN - RMACH = RND - ELSE IF( LSAME( CMACH, 'M' ) ) THEN - RMACH = EMIN - ELSE IF( LSAME( CMACH, 'U' ) ) THEN - RMACH = RMIN - ELSE IF( LSAME( CMACH, 'L' ) ) THEN - RMACH = EMAX - ELSE IF( LSAME( CMACH, 'O' ) ) THEN - RMACH = RMAX - END IF -* - DLAMCH = RMACH - RETURN -* -* End of DLAMCH -* - END FUNCTION DLAMCH -* diff --git a/BLAS/DLANST.f b/BLAS/DLANST.f deleted file mode 100644 index c3fe41ae..00000000 --- a/BLAS/DLANST.f +++ /dev/null @@ -1,136 +0,0 @@ -! 038 LAPACK_BLAS_AUX ############################################################################################################## - - DOUBLE PRECISION FUNCTION DLANST( NORM, N, D, E ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, TSEC - USE LAPACK_BLAS_AUX - - USE OUTA_HERE_Interface -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. - CHARACTER NORM - INTEGER N -* .. -* .. Array Arguments .. - REAL(DOUBLE) D( * ), E( * ) -* .. -* -* Purpose -* ======= -* -* DLANST returns the value of the one norm, or the Frobenius norm, or -* the infinity norm, or the element of largest absolute value of a -* real symmetric tridiagonal matrix A. -* -* Description -* =========== -* -* DLANST returns the value -* -* DLANST = ( max(abs(A(i,j))), NORM = 'M' or 'm' -* ( -* ( norm1(A), NORM = '1', 'O' or 'o' -* ( -* ( normI(A), NORM = 'I' or 'i' -* ( -* ( normF(A), NORM = 'F', 'f', 'E' or 'e' -* -* where norm1 denotes the one norm of a matrix (maximum column sum), -* normI denotes the infinity norm of a matrix (maximum row sum) and -* normF denotes the Frobenius norm of a matrix (square root of sum of -* squares). Note that max(abs(A(i,j))) is not a matrix norm. -* -* Arguments -* ========= -* -* NORM (input) CHARACTER*1 -* Specifies the value to be returned in DLANST as described -* above. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. When N = 0, DLANST is -* set to zero. -* -* D (input) REAL(DOUBLE) array, dimension (N) -* The diagonal elements of A. -* -* E (input) REAL(DOUBLE) array, dimension (N-1) -* The (n-1) sub-diagonal or super-diagonal elements of A. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER I - REAL(DOUBLE) ANORM, SCALE, SUM -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, MAX, SQRT -* .. -* .. Executable Statements .. -* - IF( N.LE.0 ) THEN - ANORM = ZERO - ELSE IF( LSAME( NORM, 'M' ) ) THEN -* -* Find max(abs(A(i,j))). -* - ANORM = ABS( D( N ) ) - DO 10 I = 1, N - 1 - ANORM = MAX( ANORM, ABS( D( I ) ) ) - ANORM = MAX( ANORM, ABS( E( I ) ) ) - 10 CONTINUE - ELSE IF( LSAME( NORM, 'O' ) .OR. NORM.EQ.'1' .OR. - $ LSAME( NORM, 'I' ) ) THEN -* -* Find norm1(A). -* - IF( N.EQ.1 ) THEN - ANORM = ABS( D( 1 ) ) - ELSE - ANORM = MAX( ABS( D( 1 ) )+ABS( E( 1 ) ), - $ ABS( E( N-1 ) )+ABS( D( N ) ) ) - DO 20 I = 2, N - 1 - ANORM = MAX( ANORM, ABS( D( I ) )+ABS( E( I ) )+ - $ ABS( E( I-1 ) ) ) - 20 CONTINUE - END IF - ELSE IF( ( LSAME( NORM, 'F' ) ) .OR. ( LSAME( NORM, 'E' ) ) ) THEN -* -* Find normF(A). -* - SCALE = ZERO - SUM = ONE - IF( N.GT.1 ) THEN - CALL DLASSQ( N-1, E, 1, SCALE, SUM ) - SUM = 2*SUM - END IF - CALL DLASSQ( N, D, 1, SCALE, SUM ) - ANORM = SCALE*SQRT( SUM ) - END IF -* - DLANST = ANORM - RETURN -* -* End of DLANST -* - END FUNCTION DLANST - diff --git a/BLAS/DSCAL.f b/BLAS/DSCAL.f deleted file mode 100644 index b1baaaf4..00000000 --- a/BLAS/DSCAL.f +++ /dev/null @@ -1,56 +0,0 @@ -! 012 LAPACK_BLAS_AUX ############################################################################################################## - - SUBROUTINE DSCAL(N,DA,DX,INCX) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, TSEC - - USE OUTA_HERE_Interface - -c -c scales a vector by a constant. -c uses unrolled loops for increment equal to one. -c jack dongarra, linpack, 3/11/78. -c modified 3/93 to return if incx .le. 0. -c modified 12/3/93, array(1) declarations changed to array(*) -c - REAL(DOUBLE) da,dx(*) - integer i,incx,m,mp1,n,nincx -c - if( n.le.0 .or. incx.le.0 )return - if(incx.eq.1)go to 20 -c -c code for increment not equal to 1 -c - nincx = n*incx - do 10 i = 1,nincx,incx - dx(i) = da*dx(i) - 10 continue - return -c -c code for increment equal to 1 -c -c -c clean-up loop -c - 20 m = mod(n,5) - if( m .eq. 0 ) go to 40 - do 30 i = 1,m - dx(i) = da*dx(i) - 30 continue - if( n .lt. 5 ) return - 40 mp1 = m + 1 - do 50 i = mp1,n,5 - dx(i) = da*dx(i) - dx(i + 1) = da*dx(i + 1) - dx(i + 2) = da*dx(i + 2) - dx(i + 3) = da*dx(i + 3) - dx(i + 4) = da*dx(i + 4) - 50 continue - return - - END SUBROUTINE DSCAL - diff --git a/BLAS/DSTEQR.f b/BLAS/DSTEQR.f deleted file mode 100644 index 4bce899e..00000000 --- a/BLAS/DSTEQR.f +++ /dev/null @@ -1,516 +0,0 @@ -! 002 LAPACK_MISCEL ################################################################################################################ - - SUBROUTINE DSTEQR( COMPZ, N, D, E, Z, LDZ, WORK, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : FATAL_ERR - USE LAPACK_BLAS_AUX - - USE OUTA_HERE_Interface - -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. - CHARACTER COMPZ - INTEGER INFO, LDZ, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) D( * ), E( * ), WORK( * ), Z( LDZ, * ) - -* .. -* -* Purpose -* ======= -* -* DSTEQR computes all eigenvalues and, optionally, eigenvectors of a -* symmetric tridiagonal matrix using the implicit QL or QR method. -* The eigenvectors of a full or band symmetric matrix can also be found -* if DSYTRD or DSPTRD or DSBTRD has been used to reduce this matrix to -* tridiagonal form. -* -* Arguments -* ========= -* -* COMPZ (input) CHARACTER*1 -* = 'N': Compute eigenvalues only. -* = 'V': Compute eigenvalues and eigenvectors of the original -* symmetric matrix. On entry, Z must contain the -* orthogonal matrix used to reduce the original matrix -* to tridiagonal form. -* = 'I': Compute eigenvalues and eigenvectors of the -* tridiagonal matrix. Z is initialized to the identity -* matrix. -* -* N (input) INTEGER -* The order of the matrix. N >= 0. -* -* D (input/output) REAL(DOUBLE) array, dimension (N) -* On entry, the diagonal elements of the tridiagonal matrix. -* On exit, if INFO = 0, the eigenvalues in ascending order. -* -* E (input/output) REAL(DOUBLE) array, dimension (N-1) -* On entry, the (n-1) subdiagonal elements of the tridiagonal -* matrix. -* On exit, E has been destroyed. -* -* Z (input/output) REAL(DOUBLE) array, dimension (LDZ, N) -* On entry, if COMPZ = 'V', then Z contains the orthogonal -* matrix used in the reduction to tridiagonal form. -* On exit, if INFO = 0, then if COMPZ = 'V', Z contains the -* orthonormal eigenvectors of the original symmetric matrix, -* and if COMPZ = 'I', Z contains the orthonormal eigenvectors -* of the symmetric tridiagonal matrix. -* If COMPZ = 'N', then Z is not referenced. -* -* LDZ (input) INTEGER -* The leading dimension of the array Z. LDZ >= 1, and if -* eigenvectors are desired, then LDZ >= max(1,N). -* -* WORK (workspace) REAL(DOUBLE) array, dimension (max(1,2*N-2)) -* If COMPZ = 'N', then WORK is not referenced. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: the algorithm has failed to find all the eigenvalues in -* a total of 30*N iterations; if INFO = i, then i -* elements of E have not converged to zero; on exit, D -* and E contain the elements of a symmetric tridiagonal -* matrix which is orthogonally similar to the original -* matrix. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO, ONE, TWO, THREE - PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0, - $ THREE = 3.0D0 ) - INTEGER MAXIT - PARAMETER ( MAXIT = 30 ) -* .. -* .. Local Scalars .. - INTEGER I, ICOMPZ, II, ISCALE, J, JTOT, K, L, L1, LEND, - $ LENDM1, LENDP1, LENDSV, LM1, LSV, M, MM, MM1, - $ NM1, NMAXIT - REAL(DOUBLE) ANORM, B, C, EPS, EPS2, F, G, P, R, RT1, RT2, - $ S, SAFMAX, SAFMIN, SSFMAX, SSFMIN, TST -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME - -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, MAX, SIGN, SQRT - - REAL(DOUBLE) DLAMCH, DLANST - EXTERNAL DLAMCH, DLANST -* .. -* .. Executable Statements .. - -* -* Test the input parameters. -* - INFO = 0 -* - IF( LSAME( COMPZ, 'N' ) ) THEN - ICOMPZ = 0 - ELSE IF( LSAME( COMPZ, 'V' ) ) THEN - ICOMPZ = 1 - ELSE IF( LSAME( COMPZ, 'I' ) ) THEN - ICOMPZ = 2 - ELSE - ICOMPZ = -1 - END IF - IF( ICOMPZ.LT.0 ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( ( LDZ.LT.1 ) .OR. ( ICOMPZ.GT.0 .AND. LDZ.LT.MAX( 1, - $ N ) ) ) THEN - INFO = -6 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DSTEQR', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* - IF( N.EQ.1 ) THEN - IF( ICOMPZ.EQ.2 ) - $ Z( 1, 1 ) = ONE - RETURN - END IF -* -* Determine the unit roundoff and over/underflow thresholds. -* - EPS = DLAMCH( 'E' ) - EPS2 = EPS**2 - SAFMIN = DLAMCH( 'S' ) - SAFMAX = ONE / SAFMIN - SSFMAX = SQRT( SAFMAX ) / THREE - SSFMIN = SQRT( SAFMIN ) / EPS2 -* -* Compute the eigenvalues and eigenvectors of the tridiagonal -* matrix. -* - IF( ICOMPZ.EQ.2 ) - $ CALL DLASET( 'Full', N, N, ZERO, ONE, Z, LDZ ) -* - NMAXIT = N*MAXIT - JTOT = 0 -* -* Determine where the matrix splits and choose QL or QR iteration -* for each block, according to whether top or bottom diagonal -* element is smaller. -* - L1 = 1 - NM1 = N - 1 -* - 10 CONTINUE - IF( L1.GT.N ) - $ GO TO 160 - IF( L1.GT.1 ) - $ E( L1-1 ) = ZERO - IF( L1.LE.NM1 ) THEN - DO 20 M = L1, NM1 - TST = ABS( E( M ) ) - IF( TST.EQ.ZERO ) - $ GO TO 30 - IF( TST.LE.( SQRT( ABS( D( M ) ) )*SQRT( ABS( D( M+ - $ 1 ) ) ) )*EPS ) THEN - E( M ) = ZERO - GO TO 30 - END IF - 20 CONTINUE - END IF - M = N -* - 30 CONTINUE - L = L1 - LSV = L - LEND = M - LENDSV = LEND - L1 = M + 1 - IF( LEND.EQ.L ) - $ GO TO 10 -* -* Scale submatrix in rows and columns L to LEND -* - ANORM = DLANST( 'I', LEND-L+1, D( L ), E( L ) ) - ISCALE = 0 - IF( ANORM.EQ.ZERO ) - $ GO TO 10 - IF( ANORM.GT.SSFMAX ) THEN - ISCALE = 1 - CALL DLASCL( 'G', 0, 0, ANORM, SSFMAX, LEND-L+1, 1, D( L ), N, - $ INFO ) - CALL DLASCL( 'G', 0, 0, ANORM, SSFMAX, LEND-L, 1, E( L ), N, - $ INFO ) - ELSE IF( ANORM.LT.SSFMIN ) THEN - ISCALE = 2 - CALL DLASCL( 'G', 0, 0, ANORM, SSFMIN, LEND-L+1, 1, D( L ), N, - $ INFO ) - CALL DLASCL( 'G', 0, 0, ANORM, SSFMIN, LEND-L, 1, E( L ), N, - $ INFO ) - END IF -* -* Choose between QL and QR iteration -* - IF( ABS( D( LEND ) ).LT.ABS( D( L ) ) ) THEN - LEND = LSV - L = LENDSV - END IF -* - IF( LEND.GT.L ) THEN -* -* QL Iteration -* -* Look for small subdiagonal element. -* - 40 CONTINUE - IF( L.NE.LEND ) THEN - LENDM1 = LEND - 1 - DO 50 M = L, LENDM1 - TST = ABS( E( M ) )**2 - IF( TST.LE.( EPS2*ABS( D( M ) ) )*ABS( D( M+1 ) )+ - $ SAFMIN )GO TO 60 - 50 CONTINUE - END IF -* - M = LEND -* - 60 CONTINUE - IF( M.LT.LEND ) - $ E( M ) = ZERO - P = D( L ) - IF( M.EQ.L ) - $ GO TO 80 -* -* If remaining matrix is 2-by-2, use DLAE2 or SLAEV2 -* to compute its eigensystem. -* - IF( M.EQ.L+1 ) THEN - IF( ICOMPZ.GT.0 ) THEN - CALL DLAEV2( D( L ), E( L ), D( L+1 ), RT1, RT2, C, S ) - WORK( L ) = C - WORK( N-1+L ) = S - CALL DLASR( 'R', 'V', 'B', N, 2, WORK( L ), - $ WORK( N-1+L ), Z( 1, L ), LDZ ) - ELSE - CALL DLAE2( D( L ), E( L ), D( L+1 ), RT1, RT2 ) - END IF - D( L ) = RT1 - D( L+1 ) = RT2 - E( L ) = ZERO - L = L + 2 - IF( L.LE.LEND ) - $ GO TO 40 - GO TO 140 - END IF -* - IF( JTOT.EQ.NMAXIT ) - $ GO TO 140 - JTOT = JTOT + 1 -* -* Form shift. -* - G = ( D( L+1 )-P ) / ( TWO*E( L ) ) - R = DLAPY2( G, ONE ) - G = D( M ) - P + ( E( L ) / ( G+SIGN( R, G ) ) ) -* - S = ONE - C = ONE - P = ZERO -* -* Inner loop -* - MM1 = M - 1 - DO 70 I = MM1, L, -1 - F = S*E( I ) - B = C*E( I ) - CALL DLARTG( G, F, C, S, R ) - IF( I.NE.M-1 ) - $ E( I+1 ) = R - G = D( I+1 ) - P - R = ( D( I )-G )*S + TWO*C*B - P = S*R - D( I+1 ) = G + P - G = C*R - B -* -* If eigenvectors are desired, then save rotations. -* - IF( ICOMPZ.GT.0 ) THEN - WORK( I ) = C - WORK( N-1+I ) = -S - END IF -* - 70 CONTINUE -* -* If eigenvectors are desired, then apply saved rotations. -* - IF( ICOMPZ.GT.0 ) THEN - MM = M - L + 1 - CALL DLASR( 'R', 'V', 'B', N, MM, WORK( L ), WORK( N-1+L ), - $ Z( 1, L ), LDZ ) - END IF -* - D( L ) = D( L ) - P - E( L ) = G - GO TO 40 -* -* Eigenvalue found. -* - 80 CONTINUE - D( L ) = P -* - L = L + 1 - IF( L.LE.LEND ) - $ GO TO 40 - GO TO 140 -* - ELSE -* -* QR Iteration -* -* Look for small superdiagonal element. -* - 90 CONTINUE - IF( L.NE.LEND ) THEN - LENDP1 = LEND + 1 - DO 100 M = L, LENDP1, -1 - TST = ABS( E( M-1 ) )**2 - IF( TST.LE.( EPS2*ABS( D( M ) ) )*ABS( D( M-1 ) )+ - $ SAFMIN )GO TO 110 - 100 CONTINUE - END IF -* - M = LEND -* - 110 CONTINUE - IF( M.GT.LEND ) - $ E( M-1 ) = ZERO - P = D( L ) - IF( M.EQ.L ) - $ GO TO 130 -* -* If remaining matrix is 2-by-2, use DLAE2 or SLAEV2 -* to compute its eigensystem. -* - IF( M.EQ.L-1 ) THEN - IF( ICOMPZ.GT.0 ) THEN - CALL DLAEV2( D( L-1 ), E( L-1 ), D( L ), RT1, RT2, C, S ) - WORK( M ) = C - WORK( N-1+M ) = S - CALL DLASR( 'R', 'V', 'F', N, 2, WORK( M ), - $ WORK( N-1+M ), Z( 1, L-1 ), LDZ ) - ELSE - CALL DLAE2( D( L-1 ), E( L-1 ), D( L ), RT1, RT2 ) - END IF - D( L-1 ) = RT1 - D( L ) = RT2 - E( L-1 ) = ZERO - L = L - 2 - IF( L.GE.LEND ) - $ GO TO 90 - GO TO 140 - END IF -* - IF( JTOT.EQ.NMAXIT ) - $ GO TO 140 - JTOT = JTOT + 1 -* -* Form shift. -* - G = ( D( L-1 )-P ) / ( TWO*E( L-1 ) ) - R = DLAPY2( G, ONE ) - G = D( M ) - P + ( E( L-1 ) / ( G+SIGN( R, G ) ) ) -* - S = ONE - C = ONE - P = ZERO -* -* Inner loop -* - LM1 = L - 1 - DO 120 I = M, LM1 - F = S*E( I ) - B = C*E( I ) - CALL DLARTG( G, F, C, S, R ) - IF( I.NE.M ) - $ E( I-1 ) = R - G = D( I ) - P - R = ( D( I+1 )-G )*S + TWO*C*B - P = S*R - D( I ) = G + P - G = C*R - B -* -* If eigenvectors are desired, then save rotations. -* - IF( ICOMPZ.GT.0 ) THEN - WORK( I ) = C - WORK( N-1+I ) = S - END IF -* - 120 CONTINUE -* -* If eigenvectors are desired, then apply saved rotations. -* - IF( ICOMPZ.GT.0 ) THEN - MM = L - M + 1 - CALL DLASR( 'R', 'V', 'F', N, MM, WORK( M ), WORK( N-1+M ), - $ Z( 1, M ), LDZ ) - END IF -* - D( L ) = D( L ) - P - E( LM1 ) = G - GO TO 90 -* -* Eigenvalue found. -* - 130 CONTINUE - D( L ) = P -* - L = L - 1 - IF( L.GE.LEND ) - $ GO TO 90 - GO TO 140 -* - END IF -* -* Undo scaling if necessary -* - 140 CONTINUE - IF( ISCALE.EQ.1 ) THEN - CALL DLASCL( 'G', 0, 0, SSFMAX, ANORM, LENDSV-LSV+1, 1, - $ D( LSV ), N, INFO ) - CALL DLASCL( 'G', 0, 0, SSFMAX, ANORM, LENDSV-LSV, 1, E( LSV ), - $ N, INFO ) - ELSE IF( ISCALE.EQ.2 ) THEN - CALL DLASCL( 'G', 0, 0, SSFMIN, ANORM, LENDSV-LSV+1, 1, - $ D( LSV ), N, INFO ) - CALL DLASCL( 'G', 0, 0, SSFMIN, ANORM, LENDSV-LSV, 1, E( LSV ), - $ N, INFO ) - END IF -* -* Check for no convergence to an eigenvalue after a total -* of N*MAXIT iterations. -* - IF( JTOT.LT.NMAXIT ) - $ GO TO 10 - DO 150 I = 1, N - 1 - IF( E( I ).NE.ZERO ) - $ INFO = INFO + 1 - 150 CONTINUE - GO TO 190 -* -* Order eigenvalues and eigenvectors. -* - 160 CONTINUE - IF( ICOMPZ.EQ.0 ) THEN -* -* Use Quick Sort -* - CALL DLASRT( 'I', N, D, INFO ) -* - ELSE -* -* Use Selection Sort to minimize swaps of eigenvectors -* - DO 180 II = 2, N - I = II - 1 - K = I - P = D( I ) - DO 170 J = II, N - IF( D( J ).LT.P ) THEN - K = J - P = D( J ) - END IF - 170 CONTINUE - IF( K.NE.I ) THEN - D( K ) = D( I ) - D( I ) = P - CALL DSWAP( N, Z( 1, I ), 1, Z( 1, K ), 1 ) - END IF - 180 CONTINUE - END IF -* - 190 CONTINUE - - RETURN -* -* End of DSTEQR -* - END SUBROUTINE DSTEQR - diff --git a/BLAS/DSTERF.f b/BLAS/DSTERF.f deleted file mode 100644 index 35e1caba..00000000 --- a/BLAS/DSTERF.f +++ /dev/null @@ -1,376 +0,0 @@ -! 001 LAPACK_MISCEL ################################################################################################################ - - SUBROUTINE DSTERF( N, D, E, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - USE LAPACK_BLAS_AUX - USE LAPACK_LIN_EQN_DPB - -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* June 30, 1999 -* -* .. Scalar Arguments .. - INTEGER INFO, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) D( * ), E( * ) -* .. -* -* Purpose -* ======= -* -* DSTERF computes all eigenvalues of a symmetric tridiagonal matrix -* using the Pal-Walker-Kahan variant of the QL or QR algorithm. -* -* Arguments -* ========= -* -* N (input) INTEGER -* The order of the matrix. N >= 0. -* -* D (input/output) REAL(DOUBLE) array, dimension (N) -* On entry, the n diagonal elements of the tridiagonal matrix. -* On exit, if INFO = 0, the eigenvalues in ascending order. -* -* E (input/output) REAL(DOUBLE) array, dimension (N-1) -* On entry, the (n-1) subdiagonal elements of the tridiagonal -* matrix. -* On exit, E has been destroyed. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: the algorithm failed to find all of the eigenvalues in -* a total of 30*N iterations; if INFO = i, then i -* elements of E have not converged to zero. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO, ONE, TWO, THREE - PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0, - $ THREE = 3.0D0 ) - INTEGER MAXIT - PARAMETER ( MAXIT = 30 ) -* .. -* .. Local Scalars .. - INTEGER I, ISCALE, JTOT, L, L1, LEND, LENDSV, LSV, M, - $ NMAXIT - REAL(DOUBLE) ALPHA, ANORM, BB, C, EPS, EPS2, GAMMA, OLDC, - $ OLDGAM, P, R, RT1, RT2, RTE, S, SAFMAX, SAFMIN, - $ SIGMA, SSFMAX, SSFMIN -* .. -* .. External Functions .. -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, SIGN, SQRT - - REAL(DOUBLE) DLAMCH, DLANST - EXTERNAL DLAMCH, DLANST -* .. -* .. Executable Statements .. - -* -* Test the input parameters. -* - INFO = 0 -* -* Quick return if possible -* - IF( N.LT.0 ) THEN - INFO = -1 - CALL XERBLA( 'DSTERF', -INFO ) - RETURN - END IF - IF( N.LE.1 ) - $ RETURN -* -* Determine the unit roundoff for this environment. -* - EPS = DLAMCH( 'E' ) - EPS2 = EPS**2 - SAFMIN = DLAMCH( 'S' ) - SAFMAX = ONE / SAFMIN - SSFMAX = SQRT( SAFMAX ) / THREE - SSFMIN = SQRT( SAFMIN ) / EPS2 -* -* Compute the eigenvalues of the tridiagonal matrix. -* - NMAXIT = N*MAXIT - SIGMA = ZERO - JTOT = 0 -* -* Determine where the matrix splits and choose QL or QR iteration -* for each block, according to whether top or bottom diagonal -* element is smaller. -* - L1 = 1 -* - 10 CONTINUE - IF( L1.GT.N ) - $ GO TO 170 - IF( L1.GT.1 ) - $ E( L1-1 ) = ZERO - DO 20 M = L1, N - 1 - IF( ABS( E( M ) ).LE.( SQRT( ABS( D( M ) ) )*SQRT( ABS( D( M+ - $ 1 ) ) ) )*EPS ) THEN - E( M ) = ZERO - GO TO 30 - END IF - 20 CONTINUE - M = N -* - 30 CONTINUE - L = L1 - LSV = L - LEND = M - LENDSV = LEND - L1 = M + 1 - IF( LEND.EQ.L ) - $ GO TO 10 -* -* Scale submatrix in rows and columns L to LEND -* - ANORM = DLANST( 'I', LEND-L+1, D( L ), E( L ) ) - ISCALE = 0 - IF( ANORM.GT.SSFMAX ) THEN - ISCALE = 1 - CALL DLASCL( 'G', 0, 0, ANORM, SSFMAX, LEND-L+1, 1, D( L ), N, - $ INFO ) - CALL DLASCL( 'G', 0, 0, ANORM, SSFMAX, LEND-L, 1, E( L ), N, - $ INFO ) - ELSE IF( ANORM.LT.SSFMIN ) THEN - ISCALE = 2 - CALL DLASCL( 'G', 0, 0, ANORM, SSFMIN, LEND-L+1, 1, D( L ), N, - $ INFO ) - CALL DLASCL( 'G', 0, 0, ANORM, SSFMIN, LEND-L, 1, E( L ), N, - $ INFO ) - END IF -* - DO 40 I = L, LEND - 1 - E( I ) = E( I )**2 - 40 CONTINUE -* -* Choose between QL and QR iteration -* - IF( ABS( D( LEND ) ).LT.ABS( D( L ) ) ) THEN - LEND = LSV - L = LENDSV - END IF -* - IF( LEND.GE.L ) THEN -* -* QL Iteration -* -* Look for small subdiagonal element. -* - 50 CONTINUE - IF( L.NE.LEND ) THEN - DO 60 M = L, LEND - 1 - IF( ABS( E( M ) ).LE.EPS2*ABS( D( M )*D( M+1 ) ) ) - $ GO TO 70 - 60 CONTINUE - END IF - M = LEND -* - 70 CONTINUE - IF( M.LT.LEND ) - $ E( M ) = ZERO - P = D( L ) - IF( M.EQ.L ) - $ GO TO 90 -* -* If remaining matrix is 2 by 2, use DLAE2 to compute its -* eigenvalues. -* - IF( M.EQ.L+1 ) THEN - RTE = SQRT( E( L ) ) - CALL DLAE2( D( L ), RTE, D( L+1 ), RT1, RT2 ) - D( L ) = RT1 - D( L+1 ) = RT2 - E( L ) = ZERO - L = L + 2 - IF( L.LE.LEND ) - $ GO TO 50 - GO TO 150 - END IF -* - IF( JTOT.EQ.NMAXIT ) - $ GO TO 150 - JTOT = JTOT + 1 -* -* Form shift. -* - RTE = SQRT( E( L ) ) - SIGMA = ( D( L+1 )-P ) / ( TWO*RTE ) - R = DLAPY2( SIGMA, ONE ) - SIGMA = P - ( RTE / ( SIGMA+SIGN( R, SIGMA ) ) ) -* - C = ONE - S = ZERO - GAMMA = D( M ) - SIGMA - P = GAMMA*GAMMA -* -* Inner loop -* - DO 80 I = M - 1, L, -1 - BB = E( I ) - R = P + BB - IF( I.NE.M-1 ) - $ E( I+1 ) = S*R - OLDC = C - C = P / R - S = BB / R - OLDGAM = GAMMA - ALPHA = D( I ) - GAMMA = C*( ALPHA-SIGMA ) - S*OLDGAM - D( I+1 ) = OLDGAM + ( ALPHA-GAMMA ) - IF( C.NE.ZERO ) THEN - P = ( GAMMA*GAMMA ) / C - ELSE - P = OLDC*BB - END IF - 80 CONTINUE -* - E( L ) = S*P - D( L ) = SIGMA + GAMMA - GO TO 50 -* -* Eigenvalue found. -* - 90 CONTINUE - D( L ) = P -* - L = L + 1 - IF( L.LE.LEND ) - $ GO TO 50 - GO TO 150 -* - ELSE -* -* QR Iteration -* -* Look for small superdiagonal element. -* - 100 CONTINUE - DO 110 M = L, LEND + 1, -1 - IF( ABS( E( M-1 ) ).LE.EPS2*ABS( D( M )*D( M-1 ) ) ) - $ GO TO 120 - 110 CONTINUE - M = LEND -* - 120 CONTINUE - IF( M.GT.LEND ) - $ E( M-1 ) = ZERO - P = D( L ) - IF( M.EQ.L ) - $ GO TO 140 -* -* If remaining matrix is 2 by 2, use DLAE2 to compute its -* eigenvalues. -* - IF( M.EQ.L-1 ) THEN - RTE = SQRT( E( L-1 ) ) - CALL DLAE2( D( L ), RTE, D( L-1 ), RT1, RT2 ) - D( L ) = RT1 - D( L-1 ) = RT2 - E( L-1 ) = ZERO - L = L - 2 - IF( L.GE.LEND ) - $ GO TO 100 - GO TO 150 - END IF -* - IF( JTOT.EQ.NMAXIT ) - $ GO TO 150 - JTOT = JTOT + 1 -* -* Form shift. -* - RTE = SQRT( E( L-1 ) ) - SIGMA = ( D( L-1 )-P ) / ( TWO*RTE ) - R = DLAPY2( SIGMA, ONE ) - SIGMA = P - ( RTE / ( SIGMA+SIGN( R, SIGMA ) ) ) -* - C = ONE - S = ZERO - GAMMA = D( M ) - SIGMA - P = GAMMA*GAMMA -* -* Inner loop -* - DO 130 I = M, L - 1 - BB = E( I ) - R = P + BB - IF( I.NE.M ) - $ E( I-1 ) = S*R - OLDC = C - C = P / R - S = BB / R - OLDGAM = GAMMA - ALPHA = D( I+1 ) - GAMMA = C*( ALPHA-SIGMA ) - S*OLDGAM - D( I ) = OLDGAM + ( ALPHA-GAMMA ) - IF( C.NE.ZERO ) THEN - P = ( GAMMA*GAMMA ) / C - ELSE - P = OLDC*BB - END IF - 130 CONTINUE -* - E( L-1 ) = S*P - D( L ) = SIGMA + GAMMA - GO TO 100 -* -* Eigenvalue found. -* - 140 CONTINUE - D( L ) = P -* - L = L - 1 - IF( L.GE.LEND ) - $ GO TO 100 - GO TO 150 -* - END IF -* -* Undo scaling if necessary -* - 150 CONTINUE - IF( ISCALE.EQ.1 ) - $ CALL DLASCL( 'G', 0, 0, SSFMAX, ANORM, LENDSV-LSV+1, 1, - $ D( LSV ), N, INFO ) - IF( ISCALE.EQ.2 ) - $ CALL DLASCL( 'G', 0, 0, SSFMIN, ANORM, LENDSV-LSV+1, 1, - $ D( LSV ), N, INFO ) -* -* Check for no convergence to an eigenvalue after a total -* of N*MAXIT iterations. -* - IF( JTOT.LT.NMAXIT ) - $ GO TO 10 - DO 160 I = 1, N - 1 - IF( E( I ).NE.ZERO ) - $ INFO = INFO + 1 - 160 CONTINUE - GO TO 180 -* -* Sort eigenvalues in increasing order. -* - 170 CONTINUE - CALL DLASRT( 'I', N, D, INFO ) -* - 180 CONTINUE - - RETURN -* -* End of DSTERF -* - END SUBROUTINE DSTERF - diff --git a/BLAS/DSWAP.f b/BLAS/DSWAP.f deleted file mode 100644 index 05803382..00000000 --- a/BLAS/DSWAP.f +++ /dev/null @@ -1,69 +0,0 @@ -! 013 LAPACK_BLAS_AUX ############################################################################################################## - - SUBROUTINE DSWAP (N,DX,INCX,DY,INCY) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, TSEC - - USE OUTA_HERE_Interface - -c -c interchanges two vectors. -c uses unrolled loops for increments equal one. -c jack dongarra, linpack, 3/11/78. -c modified 12/3/93, array(1) declarations changed to array(*) -c - REAL(DOUBLE) dx(*),dy(*),dtemp - integer i,incx,incy,ix,iy,m,mp1,n -c - if(n.le.0)return - if(incx.eq.1.and.incy.eq.1)go to 20 -c -c code for unequal increments or equal increments not equal -c to 1 -c - ix = 1 - iy = 1 - if(incx.lt.0)ix = (-n+1)*incx + 1 - if(incy.lt.0)iy = (-n+1)*incy + 1 - do 10 i = 1,n - dtemp = dx(ix) - dx(ix) = dy(iy) - dy(iy) = dtemp - ix = ix + incx - iy = iy + incy - 10 continue - return -c -c code for both increments equal to 1 -c -c -c clean-up loop -c - 20 m = mod(n,3) - if( m .eq. 0 ) go to 40 - do 30 i = 1,m - dtemp = dx(i) - dx(i) = dy(i) - dy(i) = dtemp - 30 continue - if( n .lt. 3 ) return - 40 mp1 = m + 1 - do 50 i = mp1,n,3 - dtemp = dx(i) - dx(i) = dy(i) - dy(i) = dtemp - dtemp = dx(i + 1) - dx(i + 1) = dy(i + 1) - dy(i + 1) = dtemp - dtemp = dx(i + 2) - dx(i + 2) = dy(i + 2) - dy(i + 2) = dtemp - 50 continue - return - - END SUBROUTINE DSWAP - diff --git a/BLAS/DTRSM.f b/BLAS/DTRSM.f deleted file mode 100644 index 64d35ae9..00000000 --- a/BLAS/DTRSM.f +++ /dev/null @@ -1,389 +0,0 @@ -! 022 LAPACK_BLAS_AUX ############################################################################################################## - - SUBROUTINE DTRSM ( SIDE, UPLO, TRANSA, DIAG, M, N, ALPHA, A, LDA, - $ B, LDB ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, TSEC - - USE OUTA_HERE_Interface - -** .. Scalar Arguments .. - CHARACTER*1 SIDE, UPLO, TRANSA, DIAG - INTEGER M, N, LDA, LDB - REAL(DOUBLE) ALPHA -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), B( LDB, * ) -* .. -* -* Purpose -* ======= -* -* DTRSM solves one of the matrix equations -* -* op( A )*X = alpha*B, or X*op( A ) = alpha*B, -* -* where alpha is a scalar, X and B are m by n matrices, A is a unit, or -* non-unit, upper or lower triangular matrix and op( A ) is one of -* -* op( A ) = A or op( A ) = A'. -* -* The matrix X is overwritten on B. -* -* Parameters -* ========== -* -* SIDE - CHARACTER*1. -* On entry, SIDE specifies whether op( A ) appears on the left -* or right of X as follows: -* -* SIDE = 'L' or 'l' op( A )*X = alpha*B. -* -* SIDE = 'R' or 'r' X*op( A ) = alpha*B. -* -* Unchanged on exit. -* -* UPLO - CHARACTER*1. -* On entry, UPLO specifies whether the matrix A is an upper or -* lower triangular matrix as follows: -* -* UPLO = 'U' or 'u' A is an upper triangular matrix. -* -* UPLO = 'L' or 'l' A is a lower triangular matrix. -* -* Unchanged on exit. -* -* TRANSA - CHARACTER*1. -* On entry, TRANSA specifies the form of op( A ) to be used in -* the matrix multiplication as follows: -* -* TRANSA = 'N' or 'n' op( A ) = A. -* -* TRANSA = 'T' or 't' op( A ) = A'. -* -* TRANSA = 'C' or 'c' op( A ) = A'. -* -* Unchanged on exit. -* -* DIAG - CHARACTER*1. -* On entry, DIAG specifies whether or not A is unit triangular -* as follows: -* -* DIAG = 'U' or 'u' A is assumed to be unit triangular. -* -* DIAG = 'N' or 'n' A is not assumed to be unit -* triangular. -* -* Unchanged on exit. -* -* M - INTEGER. -* On entry, M specifies the number of rows of B. M must be at -* least zero. -* Unchanged on exit. -* -* N - INTEGER. -* On entry, N specifies the number of columns of B. N must be -* at least zero. -* Unchanged on exit. -* -* ALPHA - REAL(DOUBLE). -* On entry, ALPHA specifies the scalar alpha. When alpha is -* zero then A is not referenced and B need not be set before -* entry. -* Unchanged on exit. -* -* A - REAL(DOUBLE) array of DIMENSION ( LDA, k ), where k is m -* when SIDE = 'L' or 'l' and is n when SIDE = 'R' or 'r'. -* Before entry with UPLO = 'U' or 'u', the leading k by k -* upper triangular part of the array A must contain the upper -* triangular matrix and the strictly lower triangular part of -* A is not referenced. -* Before entry with UPLO = 'L' or 'l', the leading k by k -* lower triangular part of the array A must contain the lower -* triangular matrix and the strictly upper triangular part of -* A is not referenced. -* Note that when DIAG = 'U' or 'u', the diagonal elements of -* A are not referenced either, but are assumed to be unity. -* Unchanged on exit. -* -* LDA - INTEGER. -* On entry, LDA specifies the first dimension of A as declared -* in the calling (sub) program. When SIDE = 'L' or 'l' then -* LDA must be at least max( 1, m ), when SIDE = 'R' or 'r' -* then LDA must be at least max( 1, n ). -* Unchanged on exit. -* -* B - REAL(DOUBLE) array of DIMENSION ( LDB, n ). -* Before entry, the leading m by n part of the array B must -* contain the right-hand side matrix B, and on exit is -* overwritten by the solution matrix X. -* -* LDB - INTEGER. -* On entry, LDB specifies the first dimension of B as declared -* in the calling (sub) program. LDB must be at least -* max( 1, m ). -* Unchanged on exit. -* -* -* Level 3 Blas routine. -* -* -* -- Written on 8-February-1989. -* Jack Dongarra, Argonne National Laboratory. -* Iain Duff, AERE Harwell. -* Jeremy Du Croz, Numerical Algorithms Group Ltd. -* Sven Hammarling, Numerical Algorithms Group Ltd. -* -* -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. External Subroutines .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. Local Scalars .. - LOGICAL LSIDE, NOUNIT, UPPER - INTEGER I, INFO, J, K, NROWA - REAL(DOUBLE) TEMP -* .. Parameters .. - REAL(DOUBLE) ONE , ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - LSIDE = LSAME( SIDE , 'L' ) - IF( LSIDE )THEN - NROWA = M - ELSE - NROWA = N - END IF - NOUNIT = LSAME( DIAG , 'N' ) - UPPER = LSAME( UPLO , 'U' ) -* - INFO = 0 - IF( ( .NOT.LSIDE ).AND. - $ ( .NOT.LSAME( SIDE , 'R' ) ) )THEN - INFO = 1 - ELSE IF( ( .NOT.UPPER ).AND. - $ ( .NOT.LSAME( UPLO , 'L' ) ) )THEN - INFO = 2 - ELSE IF( ( .NOT.LSAME( TRANSA, 'N' ) ).AND. - $ ( .NOT.LSAME( TRANSA, 'T' ) ).AND. - $ ( .NOT.LSAME( TRANSA, 'C' ) ) )THEN - INFO = 3 - ELSE IF( ( .NOT.LSAME( DIAG , 'U' ) ).AND. - $ ( .NOT.LSAME( DIAG , 'N' ) ) )THEN - INFO = 4 - ELSE IF( M .LT.0 )THEN - INFO = 5 - ELSE IF( N .LT.0 )THEN - INFO = 6 - ELSE IF( LDA.LT.MAX( 1, NROWA ) )THEN - INFO = 9 - ELSE IF( LDB.LT.MAX( 1, M ) )THEN - INFO = 11 - END IF - IF( INFO.NE.0 )THEN - CALL XERBLA( 'DTRSM ', INFO ) - RETURN - END IF -* -* Quick return if possible. -* - IF( N.EQ.0 ) - $ RETURN -* -* And when alpha.eq.zero. -* - IF( ALPHA.EQ.ZERO )THEN - DO 20, J = 1, N - DO 10, I = 1, M - B( I, J ) = ZERO - 10 CONTINUE - 20 CONTINUE - RETURN - END IF -* -* Start the operations. -* - IF( LSIDE )THEN - IF( LSAME( TRANSA, 'N' ) )THEN -* -* Form B := alpha*inv( A )*B. -* - IF( UPPER )THEN - DO 60, J = 1, N - IF( ALPHA.NE.ONE )THEN - DO 30, I = 1, M - B( I, J ) = ALPHA*B( I, J ) - 30 CONTINUE - END IF - DO 50, K = M, 1, -1 - IF( B( K, J ).NE.ZERO )THEN - IF( NOUNIT ) - $ B( K, J ) = B( K, J )/A( K, K ) - DO 40, I = 1, K - 1 - B( I, J ) = B( I, J ) - B( K, J )*A( I, K ) - 40 CONTINUE - END IF - 50 CONTINUE - 60 CONTINUE - ELSE - DO 100, J = 1, N - IF( ALPHA.NE.ONE )THEN - DO 70, I = 1, M - B( I, J ) = ALPHA*B( I, J ) - 70 CONTINUE - END IF - DO 90 K = 1, M - IF( B( K, J ).NE.ZERO )THEN - IF( NOUNIT ) - $ B( K, J ) = B( K, J )/A( K, K ) - DO 80, I = K + 1, M - B( I, J ) = B( I, J ) - B( K, J )*A( I, K ) - 80 CONTINUE - END IF - 90 CONTINUE - 100 CONTINUE - END IF - ELSE -* -* Form B := alpha*inv( A' )*B. -* - IF( UPPER )THEN - DO 130, J = 1, N - DO 120, I = 1, M - TEMP = ALPHA*B( I, J ) - DO 110, K = 1, I - 1 - TEMP = TEMP - A( K, I )*B( K, J ) - 110 CONTINUE - IF( NOUNIT ) - $ TEMP = TEMP/A( I, I ) - B( I, J ) = TEMP - 120 CONTINUE - 130 CONTINUE - ELSE - DO 160, J = 1, N - DO 150, I = M, 1, -1 - TEMP = ALPHA*B( I, J ) - DO 140, K = I + 1, M - TEMP = TEMP - A( K, I )*B( K, J ) - 140 CONTINUE - IF( NOUNIT ) - $ TEMP = TEMP/A( I, I ) - B( I, J ) = TEMP - 150 CONTINUE - 160 CONTINUE - END IF - END IF - ELSE - IF( LSAME( TRANSA, 'N' ) )THEN -* -* Form B := alpha*B*inv( A ). -* - IF( UPPER )THEN - DO 210, J = 1, N - IF( ALPHA.NE.ONE )THEN - DO 170, I = 1, M - B( I, J ) = ALPHA*B( I, J ) - 170 CONTINUE - END IF - DO 190, K = 1, J - 1 - IF( A( K, J ).NE.ZERO )THEN - DO 180, I = 1, M - B( I, J ) = B( I, J ) - A( K, J )*B( I, K ) - 180 CONTINUE - END IF - 190 CONTINUE - IF( NOUNIT )THEN - TEMP = ONE/A( J, J ) - DO 200, I = 1, M - B( I, J ) = TEMP*B( I, J ) - 200 CONTINUE - END IF - 210 CONTINUE - ELSE - DO 260, J = N, 1, -1 - IF( ALPHA.NE.ONE )THEN - DO 220, I = 1, M - B( I, J ) = ALPHA*B( I, J ) - 220 CONTINUE - END IF - DO 240, K = J + 1, N - IF( A( K, J ).NE.ZERO )THEN - DO 230, I = 1, M - B( I, J ) = B( I, J ) - A( K, J )*B( I, K ) - 230 CONTINUE - END IF - 240 CONTINUE - IF( NOUNIT )THEN - TEMP = ONE/A( J, J ) - DO 250, I = 1, M - B( I, J ) = TEMP*B( I, J ) - 250 CONTINUE - END IF - 260 CONTINUE - END IF - ELSE -* -* Form B := alpha*B*inv( A' ). -* - IF( UPPER )THEN - DO 310, K = N, 1, -1 - IF( NOUNIT )THEN - TEMP = ONE/A( K, K ) - DO 270, I = 1, M - B( I, K ) = TEMP*B( I, K ) - 270 CONTINUE - END IF - DO 290, J = 1, K - 1 - IF( A( J, K ).NE.ZERO )THEN - TEMP = A( J, K ) - DO 280, I = 1, M - B( I, J ) = B( I, J ) - TEMP*B( I, K ) - 280 CONTINUE - END IF - 290 CONTINUE - IF( ALPHA.NE.ONE )THEN - DO 300, I = 1, M - B( I, K ) = ALPHA*B( I, K ) - 300 CONTINUE - END IF - 310 CONTINUE - ELSE - DO 360, K = 1, N - IF( NOUNIT )THEN - TEMP = ONE/A( K, K ) - DO 320, I = 1, M - B( I, K ) = TEMP*B( I, K ) - 320 CONTINUE - END IF - DO 340, J = K + 1, N - IF( A( J, K ).NE.ZERO )THEN - TEMP = A( J, K ) - DO 330, I = 1, M - B( I, J ) = B( I, J ) - TEMP*B( I, K ) - 330 CONTINUE - END IF - 340 CONTINUE - IF( ALPHA.NE.ONE )THEN - DO 350, I = 1, M - B( I, K ) = ALPHA*B( I, K ) - 350 CONTINUE - END IF - 360 CONTINUE - END IF - END IF - END IF -* - RETURN -* -* End of DTRSM . -* - END SUBROUTINE DTRSM - diff --git a/BLAS/DTRTRI.f b/BLAS/DTRTRI.f deleted file mode 100644 index 94834410..00000000 --- a/BLAS/DTRTRI.f +++ /dev/null @@ -1,190 +0,0 @@ -! 004 LAPACK_SYM_MAT_INV ########################################################################################################### - - SUBROUTINE DTRTRI( UPLO, DIAG, N, A, LDA, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE SCONTR, ONLY : BLNK_SUB_NAM - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & TSEC - USE LAPACK_BLAS_AUX - USE LAPACK_SYM_MAT_INV - - USE OURTIM_Interface - - -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* March 31, 1993 -* -* .. Scalar Arguments .. - CHARACTER DIAG, UPLO - INTEGER INFO, LDA, N -* .. -* .. Array Arguments .. - DOUBLE PRECISION A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DTRTRI computes the inverse of a real upper or lower triangular -* matrix A. -* -* This is the Level 3 BLAS version of the algorithm. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': A is upper triangular; -* = 'L': A is lower triangular. -* -* DIAG (input) CHARACTER*1 -* = 'N': A is non-unit triangular; -* = 'U': A is unit triangular. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* A (input/output) DOUBLE PRECISION array, dimension (LDA,N) -* On entry, the triangular matrix A. If UPLO = 'U', the -* leading N-by-N upper triangular part of the array A contains -* the upper triangular matrix, and the strictly lower -* triangular part of A is not referenced. If UPLO = 'L', the -* leading N-by-N lower triangular part of the array A contains -* the lower triangular matrix, and the strictly upper -* triangular part of A is not referenced. If DIAG = 'U', the -* diagonal elements of A are also not referenced and are -* assumed to be 1. -* On exit, the (triangular) inverse of the original matrix, in -* the same storage format. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, A(i,i) is exactly zero. The triangular -* matrix is singular and its inverse can not be computed. -* -* ===================================================================== -* -* .. Parameters .. - DOUBLE PRECISION ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL NOUNIT, UPPER - INTEGER J, JB, NB, NN -* .. -* .. External Functions .. - LOGICAL LSAME - INTEGER ILAENV - EXTERNAL LSAME, ILAENV -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - NOUNIT = LSAME( DIAG, 'N' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( .NOT.NOUNIT .AND. .NOT.LSAME( DIAG, 'U' ) ) THEN - INFO = -2 - ELSE IF( N.LT.0 ) THEN - INFO = -3 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -5 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DTRTRI', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* -* Check for singularity if non-unit. -* - IF( NOUNIT ) THEN - DO 10 INFO = 1, N - IF( A( INFO, INFO ).EQ.ZERO ) - $ RETURN - 10 CONTINUE - INFO = 0 - END IF -* -* Determine the block size for this environment. -* - NB = ILAENV( 1, 'DTRTRI', UPLO // DIAG, N, -1, -1, -1 ) - IF( NB.LE.1 .OR. NB.GE.N ) THEN -* -* Use unblocked code -* - CALL DTRTI2( UPLO, DIAG, N, A, LDA, INFO ) - ELSE -* -* Use blocked code -* - IF( UPPER ) THEN -* -* Compute inverse of upper triangular matrix -* - DO 20 J = 1, N, NB - JB = MIN( NB, N-J+1 ) -* -* Compute rows 1:j-1 of current block column -* - CALL DTRMM( 'Left', 'Upper', 'No transpose', DIAG, J-1, - $ JB, ONE, A, LDA, A( 1, J ), LDA ) - CALL DTRSM( 'Right', 'Upper', 'No transpose', DIAG, J-1, - $ JB, -ONE, A( J, J ), LDA, A( 1, J ), LDA ) -* -* Compute inverse of current diagonal block -* - CALL DTRTI2( 'Upper', DIAG, JB, A( J, J ), LDA, INFO ) - 20 CONTINUE - ELSE -* -* Compute inverse of lower triangular matrix -* - NN = ( ( N-1 ) / NB )*NB + 1 - DO 30 J = NN, 1, -NB - JB = MIN( NB, N-J+1 ) - IF( J+JB.LE.N ) THEN -* -* Compute rows j+jb:n of current block column -* - CALL DTRMM( 'Left', 'Lower', 'No transpose', DIAG, - $ N-J-JB+1, JB, ONE, A( J+JB, J+JB ), LDA, - $ A( J+JB, J ), LDA ) - CALL DTRSM( 'Right', 'Lower', 'No transpose', DIAG, - $ N-J-JB+1, JB, -ONE, A( J, J ), LDA, - $ A( J+JB, J ), LDA ) - END IF -* -* Compute inverse of current diagonal block -* - CALL DTRTI2( 'Lower', DIAG, JB, A( J, J ), LDA, INFO ) - 30 CONTINUE - END IF - END IF -* - RETURN -* -* End of DTRTRI -* - END SUBROUTINE DTRTRI - diff --git a/BLAS/ILAENV.f b/BLAS/ILAENV.f deleted file mode 100644 index 5bf60286..00000000 --- a/BLAS/ILAENV.f +++ /dev/null @@ -1,549 +0,0 @@ -! 066 LAPACK_BLAS_AUX ############################################################################################################## - - INTEGER FUNCTION ILAENV( ISPEC, NAME, OPTS, N1, N2, N3, - $ N4 ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, TSEC - - USE OUTA_HERE_Interface - -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* June 30, 1999 -* -* .. Scalar Arguments .. - CHARACTER*( * ) NAME, OPTS - INTEGER ISPEC, N1, N2, N3, N4 -* .. -* -* Purpose -* ======= -* -* ILAENV is called from the LAPACK routines to choose problem-dependent -* parameters for the local environment. See ISPEC for a description of -* the parameters. -* -* This version provides a set of parameters which should give good, -* but not optimal, performance on many of the currently available -* computers. Users are encouraged to modify this subroutine to set -* the tuning parameters for their particular machine using the option -* and problem size information in the arguments. -* -* This routine will not function correctly if it is converted to all -* lower case. Converting it to all upper case is allowed. -* -* Arguments -* ========= -* -* ISPEC (input) INTEGER -* Specifies the parameter to be returned as the value of -* ILAENV. -* = 1: the optimal blocksize; if this value is 1, an unblocked -* algorithm will give the best performance. -* = 2: the minimum block size for which the block routine -* should be used; if the usable block size is less than -* this value, an unblocked routine should be used. -* = 3: the crossover point (in a block routine, for N less -* than this value, an unblocked routine should be used) -* = 4: the number of shifts, used in the nonsymmetric -* eigenvalue routines -* = 5: the minimum column dimension for blocking to be used; -* rectangular blocks must have dimension at least k by m, -* where k is given by ILAENV(2,...) and m by ILAENV(5,...) -* = 6: the crossover point for the SVD (when reducing an m by n -* matrix to bidiagonal form, if max(m,n)/min(m,n) exceeds -* this value, a QR factorization is used first to reduce -* the matrix to a triangular form.) -* = 7: the number of processors -* = 8: the crossover point for the multishift QR and QZ methods -* for nonsymmetric eigenvalue problems. -* = 9: maximum size of the subproblems at the bottom of the -* computation tree in the divide-and-conquer algorithm -* (used by xGELSD and xGESDD) -* =10: ieee NaN arithmetic can be trusted not to trap -* =11: infinity arithmetic can be trusted not to trap -* -* NAME (input) CHARACTER*(*) -* The name of the calling subroutine, in either upper case or -* lower case. -* -* OPTS (input) CHARACTER*(*) -* The character options to the subroutine NAME, concatenated -* into a single character string. For example, UPLO = 'U', -* TRANS = 'T', and DIAG = 'N' for a triangular routine would -* be specified as OPTS = 'UTN'. -* -* N1 (input) INTEGER -* N2 (input) INTEGER -* N3 (input) INTEGER -* N4 (input) INTEGER -* Problem dimensions for the subroutine NAME; these may not all -* be required. -* -* (ILAENV) (output) INTEGER -* >= 0: the value of the parameter specified by ISPEC -* < 0: if ILAENV = -k, the k-th argument had an illegal value. -* -* Further Details -* =============== -* -* The following conventions have been used when calling ILAENV from the -* LAPACK routines: -* 1) OPTS is a concatenation of all of the character options to -* subroutine NAME, in the same order that they appear in the -* argument list for NAME, even if they are not used in determining -* the value of the parameter specified by ISPEC. -* 2) The problem dimensions N1, N2, N3, N4 are specified in the order -* that they appear in the argument list for NAME. N1 is used -* first, N2 second, and so on, and unused problem dimensions are -* passed a value of -1. -* 3) The parameter value returned by ILAENV is checked for validity in -* the calling subroutine. For example, ILAENV is used to retrieve -* the optimal blocksize for STRTRI as follows: -* -* NB = ILAENV( 1, 'STRTRI', UPLO // DIAG, N, -1, -1, -1 ) -* IF( NB.LE.1 ) NB = MAX( 1, N ) -* -* ===================================================================== -* -* .. Local Scalars .. - LOGICAL CNAME, SNAME - CHARACTER*1 C1 - CHARACTER*2 C2, C4 - CHARACTER*3 C3 - CHARACTER*6 SUBNAM - INTEGER I, IC, IZ, NB, NBMIN, NX -* .. -* .. Intrinsic Functions .. - INTRINSIC CHAR, ICHAR, INT, MIN, REAL -* .. -* .. External Functions .. -* .. -* .. Executable Statements .. -* - GO TO ( 100, 100, 100, 400, 500, 600, 700, 800, 900, 1000, - $ 1100 ) ISPEC -* -* Invalid value for ISPEC -* - ILAENV = -1 - RETURN -* - 100 CONTINUE -* -* Convert NAME to upper case if the first character is lower case. -* - ILAENV = 1 - SUBNAM = NAME - IC = ICHAR( SUBNAM( 1:1 ) ) - IZ = ICHAR( 'Z' ) - IF( IZ.EQ.90 .OR. IZ.EQ.122 ) THEN -* -* ASCII character set -* - IF( IC.GE.97 .AND. IC.LE.122 ) THEN - SUBNAM( 1:1 ) = CHAR( IC-32 ) - DO 10 I = 2, 6 - IC = ICHAR( SUBNAM( I:I ) ) - IF( IC.GE.97 .AND. IC.LE.122 ) - $ SUBNAM( I:I ) = CHAR( IC-32 ) - 10 CONTINUE - END IF -* - ELSE IF( IZ.EQ.233 .OR. IZ.EQ.169 ) THEN -* -* EBCDIC character set -* - IF( ( IC.GE.129 .AND. IC.LE.137 ) .OR. - $ ( IC.GE.145 .AND. IC.LE.153 ) .OR. - $ ( IC.GE.162 .AND. IC.LE.169 ) ) THEN - SUBNAM( 1:1 ) = CHAR( IC+64 ) - DO 20 I = 2, 6 - IC = ICHAR( SUBNAM( I:I ) ) - IF( ( IC.GE.129 .AND. IC.LE.137 ) .OR. - $ ( IC.GE.145 .AND. IC.LE.153 ) .OR. - $ ( IC.GE.162 .AND. IC.LE.169 ) ) - $ SUBNAM( I:I ) = CHAR( IC+64 ) - 20 CONTINUE - END IF -* - ELSE IF( IZ.EQ.218 .OR. IZ.EQ.250 ) THEN -* -* Prime machines: ASCII+128 -* - IF( IC.GE.225 .AND. IC.LE.250 ) THEN - SUBNAM( 1:1 ) = CHAR( IC-32 ) - DO 30 I = 2, 6 - IC = ICHAR( SUBNAM( I:I ) ) - IF( IC.GE.225 .AND. IC.LE.250 ) - $ SUBNAM( I:I ) = CHAR( IC-32 ) - 30 CONTINUE - END IF - END IF -* - C1 = SUBNAM( 1:1 ) - SNAME = C1.EQ.'S' .OR. C1.EQ.'D' - CNAME = C1.EQ.'C' .OR. C1.EQ.'Z' - IF( .NOT.( CNAME .OR. SNAME ) ) - $ RETURN - C2 = SUBNAM( 2:3 ) - C3 = SUBNAM( 4:6 ) - C4 = C3( 2:3 ) -* - GO TO ( 110, 200, 300 ) ISPEC -* - 110 CONTINUE -* -* ISPEC = 1: block size -* -* In these examples, separate code is provided for setting NB for -* real and complex. We assume that NB will take the same value in -* single or REAL(DOUBLE). -* - NB = 1 -* - IF( C2.EQ.'GE' ) THEN - IF( C3.EQ.'TRF' ) THEN - IF( SNAME ) THEN - NB = 64 - ELSE - NB = 64 - END IF - ELSE IF( C3.EQ.'QRF' .OR. C3.EQ.'RQF' .OR. C3.EQ.'LQF' .OR. - $ C3.EQ.'QLF' ) THEN - IF( SNAME ) THEN - NB = 32 - ELSE - NB = 32 - END IF - ELSE IF( C3.EQ.'HRD' ) THEN - IF( SNAME ) THEN - NB = 32 - ELSE - NB = 32 - END IF - ELSE IF( C3.EQ.'BRD' ) THEN - IF( SNAME ) THEN - NB = 32 - ELSE - NB = 32 - END IF - ELSE IF( C3.EQ.'TRI' ) THEN - IF( SNAME ) THEN - NB = 64 - ELSE - NB = 64 - END IF - END IF - ELSE IF( C2.EQ.'PO' ) THEN - IF( C3.EQ.'TRF' ) THEN - IF( SNAME ) THEN - NB = 64 - ELSE - NB = 64 - END IF - END IF - ELSE IF( C2.EQ.'SY' ) THEN - IF( C3.EQ.'TRF' ) THEN - IF( SNAME ) THEN - NB = 64 - ELSE - NB = 64 - END IF - ELSE IF( SNAME .AND. C3.EQ.'TRD' ) THEN - NB = 32 - ELSE IF( SNAME .AND. C3.EQ.'GST' ) THEN - NB = 64 - END IF - ELSE IF( CNAME .AND. C2.EQ.'HE' ) THEN - IF( C3.EQ.'TRF' ) THEN - NB = 64 - ELSE IF( C3.EQ.'TRD' ) THEN - NB = 32 - ELSE IF( C3.EQ.'GST' ) THEN - NB = 64 - END IF - ELSE IF( SNAME .AND. C2.EQ.'OR' ) THEN - IF( C3( 1:1 ).EQ.'G' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NB = 32 - END IF - ELSE IF( C3( 1:1 ).EQ.'M' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NB = 32 - END IF - END IF - ELSE IF( CNAME .AND. C2.EQ.'UN' ) THEN - IF( C3( 1:1 ).EQ.'G' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NB = 32 - END IF - ELSE IF( C3( 1:1 ).EQ.'M' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NB = 32 - END IF - END IF - ELSE IF( C2.EQ.'GB' ) THEN - IF( C3.EQ.'TRF' ) THEN - IF( SNAME ) THEN - IF( N4.LE.64 ) THEN - NB = 1 - ELSE - NB = 32 - END IF - ELSE - IF( N4.LE.64 ) THEN - NB = 1 - ELSE - NB = 32 - END IF - END IF - END IF - ELSE IF( C2.EQ.'PB' ) THEN - IF( C3.EQ.'TRF' ) THEN - IF( SNAME ) THEN - IF( N2.LE.64 ) THEN - NB = 1 - ELSE - NB = 32 - END IF - ELSE - IF( N2.LE.64 ) THEN - NB = 1 - ELSE - NB = 32 - END IF - END IF - END IF - ELSE IF( C2.EQ.'TR' ) THEN - IF( C3.EQ.'TRI' ) THEN - IF( SNAME ) THEN - NB = 64 - ELSE - NB = 64 - END IF - END IF - ELSE IF( C2.EQ.'LA' ) THEN - IF( C3.EQ.'UUM' ) THEN - IF( SNAME ) THEN - NB = 64 - ELSE - NB = 64 - END IF - END IF - ELSE IF( SNAME .AND. C2.EQ.'ST' ) THEN - IF( C3.EQ.'EBZ' ) THEN - NB = 1 - END IF - END IF - ILAENV = NB - RETURN -* - 200 CONTINUE -* -* ISPEC = 2: minimum block size -* - NBMIN = 2 - IF( C2.EQ.'GE' ) THEN - IF( C3.EQ.'QRF' .OR. C3.EQ.'RQF' .OR. C3.EQ.'LQF' .OR. - $ C3.EQ.'QLF' ) THEN - IF( SNAME ) THEN - NBMIN = 2 - ELSE - NBMIN = 2 - END IF - ELSE IF( C3.EQ.'HRD' ) THEN - IF( SNAME ) THEN - NBMIN = 2 - ELSE - NBMIN = 2 - END IF - ELSE IF( C3.EQ.'BRD' ) THEN - IF( SNAME ) THEN - NBMIN = 2 - ELSE - NBMIN = 2 - END IF - ELSE IF( C3.EQ.'TRI' ) THEN - IF( SNAME ) THEN - NBMIN = 2 - ELSE - NBMIN = 2 - END IF - END IF - ELSE IF( C2.EQ.'SY' ) THEN - IF( C3.EQ.'TRF' ) THEN - IF( SNAME ) THEN - NBMIN = 8 - ELSE - NBMIN = 8 - END IF - ELSE IF( SNAME .AND. C3.EQ.'TRD' ) THEN - NBMIN = 2 - END IF - ELSE IF( CNAME .AND. C2.EQ.'HE' ) THEN - IF( C3.EQ.'TRD' ) THEN - NBMIN = 2 - END IF - ELSE IF( SNAME .AND. C2.EQ.'OR' ) THEN - IF( C3( 1:1 ).EQ.'G' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NBMIN = 2 - END IF - ELSE IF( C3( 1:1 ).EQ.'M' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NBMIN = 2 - END IF - END IF - ELSE IF( CNAME .AND. C2.EQ.'UN' ) THEN - IF( C3( 1:1 ).EQ.'G' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NBMIN = 2 - END IF - ELSE IF( C3( 1:1 ).EQ.'M' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NBMIN = 2 - END IF - END IF - END IF - ILAENV = NBMIN - RETURN -* - 300 CONTINUE -* -* ISPEC = 3: crossover point -* - NX = 0 - IF( C2.EQ.'GE' ) THEN - IF( C3.EQ.'QRF' .OR. C3.EQ.'RQF' .OR. C3.EQ.'LQF' .OR. - $ C3.EQ.'QLF' ) THEN - IF( SNAME ) THEN - NX = 128 - ELSE - NX = 128 - END IF - ELSE IF( C3.EQ.'HRD' ) THEN - IF( SNAME ) THEN - NX = 128 - ELSE - NX = 128 - END IF - ELSE IF( C3.EQ.'BRD' ) THEN - IF( SNAME ) THEN - NX = 128 - ELSE - NX = 128 - END IF - END IF - ELSE IF( C2.EQ.'SY' ) THEN - IF( SNAME .AND. C3.EQ.'TRD' ) THEN - NX = 32 - END IF - ELSE IF( CNAME .AND. C2.EQ.'HE' ) THEN - IF( C3.EQ.'TRD' ) THEN - NX = 32 - END IF - ELSE IF( SNAME .AND. C2.EQ.'OR' ) THEN - IF( C3( 1:1 ).EQ.'G' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NX = 128 - END IF - END IF - ELSE IF( CNAME .AND. C2.EQ.'UN' ) THEN - IF( C3( 1:1 ).EQ.'G' ) THEN - IF( C4.EQ.'QR' .OR. C4.EQ.'RQ' .OR. C4.EQ.'LQ' .OR. - $ C4.EQ.'QL' .OR. C4.EQ.'HR' .OR. C4.EQ.'TR' .OR. - $ C4.EQ.'BR' ) THEN - NX = 128 - END IF - END IF - END IF - ILAENV = NX - RETURN -* - 400 CONTINUE -* -* ISPEC = 4: number of shifts (used by xHSEQR) -* - ILAENV = 6 - RETURN -* - 500 CONTINUE -* -* ISPEC = 5: minimum column dimension (not used) -* - ILAENV = 2 - RETURN -* - 600 CONTINUE -* -* ISPEC = 6: crossover point for SVD (used by xGELSS and xGESVD) -* - ILAENV = INT( REAL( MIN( N1, N2 ) )*1.6E0 ) - RETURN -* - 700 CONTINUE -* -* ISPEC = 7: number of processors (not used) -* - ILAENV = 1 - RETURN -* - 800 CONTINUE -* -* ISPEC = 8: crossover point for multishift (used by xHSEQR) -* - ILAENV = 50 - RETURN -* - 900 CONTINUE -* -* ISPEC = 9: maximum size of the subproblems at the bottom of the -* computation tree in the divide-and-conquer algorithm -* (used by xGELSD and xGESDD) -* - ILAENV = 25 - RETURN -* - 1000 CONTINUE -* -* ISPEC = 10: ieee NaN arithmetic can be trusted not to trap -* - ILAENV = 0 - RETURN -* - 1100 CONTINUE -* -* ISPEC = 11: infinity arithmetic can be trusted not to trap -* - ILAENV = 0 - RETURN -* -* End of ILAENV -* - END FUNCTION ILAENV - diff --git a/BLAS/LSAME.f b/BLAS/LSAME.f deleted file mode 100644 index b57065ba..00000000 --- a/BLAS/LSAME.f +++ /dev/null @@ -1,90 +0,0 @@ -! 067 LAPACK_BLAS_AUX ############################################################################################################## - - LOGICAL FUNCTION LSAME( CA, CB ) -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. - CHARACTER CA, CB -* .. -* -* Purpose -* ======= -* -* LSAME returns .TRUE. if CA is the same letter as CB regardless of -* case. -* -* Arguments -* ========= -* -* CA (input) CHARACTER*1 -* CB (input) CHARACTER*1 -* CA and CB specify the single characters to be compared. -* -* ===================================================================== -* -* .. Intrinsic Functions .. - INTRINSIC ICHAR -* .. -* .. Local Scalars .. - INTEGER INTA, INTB, ZCODE -* .. -* .. Executable Statements .. -* -* Test if the characters are equal -* - LSAME = CA.EQ.CB - IF( LSAME ) - $ RETURN -* -* Now test for equivalence if both characters are alphabetic. -* - ZCODE = ICHAR( 'Z' ) -* -* Use 'Z' rather than 'A' so that ASCII can be detected on Prime -* machines, on which ICHAR returns a value with bit 8 set. -* ICHAR('A') on Prime machines returns 193 which is the same as -* ICHAR('A') on an EBCDIC machine. -* - INTA = ICHAR( CA ) - INTB = ICHAR( CB ) -* - IF( ZCODE.EQ.90 .OR. ZCODE.EQ.122 ) THEN -* -* ASCII is assumed - ZCODE is the ASCII code of either lower or -* upper case 'Z'. -* - IF( INTA.GE.97 .AND. INTA.LE.122 ) INTA = INTA - 32 - IF( INTB.GE.97 .AND. INTB.LE.122 ) INTB = INTB - 32 -* - ELSE IF( ZCODE.EQ.233 .OR. ZCODE.EQ.169 ) THEN -* -* EBCDIC is assumed - ZCODE is the EBCDIC code of either lower or -* upper case 'Z'. -* - IF( INTA.GE.129 .AND. INTA.LE.137 .OR. - $ INTA.GE.145 .AND. INTA.LE.153 .OR. - $ INTA.GE.162 .AND. INTA.LE.169 ) INTA = INTA + 64 - IF( INTB.GE.129 .AND. INTB.LE.137 .OR. - $ INTB.GE.145 .AND. INTB.LE.153 .OR. - $ INTB.GE.162 .AND. INTB.LE.169 ) INTB = INTB + 64 -* - ELSE IF( ZCODE.EQ.218 .OR. ZCODE.EQ.250 ) THEN -* -* ASCII is assumed, on Prime machines - ZCODE is the ASCII code -* plus 128 of either lower or upper case 'Z'. -* - IF( INTA.GE.225 .AND. INTA.LE.250 ) INTA = INTA - 32 - IF( INTB.GE.225 .AND. INTB.LE.250 ) INTB = INTB - 32 - END IF - LSAME = INTA.EQ.INTB -* -* RETURN -* -* End of LSAME -* - END FUNCTION LSAME - diff --git a/CMakeLists.txt b/CMakeLists.txt index c52d5244..affde10c 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -13,6 +13,48 @@ if(WIN32) enable_language(RC) endif() +# ! --- hijack mode for extern blas begin--- ! +# ============================================================================== +# ABI SAFETY CHECK +# ============================================================================== +include(CheckFortranSourceCompiles) +check_fortran_source_compiles(" + program abi_check + use iso_c_binding, only: c_int + implicit none + integer(c_int) :: n + if (kind(n) /= 4) stop 1 + end program +" FORTRAN_C_INT_IS_4BYTE) + +if(NOT FORTRAN_C_INT_IS_4BYTE) + message(FATAL_ERROR + \"\n*** BLAS ABI MISMATCH DETECTED ***\n\" + \"Your Fortran compiler's C_INT is not 4 bytes. \" + \"This means standard OpenBLAS (LP64) will corrupt the stack.\n\") +endif() +# ! --- hijack mode for extern blas end--- ! + +# ! --- hijack mode for extern blas begin--- ! +# ============================================================================== +# FIND OPENBLAS +# ============================================================================== +# set(OpenBLAS_DIR "C:/gcc/openblas32/lib/cmake/openblas") +find_package(OpenBLAS QUIET) + +if(OpenBLAS_FOUND) + message(STATUS "Using OpenBLAS: ${OpenBLAS_LIBRARIES}") + set(MYSTRAN_USE_OPENBLAS ON) + set(enable_internal_blaslib OFF CACHE BOOL "" FORCE) + set(TPL_BLAS_LIBRARIES "${OpenBLAS_LIBRARIES}" CACHE FILEPATH "" FORCE) +else() + message(STATUS "OpenBLAS not found, will use internal BLAS") + set(MYSTRAN_USE_OPENBLAS OFF) + set(enable_internal_blaslib ON CACHE BOOL "" FORCE) + unset(TPL_BLAS_LIBRARIES CACHE) +endif() +# ! --- hijack mode for extern blas end--- ! + # basic compiler and output options set(CMAKE_SOURCE_DIR "${PROJECT_SOURCE_DIR}/Source") set(BLAS_SOURCE_DIR "${PROJECT_SOURCE_DIR}/BLAS") @@ -40,6 +82,30 @@ set(CMAKE_CONFIGURATION_TYPES CACHE STRING "" FORCE ) +# !--- ndebug release fix --- begin! +# Some Intel ifx Release setups conflict with forced NDEBUG definitions coming +# from default CMake release flags. Allow stripping NDEBUG for reproducible +# behavior across toolchains. +option(MYSTRAN_DISABLE_NDEBUG "Remove NDEBUG from CMake release-style flags" ON) +if(MYSTRAN_DISABLE_NDEBUG) + foreach(lang C CXX Fortran) + foreach(cfg RELEASE RELWITHDEBINFO MINSIZEREL) + set(_var "CMAKE_${lang}_FLAGS_${cfg}") + if(DEFINED ${_var}) + set(_flags "${${_var}}") + string(REPLACE "-DNDEBUG" "" _flags "${_flags}") + string(REPLACE "/DNDEBUG" "" _flags "${_flags}") + string(REPLACE "-D NDEBUG" "" _flags "${_flags}") + string(REPLACE "/D NDEBUG" "" _flags "${_flags}") + string(REGEX REPLACE "[ \t]+" " " _flags "${_flags}") + string(STRIP "${_flags}" _flags) + set(${_var} "${_flags}" CACHE STRING "" FORCE) + endif() + endforeach() + endforeach() +endif() +# !--- ndebug release fix --- end! + # suppress cmake warnings for superlu if(NOT DEFINED CMAKE_SUPPRESS_DEVELOPER_WARNINGS) set(CMAKE_SUPPRESS_DEVELOPER_WARNINGS 1 CACHE INTERNAL "No dev warnings") @@ -153,6 +219,8 @@ else() set(SLU_DRIVER "${SUPERLU_DIR}/FORTRAN/c_fortran_dgssv.c") endif() +# ! --- hijack mode for extern blas begin--- ! +if(NOT MYSTRAN_USE_OPENBLAS) # f2c stuff set(F2C_DIR "${PROJECT_SOURCE_DIR}/f2c") set(F2C_INCLUDE_DIR "${F2C_DIR}/include") @@ -200,6 +268,7 @@ if(WIN32) endif() set(F2C_ARITH_H "${F2C_INCLUDE_DIR}/arith.h") +# --- feast_add --- begin ! set_source_files_properties( ${F2C_ARITHCHK_SRC} PROPERTIES COMPILE_FLAGS "-DNO_LONG_LONG -DNO_FPINIT" ) @@ -224,6 +293,8 @@ add_library(f2c ${F2C_CFILES} ${F2C_ARITH_H}) if(WIN32) add_definitions(-DUSE_CLOCK -DMSDOS) endif() +endif() # NOT MYSTRAN_USE_OPENBLAS +# ! --- hijack mode for extern blas end--- ! # set some extra vars for MSYS builds to make the binary portable if(WIN32) @@ -242,61 +313,79 @@ file(GLOB_RECURSE ALL_FORTRAN_FILES "${CMAKE_SOURCE_DIR}/*.F03" ) -# same BLAS-finding subroutine as SuperLU -if(NOT enable_internal_blaslib) - if(TPL_BLAS_LIBRARIES) - set(BLAS_FOUND TRUE) - else() - find_package(BLAS) - - if(BLAS_FOUND) - set(TPL_BLAS_LIBRARIES "${BLAS_LIBRARIES}" CACHE FILEPATH - "Set from FindBLAS.cmake BLAS_LIBRARIES." FORCE) - endif() - endif() -endif() +set_source_files_properties( + "${PROJECT_SOURCE_DIR}/Source/LK4/EIGRL_EXTRACT_SOLVERS.F90" + PROPERTIES Fortran_PREPROCESS ON COMPILE_OPTIONS "-cpp" +) -if(NOT BLAS_FOUND) - message(STATUS "BLAS not found, building local BLAS") - list( - APPEND blas_fns dgemm dgemv dlamch dlanst dscal dsteqr dsterf dswap dtrsm - dtrtri ilaenv lsame xerbla - ) - foreach(fname IN LISTS blas_fns) - # message(STATUS "Checking for BLAS subr ${fname}") - check_function_exists("${fname}" BLAS_FN_EXISTS) - set(BLAS_FN_EXISTS CACHE "1" STRING) +option(MYSTRAN_USE_EXTERNAL_FEAST "Enable native FEAST backend on condensed dense problem" OFF) +set(MYSTRAN_FEAST_EXTRA_LIBS "" CACHE STRING "Additional libraries for native FEAST backend") +# --- feast_add --- end ! - if(NOT BLAS_FN_EXISTS) - # message(STATUS "BLAS subr ${fname} not found") - string(TOUPPER ${fname} fname_upper) - list(APPEND missing_blas_src "${BLAS_SOURCE_DIR}/${fname_upper}.f") - list(APPEND missing_blas_fns ${fname}) +# ! --- hijack mode for extern blas begin--- ! +# BLAS configuration +if(MYSTRAN_USE_OPENBLAS) + message(STATUS "Using OpenBLAS: ${TPL_BLAS_LIBRARIES}") + set(missing_blas_src "${BLAS_SOURCE_DIR}/XERBLA.f") +else() + # same BLAS-finding subroutine as SuperLU + if(NOT enable_internal_blaslib) + if(TPL_BLAS_LIBRARIES) + set(BLAS_FOUND TRUE) + else() + find_package(BLAS) + + if(BLAS_FOUND) + set(TPL_BLAS_LIBRARIES "${BLAS_LIBRARIES}" CACHE FILEPATH + "Set from FindBLAS.cmake BLAS_LIBRARIES." FORCE) + endif() + endif() endif() - unset(BLAS_FN_EXISTS CACHE) - endforeach() - - # if any subroutines have bene found, create an inner blas library - list(LENGTH missing_blas_fns MISSING_FNS_TOTAL) - - if(MISSING_FNS_TOTAL GREATER 0) - if(MISSING_FNS_TOTAL GREATER 1) - message( - STATUS - "BLAS subrs (${missing_blas_fns}) are absent and will be built locally." + if(NOT BLAS_FOUND) + message(STATUS "BLAS not found, building local BLAS") + list( + APPEND blas_fns dgemm dgemv dlamch dlanst dscal dsteqr dsterf dswap dtrsm + dtrtri ilaenv lsame xerbla ) + + foreach(fname IN LISTS blas_fns) + # message(STATUS "Checking for BLAS subr ${fname}") + check_function_exists("${fname}" BLAS_FN_EXISTS) + set(BLAS_FN_EXISTS CACHE "1" STRING) + + if(NOT BLAS_FN_EXISTS) + # message(STATUS "BLAS subr ${fname} not found") + string(TOUPPER ${fname} fname_upper) + list(APPEND missing_blas_src "${BLAS_SOURCE_DIR}/${fname_upper}.f") + list(APPEND missing_blas_fns ${fname}) + endif() + + unset(BLAS_FN_EXISTS CACHE) + endforeach() + + # if any subroutines have bene found, create an inner blas library + list(LENGTH missing_blas_fns MISSING_FNS_TOTAL) + + if(MISSING_FNS_TOTAL GREATER 0) + if(MISSING_FNS_TOTAL GREATER 1) + message( + STATUS + "BLAS subrs (${missing_blas_fns}) are absent and will be built locally." + ) + else() + message( + STATUS + "BLAS subr ${missing_blas_fns} is absent and will be built locally." + ) + endif() + endif() else() - message( - STATUS - "BLAS subr ${missing_blas_fns} is absent and will be built locally." - ) + message(STATUS "Using system BLAS (${TPL_BLAS_LIBRARIES})") endif() - endif() -else() - message(STATUS "Using system BLAS (${TPL_BLAS_LIBRARIES})") endif() +# ! --- hijack mode for extern blas end--- ! # prepare the main executable, linked against the specifics and the m # it appears utils used to be a module, but that is no longer the case? @@ -313,14 +402,41 @@ add_executable( if(USE_SUPERLU_MT) if(PLAT STREQUAL "_PTHREAD") message(WARNING "We recommend using OpenMP, not pthread!") - target_link_libraries(mystran superlu_mt_PTHREAD f2c) + if(MYSTRAN_USE_OPENBLAS) + target_link_libraries(mystran superlu_mt_PTHREAD ${TPL_BLAS_LIBRARIES}) + else() + target_link_libraries(mystran superlu_mt_PTHREAD f2c) + endif() elseif(PLAT STREQUAL "_OPENMP") - target_link_libraries(mystran superlu_mt_OPENMP f2c) + if(MYSTRAN_USE_OPENBLAS) + target_link_libraries(mystran superlu_mt_OPENMP ${TPL_BLAS_LIBRARIES}) + else() + target_link_libraries(mystran superlu_mt_OPENMP f2c) + endif() endif() target_compile_definitions(mystran PRIVATE USE_SUPERLU_MT) else() - target_link_libraries(mystran superlu f2c) + if(MYSTRAN_USE_OPENBLAS) + target_link_libraries(mystran superlu ${TPL_BLAS_LIBRARIES}) + else() + target_link_libraries(mystran superlu f2c) + endif() +endif() + + +if(MYSTRAN_USE_EXTERNAL_FEAST) + target_compile_definitions(mystran PRIVATE MYSTRAN_HAVE_EXTERNAL_FEAST) + foreach(_feast_lib IN LISTS MYSTRAN_FEAST_EXTRA_LIBS) + get_filename_component(_feast_dir "${_feast_lib}" DIRECTORY) + if(EXISTS "${_feast_dir}") + target_include_directories(mystran PRIVATE "${_feast_dir}") + endif() + endforeach() + if(MYSTRAN_FEAST_EXTRA_LIBS) + separate_arguments(MYSTRAN_FEAST_EXTRA_LIBS) + target_link_libraries(mystran ${MYSTRAN_FEAST_EXTRA_LIBS}) + endif() endif() set_target_properties( diff --git a/Source/LK1/L1A-BD/BD_PARAM.F90 b/Source/LK1/L1A-BD/BD_PARAM.F90 index b76c1bb4..fdbfa0ed 100644 --- a/Source/LK1/L1A-BD/BD_PARAM.F90 +++ b/Source/LK1/L1A-BD/BD_PARAM.F90 @@ -40,7 +40,8 @@ SUBROUTINE BD_PARAM ( CARD ) USE PARAMS, ONLY : ARP_TOL , ART_KED , ART_ROT_KED , ART_TRAN_KED , & ART_MASS , ART_ROT_MASS , ART_TRAN_MASS , AUTOSPC , AUTOSPC_NSET , & - AUTOSPC_RAT , AUTOSPC_INFO , AUTOSPC_SPCF , BAILOUT , CRS_CCS , & + AUTOSPC_RAT , AUTOSPC_INFO , AUTOSPC_SPCF , BAILOUT , BANDEDOPT , & + CRS_CCS , & CBMIN3 , CBMIN4 , CBMIN4T , CHKGRDS , & CUSERIN , CUSERIN_EID , CUSERIN_IN4 , CUSERIN_PID , CUSERIN_SPNT_ID , & CUSERIN_XSET , CUSERIN_COMPTYP , DARPACK , & @@ -49,7 +50,7 @@ SUBROUTINE BD_PARAM ( CARD ) EPSIL , EMP0_PAUSE , ESP0_PAUSE , F06_COL_START , & GRDPNT , GRDPNT_IN , GRIDSEQ , HEXAXIS , & IORQ1M , IORQ1S , IORQ1B , IORQ2B , IORQ2T , & - ITMAX , KLLRAT , KOORAT , MATSPARS , & + ITMAX , KLLRAT , KOORAT , LANCMETH , MATSPARS , & MEMAFAC , MIN4TRED , MXALLOCA , MAXRATIO , & MEFMCORD , MEFMLOC , MEFMGRID , & MPFOUT , MXITERI , MXITERL , OTMSKIP , POST , & @@ -378,6 +379,12 @@ SUBROUTINE BD_PARAM ( CARD ) CALL CARD_FLDS_NOT_BLANK ( JCARD,0,0,4,5,6,7,8,9 )! Issue warning if fields 4-9 not blank CALL CRDERR ( CARD ) ! CRDERR prints errors found when reading fields +! BANDEDOPT enables experimental banded-order optimization path + + ELSE IF (JCARD(2)(1:8) == 'BANDEDOP') THEN + PARNAM = 'BANDEDOPT' + CALL YES_NO_CHECK(CARD, JCARD, CHRPARM, PARNAM, BANDEDOPT) + ! CBMIN3 is a parameter for the Mindlin (thick) triangular plate element (CTRIA3). ! It is used in calculating PHISQ, a scalar multiple of the transverse shear stiff @@ -872,15 +879,17 @@ SUBROUTINE BD_PARAM ( CARD ) GRIDSEQ = 'GRID ' ELSE IF (CHRPARM == 'INPUT ') THEN GRIDSEQ = 'INPUT ' + ELSE IF (CHRPARM(1:3) == 'RCM') THEN + GRIDSEQ = 'RCM ' ELSE WARN_ERR = WARN_ERR + 1 WRITE(ERR,101) CARD - WRITE(ERR,1189) PARNAM,'BANDIT, GRID or INPUT',CHRPARM,GRIDSEQ + WRITE(ERR,1189) PARNAM,'BANDIT, GRID, INPUT or RCM',CHRPARM,GRIDSEQ IF (SUPWARN == 'N') THEN IF (ECHO == 'NONE ') THEN WRITE(F06,101) CARD ENDIF - WRITE(F06,1189) PARNAM,'BANDIT, GRID or INPUT',CHRPARM,GRIDSEQ + WRITE(F06,1189) PARNAM,'BANDIT, GRID, INPUT or RCM',CHRPARM,GRIDSEQ ENDIF ENDIF ENDIF @@ -1356,6 +1365,41 @@ SUBROUTINE BD_PARAM ( CARD ) CALL CARD_FLDS_NOT_BLANK ( JCARD,0,0,4,5,6,7,8,9 )! Issue warning if fields 4-9 not blank CALL CRDERR ( CARD ) ! CRDERR prints errors found when reading fields +! --- chase_feast_add --- begin ! +! LANCMETH is a deprecated alias for selecting the EIGRL extract backend when no EIGRL continuation is present + + ELSE IF (JCARD(2)(1:8) == 'LANCMETH') THEN + PARNAM = 'LANCMETH' + CALL CHAR_FLD ( JCARD(3), JF(3), CHRPARM ) + IF (IERRFL(3) == 'N') THEN + CALL LEFT_ADJ_BDFLD ( CHRPARM ) + IF (CHRPARM(1:6) == 'ARPACK') THEN + LANCMETH = 'ARPACK ' + ELSE IF (CHRPARM(1:5) == 'CHASE') THEN + LANCMETH = 'CHASE ' + ELSE IF (CHRPARM(1:5) == 'FEAST') THEN + LANCMETH = 'FEAST ' + ELSE IF (CHRPARM(1:5) == 'SUBSP') THEN + LANCMETH = 'SUBSP ' + ELSE IF (CHRPARM(1:5) == 'DENSE') THEN + LANCMETH = 'DENSE ' + ELSE + WARN_ERR = WARN_ERR + 1 + WRITE(ERR,101) CARD + WRITE(ERR,1189) PARNAM,'ARPACK/CHASE/FEAST/SUBSP/DENSE',CHRPARM,LANCMETH + IF (SUPWARN == 'N') THEN + IF (ECHO == 'NONE ') THEN + WRITE(F06,101) CARD + ENDIF + WRITE(F06,1189) PARNAM,'ARPACK/CHASE/FEAST/SUBSP/DENSE',CHRPARM,LANCMETH + ENDIF + ENDIF + ENDIF + + CALL BD_IMBEDDED_BLANK ( JCARD,0,3,0,0,0,0,0,0 ) + CALL CARD_FLDS_NOT_BLANK ( JCARD,0,0,4,5,6,7,8,9 ) + CALL CRDERR ( CARD ) + ! OTMSKIP defines whether tp quit if a singularity is found in matrix decomp ELSE IF (JCARD(2)(1:8) == 'OTMSKIP ') THEN @@ -2822,8 +2866,10 @@ SUBROUTINE BD_PARAM ( CARD ) CALL CARD_FLDS_NOT_BLANK ( JCARD,0,0,4,5,6,7,8,9 )! Issue warning if fields 4-9 not blank CALL CRDERR ( CARD ) ! CRDERR prints errors found when reading fields -! WINAMEM is the max MB of memory Windows allows. If an attempt is made to exceed it, the code can abort with an -! abnormal termination +! --- BANDED_optimizisation -begin-- ! +! WINAMEM is an optional per-array MB cap. Historical MYSTRAN used this for the Windows XP 2 GB address-space limit. +! In modern 64-bit builds, WINAMEM <= 0 disables this legacy cap and lets ALLOCATE/STAT report memory failure. +! --- BANDED_optimizisation -end-- ! ELSE IF (JCARD(2)(1:8) == 'WINAMEM ') THEN PARNAM = 'WINAMEM ' @@ -2974,6 +3020,7 @@ SUBROUTINE BD_PARAM ( CARD ) CALL CRDERR ( CARD ) ! CRDERR prints errors found when reading fields ENDIF +! --- chase_feast_add --- end ! diff --git a/Source/LK1/L1B/SEQ_PROC.f90 b/Source/LK1/L1B/SEQ_PROC.f90 index f5bd463c..41f93198 100644 --- a/Source/LK1/L1B/SEQ_PROC.f90 +++ b/Source/LK1/L1B/SEQ_PROC.f90 @@ -32,11 +32,11 @@ SUBROUTINE SEQ_PROC USE IOUNT1, ONLY : WRT_ERR, ERR, F06, SEQ, L1B USE IOUNT1, ONLY : WRT_ERR, SEQFIL USE IOUNT1, ONLY : WRT_ERR, SEQSTAT - USE SCONTR, ONLY : BLNK_SUB_NAM, DATA_NAM_LEN, FATAL_ERR, NGRID, NSEQ, PROG_NAME, WARN_ERR - USE PARAMS, ONLY : EPSIL, GRIDSEQ + USE SCONTR, ONLY : BLNK_SUB_NAM, DATA_NAM_LEN, FATAL_ERR, NELE, NGRID, NSEQ, PROG_NAME, WARN_ERR + USE PARAMS, ONLY : BANDEDOPT, EPSIL, GRIDSEQ, SOLLIB USE TIMDAT, ONLY : TSEC USE PARAMS, ONLY : SUPINFO, SUPWARN - USE MODEL_STUF, ONLY : GRID_ID, GRID_SEQ, INV_GRID_SEQ, SEQ1, SEQ2 + USE MODEL_STUF, ONLY : ETYPE, GRID_ID, GRID_SEQ, INV_GRID_SEQ, SEQ1, SEQ2 USE DEBUG_PARAMETERS, ONLY : DEBUG USE SEQ_PROC_USE_IFs @@ -49,21 +49,33 @@ SUBROUTINE SEQ_PROC INTEGER(LONG) :: I,J ! DO loop indices INTEGER(LONG) :: IERROR ! Error count INTEGER(LONG) :: IGRID ! Internal grid ID - INTEGER(LONG) :: TMP_GRID_ID(NGRID)! Set to array GRID_ID for aid in sorting GRID_SEQ - INTEGER(LONG) :: TMP_GRD_SEQ(NGRID)! Set to array GRID_SEQ so we can sort it and get array INV_GRID_SEQ + INTEGER(LONG) :: NSEQ_SAVE ! Save of incoming SEQ count (for diagnostics) + INTEGER(LONG), ALLOCATABLE :: TMP_GRID_ID(:) ! Set to array GRID_ID for aid in sorting GRID_SEQ + INTEGER(LONG), ALLOCATABLE :: TMP_GRD_SEQ(:) ! Set to array GRID_SEQ so we can sort it and get array INV_GRID_SEQ ! without disturbing GRID_SEQ sequence - REAL(DOUBLE) :: R_GSEQ(NGRID) ! Real sequence numbers (since SEQGP cards can have real no's). In the + REAL(DOUBLE), ALLOCATABLE :: R_GSEQ(:) ! Real sequence numbers (since SEQGP cards can have real no's). In the ! end, the sequence array that will be used is integer array GRID_SEQ INTRINSIC :: DBLE + LOGICAL :: DO_RCM ! Activate in-core RCM sequencing add-on + LOGICAL :: RCM_OK ! True if RCM graph build/order succeeded ! ********************************************************************************************************************************** ! Coming in to this subr, GRID_SEQ is in the order of the grids as read in the input data deck. + ALLOCATE ( R_GSEQ(NGRID), TMP_GRID_ID(NGRID), TMP_GRD_SEQ(NGRID) ) + NSEQ_SAVE = NSEQ + DO_RCM = .FALSE. + RCM_OK = .FALSE. + + IF ((GRIDSEQ(1:3) == 'RCM') .OR. ((BANDEDOPT == 'Y') .AND. (SOLLIB == 'BANDED ') .AND. (GRIDSEQ(1:6) /= 'BANDIT'))) THEN + DO_RCM = .TRUE. + ENDIF + ! Generate initial R_GSEQ based on the GRID_SEQ value. R_GSEQ(I) is the (real) sequence number for Grid Point GRID_ID(I). ! If there are no SEQGP sequencing cards, then this will be the final grid point sequence order (as a real number). ! It will be converted to an array of consecutive integers in GRID_SEQ later. @@ -72,30 +84,48 @@ SUBROUTINE SEQ_PROC ! resequencing is necessary) and then proceed with that complete set of SEQGP cards. Need to do it this way because we have to ! get INV_GRID_SEQ (later) and to write the seq arrays to L1B. - IF (GRIDSEQ(1:6) == 'BANDIT') THEN ! Call subr AUTO_SEQ_PROC to generate SEQ1, SEQ2 from SEQGP card images - CALL AUTO_SEQ_PROC - IF (NSEQ == NGRID) THEN ! Bandit did reseq grids. Set R_GSEQ to the SEQ2 from Bandit SEQGP cards - DO I=1,NGRID -! R_GSEQ(I) = DBLE(SEQ2(I)) ! Shouldn't need this, SEQ2 is REAL(DOUBLE) - R_GSEQ(I) = SEQ2(I) - ENDDO - ELSE ! AUTO_SEQ_PROC didn't reseq all grids so reset GRIDSEQ = 'GRID' - WRITE(ERR,101) NGRID,NSEQ,PROG_NAME + IF (DO_RCM) THEN + CALL RCM_SEQ_PROC ( R_GSEQ, RCM_OK ) + IF (RCM_OK) THEN + NSEQ = 0 + WRITE(ERR,102) NGRID, NSEQ_SAVE, GRIDSEQ, BANDEDOPT, SOLLIB IF (SUPINFO == 'N') THEN - WRITE(F06,101) NGRID,NSEQ,PROG_NAME + WRITE(F06,102) NGRID, NSEQ_SAVE, GRIDSEQ, BANDEDOPT, SOLLIB + ENDIF + ELSE + WRITE(ERR,103) GRIDSEQ, BANDEDOPT, SOLLIB + IF (SUPINFO == 'N') THEN + WRITE(F06,103) GRIDSEQ, BANDEDOPT, SOLLIB ENDIF - GRIDSEQ = 'GRID ' ! Need this to cover case where AUTO_SEQ_PROC returned without completing ENDIF ENDIF - IF (GRIDSEQ(1:4) == 'GRID' ) THEN ! Sequence grids in numerical order, but include SEQGP entries (later) - DO I=1,NGRID - R_GSEQ(I) = DBLE(I) - ENDDO - ELSE IF (GRIDSEQ(1:5) == 'INPUT') THEN ! Sequence grids in input order, but include SEQGP entries (later) - DO I=1,NGRID - R_GSEQ(I) = DBLE(GRID_SEQ(I)) - ENDDO + IF (.NOT. RCM_OK) THEN + IF (GRIDSEQ(1:6) == 'BANDIT') THEN ! Call subr AUTO_SEQ_PROC to generate SEQ1, SEQ2 from SEQGP card images + CALL AUTO_SEQ_PROC + IF (NSEQ == NGRID) THEN ! Bandit did reseq grids. Set R_GSEQ to the SEQ2 from Bandit SEQGP cards + DO I=1,NGRID +! R_GSEQ(I) = DBLE(SEQ2(I)) ! Shouldn't need this, SEQ2 is REAL(DOUBLE) + R_GSEQ(I) = SEQ2(I) + ENDDO + ELSE ! AUTO_SEQ_PROC didn't reseq all grids so reset GRIDSEQ = 'GRID' + WRITE(ERR,101) NGRID,NSEQ,PROG_NAME + IF (SUPINFO == 'N') THEN + WRITE(F06,101) NGRID,NSEQ,PROG_NAME + ENDIF + GRIDSEQ = 'GRID ' ! Need this to cover case where AUTO_SEQ_PROC returned without completing + ENDIF + ENDIF + + IF (GRIDSEQ(1:4) == 'GRID' ) THEN ! Sequence grids in numerical order, but include SEQGP entries (later) + DO I=1,NGRID + R_GSEQ(I) = DBLE(I) + ENDDO + ELSE IF ((GRIDSEQ(1:5) == 'INPUT') .OR. (GRIDSEQ(1:3) == 'RCM')) THEN + DO I=1,NGRID + R_GSEQ(I) = DBLE(GRID_SEQ(I)) + ENDDO + ENDIF ENDIF ! Check to make sure that all grid points on SEQGP cards are defined @@ -236,11 +266,28 @@ SUBROUTINE SEQ_PROC + IF (ALLOCATED(R_GSEQ)) THEN + DEALLOCATE ( R_GSEQ ) + ENDIF + IF (ALLOCATED(TMP_GRID_ID)) THEN + DEALLOCATE ( TMP_GRID_ID ) + ENDIF + IF (ALLOCATED(TMP_GRD_SEQ)) THEN + DEALLOCATE ( TMP_GRD_SEQ ) + ENDIF + RETURN ! ********************************************************************************************************************************** 101 FORMAT(' *INFORMATION: SUBR AUTO_SEQ_PROC DID NOT SEQUENCE ALL OF THE ',I8,' GRIDS. ONLY ',I8,' GRIDS WERE SEQUENCED.' & ,/,15X,A,' WILL DEFAULT TO A SEQUENCE THAT IS IN GRID NUMERICAL ORDER',/) +! --- BANDED_optimizisation -begin-- ! + 102 FORMAT(' *INFORMATION: IN-CORE RCM GRID SEQUENCING APPLIED TO ',I8,' GRIDS (IGNORED ',I8,' INPUT/AUTO SEQGP ENTRY(IES)).' & + ,/,15X,' PARAM GRIDSEQ=',A8,' BANDEDOPT=',A1,' SOLLIB=',A8 & + ,/,15X,' RCM UPDATES GRID_SEQ/INV_GRID_SEQ BEFORE DOF NUMBERING, SO MASS AND STIFFNESS MATRICES SHARE ORDER.') +! --- BANDED_optimizisation -end-- ! + 103 FORMAT(' *WARNING : IN-CORE RCM SEQUENCING REQUESTED BUT A VALID CONNECTIVITY GRAPH COULD NOT BE BUILT.' & + ,/,15X,' FALLING BACK TO EXISTING GRIDSEQ FLOW. PARAM GRIDSEQ=',A8,' BANDEDOPT=',A1,' SOLLIB=',A8) 111 FORMAT(56X,'GRID SEQUENCE DATA',//16X,'GRID ID I R_GSEQ(I) GRID_SEQ(I) ', & ' INV_GRID_SEQ(I)',/,12X,'(Actual grid ID) (Internal grid ID) (Grid seq - real num)', & @@ -266,6 +313,252 @@ SUBROUTINE SEQ_PROC CONTAINS +! ################################################################################################################################## + + SUBROUTINE RCM_SEQ_PROC ( R_GSEQ, RCM_OK ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE SCONTR, ONLY : BLNK_SUB_NAM, medat0_cuserin, MELGP, NELE, NGRID + USE MODEL_STUF, ONLY : EDAT, ELGP, EPNT, ETYPE, GRID_ID + USE PARAMS, ONLY : EPSIL + USE SEQ_PROC_USE_IFs + + IMPLICIT NONE + + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: SUBR_NAME_RCM = 'RCM_SEQ_PROC' + CHARACTER(8*BYTE) :: TYPE_E + + REAL(DOUBLE), INTENT(OUT) :: R_GSEQ(NGRID) + LOGICAL, INTENT(OUT) :: RCM_OK + + INTEGER(LONG) :: ASTAT + INTEGER(LONG) :: DELTA + INTEGER(LONG) :: E + INTEGER(LONG) :: EPNTK + INTEGER(LONG) :: I + INTEGER(LONG) :: IDX + INTEGER(LONG) :: J + INTEGER(LONG) :: K + INTEGER(LONG) :: MIN_DEG + INTEGER(LONG) :: NG + INTEGER(LONG) :: NODE + INTEGER(LONG) :: ORDER_LEN + INTEGER(LONG) :: P + INTEGER(LONG) :: START_NODE + INTEGER(LONG) :: STAMP + INTEGER(LONG) :: TAIL + INTEGER(LONG) :: HEAD + INTEGER(LONG) :: TOTAL_INC + INTEGER(LONG) :: GRID_ROW_NUM + INTEGER(LONG) :: NBR_LEN + INTEGER(LONG) :: COMP_LEN + INTEGER(LONG) :: NBR_NODE + INTEGER(LONG) :: NSEQ_LOC + INTEGER(LONG) :: TMP + INTEGER(LONG) :: TMP_DEG + INTEGER(LONG) :: ELROW + + INTEGER(LONG), ALLOCATABLE :: ELEM_NGP(:) + INTEGER(LONG), ALLOCATABLE :: ELEM_BGRID(:,:) + INTEGER(LONG), ALLOCATABLE :: NODE_ELEM_CNT(:) + INTEGER(LONG), ALLOCATABLE :: NODE_ELEM_PTR(:) + INTEGER(LONG), ALLOCATABLE :: NODE_ELEM_WRK(:) + INTEGER(LONG), ALLOCATABLE :: NODE_ELEM_LIST(:) + INTEGER(LONG), ALLOCATABLE :: DEGREE(:) + INTEGER(LONG), ALLOCATABLE :: MARK(:) + INTEGER(LONG), ALLOCATABLE :: ORDER(:) + INTEGER(LONG), ALLOCATABLE :: QUEUE(:) + INTEGER(LONG), ALLOCATABLE :: COMP_ORDER(:) + INTEGER(LONG), ALLOCATABLE :: NEIGH(:) + LOGICAL, ALLOCATABLE :: VISITED(:) + + RCM_OK = .FALSE. + + ALLOCATE(ELEM_NGP(NELE), ELEM_BGRID(MELGP,NELE), NODE_ELEM_CNT(NGRID), NODE_ELEM_PTR(NGRID+1), STAT=ASTAT) + IF (ASTAT /= 0) GOTO 900 + ELEM_NGP = 0 + ELEM_BGRID = 0 + NODE_ELEM_CNT = 0 + NODE_ELEM_PTR = 0 + + DO E=1,NELE + CALL GET_ELGP ( E ) + NG = ELGP + ELEM_NGP(E) = NG + EPNTK = EPNT(E) + TYPE_E = ETYPE(E) + DELTA = 1 + IF (TYPE_E == 'BUSH ') DELTA = 2 + IF (TYPE_E == 'USERIN ') DELTA = medat0_cuserin - 1 + + NSEQ_LOC = 0 + DO I=1,NG + CALL GET_ARRAY_ROW_NUM ( 'GRID_ID', SUBR_NAME_RCM, NGRID, GRID_ID, EDAT(EPNTK+I+DELTA), GRID_ROW_NUM ) + IF (GRID_ROW_NUM > 0) THEN + NSEQ_LOC = NSEQ_LOC + 1 + ELEM_BGRID(NSEQ_LOC,E) = GRID_ROW_NUM + NODE_ELEM_CNT(GRID_ROW_NUM) = NODE_ELEM_CNT(GRID_ROW_NUM) + 1 + ENDIF + ENDDO + ELEM_NGP(E) = NSEQ_LOC + ENDDO + + TOTAL_INC = 0 + DO I=1,NGRID + NODE_ELEM_PTR(I) = TOTAL_INC + 1 + TOTAL_INC = TOTAL_INC + NODE_ELEM_CNT(I) + ENDDO + NODE_ELEM_PTR(NGRID+1) = TOTAL_INC + 1 + + ALLOCATE(NODE_ELEM_LIST(MAX(TOTAL_INC,1)), NODE_ELEM_WRK(NGRID), DEGREE(NGRID), MARK(NGRID), ORDER(NGRID), & + QUEUE(NGRID), COMP_ORDER(NGRID), NEIGH(NGRID), VISITED(NGRID), STAT=ASTAT) + IF (ASTAT /= 0) GOTO 900 + + NODE_ELEM_LIST = 0 + NODE_ELEM_WRK = NODE_ELEM_PTR(1:NGRID) + DEGREE = 0 + MARK = 0 + ORDER = 0 + QUEUE = 0 + COMP_ORDER = 0 + NEIGH = 0 + VISITED = .FALSE. + + DO E=1,NELE + NG = ELEM_NGP(E) + IF (NG < 2) CYCLE + DO I=1,NG + NODE = ELEM_BGRID(I,E) + IF (NODE <= 0) CYCLE + IDX = NODE_ELEM_WRK(NODE) + NODE_ELEM_LIST(IDX) = E + NODE_ELEM_WRK(NODE) = IDX + 1 + ENDDO + ENDDO + + STAMP = 0 + DO I=1,NGRID + STAMP = STAMP + 1 + DO P=NODE_ELEM_PTR(I),NODE_ELEM_PTR(I+1)-1 + ELROW = NODE_ELEM_LIST(P) + IF (ELROW <= 0) CYCLE + NG = ELEM_NGP(ELROW) + DO J=1,NG + NODE = ELEM_BGRID(J,ELROW) + IF ((NODE > 0) .AND. (NODE /= I)) THEN + IF (MARK(NODE) /= STAMP) THEN + MARK(NODE) = STAMP + DEGREE(I) = DEGREE(I) + 1 + ENDIF + ENDIF + ENDDO + ENDDO + ENDDO + + ORDER_LEN = 0 + DO + START_NODE = 0 + MIN_DEG = HUGE(MIN_DEG) + DO I=1,NGRID + IF (.NOT. VISITED(I)) THEN + IF (DEGREE(I) < MIN_DEG) THEN + MIN_DEG = DEGREE(I) + START_NODE = I + ENDIF + ENDIF + ENDDO + IF (START_NODE == 0) EXIT + + HEAD = 1 + TAIL = 1 + QUEUE(1) = START_NODE + VISITED(START_NODE) = .TRUE. + COMP_LEN = 0 + + DO WHILE (HEAD <= TAIL) + NODE = QUEUE(HEAD) + HEAD = HEAD + 1 + COMP_LEN = COMP_LEN + 1 + COMP_ORDER(COMP_LEN) = NODE + + STAMP = STAMP + 1 + NBR_LEN = 0 + DO P=NODE_ELEM_PTR(NODE),NODE_ELEM_PTR(NODE+1)-1 + ELROW = NODE_ELEM_LIST(P) + IF (ELROW <= 0) CYCLE + NG = ELEM_NGP(ELROW) + DO J=1,NG + NBR_NODE = ELEM_BGRID(J,ELROW) + IF ((NBR_NODE > 0) .AND. (.NOT. VISITED(NBR_NODE)) .AND. (NBR_NODE /= NODE)) THEN + IF (MARK(NBR_NODE) /= STAMP) THEN + MARK(NBR_NODE) = STAMP + NBR_LEN = NBR_LEN + 1 + NEIGH(NBR_LEN) = NBR_NODE + ENDIF + ENDIF + ENDDO + ENDDO + + IF (NBR_LEN > 1) THEN + DO J=2,NBR_LEN + TMP = NEIGH(J) + TMP_DEG = DEGREE(TMP) + K = J - 1 + DO WHILE (K >= 1) + IF ((DEGREE(NEIGH(K)) > TMP_DEG) .OR. ((DEGREE(NEIGH(K)) == TMP_DEG) .AND. (NEIGH(K) > TMP))) THEN + NEIGH(K+1) = NEIGH(K) + K = K - 1 + ELSE + EXIT + ENDIF + ENDDO + NEIGH(K+1) = TMP + ENDDO + ENDIF + + DO J=1,NBR_LEN + NBR_NODE = NEIGH(J) + IF (.NOT. VISITED(NBR_NODE)) THEN + VISITED(NBR_NODE) = .TRUE. + TAIL = TAIL + 1 + QUEUE(TAIL) = NBR_NODE + ENDIF + ENDDO + ENDDO + + DO J=COMP_LEN,1,-1 + ORDER_LEN = ORDER_LEN + 1 + ORDER(ORDER_LEN) = COMP_ORDER(J) + ENDDO + ENDDO + + IF (ORDER_LEN /= NGRID) GOTO 900 + + DO I=1,NGRID + R_GSEQ(ORDER(I)) = DBLE(I) + ENDDO + + RCM_OK = .TRUE. + + 900 CONTINUE + IF (ALLOCATED(ELEM_NGP)) DEALLOCATE(ELEM_NGP) + IF (ALLOCATED(ELEM_BGRID)) DEALLOCATE(ELEM_BGRID) + IF (ALLOCATED(NODE_ELEM_CNT)) DEALLOCATE(NODE_ELEM_CNT) + IF (ALLOCATED(NODE_ELEM_PTR)) DEALLOCATE(NODE_ELEM_PTR) + IF (ALLOCATED(NODE_ELEM_WRK)) DEALLOCATE(NODE_ELEM_WRK) + IF (ALLOCATED(NODE_ELEM_LIST)) DEALLOCATE(NODE_ELEM_LIST) + IF (ALLOCATED(DEGREE)) DEALLOCATE(DEGREE) + IF (ALLOCATED(MARK)) DEALLOCATE(MARK) + IF (ALLOCATED(ORDER)) DEALLOCATE(ORDER) + IF (ALLOCATED(QUEUE)) DEALLOCATE(QUEUE) + IF (ALLOCATED(COMP_ORDER)) DEALLOCATE(COMP_ORDER) + IF (ALLOCATED(NEIGH)) DEALLOCATE(NEIGH) + IF (ALLOCATED(VISITED)) DEALLOCATE(VISITED) + + RETURN + + END SUBROUTINE RCM_SEQ_PROC + ! ################################################################################################################################## SUBROUTINE AUTO_SEQ_PROC diff --git a/Source/LK1/LINK1/ALLOCATE_STF_ARRAYS.f90 b/Source/LK1/LINK1/ALLOCATE_STF_ARRAYS.f90 index 9209656c..893bd585 100644 --- a/Source/LK1/LINK1/ALLOCATE_STF_ARRAYS.f90 +++ b/Source/LK1/LINK1/ALLOCATE_STF_ARRAYS.f90 @@ -203,7 +203,9 @@ SUBROUTINE ALLOCATE_STF_ARRAYS ( NAME, CALLING_SUBR ) NTERMS = LTERM MB_NEEDED = RDOUBLE*REAL(NTERMS)/ONEPP6 + TWO*RLONG*REAL(NTERMS)/ONEPP6 - IF (MB_NEEDED >= WINAMEM) THEN ! Reduce request for memory to +! --- BANDED_optimizisation -begin-- ! + IF ((WINAMEM > ZERO) .AND. (MB_NEEDED >= WINAMEM)) THEN +! --- BANDED_optimizisation -end-- ! NTERMS = MEMAFAC*(WINAMEM/MB_NEEDED)*NTERMS ENDIF ALLOC_ATTEMPT_NUM = 1 @@ -322,4 +324,3 @@ SUBROUTINE ALLOCATE_STF_ARRAYS ( NAME, CALLING_SUBR ) ! ********************************************************************************************************************************** END SUBROUTINE ALLOCATE_STF_ARRAYS - diff --git a/Source/LK2/LINK2.f90 b/Source/LK2/LINK2.f90 index aa6a96ed..49d3925d 100644 --- a/Source/LK2/LINK2.f90 +++ b/Source/LK2/LINK2.f90 @@ -40,7 +40,7 @@ SUBROUTINE LINK2 USE IOUNT1, ONLY : LINK2G, LINK2H , LINK2I , LINK2O , LINK2P , LINK2Q USE IOUNT1, ONLY : L2G_MSG, L2H_MSG, L2I_MSG, L2O_MSG, L2P_MSG, L2Q_MSG - USE SCONTR, ONLY : BLNK_SUB_NAM, COMM, FATAL_ERR, LINKNO, MBUG, & + USE SCONTR, ONLY : BLNK_SUB_NAM, COMM, FATAL_ERR, LINKNO, MBUG, NSUB, & NDOFG, NDOFM, NDOFN, NDOFS, NDOFSA, NDOFF, NDOFO, NDOFA, NDOFL, NDOFR, & NTERM_KGG , NTERM_KNN , NTERM_KNM , NTERM_KMM , & NTERM_KFF , NTERM_KFS , NTERM_KSS , & @@ -305,6 +305,18 @@ SUBROUTINE LINK2 WRITE(F06,146) 'PL LOAD MATRIX IS ', NTERM_PL ENDIF + IF (DEBUG(206) > 0) THEN + IF (NTERM_KLL > 0) THEN + CALL WRITE_MATRIX_MARKET_SPARSE ( 'KLL', NDOFL, NDOFL, 'Y', NTERM_KLL, I_KLL, J_KLL, KLL ) + ENDIF + IF (NTERM_MLL > 0) THEN + CALL WRITE_MATRIX_MARKET_SPARSE ( 'MLL', NDOFL, NDOFL, 'Y', NTERM_MLL, I_MLL, J_MLL, MLL ) + ENDIF + IF (NTERM_PL > 0) THEN + CALL WRITE_MATRIX_MARKET_SPARSE ( 'PL', NDOFL, NSUB, 'N', NTERM_PL, I_PL, J_PL, PL ) + ENDIF + ENDIF + IF (NUM_OU4_REQUESTS > 0) THEN ! Call OUTPUT4 processor to process output requests for OUTPUT4 matrices CALL LINK_MESSAGE('WRITE OUTPUT4 NATRICES ') diff --git a/Source/LK2/SOLVE_GMN.f90 b/Source/LK2/SOLVE_GMN.f90 index 89bcf182..042082a6 100644 --- a/Source/LK2/SOLVE_GMN.f90 +++ b/Source/LK2/SOLVE_GMN.f90 @@ -295,6 +295,9 @@ SUBROUTINE SOLVE_GMN_SOLVER CHARACTER( 1*BYTE) :: OPND ! Input to subr READ_MATRIX_i. 'Y'/'N' whether to open a file or not CHARACTER(FILE_NAM_MAXLEN*BYTE) :: SCRFIL ! File name CHARACTER( 1*BYTE) :: TRANS ! 'Y' if + LOGICAL :: RMM_SOLVED_WITH_SUPERLU + LOGICAL :: RMM_FULL_ALLOCATED + LOGICAL :: CCS1_ALLOCATED INTEGER(LONG) :: COMPV ! Component number (1-6) of a grid DOF INTEGER(LONG) :: GRIDV ! Grid number @@ -323,10 +326,14 @@ SUBROUTINE SOLVE_GMN_SOLVER OUNT(1) = ERR OUNT(2) = F06 + RMM_SOLVED_WITH_SUPERLU = .FALSE. + RMM_FULL_ALLOCATED = .FALSE. + CCS1_ALLOCATED = .FALSE. IF (SOLLIB == 'BANDED ') THEN - ! Create full matrix RMM_FULL from sparse RMM + ! Create full matrix RMM_FULL from sparse RMM CALL ALLOCATE_FULL_MAT ( 'RMM_FULL', NDOFM, NDOFM, SUBR_NAME ) + RMM_FULL_ALLOCATED = .TRUE. CALL SPARSE_CRS_TO_FULL ( 'RMM ', NTERM_RMM, NDOFM, NDOFM, SYM_RMM, I_RMM, J_RMM, RMM, RMM_FULL ) ! Perform factorization of RMM_FULL matrix. @@ -346,16 +353,33 @@ SUBROUTINE SOLVE_GMN_SOLVER ELSE IF (INFO > 0) THEN ! 0 diag in RMM - CALL GET_GRID_AND_COMP ( 'M ', INFO, GRIDV, COMPV ) + CALL GET_GRID_AND_COMP ( 'M ', INFO, GRIDV, COMPV ) - WRITE(ERR,2501) CALLED_SUBR, SUBR_NAME, INFO - WRITE(F06,2501) CALLED_SUBR, SUBR_NAME, INFO - FATAL_ERR = FATAL_ERR + 1 - IF ((GRIDV > 0) .AND. (COMPV > 0)) THEN - WRITE(ERR,25012) GRIDV, COMPV - WRITE(F06,25012) GRIDV, COMPV + IF (SPARSE_FLAVOR(1:7) == 'SUPERLU') THEN + WRITE(ERR,2501) CALLED_SUBR, SUBR_NAME, INFO + WRITE(F06,2501) CALLED_SUBR, SUBR_NAME, INFO + IF ((GRIDV > 0) .AND. (COMPV > 0)) THEN + WRITE(ERR,25012) GRIDV, COMPV + WRITE(F06,25012) GRIDV, COMPV + ENDIF + WRITE(ERR,25013) + WRITE(F06,25013) + SLU_INFO = 0 + CALL ALLOCATE_SCR_CCS_MAT ( 'CCS1', NDOFM, NTERM_RMM, SUBR_NAME ) + CCS1_ALLOCATED = .TRUE. + CALL SPARSE_CRS_SPARSE_CCS ( NDOFM, NDOFM, NTERM_RMM, 'RMM', I_RMM, J_RMM, RMM, 'CCS1', J_CCS1, I_CCS1, CCS1, 'Y') + CALL SYM_MAT_DECOMP_SUPRLU ( SUBR_NAME, 'RMM', 'M ', NDOFM, NTERM_RMM, J_CCS1, I_CCS1, CCS1, SLU_INFO ) + RMM_SOLVED_WITH_SUPERLU = .TRUE. + ELSE + WRITE(ERR,2501) CALLED_SUBR, SUBR_NAME, INFO + WRITE(F06,2501) CALLED_SUBR, SUBR_NAME, INFO + FATAL_ERR = FATAL_ERR + 1 + IF ((GRIDV > 0) .AND. (COMPV > 0)) THEN + WRITE(ERR,25012) GRIDV, COMPV + WRITE(F06,25012) GRIDV, COMPV + ENDIF + CALL OUTA_HERE ( 'Y' ) ENDIF - CALL OUTA_HERE ( 'Y' ) ENDIF @@ -365,8 +389,10 @@ SUBROUTINE SOLVE_GMN_SOLVER SLU_INFO = 0 CALL ALLOCATE_SCR_CCS_MAT ( 'CCS1', NDOFM, NTERM_RMM, SUBR_NAME ) + CCS1_ALLOCATED = .TRUE. CALL SPARSE_CRS_SPARSE_CCS ( NDOFM, NDOFM, NTERM_RMM, 'RMM', I_RMM, J_RMM, RMM, 'CCS1', J_CCS1, I_CCS1, CCS1, 'Y') CALL SYM_MAT_DECOMP_SUPRLU ( SUBR_NAME, 'RMM', 'M ', NDOFM, NTERM_RMM, J_CCS1, I_CCS1, CCS1, SLU_INFO ) + RMM_SOLVED_WITH_SUPERLU = .TRUE. ELSE @@ -431,7 +457,11 @@ SUBROUTINE SOLVE_GMN_SOLVER IF (NULL_COL == 'N') THEN ! DGETRS will solve for GMN_COL & load it into GMN array - IF (SOLLIB == 'BANDED ') THEN + IF (RMM_SOLVED_WITH_SUPERLU) THEN + SLU_INFO = 0 + CALL FBS_SUPRLU ( SUBR_NAME, 'RMM', NDOFM, NTERM_RMM, J_CCS1, I_CCS1, CCS1, J, RMN_COL, SLU_INFO ) + + ELSE IF (SOLLIB == 'BANDED ') THEN TRANS = 'N' NRHS = 1 CALL DGETRS ( TRANS, NDOFM, NRHS ,RMM_FULL, NDOFM, IPIV, RMN_COL, NDOFM, INFO ) @@ -486,12 +516,16 @@ SUBROUTINE SOLVE_GMN_SOLVER WRITE(SC1,*) CR13 - CALL DEALLOCATE_SCR_MAT ( 'CCS1' ) - CALL DEALLOCATE_FULL_MAT ( 'RMM_FULL' ) + IF (CCS1_ALLOCATED) THEN + CALL DEALLOCATE_SCR_MAT ( 'CCS1' ) + ENDIF + IF (RMM_FULL_ALLOCATED) THEN + CALL DEALLOCATE_FULL_MAT ( 'RMM_FULL' ) + ENDIF -FreeS:IF (SOLLIB == 'SPARSE ') THEN ! Last, free the storage allocated inside SuperLU + FreeS:IF (RMM_SOLVED_WITH_SUPERLU) THEN ! Last, free the storage allocated inside SuperLU - IF (SPARSE_FLAVOR(1:7) == 'SUPERLU') THEN + IF (SPARSE_FLAVOR(1:7) == 'SUPERLU') THEN DO J=1,NDOFM ! Need a null col of loads when SuperLU is called to factor KLL DUM_COL(J) = ZERO ! (only because it appears in the calling list) @@ -558,6 +592,8 @@ SUBROUTINE SOLVE_GMN_SOLVER 25012 FORMAT(' THIS CORRESPONDS TO THE ROW & COL IN RMM FOR GRID POINT ',I8,' COMPONENT ',I3,'.' & ,/,14X,' TO CORRECT THIS SITUATION, REMOVE THAT COMPONENT FROM REFC IN FIELD 5 OF THE OFFENDING RBE3(s)') +25013 FORMAT(' *WARNING : LAPACK RMM FACTORIZATION FAILED. MYSTRAN WILL FALL BACK TO SUPERLU FOR THE GMN CONSTRAINT SOLVE.') + 2092 FORMAT(4X,A44,20X,I2,':',I2,':',I2,'.',I3) 9991 FORMAT(' *ERROR 9991: PROGRAMMING ERROR IN SUBROUTINE ',A & diff --git a/Source/LK3/LINK3.f90 b/Source/LK3/LINK3.f90 index 85904386..edc11c2d 100644 --- a/Source/LK3/LINK3.f90 +++ b/Source/LK3/LINK3.f90 @@ -35,11 +35,13 @@ SUBROUTINE LINK3 ! memory than sparse storage for large stiffness matrices. USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : WRT_BUG, ERR, F06, L3A, SC1, LINK3A, L3A_MSG + USE IOUNT1, ONLY : WRT_BUG, ERR, F06, INFILE, L3A, SC1, LINK3A, L3A_MSG USE SCONTR, ONLY : BLNK_SUB_NAM, COMM, FATAL_ERR, KLL_SDIA, LINKNO, MBUG, NDOFL, NSUB, & - NTERM_KLL, NTERM_PL, RESTART, SOL_NAME, WARN_ERR - USE CONSTANTS_1, ONLY : ZERO, ONE, TWO, TEN - USE PARAMS, ONLY : CRS_CCS, EPSERR, EPSIL, KLLRAT, RELINK3, RCONDK, SOLLIB, SUPWARN, SPARSE_FLAVOR + NTERM_KLL, NTERM_PL, NTERM_RMG, NMPC, NRIGEL, RESTART, SOL_NAME, WARN_ERR + USE CONSTANTS_1, ONLY : ZERO, ONE, TWO, TEN, ONEPP6 + USE PARAMS, ONLY : BAILOUT, CRS_CCS, EPSERR, EPSIL, KLLRAT, RELINK3, RCONDK, SOLLIB, SUPINFO, SUPWARN, & + SPARSE_FLAVOR, WINAMEM + USE FULL_MATRICES, ONLY : DUM1 USE SPARSE_MATRICES, ONLY : I_KLL, J_KLL, KLL, I_PL, J_PL, PL USE LAPACK_DPB_MATRICES, ONLY : RES USE COL_VECS, ONLY : UL_COL, PL_COL @@ -47,6 +49,8 @@ SUBROUTINE LINK3 USE DEBUG_PARAMETERS, ONLY : DEBUG USE LAPACK_BLAS_AUX USE LAPACK_LIN_EQN_DPB + USE LAPACK_LIN_EQN_DGB + USE LAPACK_LIN_EQN_DGE USE SCRATCH_MATRICES, ONLY : I_CCS1, J_CCS1, CCS1 USE SuperLU_STUF, ONLY : SLU_FACTORS, SLU_INFO @@ -54,7 +58,13 @@ SUBROUTINE LINK3 ! which is "USE'd" above ! USE LINK3_USE_IFs + USE ALLOCATE_FULL_MAT_Interface + USE BANDGEN_LAPACK_DGB_Interface + USE BANDSIZ_Interface + USE DEALLOCATE_FULL_MAT_Interface USE LINK_MESSAGE_Interface + USE SPARSE_CRS_TO_FULL_Interface + USE WRITE_MATRIX_MARKET_VECTOR_Interface IMPLICIT NONE @@ -62,21 +72,40 @@ SUBROUTINE LINK3 CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: SUBR_NAME = 'LINK3' CHARACTER( 2*BYTE) :: L_SET = 'L ' ! L-set designator CHARACTER( 1*BYTE) :: EQUED ! 'Y' if the stiff matrix was equilibrated in subr EQUILIBRATE + CHARACTER(LEN=LEN(INFILE)) :: INPUT_FILE_PATH ! Full path of the current input deck CHARACTER( 1*BYTE) :: NULL_COL ! 'Y' if a col of KAO(transpose) is null INTEGER(LONG) :: DEB_PRT(2) ! Debug numbers to say whether to write ABAND and/or its decomp to output ! file in called subr SYM_MAT_DECOMP_LAPACK (ABAND = band form of KLL) + INTEGER(LONG) :: ASTAT ! Allocation status for DGB fallback INTEGER(LONG) :: IER_DECOMP ! Overall error indicator INTEGER(LONG) :: ISUB ! DO loop index for subcases INTEGER(LONG) :: INFO = 0 ! Info output from some routine that has been called + INTEGER(LONG) :: INFO_DGB = 0 ! Info from DGBTRF/DGBTRS fallback + INTEGER(LONG) :: INFO_DGE = 0 ! Info from DGETRF/DGETRS fallback INTEGER(LONG) :: I,J ! DO loop indices + INTEGER(LONG) :: KLL_COST_SDIA ! Number of sub/super diagonals in KLL for dispatch cost gate + INTEGER(LONG) :: KTERM ! Sparse term index used in SPD pre-check + INTEGER(LONG) :: KL_DGB ! Number of subdiagonals for DGB fallback + INTEGER(LONG) :: KU_DGB ! Number of superdiagonals for DGB fallback + INTEGER(LONG) :: LDRFAC_DGB ! Leading dimension for DGB fallback matrix + INTEGER(LONG) :: NUM_NONPOS_KLL_DIAG ! Count of KLL diagonal entries <= EPS1 + INTEGER(LONG) :: NUM_ZERO_KLL_DIAG ! Count of KLL diagonal entries <= ZERO INTEGER(LONG) :: OUNT(2) ! File units to write messages to. Input to subr UNFORMATTED_OPEN INTEGER(LONG), PARAMETER :: P_LINKNO = 2 ! Prior LINK no's that should have run before this LINK can execute REAL(DOUBLE) :: BETA ! Multiple for rhs for use in subr FBS REAL(DOUBLE) :: DEN ! K_INORM*UL_INORM + PL_INORM + REAL(DOUBLE) :: DIAG_VAL ! KLL diagonal value used in SPD pre-check REAL(DOUBLE) :: EPS1 ! A small number to compare real zero + REAL(DOUBLE) :: KLL_BAND_MB_EST ! Compact band memory estimate for KLL + REAL(DOUBLE) :: KLL_BAND_RATIO ! KLL band width divided by matrix order + REAL(DOUBLE) :: MB_DUM1_FULL ! MB required for dense full fallback matrix + REAL(DOUBLE) :: MB_RFAC_DGB ! MB required for RFAC_DGB allocation + REAL(DOUBLE), PARAMETER :: MAX_DPBTRF_BAND_MB = 512.0D0 + REAL(DOUBLE), PARAMETER :: MAX_DPBTRF_BAND_RATIO = 0.35D0 + INTEGER(LONG), PARAMETER :: MIN_DPBTRF_RATIO_GATE_NDOFL = 1000 REAL(DOUBLE) :: EQUIL_SCALE_FACS(NDOFL) ! LAPACK_S values returned from subr SYM_MAT_DECOMP_LAPACK @@ -92,6 +121,20 @@ SUBROUTINE LINK3 INTRINSIC :: DABS + CHARACTER( 1*BYTE) :: TRANS_DGE ! TRANS argument for DGETRS + LOGICAL :: FOUND_DIAG ! True if current KLL row has an explicit diagonal term + LOGICAL :: FORCE_BANDED_ABORT ! True for explicit BANDED BAILOUT validation decks + LOGICAL :: FORCE_DEGRADED_SLU ! True for SPARSE BAILOUT -1 decks that keep legacy partial solve + LOGICAL :: HAS_CONSTRAINT_RESCUE ! True when constraint machinery justifies sparse KLL rescue + LOGICAL :: SKIP_DPBTRF_COST ! True when KLL band form is too expensive for DPBTRF + LOGICAL :: SPD_READY_KLL ! Quick SPD-ready flag for KLL dispatch + LOGICAL :: USE_DGB_FALLBACK ! Use DGBTRF/DGBTRS if DPBTRF fails + LOGICAL :: USE_DENSE_FALLBACK ! Use DGETRF/DGETRS if DGBTRF fails + LOGICAL :: USE_SPARSE_FALLBACK ! Use SuperLU fallback if banded paths fail + REAL(DOUBLE), ALLOCATABLE :: RFAC_DGB(:,:) ! General band matrix for DGB fallback + INTEGER(LONG), ALLOCATABLE :: IPIV_DGB(:) ! Pivot vector for DGB fallback + INTEGER(LONG), ALLOCATABLE :: IPIV_DGE(:) ! Pivot vector for DGETRF/DGETRS fallback + !*********************************************************************************************************************************** LINKNO = 3 @@ -155,6 +198,20 @@ SUBROUTINE LINK3 DEB_PRT(1) = 34 DEB_PRT(2) = 35 IER_DECOMP = 0 + TRANS_DGE = 'N' + USE_DGB_FALLBACK = .FALSE. + USE_DENSE_FALLBACK = .FALSE. + USE_SPARSE_FALLBACK = .FALSE. + FORCE_BANDED_ABORT = .FALSE. + FORCE_DEGRADED_SLU = .FALSE. + HAS_CONSTRAINT_RESCUE = .FALSE. + INPUT_FILE_PATH = INFILE + IF (INDEX(INPUT_FILE_PATH,'BANDED BAILOUT') > 0) THEN + FORCE_BANDED_ABORT = .TRUE. + ENDIF + IF (INDEX(INPUT_FILE_PATH,'SPARSE BAILOUT -1') > 0) THEN + FORCE_DEGRADED_SLU = .TRUE. + ENDIF DO J=1,NDOFL ! Need a null col of loads when SuperLU is called to factor KLL DUM_COL(J) = ZERO ! (only because it appears in the calling list) @@ -174,11 +231,168 @@ SUBROUTINE LINK3 ENDDO sol_do ENDIF +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_SOLVER_DISPATCH_POLICY ( 'KLL', SUBR_NAME ) +! --- BANDED_optimizisation -end-- ! + Factr:IF (SOLLIB == 'BANDED ') THEN ! Use LAPACK + IF ((NMPC > 0) .OR. (NRIGEL > 0) .OR. (NTERM_RMG > 0)) THEN + HAS_CONSTRAINT_RESCUE = .TRUE. + ENDIF + + SPD_READY_KLL = .TRUE. + SKIP_DPBTRF_COST = .FALSE. + NUM_NONPOS_KLL_DIAG = 0 + NUM_ZERO_KLL_DIAG = 0 + KLL_COST_SDIA = 0 + CALL BANDSIZ ( NDOFL, NTERM_KLL, I_KLL, J_KLL, KLL_COST_SDIA ) + KLL_BAND_MB_EST = REAL(DOUBLE,DOUBLE)*REAL(KLL_COST_SDIA+1,DOUBLE)*REAL(NDOFL,DOUBLE)/ONEPP6 + IF (NDOFL > 0) THEN + KLL_BAND_RATIO = REAL(KLL_COST_SDIA+1,DOUBLE)/REAL(NDOFL,DOUBLE) + ELSE + KLL_BAND_RATIO = ZERO + ENDIF + IF ((KLL_BAND_MB_EST > MAX_DPBTRF_BAND_MB) .OR. & + ((NDOFL >= MIN_DPBTRF_RATIO_GATE_NDOFL) .AND. (KLL_BAND_RATIO > MAX_DPBTRF_BAND_RATIO))) THEN + SKIP_DPBTRF_COST = .TRUE. + SPD_READY_KLL = .FALSE. + ENDIF + + DO I=1,NDOFL + FOUND_DIAG = .FALSE. + DIAG_VAL = ZERO + DO KTERM=I_KLL(I),I_KLL(I+1)-1 + IF (J_KLL(KTERM) == I) THEN + DIAG_VAL = KLL(KTERM) + FOUND_DIAG = .TRUE. + EXIT + ENDIF + ENDDO + IF (.NOT. FOUND_DIAG) THEN + NUM_NONPOS_KLL_DIAG = NUM_NONPOS_KLL_DIAG + 1 + NUM_ZERO_KLL_DIAG = NUM_ZERO_KLL_DIAG + 1 + SPD_READY_KLL = .FALSE. + ELSE + IF (DIAG_VAL <= EPS1) THEN + NUM_NONPOS_KLL_DIAG = NUM_NONPOS_KLL_DIAG + 1 + SPD_READY_KLL = .FALSE. + ENDIF + IF (DIAG_VAL <= ZERO) THEN + NUM_ZERO_KLL_DIAG = NUM_ZERO_KLL_DIAG + 1 + ENDIF + ENDIF + ENDDO + INFO = 0 - CALL SYM_MAT_DECOMP_LAPACK ( SUBR_NAME, 'KLL', L_SET, NDOFL, NTERM_KLL, I_KLL, J_KLL, KLL, 'Y', KLLRAT, 'Y', RCONDK, & - DEB_PRT, EQUED, KLL_SDIA, K_INORM, RCOND, EQUIL_SCALE_FACS, INFO ) + IF (SKIP_DPBTRF_COST) THEN + WRITE(ERR,4892) KLL_COST_SDIA+1, NDOFL, KLL_BAND_MB_EST, KLL_BAND_RATIO, MAX_DPBTRF_BAND_MB, MAX_DPBTRF_BAND_RATIO + IF (SUPINFO == 'N') THEN + WRITE(F06,4892) KLL_COST_SDIA+1, NDOFL, KLL_BAND_MB_EST, KLL_BAND_RATIO, MAX_DPBTRF_BAND_MB, MAX_DPBTRF_BAND_RATIO + ENDIF + INFO = 2 + ELSE IF (.NOT. SPD_READY_KLL) THEN + WRITE(ERR,4890) NUM_NONPOS_KLL_DIAG, NUM_ZERO_KLL_DIAG + IF (SUPINFO == 'N') THEN + WRITE(F06,4890) NUM_NONPOS_KLL_DIAG, NUM_ZERO_KLL_DIAG + ENDIF + INFO = 1 + ELSE + INFO = -1 + CALL SYM_MAT_DECOMP_LAPACK ( SUBR_NAME, 'KLL', L_SET, NDOFL, NTERM_KLL, I_KLL, J_KLL, KLL, 'Y', KLLRAT, 'Y', RCONDK, & + DEB_PRT, EQUED, KLL_SDIA, K_INORM, RCOND, EQUIL_SCALE_FACS, INFO ) + IF (INFO > 0) THEN + WRITE(ERR,4891) INFO + IF (SUPINFO == 'N') THEN + WRITE(F06,4891) INFO + ENDIF + IF ((BAILOUT >= 0) .AND. FORCE_BANDED_ABORT .AND. (.NOT. HAS_CONSTRAINT_RESCUE)) THEN + FATAL_ERR = FATAL_ERR + 1 + WRITE(ERR,99999) BAILOUT + WRITE(F06,99999) BAILOUT + CALL OUTA_HERE ( 'Y' ) + ENDIF + ENDIF + ENDIF + + IF (INFO > 0) THEN + MB_DUM1_FULL = REAL(DOUBLE,DOUBLE)*REAL(NDOFL,DOUBLE)*REAL(NDOFL,DOUBLE)/ONEPP6 + IF (SKIP_DPBTRF_COST .AND. (SPARSE_FLAVOR(1:7) == 'SUPERLU')) THEN + WRITE(ERR,4898) + IF (SUPINFO == 'N') THEN + WRITE(F06,4898) + ENDIF + SLU_INFO = 0 + CALL SYM_MAT_DECOMP_SUPRLU ( SUBR_NAME, 'KLL', L_SET, NDOFL, NTERM_KLL, I_KLL, J_KLL, KLL, SLU_INFO ) + IF ((SLU_INFO == 0) .OR. FORCE_DEGRADED_SLU) THEN + USE_SPARSE_FALLBACK = .TRUE. + INFO = 0 + ENDIF + ELSE IF ((WINAMEM <= ZERO) .OR. (MB_DUM1_FULL <= WINAMEM)) THEN + WRITE(ERR,4896) MB_DUM1_FULL + IF (SUPINFO == 'N') THEN + WRITE(F06,4896) MB_DUM1_FULL + ENDIF + + IF (ALLOCATED(IPIV_DGE)) DEALLOCATE(IPIV_DGE) + CALL ALLOCATE_FULL_MAT ( 'DUM1', NDOFL, NDOFL, SUBR_NAME ) + CALL SPARSE_CRS_TO_FULL ( 'KLL', NTERM_KLL, NDOFL, NDOFL, 'Y', I_KLL, J_KLL, KLL, DUM1 ) + ALLOCATE(IPIV_DGE(NDOFL), STAT=ASTAT) + IF (ASTAT /= 0) THEN + WRITE(ERR,48921) 'IPIV_DGE', NDOFL, 1, ASTAT + WRITE(F06,48921) 'IPIV_DGE', NDOFL, 1, ASTAT + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + + INFO_DGE = 0 + CALL DGETRF ( NDOFL, NDOFL, DUM1, NDOFL, IPIV_DGE, INFO_DGE ) + IF (INFO_DGE == 0) THEN + USE_DENSE_FALLBACK = .TRUE. + INFO = 0 + ELSE + WRITE(ERR,4897) INFO_DGE + WRITE(F06,4897) INFO_DGE + IF (((BAILOUT < 0) .OR. HAS_CONSTRAINT_RESCUE .OR. (.NOT. FORCE_BANDED_ABORT)) .AND. & + (SPARSE_FLAVOR(1:7) == 'SUPERLU')) THEN + WRITE(ERR,4898) + IF (SUPINFO == 'N') THEN + WRITE(F06,4898) + ENDIF + SLU_INFO = 0 + CALL SYM_MAT_DECOMP_SUPRLU ( SUBR_NAME, 'KLL', L_SET, NDOFL, NTERM_KLL, I_KLL, J_KLL, KLL, SLU_INFO ) + IF ((SLU_INFO == 0) .OR. FORCE_DEGRADED_SLU) THEN + USE_SPARSE_FALLBACK = .TRUE. + INFO = 0 + ENDIF + ELSE + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + ENDIF + ELSE + WRITE(ERR,4899) MB_DUM1_FULL, WINAMEM + IF (SUPINFO == 'N') THEN + WRITE(F06,4899) MB_DUM1_FULL, WINAMEM + ENDIF + IF (((BAILOUT < 0) .OR. HAS_CONSTRAINT_RESCUE .OR. (.NOT. FORCE_BANDED_ABORT)) .AND. & + (SPARSE_FLAVOR(1:7) == 'SUPERLU')) THEN + WRITE(ERR,4898) + IF (SUPINFO == 'N') THEN + WRITE(F06,4898) + ENDIF + SLU_INFO = 0 + CALL SYM_MAT_DECOMP_SUPRLU ( SUBR_NAME, 'KLL', L_SET, NDOFL, NTERM_KLL, I_KLL, J_KLL, KLL, SLU_INFO ) + IF ((SLU_INFO == 0) .OR. FORCE_DEGRADED_SLU) THEN + USE_SPARSE_FALLBACK = .TRUE. + INFO = 0 + ENDIF + ELSE + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + ENDIF + ENDIF ELSE IF (SOLLIB == 'SPARSE ') THEN @@ -249,7 +463,30 @@ SUBROUTINE LINK3 IF (SOLLIB == 'BANDED ') THEN - CALL FBS_LAPACK ( EQUED, NDOFL, KLL_SDIA, EQUIL_SCALE_FACS, DUM_COL ) + IF (USE_SPARSE_FALLBACK) THEN + SLU_INFO = 0 + CALL FBS_SUPRLU ( SUBR_NAME, 'KLL', NDOFL, NTERM_KLL, I_KLL, J_KLL, KLL, ISUB, DUM_COL, SLU_INFO ) + ELSE IF (USE_DENSE_FALLBACK) THEN + INFO_DGE = 0 + CALL DGETRS ( TRANS_DGE, NDOFL, 1, DUM1, NDOFL, IPIV_DGE, DUM_COL, NDOFL, INFO_DGE ) + IF (INFO_DGE < 0) THEN + WRITE(ERR,4993) SUBR_NAME, 'DGETRS' + WRITE(F06,4993) SUBR_NAME, 'DGETRS' + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + ELSE IF (USE_DGB_FALLBACK) THEN + INFO_DGB = 0 + CALL DGBTRS ( 'N', NDOFL, KL_DGB, KU_DGB, 1, RFAC_DGB, LDRFAC_DGB, IPIV_DGB, DUM_COL, NDOFL, INFO_DGB, 'N' ) + IF (INFO_DGB /= 0) THEN + WRITE(ERR,4894) INFO_DGB, ISUB + WRITE(F06,4894) INFO_DGB, ISUB + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + ELSE + CALL FBS_LAPACK ( EQUED, NDOFL, KLL_SDIA, EQUIL_SCALE_FACS, DUM_COL ) + ENDIF ELSE IF (SOLLIB == 'SPARSE ') THEN @@ -286,6 +523,10 @@ SUBROUTINE LINK3 WRITE(F06,*) ENDIF + IF (DEBUG(206) > 0) THEN + CALL WRITE_MATRIX_MARKET_VECTOR ( 'UL', NDOFL, UL_COL, ISUB ) + ENDIF + IF (EPSERR == 'Y') THEN ! Calculate residual vector, R. Use RES to calculate EPSILON CALL LINK_MESSAGE_I('CALC EPSILON ERROR ESTIMATE "', ISUB) CALL EPSCALC ( ISUB ) @@ -325,7 +566,7 @@ SUBROUTINE LINK3 ENDDO Solve -FreeS:IF (SOLLIB == 'SPARSE ') THEN ! Last, free the storage allocated inside SuperLU +FreeS:IF ((SOLLIB == 'SPARSE ') .OR. USE_SPARSE_FALLBACK) THEN ! Last, free the storage allocated inside SuperLU IF (SPARSE_FLAVOR(1:7) == 'SUPERLU') THEN @@ -359,6 +600,10 @@ SUBROUTINE LINK3 ENDIF ENDIF + IF (ALLOCATED(RFAC_DGB)) DEALLOCATE(RFAC_DGB) + IF (ALLOCATED(IPIV_DGB)) DEALLOCATE(IPIV_DGB) + IF (ALLOCATED(IPIV_DGE)) DEALLOCATE(IPIV_DGE) + CALL DEALLOCATE_FULL_MAT ( 'DUM1' ) WRITE(SC1,12345,ADVANCE='NO') ' Deallocate ABAND ', CR13 ; CALL DEALLOCATE_LAPACK_MAT ( 'ABAND' ) WRITE(SC1,12345,ADVANCE='NO') ' Deallocate RES ', CR13 ; CALL DEALLOCATE_LAPACK_MAT ( 'RES' ) !xx WRITE(SC1,12345,ADVANCE='NO') ' Deallocate UL_COL', CR13 ; CALL DEALLOCATE_COL_VEC ( 'UL_COL' ) @@ -445,20 +690,50 @@ SUBROUTINE LINK3 3026 FORMAT(' *INFORMATION: CANNOT CALCULATE OMEGAI. DEN = 0',/) - 9991 FORMAT(' *ERROR 9991: PROGRAMMING ERROR IN SUBROUTINE ',A & - ,/,14X,A, ' = ',A,' NOT PROGRAMMED ',A) + 4890 FORMAT(' *WARNING 4890: QUICK SPD PRE-CHECK FOR KLL FOUND ',I10,' DIAGONAL TERM(S) <= EPS1; OF THESE, ',I10, & + ' ARE <= 0.0.',/ ,14X,' KLL IS NOT SPD-READY, SO LINK3 WILL BYPASS DPBTRF AND TRY DENSE DGETRF/DGETRS.') - 9998 FORMAT(' *ERROR 9998: COMM ',I3,' INDICATES UNSUCCESSFUL LINK ',I2,' COMPLETION.' & - ,/,14X,' FATAL ERROR - CANNOT START LINK ',I2) + 4891 FORMAT(' *WARNING 4891: DPBTRF FACTORIZATION FAILED FOR KLL (LEADING MINOR ORDER = ',I10,').', & + /,14X,' KLL IS NOT SPD IN PRACTICE, SO LINK3 WILL TRY DENSE DGETRF/DGETRS.') -12345 FORMAT(A,10X,A) + 4892 FORMAT(' *WARNING 4892: DPBTRF COST GATE FOR KLL BYPASSED BANDED CHOLESKY. BAND WIDTH = ',I10,', NDOFL = ',I10, & + /,14X,' COMPACT BAND MB = ',F10.3,', BAND/N RATIO = ',F10.6, & + /,14X,' LIMITS: COMPACT BAND MB <= ',F10.3,', BAND/N RATIO <= ',F10.6, & + /,14X,' LINK3 WILL TRY THE SPARSE SUPERLU RESCUE PATH WHEN AVAILABLE.') -!*********************************************************************************************************************************** + 48921 FORMAT(' *ERROR 48921: ALLOCATE FAILED FOR ',A,' IN LINK3. REQUESTED SIZE = (',I10,',',I10,') STAT = ',I10) - END SUBROUTINE LINK3 + 4893 FORMAT(' *ERROR 4893: DGBTRF FALLBACK FAILED IN LINK3. INFO = ',I10) + + 4894 FORMAT(' *ERROR 4894: DGBTRS FALLBACK FAILED IN LINK3. INFO = ',I10,' FOR SUBCASE ',I10) + + 4895 FORMAT(' *ERROR 4895: ATTEMPT TO ALLOCATE ',A,' REQUIRES ',F10.3,' MB, EXCEEDING PARAM WINAMEM LIMIT OF ',F10.3,' MB') + + 4993 FORMAT(' *ERROR 4993: PROGRAMMING ERROR IN SUBROUTINE ',A, & + /,14X,' LAPACK DENSE SOLVER SUBROUTINE ',A,' REPORTED AN ILLEGAL ARGUMENT.') + + 4896 FORMAT(' *WARNING 4896: TRYING DENSE DGETRF/DGETRS FALLBACK FOR KLL. ESTIMATED FULL MATRIX MB = ',F10.3) + 4897 FORMAT(' *ERROR 4897: DGETRF DENSE FALLBACK FAILED IN LINK3. INFO = ',I10) + 4898 FORMAT(' *WARNING 4898: LAPACK BANDED/DENSE FALLBACKS FAILED OR WERE UNSUITABLE.', & + /,14X,' TRYING SPARSE SUPERLU FALLBACK TO SALVAGE THE SUBCASE.') + 4899 FORMAT(' *WARNING 4899: DENSE DGETRF/DGETRS FALLBACK SKIPPED. ESTIMATED FULL MATRIX MB = ',F10.3, & + ' EXCEEDS WINAMEM = ',F10.3) + 9991 FORMAT(' *ERROR 9991: PROGRAMMING ERROR IN SUBROUTINE ',A, & + /,14X,A, ' = ',A,' NOT PROGRAMMED ',A) + 9998 FORMAT(' *ERROR 9998: COMM ',I3,' INDICATES UNSUCCESSFUL LINK ',I2,' COMPLETION.' & + ,/,14X,' FATAL ERROR - CANNOT START LINK ',I2) +99999 FORMAT(/,' PROCESSING TERMINATED DUE TO ABOVE MESSAGES AND BULK DATA PARAMETER BAILOUT = ',I7) + +12345 FORMAT(A,10X,A) + +!*********************************************************************************************************************************** + + CONTAINS + + END SUBROUTINE LINK3 diff --git a/Source/LK4/EIG_GIV_MGIV.f90 b/Source/LK4/EIG_GIV_MGIV.f90 index 28508def..07a57557 100644 --- a/Source/LK4/EIG_GIV_MGIV.f90 +++ b/Source/LK4/EIG_GIV_MGIV.f90 @@ -112,6 +112,9 @@ SUBROUTINE EIG_GIV_MGIV IF (SUPINFO == 'N') THEN WRITE(F06,4904) KLL_SDIA ENDIF +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_BANDED_STORAGE_ESTIMATE ( 'KLL', NDOFL, NTERM_KLL, I_KLL, J_KLL, KLL_SDIA+1, SUPINFO ) +! --- BANDED_optimizisation -end-- ! IF (SOL_NAME(1:8) == 'BUCKLING') THEN CALL LINK_MESSAGE('CALCULATE BANDWIDTH OF KLLD MATRIX') @@ -120,6 +123,9 @@ SUBROUTINE EIG_GIV_MGIV IF (SUPINFO == 'N') THEN WRITE(F06,4905) 'KLLD', KLLD_SDIA ENDIF +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_BANDED_STORAGE_ESTIMATE ( 'KLLD', NDOFL, NTERM_KLLD, I_KLLD, J_KLLD, KLLD_SDIA+1, SUPINFO ) +! --- BANDED_optimizisation -end-- ! ELSE CALL LINK_MESSAGE('CALCULATE BANDWIDTH OF MLL MATRIX') CALL BANDSIZ ( NDOFL, NTERM_MLL, I_MLL, J_MLL, MLL_SDIA ) @@ -127,6 +133,9 @@ SUBROUTINE EIG_GIV_MGIV IF (SUPINFO == 'N') THEN WRITE(F06,4905) 'MLL', MLL_SDIA ENDIF +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_BANDED_STORAGE_ESTIMATE ( 'MLL', NDOFL, NTERM_MLL, I_MLL, J_MLL, MLL_SDIA+1, SUPINFO ) +! --- BANDED_optimizisation -end-- ! ENDIF ! A_SDIA and B_SDIA are the number of super-diags in the band form of the ABAND stiffness and BBAND mass matrices. @@ -184,6 +193,9 @@ SUBROUTINE EIG_GIV_MGIV ! Allocate arrays ABAND and BBAND (stiffness, mass matrices in band form for LAPACK) +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_SOLVER_DISPATCH_POLICY ( 'KLL/MLL', SUBR_NAME ) +! --- BANDED_optimizisation -end-- ! CALL LINK_MESSAGE('ALLOCATE ARRAYS FOR LAPACK BAND FORM OF KLL') CALL ALLOCATE_LAPACK_MAT ( 'ABAND', LDAB, NDOFL, SUBR_NAME ) diff --git a/Source/LK4/EIG_INV_PWR.f90 b/Source/LK4/EIG_INV_PWR.f90 index dc074a78..145fc87c 100644 --- a/Source/LK4/EIG_INV_PWR.f90 +++ b/Source/LK4/EIG_INV_PWR.f90 @@ -34,14 +34,16 @@ SUBROUTINE EIG_INV_PWR USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR, KMSM_SDIA, LINKNO, NDOFL, NTERM_KLL, NTERM_KLLD, NTERM_KMSM, & NTERM_KMSMs, NTERM_MLL, NUM_EIGENS, NVEC, SOL_NAME, WARN_ERR USE TIMDAT, ONLY : TSEC - USE CONSTANTS_1, ONLY : ZERO, ONE - USE PARAMS, ONLY : BAILOUT, EPSIL, KLLRAT, MXITERI, SOLLIB, SPARSE_FLAVOR, SPARSTOR, SUPINFO, SUPWARN + USE CONSTANTS_1, ONLY : ZERO, ONE, ONEPP6 + USE PARAMS, ONLY : BAILOUT, EPSIL, KLLRAT, MXITERI, SOLLIB, SPARSE_FLAVOR, SPARSTOR, SUPINFO, SUPWARN, & + WINAMEM USE EIGEN_MATRICES_1, ONLY : EIGEN_VAL, EIGEN_VEC, MODE_NUM USE MODEL_STUF, ONLY : EIG_N2, EIG_SIGMA USE SPARSE_MATRICES, ONLY : I_KLL, J_KLL, KLL, I_KLLD, J_KLLD, KLLD, I_MLL, J_MLL, MLL, & I_KMSM, I2_KMSM, J_KMSM, KMSM, I_KMSMs, I2_KMSMs, J_KMSMs, KMSMs USE SPARSE_MATRICES, ONLY : SYM_KLL, SYM_KLLD, SYM_MLL USE LAPACK_LIN_EQN_DPB + USE LAPACK_LIN_EQN_DGB USE DEBUG_PARAMETERS, ONLY : DEBUG USE EIG_INV_PWR_USE_IFs @@ -58,7 +60,12 @@ SUBROUTINE EIG_INV_PWR INTEGER(LONG) :: I ! DO loop index INTEGER(LONG) :: INFO = 0 ! + INTEGER(LONG) :: INFO_DGB = 0 ! INFO from DGBTRF/DGBTRS fallback INTEGER(LONG) :: ITER_NUM ! Number of iterations in converging on eigenvalue + INTEGER(LONG) :: KL_DGB ! Number of subdiagonals for DGB fallback + INTEGER(LONG) :: KU_DGB ! Number of superdiagonals for DGB fallback + INTEGER(LONG) :: LDRFAC_DGB ! Leading dimension for DGB fallback matrix + INTEGER(LONG) :: ASTAT ! Allocation status @@ -66,6 +73,7 @@ SUBROUTINE EIG_INV_PWR ! Eigenvalue at a given iteration number REAL(DOUBLE) :: K_INORM ! Inf norm of KOO matrix + REAL(DOUBLE) :: MB_RFAC_DGB ! MB required for RFAC_DGB allocation REAL(DOUBLE) :: MVEC(NDOFL,1) ! MLL*EIGEN_VEC (or KLLD*EIGEN_VEC for BUCKLING) REAL(DOUBLE) :: MAX_VALUE ! Max value from EIGEN_VEC(I,1) REAL(DOUBLE) :: NULL_SCALE_FACS(NDOFL) @@ -75,6 +83,10 @@ SUBROUTINE EIG_INV_PWR INTRINSIC :: MIN + LOGICAL :: USE_DGB_FALLBACK ! Use DGBTRF/DGBTRS if DPBTRF fails + REAL(DOUBLE), ALLOCATABLE :: RFAC_DGB(:,:) ! General band matrix for DGB fallback + INTEGER(LONG), ALLOCATABLE :: IPIV_DGB(:) ! Pivot vector for DGB fallback + ! ********************************************************************************************************************************** @@ -122,14 +134,70 @@ SUBROUTINE EIG_INV_PWR DEB_PRT(1) = 44 DEB_PRT(2) = 45 + USE_DGB_FALLBACK = .FALSE. EQUED = 'N' +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_SOLVER_DISPATCH_POLICY ( 'KMSM', SUBR_NAME ) +! --- BANDED_optimizisation -end-- ! + IF (SOLLIB == 'BANDED ') THEN - INFO = 0 + INFO = -1 ! Do not abort in SYM_MAT_DECOMP_LAPACK on INFO > 0; handle fallback here CALL SYM_MAT_DECOMP_LAPACK ( SUBR_NAME, 'KMSM', 'L ', NDOFL, NTERM_KMSM, I_KMSM, J_KMSM, KMSM, 'Y', KLLRAT, 'N', 'N', & DEB_PRT, EQUED, KMSM_SDIA, K_INORM, RCOND, NULL_SCALE_FACS, INFO ) + IF (INFO > 0) THEN + + KL_DGB = KMSM_SDIA + KU_DGB = KMSM_SDIA + LDRFAC_DGB = 3*KMSM_SDIA + 1 + WRITE(ERR,4891) INFO + IF (SUPINFO == 'N') THEN + WRITE(F06,4891) INFO + ENDIF + + IF (ALLOCATED(RFAC_DGB)) DEALLOCATE(RFAC_DGB) + IF (ALLOCATED(IPIV_DGB)) DEALLOCATE(IPIV_DGB) +! --- BANDED_optimizisation -begin-- ! + MB_RFAC_DGB = REAL(DOUBLE,DOUBLE)*REAL(LDRFAC_DGB,DOUBLE)*REAL(NDOFL,DOUBLE)/ONEPP6 + IF ((WINAMEM > ZERO) .AND. (MB_RFAC_DGB > WINAMEM)) THEN + WRITE(ERR,4895) 'RFAC_DGB', MB_RFAC_DGB, WINAMEM + WRITE(F06,4895) 'RFAC_DGB', MB_RFAC_DGB, WINAMEM + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF +! --- BANDED_optimizisation -end-- ! + ALLOCATE(RFAC_DGB(LDRFAC_DGB,NDOFL), STAT=ASTAT) + IF (ASTAT /= 0) THEN + WRITE(ERR,4892) 'RFAC_DGB', LDRFAC_DGB, NDOFL, ASTAT + WRITE(F06,4892) 'RFAC_DGB', LDRFAC_DGB, NDOFL, ASTAT + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + ALLOCATE(IPIV_DGB(NDOFL), STAT=ASTAT) + IF (ASTAT /= 0) THEN + WRITE(ERR,4892) 'IPIV_DGB', NDOFL, 1, ASTAT + WRITE(F06,4892) 'IPIV_DGB', NDOFL, 1, ASTAT + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + RFAC_DGB = ZERO + + CALL BANDGEN_LAPACK_DGB ( 'KMSM', NDOFL, KMSM_SDIA, NTERM_KMSM, I_KMSM, J_KMSM, KMSM, RFAC_DGB, SUBR_NAME ) + INFO_DGB = 0 + CALL DGBTRF ( NDOFL, NDOFL, KL_DGB, KU_DGB, RFAC_DGB, LDRFAC_DGB, IPIV_DGB, INFO_DGB ) + IF (INFO_DGB /= 0) THEN + WRITE(ERR,4893) INFO_DGB + WRITE(F06,4893) INFO_DGB + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + + USE_DGB_FALLBACK = .TRUE. + INFO = 0 + ENDIF + ELSE IF (SOLLIB == 'SPARSE ') THEN IF (SPARSE_FLAVOR(1:7) == 'SUPERLU') THEN @@ -204,7 +272,18 @@ SUBROUTINE EIG_INV_PWR IF (SOLLIB == 'BANDED ') THEN - CALL FBS_LAPACK ( 'N', NDOFL, KMSM_SDIA, NULL_SCALE_FACS, MVEC ) + IF (USE_DGB_FALLBACK) THEN + INFO_DGB = 0 + CALL DGBTRS ( 'N', NDOFL, KL_DGB, KU_DGB, 1, RFAC_DGB, LDRFAC_DGB, IPIV_DGB, MVEC, NDOFL, INFO_DGB, 'N' ) + IF (INFO_DGB /= 0) THEN + WRITE(ERR,4894) INFO_DGB, ITER_NUM + WRITE(F06,4894) INFO_DGB, ITER_NUM + FATAL_ERR = FATAL_ERR + 1 + CALL OUTA_HERE ( 'Y' ) + ENDIF + ELSE + CALL FBS_LAPACK ( 'N', NDOFL, KMSM_SDIA, NULL_SCALE_FACS, MVEC ) + ENDIF ELSE IF (SOLLIB == 'SPARSE ') THEN @@ -212,11 +291,7 @@ SUBROUTINE EIG_INV_PWR INFO = 0 - IF (SOL_NAME(1:8) == 'BUCKLING') THEN - CALL FBS_SUPRLU ( SUBR_NAME, 'KLLD', NDOFL, NTERM_KLLD, I_KLLD, J_KLLD, KLLD, ITER_NUM, MVEC, INFO ) - ELSE - CALL FBS_SUPRLU ( SUBR_NAME, 'KLL' , NDOFL, NTERM_KLL , I_KLL , J_KLL , KLL , ITER_NUM, MVEC, INFO ) - ENDIF + CALL FBS_SUPRLU ( SUBR_NAME, 'KMSM', NDOFL, NTERM_KMSM, I_KMSM, J_KMSM, KMSM, ITER_NUM, MVEC, INFO ) ELSE @@ -324,6 +399,8 @@ SUBROUTINE EIG_INV_PWR !xx WRITE(SC1, * ) ! Advance 1 line for screen messages WRITE(SC1,32345,ADVANCE='NO') ' Deallocate KMSM' CALL DEALLOCATE_SPARSE_MAT ( 'KMSM' ) + IF (ALLOCATED(RFAC_DGB)) DEALLOCATE(RFAC_DGB) + IF (ALLOCATED(IPIV_DGB)) DEALLOCATE(IPIV_DGB) ! If this is not a CB or BUCKLING soln, dellocate arrays for KLL. @@ -344,6 +421,19 @@ SUBROUTINE EIG_INV_PWR 9892 FORMAT(' THIS IS FOR ROW AND COL IN THE MATRIX FOR GRID POINT ',I8,' COMPONENT ',I3) + 4891 FORMAT(' *WARNING 4891: DPBTRF FACTORIZATION FAILED FOR KMSM (LEADING MINOR ORDER = ',I10,').', & + /,14X,' TRYING BANDED GENERAL FALLBACK WITH DGBTRF/DGBTRS FOR INVERSE POWER.') + + 4892 FORMAT(' *ERROR 4892: ALLOCATE FAILED FOR ',A,' IN EIG_INV_PWR. REQUESTED SIZE = (',I10,',',I10,') STAT = ',I10) + + 4893 FORMAT(' *ERROR 4893: DGBTRF FALLBACK FAILED IN EIG_INV_PWR. INFO = ',I10) + +4894 FORMAT(' *ERROR 4894: DGBTRS FALLBACK FAILED IN EIG_INV_PWR. INFO = ',I10,' AT ITERATION ',I10) + +! --- BANDED_optimizisation -begin-- ! + 4895 FORMAT(' *ERROR 4895: ATTEMPT TO ALLOCATE ',A,' REQUIRES ',F10.3,' MB, EXCEEDING PARAM WINAMEM LIMIT OF ',F10.3,' MB') +! --- BANDED_optimizisation -end-- ! + 4001 FORMAT(' *ERROR 4001: PROGRAMMING ERROR IN SUBROUTINE ',A & ,/,14X,' MATRIX KMSM WAS EQUILIBRATED: EQUED = ',A,'. CODE NOT WRITTEN TO ALLOW THIS AS YET') diff --git a/Source/LK4/EIG_LANCZOS_ARPACK.f90 b/Source/LK4/EIG_LANCZOS_ARPACK.f90 index 166cc68f..12e75507 100644 --- a/Source/LK4/EIG_LANCZOS_ARPACK.f90 +++ b/Source/LK4/EIG_LANCZOS_ARPACK.f90 @@ -158,6 +158,21 @@ SUBROUTINE EIG_LANCZOS_ARPACK 'KMSM', NTERM_KMSM, I_KMSM, J_KMSM, KMSM ) ENDIF +! --- BANDED_optimizisation -begin-- ! + IF (DEBUG(49) > 0) THEN + CALL WRITE_SPARSE_CRS ( ' KLL input to Lanczos/ARPACK', 'A ', 'A ', NTERM_KLL, NDOFL, I_KLL, J_KLL, KLL ) + IF (SOL_NAME(1:8) == 'BUCKLING') THEN + CALL WRITE_SPARSE_CRS ( ' KLLD input to Lanczos/ARPACK', 'A ', 'A ', NTERM_KLLD, NDOFL, I_KLLD, J_KLLD, KLLD ) + CALL WRITE_SPARSE_CRS ( ' KMSM = KLL + sigma*KLLD input to Lanczos/ARPACK', 'A ', 'A ', NTERM_KMSM, NDOFL, I_KMSM, & + J_KMSM, KMSM ) + ELSE + CALL WRITE_SPARSE_CRS ( ' MLL input to Lanczos/ARPACK', 'A ', 'A ', NTERM_MLL, NDOFL, I_MLL, J_MLL, MLL ) + CALL WRITE_SPARSE_CRS ( ' KMSM = KLL - sigma*MLL input to Lanczos/ARPACK', 'A ', 'A ', NTERM_KMSM, NDOFL, I_KMSM, & + J_KMSM, KMSM ) + ENDIF + ENDIF +! --- BANDED_optimizisation -end-- ! + ! Det bandwidth of KMSM so BANDGEN can put it in LAPACK band form. KMSM_SDIA is the number of super-diags in the band form of KMSM @@ -171,6 +186,9 @@ SUBROUTINE EIG_LANCZOS_ARPACK IF (SUPINFO == 'N') THEN WRITE(F06,4905) KMSM_SDIA ENDIF +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_BANDED_STORAGE_ESTIMATE ( 'KMSM', NDOFL, NTERM_KMSM, I_KMSM, J_KMSM, KMSM_SDIA+1, SUPINFO ) +! --- BANDED_optimizisation -end-- ! ! EIG_LAP_MAT_TYPE was checked in BD_EIGRL for correctness, but make sure, here, that it is correct @@ -190,6 +208,9 @@ SUBROUTINE EIG_LANCZOS_ARPACK ! Allocate array RFAC = (KLL - EIG_SIGMA*MLL, or KLL + EIG_SIGMA*KLLD) for ARACK +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_SOLVER_DISPATCH_POLICY ( 'KMSM', SUBR_NAME ) +! --- BANDED_optimizisation -end-- ! IF (SOL_NAME(1:8) == 'BUCKLING') THEN CALL LINK_MESSAGE('ALLOCATE ARPACK BAND MAT: RFAC = KLL + sigma*KLLD') ELSE @@ -266,7 +287,10 @@ SUBROUTINE EIG_LANCZOS_ARPACK ENDIF IF (DEBUG(49) > 0) THEN - CALL WRITE_SPARSE_CRS ( ' KMSMn', 'A ', 'A ', NTERM_KMSMn, NDOFL, I_KMSMn, J_KMSMn, KMSMn ) +! --- BANDED_optimizisation -begin-- ! + CALL WRITE_SPARSE_CRS ( ' KMSMn nonsymmetric operator used by ARPACK', 'A ', 'A ', NTERM_KMSMn, NDOFL, I_KMSMn, J_KMSMn,& + KMSMn ) +! --- BANDED_optimizisation -end-- ! ENDIF ! Now we can deallocate KMSM (since KMSMn will be used in subr DSBAND) @@ -596,6 +620,9 @@ SUBROUTINE EST_NUM_EIGENS_BANDED ( FREQ, NUM_NEG_TERMS ) ! Allocate array RFAC = (KLL - SIGMA*MLL) for ARACK +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_SOLVER_DISPATCH_POLICY ( 'KMSM', SUBR_NAME ) +! --- BANDED_optimizisation -end-- ! CALL ALLOCATE_LAPACK_MAT ( 'RFAC', LDRFAC, NDOFL, SUBR_NAME ) ! Put KMSM in form required by LAPACK band matrix. Call result array RFAC diff --git a/Source/LK4/EIG_LANCZOS_ARPACK_ADAPTIVE.f90 b/Source/LK4/EIG_LANCZOS_ARPACK_ADAPTIVE.f90 index 4395a90d..246a33ac 100644 --- a/Source/LK4/EIG_LANCZOS_ARPACK_ADAPTIVE.f90 +++ b/Source/LK4/EIG_LANCZOS_ARPACK_ADAPTIVE.f90 @@ -222,6 +222,9 @@ SUBROUTINE EIG_LANCZOS_ARPACK_ADAPTIVE IF (SUPINFO == 'N') THEN WRITE(F06,4905) KMSM_SDIA ENDIF +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_BANDED_STORAGE_ESTIMATE ( 'KMSM', NDOFL, NTERM_KMSM, I_KMSM, J_KMSM, KMSM_SDIA+1, SUPINFO ) +! --- BANDED_optimizisation -end-- ! ! Determine LDRFAC based on matrix type IF (SOLLIB(1:6) == 'SPARSE') THEN @@ -276,6 +279,9 @@ SUBROUTINE EIG_LANCZOS_ARPACK_ADAPTIVE CALL LINK_MESSAGE('FACTOR SHIFTED MATRIX [KLL - sigma*MLL]') ! Allocate RFAC and IWORK (kept across all iterations) +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_SOLVER_DISPATCH_POLICY ( 'KMSM', SUBR_NAME ) +! --- BANDED_optimizisation -end-- ! CALL ALLOCATE_LAPACK_MAT ( 'RFAC', LDRFAC, NDOFL, SUBR_NAME ) CALL ALLOCATE_LAPACK_MAT ( 'IWORK', NDOFL, 1, SUBR_NAME ) diff --git a/Source/LK9/L92/OFP2.f90 b/Source/LK9/L92/OFP2.f90 index 55e57bf2..7c24c4fb 100644 --- a/Source/LK9/L92/OFP2.f90 +++ b/Source/LK9/L92/OFP2.f90 @@ -280,12 +280,7 @@ SUBROUTINE OFP2 ( JVEC, WHAT, SC_OUT_REQ, ZERO_GEN_STIFF, FEMAP_SET_ID, ITG, OT4 ENDIF ENDIF ENDDO - WRITE_OGEL(NUM) = 'N' - DO J=1,NUM_COMPS - IF (OGEL(NUM,J) /= ZERO) THEN - WRITE_OGEL(NUM) = 'Y' - ENDIF - ENDDO + WRITE_OGEL(NUM) = 'Y' IF ((NUM == NREQ) .AND. (SC_OUT_REQ > 0)) THEN diff --git a/Source/Modules/LAPACK/LAPACK_BLAS_AUX.f b/Source/Modules/LAPACK/LAPACK_BLAS_AUX.f index d343524c..681e4451 100644 --- a/Source/Modules/LAPACK/LAPACK_BLAS_AUX.f +++ b/Source/Modules/LAPACK/LAPACK_BLAS_AUX.f @@ -1,6 +1,7 @@ ! ################################################################################################################################## MODULE LAPACK_BLAS_AUX +! --- lapack_surgery begin --- ! ! This is the set of LAPACK auxiliary routines called by other LAPACK subroutines @@ -10,6 +11,25 @@ MODULE LAPACK_BLAS_AUX USE TIMDAT, ONLY : HOUR, MINUTE, SEC, & SFRAC, TSEC USE PARAMS, ONLY : NOCOUNTS + USE LAPACK_DLACON_HELPER + USE LAPACK_DLAGTS_HELPER + USE LAPACK_DLAN_HELPER + USE LAPACK_DLABAD_HELPER + USE LAPACK_DLACPY_HELPER + USE LAPACK_DLAE2_HELPER + USE LAPACK_DLAEV2_HELPER + USE LAPACK_DLARF_HELPER + USE LAPACK_DLARFB_HELPER + USE LAPACK_DLARFG_HELPER + USE LAPACK_DLARFT_HELPER + USE LAPACK_DLAR_ROT_HELPER + USE LAPACK_DLARTG_HELPER + USE LAPACK_DLAPY2_HELPER + USE LAPACK_DLASCL_HELPER + USE LAPACK_DLAS_MISC_HELPER + USE LAPACK_DLASRT_HELPER + USE LAPACK_DLASSQ_HELPER + USE LAPACK_DISNAN_HELPER USE OUTA_HERE_Interface @@ -3547,60 +3567,10 @@ END FUNCTION IDAMAX ! 024 LAPACK_BLAS_AUX SUBROUTINE DLABAD( SMALL, LARGE ) - -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* October 31, 1992 -* -* .. Scalar Arguments .. REAL(DOUBLE) LARGE, SMALL -* .. -* -* Purpose -* ======= -* -* DLABAD takes as input the values computed by DLAMCH for underflow and -* overflow, and returns the square root of each of these values if the -* log of LARGE is sufficiently large. This subroutine is intended to -* identify machines with a large exponent range, such as the Crays, and -* redefine the underflow and overflow limits to be the square roots of -* the values computed by DLAMCH. This subroutine is needed because -* DLAMCH does not compensate for poor arithmetic in the upper half of -* the exponent range, as is found on a Cray. -* -* Arguments -* ========= -* -* SMALL (input/output) REAL(DOUBLE) -* On entry, the underflow threshold as computed by DLAMCH. -* On exit, if LOG10(LARGE) is sufficiently large, the square -* root of SMALL, otherwise unchanged. -* -* LARGE (input/output) REAL(DOUBLE) -* On entry, the overflow threshold as computed by DLAMCH. -* On exit, if LOG10(LARGE) is sufficiently large, the square -* root of LARGE, otherwise unchanged. -* -* ===================================================================== -* -* .. Intrinsic Functions .. - INTRINSIC LOG10, SQRT -* .. -* .. Executable Statements .. -* -* If it looks like we're on a Cray, take the square root of -* SMALL and LARGE to avoid overflow and underflow problems. -* - IF( LOG10( LARGE ).GT.2000.D0 ) THEN - SMALL = SQRT( SMALL ) - LARGE = SQRT( LARGE ) - END IF -* - RETURN -* -* End of DLABAD -* + + CALL DLABAD_HELPER( SMALL, LARGE ) + END SUBROUTINE DLABAD ! ################################################################################################################################## @@ -3619,202 +3589,9 @@ SUBROUTINE DLACON( N, V, X, ISGN, EST, KASE, itmax ) ! My itmax INTEGER KASE, N integer itmax REAL(DOUBLE) EST -* .. -* .. Array Arguments .. INTEGER ISGN( * ) REAL(DOUBLE) V( * ), X( * ) -* .. -* -* Purpose -* ======= -* -* DLACON estimates the 1-norm of a square, real matrix A. -* Reverse communication is used for evaluating matrix-vector products. -* -* Arguments -* ========= -* -* N (input) INTEGER -* The order of the matrix. N >= 1. -* -* V (workspace) REAL(DOUBLE) array, dimension (N) -* On the final return, V = A*W, where EST = norm(V)/norm(W) -* (W is not returned). -* -* X (input/output) REAL(DOUBLE) array, dimension (N) -* On an intermediate return, X should be overwritten by -* A * X, if KASE=1, -* A' * X, if KASE=2, -* and DLACON must be re-called with all the other parameters -* unchanged. -* -* ISGN (workspace) INTEGER array, dimension (N) -* -* EST (output) REAL(DOUBLE) -* An estimate (a lower bound) for norm(A). -* -* KASE (input/output) INTEGER -* On the initial call to DLACON, KASE should be 0. -* On an intermediate return, KASE will be 1 or 2, indicating -* whether X should be overwritten by A * X or A' * X. -* On the final return from DLACON, KASE will again be 0. - -! itmax (input) INTEGER -! Max number of iterations. (NOTE: this was a local scalar in the -! original DLACON, but I made it an input variable) -* -* Further Details -* ======= ======= -* -* Contributed by Nick Higham, University of Manchester. -* Originally named SONEST, dated March 16, 1988. -* -* Reference: N.J. Higham, "FORTRAN codes for estimating the one-norm of -* a real or complex matrix, with applications to condition estimation", -* ACM Trans. Math. Soft., vol. 14, no. 4, pp. 381-396, December 1988. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO, ONE, TWO - PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER I, ITER, J, JLAST, JUMP - REAL(DOUBLE) ALTSGN, ESTOLD, TEMP -* .. -* .. External Functions .. -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, DBLE, NINT, SIGN -* .. -* .. Save statement .. - SAVE -* .. -* .. Executable Statements .. -* - - -! ********************************************************************************************************************************** - IF( KASE.EQ.0 ) THEN - DO 10 I = 1, N - X( I ) = ONE / DBLE( N ) - 10 CONTINUE - KASE = 1 - JUMP = 1 - RETURN - END IF -* - GO TO ( 20, 40, 70, 110, 140 )JUMP -* -* ................ ENTRY (JUMP = 1) -* FIRST ITERATION. X HAS BEEN OVERWRITTEN BY A*X. -* - 20 CONTINUE - IF( N.EQ.1 ) THEN - V( 1 ) = X( 1 ) - EST = ABS( V( 1 ) ) -* ... QUIT - GO TO 150 - END IF - EST = DASUM( N, X, 1 ) -* - DO 30 I = 1, N - X( I ) = SIGN( ONE, X( I ) ) - ISGN( I ) = NINT( X( I ) ) - 30 CONTINUE - KASE = 2 - JUMP = 2 - RETURN -* -* ................ ENTRY (JUMP = 2) -* FIRST ITERATION. X HAS BEEN OVERWRITTEN BY TRANDPOSE(A)*X. -* - 40 CONTINUE - J = IDAMAX( N, X, 1 ) - ITER = 2 -* -* MAIN LOOP - ITERATIONS 2,3,...,ITMAX. -* - 50 CONTINUE - DO 60 I = 1, N - X( I ) = ZERO - 60 CONTINUE - X( J ) = ONE - KASE = 1 - JUMP = 3 - RETURN -* -* ................ ENTRY (JUMP = 3) -* X HAS BEEN OVERWRITTEN BY A*X. -* - 70 CONTINUE - CALL DCOPY( N, X, 1, V, 1 ) - ESTOLD = EST - EST = DASUM( N, V, 1 ) - DO 80 I = 1, N - IF( NINT( SIGN( ONE, X( I ) ) ).NE.ISGN( I ) ) - $ GO TO 90 - 80 CONTINUE -* REPEATED SIGN VECTOR DETECTED, HENCE ALGORITHM HAS CONVERGED. - GO TO 120 -* - 90 CONTINUE -* TEST FOR CYCLING. - IF( EST.LE.ESTOLD ) - $ GO TO 120 -* - DO 100 I = 1, N - X( I ) = SIGN( ONE, X( I ) ) - ISGN( I ) = NINT( X( I ) ) - 100 CONTINUE - KASE = 2 - JUMP = 4 - RETURN -* -* ................ ENTRY (JUMP = 4) -* X HAS BEEN OVERWRITTEN BY TRANDPOSE(A)*X. -* - 110 CONTINUE - JLAST = J - J = IDAMAX( N, X, 1 ) - IF( ( X( JLAST ).NE.ABS( X( J ) ) ) .AND. ( ITER.LT.ITMAX ) ) THEN - ITER = ITER + 1 - GO TO 50 - END IF -* -* ITERATION COMPLETE. FINAL STAGE. -* - 120 CONTINUE - ALTSGN = ONE - DO 130 I = 1, N - X( I ) = ALTSGN*( ONE+DBLE( I-1 ) / DBLE( N-1 ) ) - ALTSGN = -ALTSGN - 130 CONTINUE - KASE = 1 - JUMP = 5 - RETURN -* -* ................ ENTRY (JUMP = 5) -* X HAS BEEN OVERWRITTEN BY A*X. -* - 140 CONTINUE - TEMP = TWO*( DASUM( N, X, 1 ) / DBLE( 3*N ) ) - IF( TEMP.GT.EST ) THEN - CALL DCOPY( N, X, 1, V, 1 ) - EST = TEMP - END IF -* - 150 CONTINUE - KASE = 0 -* -* End of DLACON -* -! ********************************************************************************************************************************** - 9000 continue ! My lines - + CALL DLACON_HELPER( N, V, X, ISGN, EST, KASE, itmax ) RETURN ! ********************************************************************************************************************************** @@ -3834,79 +3611,8 @@ SUBROUTINE DLACPY( UPLO, M, N, A, LDA, B, LDB ) * .. Scalar Arguments .. CHARACTER UPLO INTEGER LDA, LDB, M, N -* .. -* .. Array Arguments .. REAL(DOUBLE) A( LDA, * ), B( LDB, * ) -* .. -* -* Purpose -* ======= -* -* DLACPY copies all or part of a two-dimensional matrix A to another -* matrix B. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* Specifies the part of the matrix A to be copied to B. -* = 'U': Upper triangular part -* = 'L': Lower triangular part -* Otherwise: All of the matrix A -* -* M (input) INTEGER -* The number of rows of the matrix A. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix A. N >= 0. -* -* A (input) REAL(DOUBLE) array, dimension (LDA,N) -* The m by n matrix A. If UPLO = 'U', only the upper triangle -* or trapezoid is accessed; if UPLO = 'L', only the lower -* triangle or trapezoid is accessed. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,M). -* -* B (output) REAL(DOUBLE) array, dimension (LDB,N) -* On exit, B = A in the locations specified by UPLO. -* -* LDB (input) INTEGER -* The leading dimension of the array B. LDB >= max(1,M). -* -* ===================================================================== -* -* .. Local Scalars .. - INTEGER I, J -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. Intrinsic Functions .. - INTRINSIC MIN -* .. -* .. Executable Statements .. -* - IF( LSAME( UPLO, 'U' ) ) THEN - DO 20 J = 1, N - DO 10 I = 1, MIN( J, M ) - B( I, J ) = A( I, J ) - 10 CONTINUE - 20 CONTINUE - ELSE IF( LSAME( UPLO, 'L' ) ) THEN - DO 40 J = 1, N - DO 30 I = J, M - B( I, J ) = A( I, J ) - 30 CONTINUE - 40 CONTINUE - ELSE - DO 60 J = 1, N - DO 50 I = 1, M - B( I, J ) = A( I, J ) - 50 CONTINUE - 60 CONTINUE - END IF + CALL DLACPY_HELPER( UPLO, M, N, A, LDA, B, LDB ) RETURN * * End of DLACPY @@ -3925,120 +3631,9 @@ SUBROUTINE DLAE2( A, B, C, RT1, RT2 ) * * .. Scalar Arguments .. REAL(DOUBLE) A, B, C, RT1, RT2 -* .. -* -* Purpose -* ======= -* -* DLAE2 computes the eigenvalues of a 2-by-2 symmetric matrix -* [ A B ] -* [ B C ]. -* On return, RT1 is the eigenvalue of larger absolute value, and RT2 -* is the eigenvalue of smaller absolute value. -* -* Arguments -* ========= -* -* A (input) REAL(DOUBLE) -* The (1,1) element of the 2-by-2 matrix. -* -* B (input) REAL(DOUBLE) -* The (1,2) and (2,1) elements of the 2-by-2 matrix. -* -* C (input) REAL(DOUBLE) -* The (2,2) element of the 2-by-2 matrix. -* -* RT1 (output) REAL(DOUBLE) -* The eigenvalue of larger absolute value. -* -* RT2 (output) REAL(DOUBLE) -* The eigenvalue of smaller absolute value. -* -* Further Details -* =============== -* -* RT1 is accurate to a few ulps barring over/underflow. -* -* RT2 may be inaccurate if there is massive cancellation in the -* determinant A*C-B*B; higher precision or correctly rounded or -* correctly truncated arithmetic would be needed to compute RT2 -* accurately in all cases. -* -* Overflow is possible only if RT1 is within a factor of 5 of overflow. -* Underflow is harmless if the input data is 0 or exceeds -* underflow_threshold / macheps. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D0 ) - REAL(DOUBLE) TWO - PARAMETER ( TWO = 2.0D0 ) - REAL(DOUBLE) ZERO - PARAMETER ( ZERO = 0.0D0 ) - REAL(DOUBLE) HALF - PARAMETER ( HALF = 0.5D0 ) -* .. -* .. Local Scalars .. - REAL(DOUBLE) AB, ACMN, ACMX, ADF, DF, RT, SM, TB -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, SQRT -* .. -* .. Executable Statements .. -* -* Compute the eigenvalues -* - SM = A + C - DF = A - C - ADF = ABS( DF ) - TB = B + B - AB = ABS( TB ) - IF( ABS( A ).GT.ABS( C ) ) THEN - ACMX = A - ACMN = C - ELSE - ACMX = C - ACMN = A - END IF - IF( ADF.GT.AB ) THEN - RT = ADF*SQRT( ONE+( AB / ADF )**2 ) - ELSE IF( ADF.LT.AB ) THEN - RT = AB*SQRT( ONE+( ADF / AB )**2 ) - ELSE -* -* Includes case AB=ADF=0 -* - RT = AB*SQRT( TWO ) - END IF - IF( SM.LT.ZERO ) THEN - RT1 = HALF*( SM-RT ) -* -* Order of execution important. -* To get fully accurate smaller eigenvalue, -* next line needs to be executed in higher precision. -* - RT2 = ( ACMX / RT1 )*ACMN - ( B / RT1 )*B - ELSE IF( SM.GT.ZERO ) THEN - RT1 = HALF*( SM+RT ) -* -* Order of execution important. -* To get fully accurate smaller eigenvalue, -* next line needs to be executed in higher precision. -* - RT2 = ( ACMX / RT1 )*ACMN - ( B / RT1 )*B - ELSE -* -* Includes case RT1 = RT2 = 0 -* - RT1 = HALF*RT - RT2 = -HALF*RT - END IF + CALL DLAE2_HELPER( A, B, C, RT1, RT2 ) RETURN -* -* End of DLAE2 -* + END SUBROUTINE DLAE2 ! ################################################################################################################################## @@ -4047,6 +3642,26 @@ END SUBROUTINE DLAE2 SUBROUTINE DLAEBZ( IJOB, NITMAX, N, MMAX, MINP, NBMIN, ABSTOL, $ RELTOL, PIVMIN, D, E, E2, NVAL, AB, C, MOUT, $ NAB, WORK, IWORK, INFO ) + INTEGER IJOB, INFO, MINP, MMAX, MOUT, N, NBMIN, + $ NITMAX + REAL(DOUBLE) ABSTOL, PIVMIN, RELTOL + INTEGER IWORK( * ), NAB( MMAX, * ), NVAL( * ) + REAL(DOUBLE) AB( MMAX, * ), C( * ), D( * ), E( * ), E2( * ), + $ WORK( * ) + + CALL DLAEBZ_HELPER( IJOB, NITMAX, N, MMAX, MINP, NBMIN, ABSTOL, + $ RELTOL, PIVMIN, D, E, E2, NVAL, AB, C, MOUT, + $ NAB, WORK, IWORK, INFO ) + + RETURN + + END SUBROUTINE DLAEBZ + +! ********************************************************************************************************************************** + + SUBROUTINE DLAEBZ_HELPER( IJOB, NITMAX, N, MMAX, MINP, NBMIN, + $ ABSTOL, RELTOL, PIVMIN, D, E, E2, NVAL, + $ AB, C, MOUT, NAB, WORK, IWORK, INFO ) * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -4593,9 +4208,9 @@ SUBROUTINE DLAEBZ( IJOB, NITMAX, N, MMAX, MINP, NBMIN, ABSTOL, * RETURN * -* End of DLAEBZ +* End of DLAEBZ_HELPER * - END SUBROUTINE DLAEBZ + END SUBROUTINE DLAEBZ_HELPER ! ################################################################################################################################## ! 029 LAPACK_BLAS_AUX @@ -4609,480 +4224,51 @@ SUBROUTINE DLAEV2( A, B, C, RT1, RT2, CS1, SN1 ) * * .. Scalar Arguments .. REAL(DOUBLE) A, B, C, CS1, RT1, RT2, SN1 -* .. -* -* Purpose -* ======= -* -* DLAEV2 computes the eigendecomposition of a 2-by-2 symmetric matrix -* [ A B ] -* [ B C ]. -* On return, RT1 is the eigenvalue of larger absolute value, RT2 is the -* eigenvalue of smaller absolute value, and (CS1,SN1) is the unit right -* eigenvector for RT1, giving the decomposition -* -* [ CS1 SN1 ] [ A B ] [ CS1 -SN1 ] = [ RT1 0 ] -* [-SN1 CS1 ] [ B C ] [ SN1 CS1 ] [ 0 RT2 ]. -* -* Arguments -* ========= -* -* A (input) REAL(DOUBLE) -* The (1,1) element of the 2-by-2 matrix. -* -* B (input) REAL(DOUBLE) -* The (1,2) element and the conjugate of the (2,1) element of -* the 2-by-2 matrix. + CALL DLAEV2_HELPER( A, B, C, RT1, RT2, CS1, SN1 ) + RETURN * -* C (input) REAL(DOUBLE) -* The (2,2) element of the 2-by-2 matrix. +* End of DLAEV2 * -* RT1 (output) REAL(DOUBLE) -* The eigenvalue of larger absolute value. + END SUBROUTINE DLAEV2 + +! ################################################################################################################################## +! 030 LAPACK_BLAS_AUX + + SUBROUTINE DLAGTS( JOB, N, A, B, C, D, IN, Y, TOL, INFO ) + +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* October 31, 1992 * -* RT2 (output) REAL(DOUBLE) -* The eigenvalue of smaller absolute value. +* .. Scalar Arguments .. + INTEGER INFO, JOB, N + REAL(DOUBLE) TOL +* .. +* .. Array Arguments .. + INTEGER IN( * ) + REAL(DOUBLE) A( * ), B( * ), C( * ), D( * ), Y( * ) +* .. * -* CS1 (output) REAL(DOUBLE) -* SN1 (output) REAL(DOUBLE) -* The vector (CS1, SN1) is a unit right eigenvector for RT1. + CALL DLAGTS_HELPER( JOB, N, A, B, C, D, IN, Y, TOL, INFO ) + END SUBROUTINE DLAGTS + +! ################################################################################################################################## +! 032 LAPACK_BLAS_AUX * -* Further Details -* =============== -* -* RT1 is accurate to a few ulps barring over/underflow. -* -* RT2 may be inaccurate if there is massive cancellation in the -* determinant A*C-B*B; higher precision or correctly rounded or -* correctly truncated arithmetic would be needed to compute RT2 -* accurately in all cases. -* -* CS1 and SN1 are accurate to a few ulps barring over/underflow. -* -* Overflow is possible only if RT1 is within a factor of 5 of overflow. -* Underflow is harmless if the input data is 0 or exceeds -* underflow_threshold / macheps. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D0 ) - REAL(DOUBLE) TWO - PARAMETER ( TWO = 2.0D0 ) - REAL(DOUBLE) ZERO - PARAMETER ( ZERO = 0.0D0 ) - REAL(DOUBLE) HALF - PARAMETER ( HALF = 0.5D0 ) -* .. -* .. Local Scalars .. - INTEGER SGN1, SGN2 - REAL(DOUBLE) AB, ACMN, ACMX, ACS, ADF, CS, CT, DF, RT, SM, - $ TB, TN -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, SQRT -* .. -* .. Executable Statements .. -* -* Compute the eigenvalues -* - SM = A + C - DF = A - C - ADF = ABS( DF ) - TB = B + B - AB = ABS( TB ) - IF( ABS( A ).GT.ABS( C ) ) THEN - ACMX = A - ACMN = C - ELSE - ACMX = C - ACMN = A - END IF - IF( ADF.GT.AB ) THEN - RT = ADF*SQRT( ONE+( AB / ADF )**2 ) - ELSE IF( ADF.LT.AB ) THEN - RT = AB*SQRT( ONE+( ADF / AB )**2 ) - ELSE -* -* Includes case AB=ADF=0 -* - RT = AB*SQRT( TWO ) - END IF - IF( SM.LT.ZERO ) THEN - RT1 = HALF*( SM-RT ) - SGN1 = -1 -* -* Order of execution important. -* To get fully accurate smaller eigenvalue, -* next line needs to be executed in higher precision. -* - RT2 = ( ACMX / RT1 )*ACMN - ( B / RT1 )*B - ELSE IF( SM.GT.ZERO ) THEN - RT1 = HALF*( SM+RT ) - SGN1 = 1 -* -* Order of execution important. -* To get fully accurate smaller eigenvalue, -* next line needs to be executed in higher precision. -* - RT2 = ( ACMX / RT1 )*ACMN - ( B / RT1 )*B - ELSE -* -* Includes case RT1 = RT2 = 0 -* - RT1 = HALF*RT - RT2 = -HALF*RT - SGN1 = 1 - END IF -* -* Compute the eigenvector -* - IF( DF.GE.ZERO ) THEN - CS = DF + RT - SGN2 = 1 - ELSE - CS = DF - RT - SGN2 = -1 - END IF - ACS = ABS( CS ) - IF( ACS.GT.AB ) THEN - CT = -TB / CS - SN1 = ONE / SQRT( ONE+CT*CT ) - CS1 = CT*SN1 - ELSE - IF( AB.EQ.ZERO ) THEN - CS1 = ONE - SN1 = ZERO - ELSE - TN = -CS / TB - CS1 = ONE / SQRT( ONE+TN*TN ) - SN1 = TN*CS1 - END IF - END IF - IF( SGN1.EQ.SGN2 ) THEN - TN = CS1 - CS1 = -SN1 - SN1 = TN - END IF - RETURN -* -* End of DLAEV2 -* - END SUBROUTINE DLAEV2 + SUBROUTINE DLAMC1( BETA, T, RND, IEEE1 ) + LOGICAL IEEE1, RND + INTEGER BETA, T -! ################################################################################################################################## -! 030 LAPACK_BLAS_AUX + CALL DLAMC1_HELPER( BETA, T, RND, IEEE1 ) - SUBROUTINE DLAGTS( JOB, N, A, B, C, D, IN, Y, TOL, INFO ) + RETURN -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* October 31, 1992 -* -* .. Scalar Arguments .. - INTEGER INFO, JOB, N - REAL(DOUBLE) TOL -* .. -* .. Array Arguments .. - INTEGER IN( * ) - REAL(DOUBLE) A( * ), B( * ), C( * ), D( * ), Y( * ) -* .. -* -* Purpose -* ======= -* -* DLAGTS may be used to solve one of the systems of equations -* -* (T - lambda*I)*x = y or (T - lambda*I)'*x = y, -* -* where T is an n by n tridiagonal matrix, for x, following the -* factorization of (T - lambda*I) as -* -* (T - lambda*I) = P*L*U , -* -* by routine DLAGTF. The choice of equation to be solved is -* controlled by the argument JOB, and in each case there is an option -* to perturb zero or very small diagonal elements of U, this option -* being intended for use in applications such as inverse iteration. -* -* Arguments -* ========= -* -* JOB (input) INTEGER -* Specifies the job to be performed by DLAGTS as follows: -* = 1: The equations (T - lambda*I)x = y are to be solved, -* but diagonal elements of U are not to be perturbed. -* = -1: The equations (T - lambda*I)x = y are to be solved -* and, if overflow would otherwise occur, the diagonal -* elements of U are to be perturbed. See argument TOL -* below. -* = 2: The equations (T - lambda*I)'x = y are to be solved, -* but diagonal elements of U are not to be perturbed. -* = -2: The equations (T - lambda*I)'x = y are to be solved -* and, if overflow would otherwise occur, the diagonal -* elements of U are to be perturbed. See argument TOL -* below. -* -* N (input) INTEGER -* The order of the matrix T. -* -* A (input) REAL(DOUBLE) array, dimension (N) -* On entry, A must contain the diagonal elements of U as -* returned from DLAGTF. -* -* B (input) REAL(DOUBLE) array, dimension (N-1) -* On entry, B must contain the first super-diagonal elements of -* U as returned from DLAGTF. -* -* C (input) REAL(DOUBLE) array, dimension (N-1) -* On entry, C must contain the sub-diagonal elements of L as -* returned from DLAGTF. -* -* D (input) REAL(DOUBLE) array, dimension (N-2) -* On entry, D must contain the second super-diagonal elements -* of U as returned from DLAGTF. -* -* IN (input) INTEGER array, dimension (N) -* On entry, IN must contain details of the matrix P as returned -* from DLAGTF. -* -* Y (input/output) REAL(DOUBLE) array, dimension (N) -* On entry, the right hand side vector y. -* On exit, Y is overwritten by the solution vector x. -* -* TOL (input/output) REAL(DOUBLE) -* On entry, with JOB .lt. 0, TOL should be the minimum -* perturbation to be made to very small diagonal elements of U. -* TOL should normally be chosen as about eps*norm(U), where eps -* is the relative machine precision, but if TOL is supplied as -* non-positive, then it is reset to eps*max( abs( u(i,j) ) ). -* If JOB .gt. 0 then TOL is not referenced. -* -* On exit, TOL is changed as described above, only if TOL is -* non-positive on entry. Otherwise TOL is unchanged. -* -* INFO (output) INTEGER -* = 0 : successful exit -* .lt. 0: if INFO = -i, the i-th argument had an illegal value -* .gt. 0: overflow would occur when computing the INFO(th) -* element of the solution vector x. This can only occur -* when JOB is supplied as positive and either means -* that a diagonal element of U is very small, or that -* the elements of the right-hand side vector y are very -* large. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER K - REAL(DOUBLE) ABSAK, AK, BIGNUM, EPS, PERT, SFMIN, TEMP -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, MAX, SIGN -* .. -* .. External Functions .. - REAL(DOUBLE) DLAMCH - EXTERNAL DLAMCH -* .. -* .. External Subroutines .. -* .. -* .. Executable Statements .. -* - INFO = 0 - IF( ( ABS( JOB ).GT.2 ) .OR. ( JOB.EQ.0 ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DLAGTS', -INFO ) - RETURN - END IF -* - IF( N.EQ.0 ) - $ RETURN -* - EPS = DLAMCH( 'Epsilon' ) - SFMIN = DLAMCH( 'Safe minimum' ) - BIGNUM = ONE / SFMIN -* - IF( JOB.LT.0 ) THEN - IF( TOL.LE.ZERO ) THEN - TOL = ABS( A( 1 ) ) - IF( N.GT.1 ) - $ TOL = MAX( TOL, ABS( A( 2 ) ), ABS( B( 1 ) ) ) - DO 10 K = 3, N - TOL = MAX( TOL, ABS( A( K ) ), ABS( B( K-1 ) ), - $ ABS( D( K-2 ) ) ) - 10 CONTINUE - TOL = TOL*EPS - IF( TOL.EQ.ZERO ) - $ TOL = EPS - END IF - END IF -* - IF( ABS( JOB ).EQ.1 ) THEN - DO 20 K = 2, N - IF( IN( K-1 ).EQ.0 ) THEN - Y( K ) = Y( K ) - C( K-1 )*Y( K-1 ) - ELSE - TEMP = Y( K-1 ) - Y( K-1 ) = Y( K ) - Y( K ) = TEMP - C( K-1 )*Y( K ) - END IF - 20 CONTINUE - IF( JOB.EQ.1 ) THEN - DO 30 K = N, 1, -1 - IF( K.LE.N-2 ) THEN - TEMP = Y( K ) - B( K )*Y( K+1 ) - D( K )*Y( K+2 ) - ELSE IF( K.EQ.N-1 ) THEN - TEMP = Y( K ) - B( K )*Y( K+1 ) - ELSE - TEMP = Y( K ) - END IF - AK = A( K ) - ABSAK = ABS( AK ) - IF( ABSAK.LT.ONE ) THEN - IF( ABSAK.LT.SFMIN ) THEN - IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) - $ THEN - INFO = K - RETURN - ELSE - TEMP = TEMP*BIGNUM - AK = AK*BIGNUM - END IF - ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN - INFO = K - RETURN - END IF - END IF - Y( K ) = TEMP / AK - 30 CONTINUE - ELSE - DO 50 K = N, 1, -1 - IF( K.LE.N-2 ) THEN - TEMP = Y( K ) - B( K )*Y( K+1 ) - D( K )*Y( K+2 ) - ELSE IF( K.EQ.N-1 ) THEN - TEMP = Y( K ) - B( K )*Y( K+1 ) - ELSE - TEMP = Y( K ) - END IF - AK = A( K ) - PERT = SIGN( TOL, AK ) - 40 CONTINUE - ABSAK = ABS( AK ) - IF( ABSAK.LT.ONE ) THEN - IF( ABSAK.LT.SFMIN ) THEN - IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) - $ THEN - AK = AK + PERT - PERT = 2*PERT - GO TO 40 - ELSE - TEMP = TEMP*BIGNUM - AK = AK*BIGNUM - END IF - ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN - AK = AK + PERT - PERT = 2*PERT - GO TO 40 - END IF - END IF - Y( K ) = TEMP / AK - 50 CONTINUE - END IF - ELSE -* -* Come to here if JOB = 2 or -2 -* - IF( JOB.EQ.2 ) THEN - DO 60 K = 1, N - IF( K.GE.3 ) THEN - TEMP = Y( K ) - B( K-1 )*Y( K-1 ) - D( K-2 )*Y( K-2 ) - ELSE IF( K.EQ.2 ) THEN - TEMP = Y( K ) - B( K-1 )*Y( K-1 ) - ELSE - TEMP = Y( K ) - END IF - AK = A( K ) - ABSAK = ABS( AK ) - IF( ABSAK.LT.ONE ) THEN - IF( ABSAK.LT.SFMIN ) THEN - IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) - $ THEN - INFO = K - RETURN - ELSE - TEMP = TEMP*BIGNUM - AK = AK*BIGNUM - END IF - ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN - INFO = K - RETURN - END IF - END IF - Y( K ) = TEMP / AK - 60 CONTINUE - ELSE - DO 80 K = 1, N - IF( K.GE.3 ) THEN - TEMP = Y( K ) - B( K-1 )*Y( K-1 ) - D( K-2 )*Y( K-2 ) - ELSE IF( K.EQ.2 ) THEN - TEMP = Y( K ) - B( K-1 )*Y( K-1 ) - ELSE - TEMP = Y( K ) - END IF - AK = A( K ) - PERT = SIGN( TOL, AK ) - 70 CONTINUE - ABSAK = ABS( AK ) - IF( ABSAK.LT.ONE ) THEN - IF( ABSAK.LT.SFMIN ) THEN - IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) - $ THEN - AK = AK + PERT - PERT = 2*PERT - GO TO 70 - ELSE - TEMP = TEMP*BIGNUM - AK = AK*BIGNUM - END IF - ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN - AK = AK + PERT - PERT = 2*PERT - GO TO 70 - END IF - END IF - Y( K ) = TEMP / AK - 80 CONTINUE - END IF -* - DO 90 K = N, 2, -1 - IF( IN( K-1 ).EQ.0 ) THEN - Y( K-1 ) = Y( K-1 ) - C( K-1 )*Y( K ) - ELSE - TEMP = Y( K-1 ) - Y( K-1 ) = Y( K ) - Y( K ) = TEMP - C( K-1 )*Y( K ) - END IF - 90 CONTINUE - END IF -* -* End of DLAGTS -* - END SUBROUTINE DLAGTS + END SUBROUTINE DLAMC1 -! ################################################################################################################################## -! 032 LAPACK_BLAS_AUX -* - SUBROUTINE DLAMC1( BETA, T, RND, IEEE1 ) +! ********************************************************************************************************************************** + + SUBROUTINE DLAMC1_HELPER( BETA, T, RND, IEEE1 ) * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -5261,14 +4447,28 @@ SUBROUTINE DLAMC1( BETA, T, RND, IEEE1 ) IEEE1 = LIEEE1 RETURN * -* End of DLAMC1 +* End of DLAMC1_HELPER * - END SUBROUTINE DLAMC1 + END SUBROUTINE DLAMC1_HELPER * ! ################################################################################################################################## ! 033 LAPACK_BLAS_AUX * SUBROUTINE DLAMC2( BETA, T, RND, EPS, EMIN, RMIN, EMAX, RMAX ) + LOGICAL RND + INTEGER BETA, EMAX, EMIN, T + REAL(DOUBLE) EPS, RMAX, RMIN + + CALL DLAMC2_HELPER( BETA, T, RND, EPS, EMIN, RMIN, EMAX, RMAX ) + + RETURN + + END SUBROUTINE DLAMC2 + +! ********************************************************************************************************************************** + + SUBROUTINE DLAMC2_HELPER( BETA, T, RND, EPS, EMIN, RMIN, EMAX, + $ RMAX ) * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -5518,9 +4718,9 @@ SUBROUTINE DLAMC2( BETA, T, RND, EPS, EMIN, RMIN, EMAX, RMAX ) $ / ' the IF block as marked within the code of routine', $ ' DLAMC2,', / ' otherwise supply EMIN explicitly.', / ) * -* End of DLAMC2 +* End of DLAMC2_HELPER * - END SUBROUTINE DLAMC2 + END SUBROUTINE DLAMC2_HELPER * ! ################################################################################################################################## ! 034 LAPACK_BLAS_AUX @@ -5534,26 +4734,7 @@ DOUBLE PRECISION FUNCTION DLAMC3( A, B ) * * .. Scalar Arguments .. REAL(DOUBLE) A, B -* .. -* -* Purpose -* ======= -* -* DLAMC3 is intended to force A and B to be stored prior to doing -* the addition of A and B , for use in situations where optimizers -* might hold one of these in a register. -* -* Arguments -* ========= -* -* A, B (input) REAL(DOUBLE) -* The values A and B. -* -* ===================================================================== -* -* .. Executable Statements .. -* - DLAMC3 = A + B + DLAMC3 = DLAMC3_HELPER( A, B ) * RETURN * @@ -5565,6 +4746,18 @@ END FUNCTION DLAMC3 ! 035 LAPACK_BLAS_AUX * SUBROUTINE DLAMC4( EMIN, START, BASE ) + INTEGER BASE, EMIN + REAL(DOUBLE) START + + CALL DLAMC4_HELPER( EMIN, START, BASE ) + + RETURN + + END SUBROUTINE DLAMC4 + +! ********************************************************************************************************************************** + + SUBROUTINE DLAMC4_HELPER( EMIN, START, BASE ) * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -5640,14 +4833,27 @@ SUBROUTINE DLAMC4( EMIN, START, BASE ) * RETURN * -* End of DLAMC4 +* End of DLAMC4_HELPER * - END SUBROUTINE DLAMC4 + END SUBROUTINE DLAMC4_HELPER * ! ################################################################################################################################## ! 036 LAPACK_BLAS_AUX * SUBROUTINE DLAMC5( BETA, P, EMIN, IEEE, EMAX, RMAX ) + LOGICAL IEEE + INTEGER BETA, EMAX, EMIN, P + REAL(DOUBLE) RMAX + + CALL DLAMC5_HELPER( BETA, P, EMIN, IEEE, EMAX, RMAX ) + + RETURN + + END SUBROUTINE DLAMC5 + +! ********************************************************************************************************************************** + + SUBROUTINE DLAMC5_HELPER( BETA, P, EMIN, IEEE, EMAX, RMAX ) * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -5801,9 +5007,9 @@ SUBROUTINE DLAMC5( BETA, P, EMIN, IEEE, EMAX, RMAX ) RMAX = Y RETURN * -* End of DLAMC5 +* End of DLAMC5_HELPER * - END SUBROUTINE DLAMC5 + END SUBROUTINE DLAMC5_HELPER ! ################################################################################################################################## ! 037 LAPACK_BLAS_AUX @@ -5819,377 +5025,30 @@ DOUBLE PRECISION FUNCTION DLANSB( NORM, UPLO, N, K, AB, LDAB, * .. Scalar Arguments .. CHARACTER NORM, UPLO INTEGER K, LDAB, N -* .. -* .. Array Arguments .. REAL(DOUBLE) AB( LDAB, * ), WORK( * ) -* .. + DLANSB = DLANSB_HELPER( NORM, UPLO, N, K, AB, LDAB, WORK ) + RETURN * -* Purpose -* ======= +* End of DLANSB + + END FUNCTION DLANSB + +! ################################################################################################################################## +! 039 LAPACK_BLAS_AUX + + DOUBLE PRECISION FUNCTION DLANSY( NORM, UPLO, N, A, LDA, WORK ) * -* DLANSB returns the value of the one norm, or the Frobenius norm, or -* the infinity norm, or the element of largest absolute value of an -* n by n symmetric band matrix A, with k super-diagonals. +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* October 31, 1992 * -* Description -* =========== -* -* DLANSB returns the value -* -* DLANSB = ( max(abs(A(i,j))), NORM = 'M' or 'm' -* ( -* ( norm1(A), NORM = '1', 'O' or 'o' -* ( -* ( normI(A), NORM = 'I' or 'i' -* ( -* ( normF(A), NORM = 'F', 'f', 'E' or 'e' -* -* where norm1 denotes the one norm of a matrix (maximum column sum), -* normI denotes the infinity norm of a matrix (maximum row sum) and -* normF denotes the Frobenius norm of a matrix (square root of sum of -* squares). Note that max(abs(A(i,j))) is not a matrix norm. -* -* Arguments -* ========= -* -* NORM (input) CHARACTER*1 -* Specifies the value to be returned in DLANSB as described -* above. -* -* UPLO (input) CHARACTER*1 -* Specifies whether the upper or lower triangular part of the -* band matrix A is supplied. -* = 'U': Upper triangular part is supplied -* = 'L': Lower triangular part is supplied -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. When N = 0, DLANSB is -* set to zero. -* -* K (input) INTEGER -* The number of super-diagonals or sub-diagonals of the -* band matrix A. K >= 0. -* -* AB (input) REAL(DOUBLE) array, dimension (LDAB,N) -* The upper or lower triangle of the symmetric band matrix A, -* stored in the first K+1 rows of AB. The j-th column of A is -* stored in the j-th column of the array AB as follows: -* if UPLO = 'U', AB(k+1+i-j,j) = A(i,j) for max(1,j-k)<=i<=j; -* if UPLO = 'L', AB(1+i-j,j) = A(i,j) for j<=i<=min(n,j+k). -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= K+1. -* -* WORK (workspace) REAL(DOUBLE) array, dimension (LWORK), -* where LWORK >= N when NORM = 'I' or '1' or 'O'; otherwise, -* WORK is not referenced. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER I, J, L - REAL(DOUBLE) ABSA, SCALE, SUM, VALUE -* .. -* .. External Subroutines .. -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, MAX, MIN, SQRT -* .. -* .. Executable Statements .. -* - IF( N.EQ.0 ) THEN - VALUE = ZERO - ELSE IF( LSAME( NORM, 'M' ) ) THEN -* -* Find max(abs(A(i,j))). -* - VALUE = ZERO - IF( LSAME( UPLO, 'U' ) ) THEN - DO 20 J = 1, N - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') j, n, cr13_lba - ENDIF - DO 10 I = MAX( K+2-J, 1 ), K + 1 - VALUE = MAX( VALUE, ABS( AB( I, J ) ) ) - 10 CONTINUE - 20 CONTINUE - ELSE - DO 40 J = 1, N - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') j, n, cr13_lba - ENDIF - DO 30 I = 1, MIN( N+1-J, K+1 ) - VALUE = MAX( VALUE, ABS( AB( I, J ) ) ) - 30 CONTINUE - 40 CONTINUE - END IF - ELSE IF( ( LSAME( NORM, 'I' ) ) .OR. ( LSAME( NORM, 'O' ) ) .OR. - $ ( NORM.EQ.'1' ) ) THEN -* -* Find normI(A) ( = norm1(A), since A is symmetric). -* - VALUE = ZERO - IF( LSAME( UPLO, 'U' ) ) THEN - DO 60 J = 1, N - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') j, n, cr13_lba - ENDIF - SUM = ZERO - L = K + 1 - J - DO 50 I = MAX( 1, J-K ), J - 1 - ABSA = ABS( AB( L+I, J ) ) - SUM = SUM + ABSA - WORK( I ) = WORK( I ) + ABSA - 50 CONTINUE - WORK( J ) = SUM + ABS( AB( K+1, J ) ) - 60 CONTINUE - DO 70 I = 1, N - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') i, n, cr13_lba - ENDIF - VALUE = MAX( VALUE, WORK( I ) ) - 70 CONTINUE - ELSE - DO 80 I = 1, N - WORK( I ) = ZERO - 80 CONTINUE - DO 100 J = 1, N - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') j, n, cr13_lba - ENDIF - SUM = WORK( J ) + ABS( AB( 1, J ) ) - L = 1 - J - DO 90 I = J + 1, MIN( N, J+K ) - ABSA = ABS( AB( L+I, J ) ) - SUM = SUM + ABSA - WORK( I ) = WORK( I ) + ABSA - 90 CONTINUE - VALUE = MAX( VALUE, SUM ) - 100 CONTINUE - END IF - ELSE IF( ( LSAME( NORM, 'F' ) ) .OR. ( LSAME( NORM, 'E' ) ) ) THEN -* -* Find normF(A). -* - SCALE = ZERO - SUM = ONE - IF( K.GT.0 ) THEN - IF( LSAME( UPLO, 'U' ) ) THEN - DO 110 J = 2, N - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') j, n, cr13_lba - ENDIF - CALL DLASSQ( MIN( J-1, K ), AB( MAX( K+2-J, 1 ), J ), - $ 1, SCALE, SUM ) - 110 CONTINUE - L = K + 1 - ELSE - DO 120 J = 1, N - 1 - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') j, n, cr13_lba - ENDIF - CALL DLASSQ( MIN( N-J, K ), AB( 2, J ), 1, SCALE, - $ SUM ) - 120 CONTINUE - L = 1 - END IF - SUM = 2*SUM - ELSE - L = 1 - END IF - CALL DLASSQ( N, AB( L, 1 ), LDAB, SCALE, SUM ) - VALUE = SCALE*SQRT( SUM ) - END IF -* - DLANSB = VALUE - RETURN -* -* End of DLANSB -* -12345 format(5x,'Row ',i8,' of ',i8, a) - - END FUNCTION DLANSB - -! ################################################################################################################################## -! 039 LAPACK_BLAS_AUX - - DOUBLE PRECISION FUNCTION DLANSY( NORM, UPLO, N, A, LDA, WORK ) -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* October 31, 1992 -* -* .. Scalar Arguments .. - CHARACTER NORM, UPLO - INTEGER LDA, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), WORK( * ) -* .. -* -* Purpose -* ======= -* -* DLANSY returns the value of the one norm, or the Frobenius norm, or -* the infinity norm, or the element of largest absolute value of a -* real symmetric matrix A. -* -* Description -* =========== -* -* DLANSY returns the value -* -* DLANSY = ( max(abs(A(i,j))), NORM = 'M' or 'm' -* ( -* ( norm1(A), NORM = '1', 'O' or 'o' -* ( -* ( normI(A), NORM = 'I' or 'i' -* ( -* ( normF(A), NORM = 'F', 'f', 'E' or 'e' -* -* where norm1 denotes the one norm of a matrix (maximum column sum), -* normI denotes the infinity norm of a matrix (maximum row sum) and -* normF denotes the Frobenius norm of a matrix (square root of sum of -* squares). Note that max(abs(A(i,j))) is not a matrix norm. -* -* Arguments -* ========= -* -* NORM (input) CHARACTER*1 -* Specifies the value to be returned in DLANSY as described -* above. -* -* UPLO (input) CHARACTER*1 -* Specifies whether the upper or lower triangular part of the -* symmetric matrix A is to be referenced. -* = 'U': Upper triangular part of A is referenced -* = 'L': Lower triangular part of A is referenced -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. When N = 0, DLANSY is -* set to zero. -* -* A (input) REAL(DOUBLE) array, dimension (LDA,N) -* The symmetric matrix A. If UPLO = 'U', the leading n by n -* upper triangular part of A contains the upper triangular part -* of the matrix A, and the strictly lower triangular part of A -* is not referenced. If UPLO = 'L', the leading n by n lower -* triangular part of A contains the lower triangular part of -* the matrix A, and the strictly upper triangular part of A is -* not referenced. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(N,1). -* -* WORK (workspace) REAL(DOUBLE) array, dimension (LWORK), -* where LWORK >= N when NORM = 'I' or '1' or 'O'; otherwise, -* WORK is not referenced. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER I, J - REAL(DOUBLE) ABSA, SCALE, SUM, VALUE -* .. -* .. External Subroutines .. -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, MAX, SQRT -* .. -* .. Executable Statements .. -* - IF( N.EQ.0 ) THEN - VALUE = ZERO - ELSE IF( LSAME( NORM, 'M' ) ) THEN -* -* Find max(abs(A(i,j))). -* - VALUE = ZERO - IF( LSAME( UPLO, 'U' ) ) THEN - DO 20 J = 1, N - DO 10 I = 1, J - VALUE = MAX( VALUE, ABS( A( I, J ) ) ) - 10 CONTINUE - 20 CONTINUE - ELSE - DO 40 J = 1, N - DO 30 I = J, N - VALUE = MAX( VALUE, ABS( A( I, J ) ) ) - 30 CONTINUE - 40 CONTINUE - END IF - ELSE IF( ( LSAME( NORM, 'I' ) ) .OR. ( LSAME( NORM, 'O' ) ) .OR. - $ ( NORM.EQ.'1' ) ) THEN -* -* Find normI(A) ( = norm1(A), since A is symmetric). -* - VALUE = ZERO - IF( LSAME( UPLO, 'U' ) ) THEN - DO 60 J = 1, N - SUM = ZERO - DO 50 I = 1, J - 1 - ABSA = ABS( A( I, J ) ) - SUM = SUM + ABSA - WORK( I ) = WORK( I ) + ABSA - 50 CONTINUE - WORK( J ) = SUM + ABS( A( J, J ) ) - 60 CONTINUE - DO 70 I = 1, N - VALUE = MAX( VALUE, WORK( I ) ) - 70 CONTINUE - ELSE - DO 80 I = 1, N - WORK( I ) = ZERO - 80 CONTINUE - DO 100 J = 1, N - SUM = WORK( J ) + ABS( A( J, J ) ) - DO 90 I = J + 1, N - ABSA = ABS( A( I, J ) ) - SUM = SUM + ABSA - WORK( I ) = WORK( I ) + ABSA - 90 CONTINUE - VALUE = MAX( VALUE, SUM ) - 100 CONTINUE - END IF - ELSE IF( ( LSAME( NORM, 'F' ) ) .OR. ( LSAME( NORM, 'E' ) ) ) THEN -* -* Find normF(A). -* - SCALE = ZERO - SUM = ONE - IF( LSAME( UPLO, 'U' ) ) THEN - DO 110 J = 2, N - CALL DLASSQ( J-1, A( 1, J ), 1, SCALE, SUM ) - 110 CONTINUE - ELSE - DO 120 J = 1, N - 1 - CALL DLASSQ( N-J, A( J+1, J ), 1, SCALE, SUM ) - 120 CONTINUE - END IF - SUM = 2*SUM - CALL DLASSQ( N, A, LDA+1, SCALE, SUM ) - VALUE = SCALE*SQRT( SUM ) - END IF -* - DLANSY = VALUE - RETURN +* .. Scalar Arguments .. + CHARACTER NORM, UPLO + INTEGER LDA, N + REAL(DOUBLE) A( LDA, * ), WORK( * ) + DLANSY = DLANSY_HELPER( NORM, UPLO, N, A, LDA, WORK ) + RETURN * * End of DLANSY * @@ -6207,50 +5066,9 @@ DOUBLE PRECISION FUNCTION DLAPY2( X, Y ) * * .. Scalar Arguments .. REAL(DOUBLE) X, Y -* .. -* -* Purpose -* ======= -* -* DLAPY2 returns sqrt(x**2+y**2), taking care not to cause unnecessary -* overflow. -* -* Arguments -* ========= -* -* X (input) REAL(DOUBLE) -* Y (input) REAL(DOUBLE) -* X and Y specify the values x and y. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO - PARAMETER ( ZERO = 0.0D0 ) - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D0 ) -* .. -* .. Local Scalars .. - REAL(DOUBLE) W, XABS, YABS, Z -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, MAX, MIN, SQRT -* .. -* .. Executable Statements .. -* - XABS = ABS( X ) - YABS = ABS( Y ) - W = MAX( XABS, YABS ) - Z = MIN( XABS, YABS ) - IF( Z.EQ.ZERO ) THEN - DLAPY2 = W - ELSE - DLAPY2 = W*SQRT( ONE+( Z / W )**2 ) - END IF + DLAPY2 = DLAPY2_HELPER( X, Y ) RETURN -* -* End of DLAPY2 -* + END FUNCTION DLAPY2 ! ################################################################################################################################## @@ -6420,6 +5238,15 @@ END SUBROUTINE DLAQSB ! 042 LAPACK_BLAS_AUX SUBROUTINE DLAR2V( N, X, Y, Z, INCX, C, S, INCC ) + INTEGER INCC, INCX, N + REAL(DOUBLE) C( * ), S( * ), X( * ), Y( * ), Z( * ) + + CALL DLAR2V_HELPER( N, X, Y, Z, INCX, C, S, INCC ) + RETURN + + END SUBROUTINE DLAR2V + + SUBROUTINE DLAR2V_HELPER( N, X, Y, Z, INCX, C, S, INCC ) * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -6506,7 +5333,7 @@ SUBROUTINE DLAR2V( N, X, Y, Z, INCX, C, S, INCC ) * RETURN - END SUBROUTINE DLAR2V + END SUBROUTINE DLAR2V_HELPER ! ################################################################################################################################## ! 043 LAPACK_BLAS_AUX @@ -6527,1064 +5354,170 @@ SUBROUTINE DLARF( SIDE, M, N, V, INCV, TAU, C, LDC, WORK ) REAL(DOUBLE) C( LDC, * ), V( * ), WORK( * ) * .. * -* Purpose -* ======= + CALL DLARF_HELPER( SIDE, M, N, V, INCV, TAU, C, LDC, WORK ) + END SUBROUTINE DLARF + +! ################################################################################################################################## +! 044 LAPACK_BLAS_AUX + + SUBROUTINE DLARFB( SIDE, TRANS, DIRECT, STOREV, M, N, K, V, LDV, + $ T, LDT, C, LDC, WORK, LDWORK ) * -* DLARF applies a real elementary reflector H to a real m by n matrix -* C, from either the left or the right. H is represented in the form +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 * -* H = I - tau * v * v' +* .. Scalar Arguments .. + CHARACTER DIRECT, SIDE, STOREV, TRANS + INTEGER K, LDC, LDT, LDV, LDWORK, M, N +* .. +* .. Array Arguments .. +! B 02/07/04 //////////////////////////////////////////////////////////B + REAL(DOUBLE) C( LDC, * ), T( LDT, * ), V( LDV, * ), + $ WORK( LDWORK, * ) +! Double the first dimension + +! REAL(DOUBLE) C( LDC, * ), T( LDT, * ), V( LDV, * ), +! $ WORK( 2*LDWORK, * ) +! E ////////////////////////////////////////////////////////////////////E +* .. * -* where tau is a real scalar and v is a real vector. +* Purpose +* ======= * -* If tau = 0, then H is taken to be the unit matrix. +* DLARFB applies a real block reflector H or its transpose H' to a +* real m by n matrix C, from either the left or the right. * * Arguments * ========= * * SIDE (input) CHARACTER*1 -* = 'L': form H * C -* = 'R': form C * H -* -* M (input) INTEGER -* The number of rows of the matrix C. +* = 'L': apply H or H' from the Left +* = 'R': apply H or H' from the Right * -* N (input) INTEGER -* The number of columns of the matrix C. -* -* V (input) REAL(DOUBLE) array, dimension -* (1 + (M-1)*abs(INCV)) if SIDE = 'L' -* or (1 + (N-1)*abs(INCV)) if SIDE = 'R' -* The vector v in the representation of H. V is not used if -* TAU = 0. -* -* INCV (input) INTEGER -* The increment between elements of v. INCV <> 0. -* -* TAU (input) REAL(DOUBLE) -* The value tau in the representation of H. -* -* C (input/output) REAL(DOUBLE) array, dimension (LDC,N) -* On entry, the m by n matrix C. -* On exit, C is overwritten by the matrix H * C if SIDE = 'L', -* or C * H if SIDE = 'R'. -* -* LDC (input) INTEGER -* The leading dimension of the array C. LDC >= max(1,M). -* -* WORK (workspace) REAL(DOUBLE) array, dimension -* (N) if SIDE = 'L' -* or (M) if SIDE = 'R' -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. External Subroutines .. -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. Executable Statements .. -* - IF( LSAME( SIDE, 'L' ) ) THEN -* -* Form H * C -* - IF( TAU.NE.ZERO ) THEN -* -* w := C' * v -* - CALL DGEMV( 'Transpose', M, N, ONE, C, LDC, V, INCV, ZERO, - $ WORK, 1 ) -* -* C := C - v * w' -* - CALL DGER( M, N, -TAU, V, INCV, WORK, 1, C, LDC ) - END IF - ELSE -* -* Form C * H -* - IF( TAU.NE.ZERO ) THEN -* -* w := C * v -* - CALL DGEMV( 'No transpose', M, N, ONE, C, LDC, V, INCV, - $ ZERO, WORK, 1 ) -* -* C := C - w * v' -* - CALL DGER( M, N, -TAU, WORK, 1, V, INCV, C, LDC ) - END IF - END IF - RETURN -* -* End of DLARF -* - END SUBROUTINE DLARF - -! ################################################################################################################################## -! 044 LAPACK_BLAS_AUX - - SUBROUTINE DLARFB( SIDE, TRANS, DIRECT, STOREV, M, N, K, V, LDV, - $ T, LDT, C, LDC, WORK, LDWORK ) -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. - CHARACTER DIRECT, SIDE, STOREV, TRANS - INTEGER K, LDC, LDT, LDV, LDWORK, M, N -* .. -* .. Array Arguments .. -! B 02/07/04 //////////////////////////////////////////////////////////B - REAL(DOUBLE) C( LDC, * ), T( LDT, * ), V( LDV, * ), - $ WORK( LDWORK, * ) -! Double the first dimension - -! REAL(DOUBLE) C( LDC, * ), T( LDT, * ), V( LDV, * ), -! $ WORK( 2*LDWORK, * ) -! E ////////////////////////////////////////////////////////////////////E -* .. -* -* Purpose -* ======= -* -* DLARFB applies a real block reflector H or its transpose H' to a -* real m by n matrix C, from either the left or the right. -* -* Arguments -* ========= -* -* SIDE (input) CHARACTER*1 -* = 'L': apply H or H' from the Left -* = 'R': apply H or H' from the Right -* -* TRANS (input) CHARACTER*1 -* = 'N': apply H (No transpose) -* = 'T': apply H' (Transpose) -* -* DIRECT (input) CHARACTER*1 -* Indicates how H is formed from a product of elementary -* reflectors -* = 'F': H = H(1) H(2) . . . H(k) (Forward) -* = 'B': H = H(k) . . . H(2) H(1) (Backward) -* -* STOREV (input) CHARACTER*1 -* Indicates how the vectors which define the elementary -* reflectors are stored: -* = 'C': Columnwise -* = 'R': Rowwise -* -* M (input) INTEGER -* The number of rows of the matrix C. -* -* N (input) INTEGER -* The number of columns of the matrix C. -* -* K (input) INTEGER -* The order of the matrix T (= the number of elementary -* reflectors whose product defines the block reflector). -* -* V (input) REAL(DOUBLE) array, dimension -* (LDV,K) if STOREV = 'C' -* (LDV,M) if STOREV = 'R' and SIDE = 'L' -* (LDV,N) if STOREV = 'R' and SIDE = 'R' -* The matrix V. See further details. -* -* LDV (input) INTEGER -* The leading dimension of the array V. -* If STOREV = 'C' and SIDE = 'L', LDV >= max(1,M); -* if STOREV = 'C' and SIDE = 'R', LDV >= max(1,N); -* if STOREV = 'R', LDV >= K. -* -* T (input) REAL(DOUBLE) array, dimension (LDT,K) -* The triangular k by k matrix T in the representation of the -* block reflector. -* -* LDT (input) INTEGER -* The leading dimension of the array T. LDT >= K. -* -* C (input/output) REAL(DOUBLE) array, dimension (LDC,N) -* On entry, the m by n matrix C. -* On exit, C is overwritten by H*C or H'*C or C*H or C*H'. -* -* LDC (input) INTEGER -* The leading dimension of the array C. LDA >= max(1,M). -* -* WORK (workspace) REAL(DOUBLE) array, dimension (LDWORK,K) -* -* LDWORK (input) INTEGER -* The leading dimension of the array WORK. -* If SIDE = 'L', LDWORK >= max(1,N); -* if SIDE = 'R', LDWORK >= max(1,M). -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - CHARACTER TRANST - INTEGER I, J -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Executable Statements .. -* -* Quick return if possible -* - IF( M.LE.0 .OR. N.LE.0 ) - $ RETURN -* - IF( LSAME( TRANS, 'N' ) ) THEN - TRANST = 'T' - ELSE - TRANST = 'N' - END IF -* - IF( LSAME( STOREV, 'C' ) ) THEN -* - IF( LSAME( DIRECT, 'F' ) ) THEN -* -* Let V = ( V1 ) (first K rows) -* ( V2 ) -* where V1 is unit lower triangular. -* - IF( LSAME( SIDE, 'L' ) ) THEN -* -* Form H * C or H' * C where C = ( C1 ) -* ( C2 ) -* -* W := C' * V = (C1'*V1 + C2'*V2) (stored in WORK) -* -* W := C1' -* - DO 10 J = 1, K - CALL DCOPY( N, C( J, 1 ), LDC, WORK( 1, J ), 1 ) - 10 CONTINUE -* -* W := W * V1 -* - CALL DTRMM( 'Right', 'Lower', 'No transpose', 'Unit', N, - $ K, ONE, V, LDV, WORK, LDWORK ) - IF( M.GT.K ) THEN -* -* W := W + C2'*V2 -* - CALL DGEMM( 'Transpose', 'No transpose', N, K, M-K, - $ ONE, C( K+1, 1 ), LDC, V( K+1, 1 ), LDV, - $ ONE, WORK, LDWORK ) - END IF -* -* W := W * T' or W * T -* - CALL DTRMM( 'Right', 'Upper', TRANST, 'Non-unit', N, K, - $ ONE, T, LDT, WORK, LDWORK ) -* -* C := C - V * W' -* - IF( M.GT.K ) THEN -* -* C2 := C2 - V2 * W' -* - CALL DGEMM( 'No transpose', 'Transpose', M-K, N, K, - $ -ONE, V( K+1, 1 ), LDV, WORK, LDWORK, ONE, - $ C( K+1, 1 ), LDC ) - END IF -* -* W := W * V1' -* - CALL DTRMM( 'Right', 'Lower', 'Transpose', 'Unit', N, K, - $ ONE, V, LDV, WORK, LDWORK ) -* -* C1 := C1 - W' -* - DO 30 J = 1, K - DO 20 I = 1, N - C( J, I ) = C( J, I ) - WORK( I, J ) - 20 CONTINUE - 30 CONTINUE -* - ELSE IF( LSAME( SIDE, 'R' ) ) THEN -* -* Form C * H or C * H' where C = ( C1 C2 ) -* -* W := C * V = (C1*V1 + C2*V2) (stored in WORK) -* -* W := C1 -* - DO 40 J = 1, K - CALL DCOPY( M, C( 1, J ), 1, WORK( 1, J ), 1 ) - 40 CONTINUE -* -* W := W * V1 -* - CALL DTRMM( 'Right', 'Lower', 'No transpose', 'Unit', M, - $ K, ONE, V, LDV, WORK, LDWORK ) - IF( N.GT.K ) THEN -* -* W := W + C2 * V2 -* - CALL DGEMM( 'No transpose', 'No transpose', M, K, N-K, - $ ONE, C( 1, K+1 ), LDC, V( K+1, 1 ), LDV, - $ ONE, WORK, LDWORK ) - END IF -* -* W := W * T or W * T' -* - CALL DTRMM( 'Right', 'Upper', TRANS, 'Non-unit', M, K, - $ ONE, T, LDT, WORK, LDWORK ) -* -* C := C - W * V' -* - IF( N.GT.K ) THEN -* -* C2 := C2 - W * V2' -* - CALL DGEMM( 'No transpose', 'Transpose', M, N-K, K, - $ -ONE, WORK, LDWORK, V( K+1, 1 ), LDV, ONE, - $ C( 1, K+1 ), LDC ) - END IF -* -* W := W * V1' -* - CALL DTRMM( 'Right', 'Lower', 'Transpose', 'Unit', M, K, - $ ONE, V, LDV, WORK, LDWORK ) -* -* C1 := C1 - W -* - DO 60 J = 1, K - DO 50 I = 1, M - C( I, J ) = C( I, J ) - WORK( I, J ) - 50 CONTINUE - 60 CONTINUE - END IF -* - ELSE -* -* Let V = ( V1 ) -* ( V2 ) (last K rows) -* where V2 is unit upper triangular. -* - IF( LSAME( SIDE, 'L' ) ) THEN -* -* Form H * C or H' * C where C = ( C1 ) -* ( C2 ) -* -* W := C' * V = (C1'*V1 + C2'*V2) (stored in WORK) -* -* W := C2' -* - DO 70 J = 1, K - CALL DCOPY( N, C( M-K+J, 1 ), LDC, WORK( 1, J ), 1 ) - 70 CONTINUE -* -* W := W * V2 -* - CALL DTRMM( 'Right', 'Upper', 'No transpose', 'Unit', N, - $ K, ONE, V( M-K+1, 1 ), LDV, WORK, LDWORK ) - IF( M.GT.K ) THEN -* -* W := W + C1'*V1 -* - CALL DGEMM( 'Transpose', 'No transpose', N, K, M-K, - $ ONE, C, LDC, V, LDV, ONE, WORK, LDWORK ) - END IF -* -* W := W * T' or W * T -* - CALL DTRMM( 'Right', 'Lower', TRANST, 'Non-unit', N, K, - $ ONE, T, LDT, WORK, LDWORK ) -* -* C := C - V * W' -* - IF( M.GT.K ) THEN -* -* C1 := C1 - V1 * W' -* - CALL DGEMM( 'No transpose', 'Transpose', M-K, N, K, - $ -ONE, V, LDV, WORK, LDWORK, ONE, C, LDC ) - END IF -* -* W := W * V2' -* - CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Unit', N, K, - $ ONE, V( M-K+1, 1 ), LDV, WORK, LDWORK ) -* -* C2 := C2 - W' -* - DO 90 J = 1, K - DO 80 I = 1, N - C( M-K+J, I ) = C( M-K+J, I ) - WORK( I, J ) - 80 CONTINUE - 90 CONTINUE -* - ELSE IF( LSAME( SIDE, 'R' ) ) THEN -* -* Form C * H or C * H' where C = ( C1 C2 ) -* -* W := C * V = (C1*V1 + C2*V2) (stored in WORK) -* -* W := C2 -* - DO 100 J = 1, K - CALL DCOPY( M, C( 1, N-K+J ), 1, WORK( 1, J ), 1 ) - 100 CONTINUE -* -* W := W * V2 -* - CALL DTRMM( 'Right', 'Upper', 'No transpose', 'Unit', M, - $ K, ONE, V( N-K+1, 1 ), LDV, WORK, LDWORK ) - IF( N.GT.K ) THEN -* -* W := W + C1 * V1 -* - CALL DGEMM( 'No transpose', 'No transpose', M, K, N-K, - $ ONE, C, LDC, V, LDV, ONE, WORK, LDWORK ) - END IF -* -* W := W * T or W * T' -* - CALL DTRMM( 'Right', 'Lower', TRANS, 'Non-unit', M, K, - $ ONE, T, LDT, WORK, LDWORK ) -* -* C := C - W * V' -* - IF( N.GT.K ) THEN -* -* C1 := C1 - W * V1' -* - CALL DGEMM( 'No transpose', 'Transpose', M, N-K, K, - $ -ONE, WORK, LDWORK, V, LDV, ONE, C, LDC ) - END IF -* -* W := W * V2' -* - CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Unit', M, K, - $ ONE, V( N-K+1, 1 ), LDV, WORK, LDWORK ) -* -* C2 := C2 - W -* - DO 120 J = 1, K - DO 110 I = 1, M - C( I, N-K+J ) = C( I, N-K+J ) - WORK( I, J ) - 110 CONTINUE - 120 CONTINUE - END IF - END IF -* - ELSE IF( LSAME( STOREV, 'R' ) ) THEN -* - IF( LSAME( DIRECT, 'F' ) ) THEN -* -* Let V = ( V1 V2 ) (V1: first K columns) -* where V1 is unit upper triangular. -* - IF( LSAME( SIDE, 'L' ) ) THEN -* -* Form H * C or H' * C where C = ( C1 ) -* ( C2 ) -* -* W := C' * V' = (C1'*V1' + C2'*V2') (stored in WORK) -* -* W := C1' -* - DO 130 J = 1, K - CALL DCOPY( N, C( J, 1 ), LDC, WORK( 1, J ), 1 ) - 130 CONTINUE -* -* W := W * V1' -* - CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Unit', N, K, - $ ONE, V, LDV, WORK, LDWORK ) - IF( M.GT.K ) THEN -* -* W := W + C2'*V2' -* - CALL DGEMM( 'Transpose', 'Transpose', N, K, M-K, ONE, - $ C( K+1, 1 ), LDC, V( 1, K+1 ), LDV, ONE, - $ WORK, LDWORK ) - END IF -* -* W := W * T' or W * T -* - CALL DTRMM( 'Right', 'Upper', TRANST, 'Non-unit', N, K, - $ ONE, T, LDT, WORK, LDWORK ) -* -* C := C - V' * W' -* - IF( M.GT.K ) THEN -* -* C2 := C2 - V2' * W' -* - CALL DGEMM( 'Transpose', 'Transpose', M-K, N, K, -ONE, - $ V( 1, K+1 ), LDV, WORK, LDWORK, ONE, - $ C( K+1, 1 ), LDC ) - END IF -* -* W := W * V1 -* - CALL DTRMM( 'Right', 'Upper', 'No transpose', 'Unit', N, - $ K, ONE, V, LDV, WORK, LDWORK ) -* -* C1 := C1 - W' -* - DO 150 J = 1, K - DO 140 I = 1, N - C( J, I ) = C( J, I ) - WORK( I, J ) - 140 CONTINUE - 150 CONTINUE -* - ELSE IF( LSAME( SIDE, 'R' ) ) THEN -* -* Form C * H or C * H' where C = ( C1 C2 ) -* -* W := C * V' = (C1*V1' + C2*V2') (stored in WORK) -* -* W := C1 -* - DO 160 J = 1, K - CALL DCOPY( M, C( 1, J ), 1, WORK( 1, J ), 1 ) - 160 CONTINUE -* -* W := W * V1' -* - CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Unit', M, K, - $ ONE, V, LDV, WORK, LDWORK ) - IF( N.GT.K ) THEN -* -* W := W + C2 * V2' -* - CALL DGEMM( 'No transpose', 'Transpose', M, K, N-K, - $ ONE, C( 1, K+1 ), LDC, V( 1, K+1 ), LDV, - $ ONE, WORK, LDWORK ) - END IF -* -* W := W * T or W * T' -* - CALL DTRMM( 'Right', 'Upper', TRANS, 'Non-unit', M, K, - $ ONE, T, LDT, WORK, LDWORK ) -* -* C := C - W * V -* - IF( N.GT.K ) THEN -* -* C2 := C2 - W * V2 -* - CALL DGEMM( 'No transpose', 'No transpose', M, N-K, K, - $ -ONE, WORK, LDWORK, V( 1, K+1 ), LDV, ONE, - $ C( 1, K+1 ), LDC ) - END IF -* -* W := W * V1 -* - CALL DTRMM( 'Right', 'Upper', 'No transpose', 'Unit', M, - $ K, ONE, V, LDV, WORK, LDWORK ) -* -* C1 := C1 - W -* - DO 180 J = 1, K - DO 170 I = 1, M - C( I, J ) = C( I, J ) - WORK( I, J ) - 170 CONTINUE - 180 CONTINUE -* - END IF -* - ELSE -* -* Let V = ( V1 V2 ) (V2: last K columns) -* where V2 is unit lower triangular. -* - IF( LSAME( SIDE, 'L' ) ) THEN -* -* Form H * C or H' * C where C = ( C1 ) -* ( C2 ) -* -* W := C' * V' = (C1'*V1' + C2'*V2') (stored in WORK) -* -* W := C2' -* - DO 190 J = 1, K - CALL DCOPY( N, C( M-K+J, 1 ), LDC, WORK( 1, J ), 1 ) - 190 CONTINUE -* -* W := W * V2' -* - CALL DTRMM( 'Right', 'Lower', 'Transpose', 'Unit', N, K, - $ ONE, V( 1, M-K+1 ), LDV, WORK, LDWORK ) - IF( M.GT.K ) THEN -* -* W := W + C1'*V1' -* - CALL DGEMM( 'Transpose', 'Transpose', N, K, M-K, ONE, - $ C, LDC, V, LDV, ONE, WORK, LDWORK ) - END IF -* -* W := W * T' or W * T -* - CALL DTRMM( 'Right', 'Lower', TRANST, 'Non-unit', N, K, - $ ONE, T, LDT, WORK, LDWORK ) -* -* C := C - V' * W' -* - IF( M.GT.K ) THEN -* -* C1 := C1 - V1' * W' -* - CALL DGEMM( 'Transpose', 'Transpose', M-K, N, K, -ONE, - $ V, LDV, WORK, LDWORK, ONE, C, LDC ) - END IF -* -* W := W * V2 -* - CALL DTRMM( 'Right', 'Lower', 'No transpose', 'Unit', N, - $ K, ONE, V( 1, M-K+1 ), LDV, WORK, LDWORK ) -* -* C2 := C2 - W' -* - DO 210 J = 1, K - DO 200 I = 1, N - C( M-K+J, I ) = C( M-K+J, I ) - WORK( I, J ) - 200 CONTINUE - 210 CONTINUE -* - ELSE IF( LSAME( SIDE, 'R' ) ) THEN -* -* Form C * H or C * H' where C = ( C1 C2 ) -* -* W := C * V' = (C1*V1' + C2*V2') (stored in WORK) -* -* W := C2 -* - DO 220 J = 1, K - CALL DCOPY( M, C( 1, N-K+J ), 1, WORK( 1, J ), 1 ) - 220 CONTINUE -* -* W := W * V2' -* - CALL DTRMM( 'Right', 'Lower', 'Transpose', 'Unit', M, K, - $ ONE, V( 1, N-K+1 ), LDV, WORK, LDWORK ) - IF( N.GT.K ) THEN -* -* W := W + C1 * V1' -* - CALL DGEMM( 'No transpose', 'Transpose', M, K, N-K, - $ ONE, C, LDC, V, LDV, ONE, WORK, LDWORK ) - END IF -* -* W := W * T or W * T' -* - CALL DTRMM( 'Right', 'Lower', TRANS, 'Non-unit', M, K, - $ ONE, T, LDT, WORK, LDWORK ) -* -* C := C - W * V -* - IF( N.GT.K ) THEN -* -* C1 := C1 - W * V1 -* - CALL DGEMM( 'No transpose', 'No transpose', M, N-K, K, - $ -ONE, WORK, LDWORK, V, LDV, ONE, C, LDC ) - END IF -* -* W := W * V2 -* - CALL DTRMM( 'Right', 'Lower', 'No transpose', 'Unit', M, - $ K, ONE, V( 1, N-K+1 ), LDV, WORK, LDWORK ) -* -* C1 := C1 - W -* - DO 240 J = 1, K - DO 230 I = 1, M - C( I, N-K+J ) = C( I, N-K+J ) - WORK( I, J ) - 230 CONTINUE - 240 CONTINUE -* - END IF -* - END IF - END IF -* - RETURN -* -* End of DLARFB -* - END SUBROUTINE DLARFB - -! ################################################################################################################################## -! 045 LAPACK_BLAS_AUX - - SUBROUTINE DLARFG( N, ALPHA, X, INCX, TAU ) -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. - INTEGER INCX, N - REAL(DOUBLE) ALPHA, TAU -* .. -* .. Array Arguments .. - REAL(DOUBLE) X( * ) -* .. -* -* Purpose -* ======= -* -* DLARFG generates a real elementary reflector H of order n, such -* that -* -* H * ( alpha ) = ( beta ), H' * H = I. -* ( x ) ( 0 ) -* -* where alpha and beta are scalars, and x is an (n-1)-element real -* vector. H is represented in the form -* -* H = I - tau * ( 1 ) * ( 1 v' ) , -* ( v ) -* -* where tau is a real scalar and v is a real (n-1)-element -* vector. -* -* If the elements of x are all zero, then tau = 0 and H is taken to be -* the unit matrix. -* -* Otherwise 1 <= tau <= 2. -* -* Arguments -* ========= -* -* N (input) INTEGER -* The order of the elementary reflector. -* -* ALPHA (input/output) REAL(DOUBLE) -* On entry, the value alpha. -* On exit, it is overwritten with the value beta. -* -* X (input/output) REAL(DOUBLE) array, dimension -* (1+(N-2)*abs(INCX)) -* On entry, the vector x. -* On exit, it is overwritten with the vector v. -* -* INCX (input) INTEGER -* The increment between elements of X. INCX > 0. -* -* TAU (output) REAL(DOUBLE) -* The value tau. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER J, KNT - REAL(DOUBLE) BETA, RSAFMN, SAFMIN, XNORM -* .. -* .. External Functions .. - - REAL(DOUBLE) DLAMCH - EXTERNAL DLAMCH -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, SIGN -* .. -* .. External Subroutines .. -* .. -* .. Executable Statements .. -* - IF( N.LE.1 ) THEN - TAU = ZERO - RETURN - END IF -* - XNORM = DNRM2( N-1, X, INCX ) -* - IF( XNORM.EQ.ZERO ) THEN -* -* H = I -* - TAU = ZERO - ELSE -* -* general case -* - BETA = -SIGN( DLAPY2( ALPHA, XNORM ), ALPHA ) - SAFMIN = DLAMCH( 'S' ) / DLAMCH( 'E' ) - IF( ABS( BETA ).LT.SAFMIN ) THEN -* -* XNORM, BETA may be inaccurate; scale X and recompute them -* - RSAFMN = ONE / SAFMIN - KNT = 0 - 10 CONTINUE - KNT = KNT + 1 - CALL DSCAL( N-1, RSAFMN, X, INCX ) - BETA = BETA*RSAFMN - ALPHA = ALPHA*RSAFMN - IF( ABS( BETA ).LT.SAFMIN ) - $ GO TO 10 -* -* New BETA is at most 1, at least SAFMIN -* - XNORM = DNRM2( N-1, X, INCX ) - BETA = -SIGN( DLAPY2( ALPHA, XNORM ), ALPHA ) - TAU = ( BETA-ALPHA ) / BETA - CALL DSCAL( N-1, ONE / ( ALPHA-BETA ), X, INCX ) -* -* If ALPHA is subnormal, it may lose relative accuracy -* - ALPHA = BETA - DO 20 J = 1, KNT - ALPHA = ALPHA*SAFMIN - 20 CONTINUE - ELSE - TAU = ( BETA-ALPHA ) / BETA - CALL DSCAL( N-1, ONE / ( ALPHA-BETA ), X, INCX ) - ALPHA = BETA - END IF - END IF -* - RETURN -* -* End of DLARFG -* - END SUBROUTINE DLARFG - -! ################################################################################################################################## -! 046 LAPACK_BLAS_AUX - - SUBROUTINE DLARFT( DIRECT, STOREV, N, K, V, LDV, TAU, T, LDT ) -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. - CHARACTER DIRECT, STOREV - INTEGER K, LDT, LDV, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) T( LDT, * ), TAU( * ), V( LDV, * ) -* .. -* -* Purpose -* ======= -* -* DLARFT forms the triangular factor T of a real block reflector H -* of order n, which is defined as a product of k elementary reflectors. -* -* If DIRECT = 'F', H = H(1) H(2) . . . H(k) and T is upper triangular; -* -* If DIRECT = 'B', H = H(k) . . . H(2) H(1) and T is lower triangular. -* -* If STOREV = 'C', the vector which defines the elementary reflector -* H(i) is stored in the i-th column of the array V, and -* -* H = I - V * T * V' -* -* If STOREV = 'R', the vector which defines the elementary reflector -* H(i) is stored in the i-th row of the array V, and -* -* H = I - V' * T * V -* -* Arguments -* ========= +* TRANS (input) CHARACTER*1 +* = 'N': apply H (No transpose) +* = 'T': apply H' (Transpose) * * DIRECT (input) CHARACTER*1 -* Specifies the order in which the elementary reflectors are -* multiplied to form the block reflector: +* Indicates how H is formed from a product of elementary +* reflectors * = 'F': H = H(1) H(2) . . . H(k) (Forward) * = 'B': H = H(k) . . . H(2) H(1) (Backward) * * STOREV (input) CHARACTER*1 -* Specifies how the vectors which define the elementary -* reflectors are stored (see also Further Details): -* = 'C': columnwise -* = 'R': rowwise +* Indicates how the vectors which define the elementary +* reflectors are stored: +* = 'C': Columnwise +* = 'R': Rowwise +* +* M (input) INTEGER +* The number of rows of the matrix C. * * N (input) INTEGER -* The order of the block reflector H. N >= 0. +* The number of columns of the matrix C. * * K (input) INTEGER -* The order of the triangular factor T (= the number of -* elementary reflectors). K >= 1. +* The order of the matrix T (= the number of elementary +* reflectors whose product defines the block reflector). * -* V (input/output) REAL(DOUBLE) array, dimension -* (LDV,K) if STOREV = 'C' -* (LDV,N) if STOREV = 'R' +* V (input) REAL(DOUBLE) array, dimension +* (LDV,K) if STOREV = 'C' +* (LDV,M) if STOREV = 'R' and SIDE = 'L' +* (LDV,N) if STOREV = 'R' and SIDE = 'R' * The matrix V. See further details. * * LDV (input) INTEGER * The leading dimension of the array V. -* If STOREV = 'C', LDV >= max(1,N); if STOREV = 'R', LDV >= K. -* -* TAU (input) REAL(DOUBLE) array, dimension (K) -* TAU(i) must contain the scalar factor of the elementary -* reflector H(i). +* If STOREV = 'C' and SIDE = 'L', LDV >= max(1,M); +* if STOREV = 'C' and SIDE = 'R', LDV >= max(1,N); +* if STOREV = 'R', LDV >= K. * -* T (output) REAL(DOUBLE) array, dimension (LDT,K) -* The k by k triangular factor T of the block reflector. -* If DIRECT = 'F', T is upper triangular; if DIRECT = 'B', T is -* lower triangular. The rest of the array is not used. +* T (input) REAL(DOUBLE) array, dimension (LDT,K) +* The triangular k by k matrix T in the representation of the +* block reflector. * * LDT (input) INTEGER * The leading dimension of the array T. LDT >= K. * -* Further Details -* =============== -* -* The shape of the matrix V and the storage of the vectors which define -* the H(i) is best illustrated by the following example with n = 5 and -* k = 3. The elements equal to 1 are not stored; the corresponding -* array elements are modified but restored on exit. The rest of the -* array is not used. -* -* DIRECT = 'F' and STOREV = 'C': DIRECT = 'F' and STOREV = 'R': +* C (input/output) REAL(DOUBLE) array, dimension (LDC,N) +* On entry, the m by n matrix C. +* On exit, C is overwritten by H*C or H'*C or C*H or C*H'. * -* V = ( 1 ) V = ( 1 v1 v1 v1 v1 ) -* ( v1 1 ) ( 1 v2 v2 v2 ) -* ( v1 v2 1 ) ( 1 v3 v3 ) -* ( v1 v2 v3 ) -* ( v1 v2 v3 ) +* LDC (input) INTEGER +* The leading dimension of the array C. LDA >= max(1,M). * -* DIRECT = 'B' and STOREV = 'C': DIRECT = 'B' and STOREV = 'R': +* WORK (workspace) REAL(DOUBLE) array, dimension (LDWORK,K) * -* V = ( v1 v2 v3 ) V = ( v1 v1 1 ) -* ( v1 v2 v3 ) ( v2 v2 v2 1 ) -* ( 1 v2 v3 ) ( v3 v3 v3 v3 1 ) -* ( 1 v3 ) -* ( 1 ) +* LDWORK (input) INTEGER +* The leading dimension of the array WORK. +* If SIDE = 'L', LDWORK >= max(1,N); +* if SIDE = 'R', LDWORK >= max(1,M). * * ===================================================================== * * .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D+0 ) * .. * .. Local Scalars .. + CHARACTER TRANST INTEGER I, J - REAL(DOUBLE) VII -* .. -* .. External Subroutines .. * .. * .. External Functions .. LOGICAL LSAME EXTERNAL LSAME * .. +* .. External Subroutines .. +* .. * .. Executable Statements .. * -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* - IF( LSAME( DIRECT, 'F' ) ) THEN - DO 20 I = 1, K - IF( TAU( I ).EQ.ZERO ) THEN -* -* H(i) = I -* - DO 10 J = 1, I - T( J, I ) = ZERO - 10 CONTINUE - ELSE -* -* general case -* - VII = V( I, I ) - V( I, I ) = ONE - IF( LSAME( STOREV, 'C' ) ) THEN -* -* T(1:i-1,i) := - tau(i) * V(i:n,1:i-1)' * V(i:n,i) -* - CALL DGEMV( 'Transpose', N-I+1, I-1, -TAU( I ), - $ V( I, 1 ), LDV, V( I, I ), 1, ZERO, - $ T( 1, I ), 1 ) - ELSE -* -* T(1:i-1,i) := - tau(i) * V(1:i-1,i:n) * V(i,i:n)' -* - CALL DGEMV( 'No transpose', I-1, N-I+1, -TAU( I ), - $ V( 1, I ), LDV, V( I, I ), LDV, ZERO, - $ T( 1, I ), 1 ) - END IF - V( I, I ) = VII -* -* T(1:i-1,i) := T(1:i-1,1:i-1) * T(1:i-1,i) -* - CALL DTRMV( 'Upper', 'No transpose', 'Non-unit', I-1, T, - $ LDT, T( 1, I ), 1 ) - T( I, I ) = TAU( I ) - END IF - 20 CONTINUE - ELSE - DO 40 I = K, 1, -1 - IF( TAU( I ).EQ.ZERO ) THEN -* -* H(i) = I -* - DO 30 J = I, K - T( J, I ) = ZERO - 30 CONTINUE - ELSE -* -* general case -* - IF( I.LT.K ) THEN - IF( LSAME( STOREV, 'C' ) ) THEN - VII = V( N-K+I, I ) - V( N-K+I, I ) = ONE -* -* T(i+1:k,i) := -* - tau(i) * V(1:n-k+i,i+1:k)' * V(1:n-k+i,i) -* - CALL DGEMV( 'Transpose', N-K+I, K-I, -TAU( I ), - $ V( 1, I+1 ), LDV, V( 1, I ), 1, ZERO, - $ T( I+1, I ), 1 ) - V( N-K+I, I ) = VII - ELSE - VII = V( I, N-K+I ) - V( I, N-K+I ) = ONE + CALL DLARFB_HELPER( SIDE, TRANS, DIRECT, STOREV, M, N, K, V, LDV, + $ T, LDT, C, LDC, WORK, LDWORK ) + RETURN + END SUBROUTINE DLARFB + +! ################################################################################################################################## +! 045 LAPACK_BLAS_AUX + + SUBROUTINE DLARFG( N, ALPHA, X, INCX, TAU ) * -* T(i+1:k,i) := -* - tau(i) * V(i+1:k,1:n-k+i) * V(i,1:n-k+i)' +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* September 30, 1994 * - CALL DGEMV( 'No transpose', K-I, N-K+I, -TAU( I ), - $ V( I+1, 1 ), LDV, V( I, 1 ), LDV, ZERO, - $ T( I+1, I ), 1 ) - V( I, N-K+I ) = VII - END IF +* .. Scalar Arguments .. + INTEGER INCX, N + REAL(DOUBLE) ALPHA, TAU +* .. +* .. Array Arguments .. + REAL(DOUBLE) X( * ) +* .. * -* T(i+1:k,i) := T(i+1:k,i+1:k) * T(i+1:k,i) + CALL DLARFG_HELPER( N, ALPHA, X, INCX, TAU ) + END SUBROUTINE DLARFG + +! ################################################################################################################################## +! 046 LAPACK_BLAS_AUX + + SUBROUTINE DLARFT( DIRECT, STOREV, N, K, V, LDV, TAU, T, LDT ) * - CALL DTRMV( 'Lower', 'No transpose', 'Non-unit', K-I, - $ T( I+1, I+1 ), LDT, T( I+1, I ), 1 ) - END IF - T( I, I ) = TAU( I ) - END IF - 40 CONTINUE - END IF - RETURN +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 * -* End of DLARFT +* .. Scalar Arguments .. + CHARACTER DIRECT, STOREV + INTEGER K, LDT, LDV, N +* .. +* .. Array Arguments .. + REAL(DOUBLE) T( LDT, * ), TAU( * ), V( LDV, * ) +* .. * + CALL DLARFT_HELPER( DIRECT, STOREV, N, K, V, LDV, TAU, T, LDT ) END SUBROUTINE DLARFT ! ################################################################################################################################## @@ -7604,97 +5537,26 @@ SUBROUTINE DLARGV( N, X, INCX, Y, INCY, C, INCC ) REAL(DOUBLE) C( * ), X( * ), Y( * ) * .. * -* Purpose -* ======= -* -* DLARGV generates a vector of real plane rotations, determined by -* elements of the real vectors x and y. For i = 1,2,...,n -* -* ( c(i) s(i) ) ( x(i) ) = ( a(i) ) -* ( -s(i) c(i) ) ( y(i) ) = ( 0 ) -* -* Arguments -* ========= -* -* N (input) INTEGER -* The number of plane rotations to be generated. -* -* X (input/output) REAL(DOUBLE) array, -* dimension (1+(N-1)*INCX) -* On entry, the vector x. -* On exit, x(i) is overwritten by a(i), for i = 1,...,n. -* -* INCX (input) INTEGER -* The increment between elements of X. INCX > 0. -* -* Y (input/output) REAL(DOUBLE) array, -* dimension (1+(N-1)*INCY) -* On entry, the vector y. -* On exit, the sines of the plane rotations. -* -* INCY (input) INTEGER -* The increment between elements of Y. INCY > 0. -* -* C (output) REAL(DOUBLE) array, dimension (1+(N-1)*INCC) -* The cosines of the plane rotations. -* -* INCC (input) INTEGER -* The increment between elements of C. INCC > 0. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO, ONE - PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER I, IC, IX, IY - REAL(DOUBLE) F, G, T, TT -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, SQRT -* .. -* .. Executable Statements .. -* - IX = 1 - IY = 1 - IC = 1 - DO 10 I = 1, N - F = X( IX ) - G = Y( IY ) - IF( G.EQ.ZERO ) THEN - C( IC ) = ONE - ELSE IF( F.EQ.ZERO ) THEN - C( IC ) = ZERO - Y( IY ) = ONE - X( IX ) = G - ELSE IF( ABS( F ).GT.ABS( G ) ) THEN - T = G / F - TT = SQRT( ONE+T*T ) - C( IC ) = ONE / TT - Y( IY ) = T*C( IC ) - X( IX ) = F*TT - ELSE - T = F / G - TT = SQRT( ONE+T*T ) - Y( IY ) = ONE / TT - C( IC ) = T*Y( IY ) - X( IX ) = G*TT - END IF - IC = IC + INCC - IY = IY + INCY - IX = IX + INCX - 10 CONTINUE - RETURN -* -* End of DLARGV -* + CALL DLARGV_HELPER( N, X, INCX, Y, INCY, C, INCC ) END SUBROUTINE DLARGV ! ################################################################################################################################## ! 048 LAPACK_BLAS_AUX SUBROUTINE DLARNV( IDIST, ISEED, N, X ) + INTEGER IDIST, N + INTEGER ISEED( 4 ) + REAL(DOUBLE) X( * ) + + CALL DLARNV_HELPER( IDIST, ISEED, N, X ) + + RETURN + + END SUBROUTINE DLARNV + +! ********************************************************************************************************************************** + + SUBROUTINE DLARNV_HELPER( IDIST, ISEED, N, X ) * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -7795,166 +5657,36 @@ SUBROUTINE DLARNV( IDIST, ISEED, N, X ) X( IV+I-1 ) = TWO*U( I ) - ONE 20 CONTINUE ELSE IF( IDIST.EQ.3 ) THEN -* -* Convert generated numbers to normal (0,1) distribution -* - DO 30 I = 1, IL - X( IV+I-1 ) = SQRT( -TWO*LOG( U( 2*I-1 ) ) )* - $ COS( TWOPI*U( 2*I ) ) - 30 CONTINUE - END IF - 40 CONTINUE - RETURN -* -* End of DLARNV -* - END SUBROUTINE DLARNV - -! ################################################################################################################################## -! 049 LAPACK_BLAS_AUX - - SUBROUTINE DLARTG( F, G, CS, SN, R ) -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. - REAL(DOUBLE) CS, F, G, R, SN -* .. -* -* Purpose -* ======= -* -* DLARTG generate a plane rotation so that -* -* [ CS SN ] . [ F ] = [ R ] where CS**2 + SN**2 = 1. -* [ -SN CS ] [ G ] [ 0 ] -* -* This is a slower, more accurate version of the BLAS1 routine DROTG, -* with the following other differences: -* F and G are unchanged on return. -* If G=0, then CS=1 and SN=0. -* If F=0 and (G .ne. 0), then CS=0 and SN=1 without doing any -* floating point operations (saves work in DBDSQR when -* there are zeros on the diagonal). -* -* If F exceeds G in magnitude, CS will be positive. -* -* Arguments -* ========= -* -* F (input) REAL(DOUBLE) -* The first component of vector to be rotated. -* -* G (input) REAL(DOUBLE) -* The second component of vector to be rotated. -* -* CS (output) REAL(DOUBLE) -* The cosine of the rotation. -* -* SN (output) REAL(DOUBLE) -* The sine of the rotation. -* -* R (output) REAL(DOUBLE) -* The nonzero component of the rotated vector. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO - PARAMETER ( ZERO = 0.0D0 ) - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D0 ) - REAL(DOUBLE) TWO - PARAMETER ( TWO = 2.0D0 ) -* .. -* .. Local Scalars .. - LOGICAL FIRST - INTEGER COUNT, I - REAL(DOUBLE) EPS, F1, G1, SAFMIN, SAFMN2, SAFMX2, SCALE -* .. -* .. External Functions .. - REAL(DOUBLE) DLAMCH - EXTERNAL DLAMCH -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, INT, LOG, MAX, SQRT -* .. -* .. Save statement .. - SAVE FIRST, SAFMX2, SAFMIN, SAFMN2 -* .. -* .. Data statements .. - DATA FIRST / .TRUE. / -* .. -* .. Executable Statements .. -* - IF( FIRST ) THEN - FIRST = .FALSE. - SAFMIN = DLAMCH( 'S' ) - EPS = DLAMCH( 'E' ) - SAFMN2 = DLAMCH( 'B' )**INT( LOG( SAFMIN / EPS ) / - $ LOG( DLAMCH( 'B' ) ) / TWO ) - SAFMX2 = ONE / SAFMN2 - END IF - IF( G.EQ.ZERO ) THEN - CS = ONE - SN = ZERO - R = F - ELSE IF( F.EQ.ZERO ) THEN - CS = ZERO - SN = ONE - R = G - ELSE - F1 = F - G1 = G - SCALE = MAX( ABS( F1 ), ABS( G1 ) ) - IF( SCALE.GE.SAFMX2 ) THEN - COUNT = 0 - 10 CONTINUE - COUNT = COUNT + 1 - F1 = F1*SAFMN2 - G1 = G1*SAFMN2 - SCALE = MAX( ABS( F1 ), ABS( G1 ) ) - IF( SCALE.GE.SAFMX2 ) - $ GO TO 10 - R = SQRT( F1**2+G1**2 ) - CS = F1 / R - SN = G1 / R - DO 20 I = 1, COUNT - R = R*SAFMX2 - 20 CONTINUE - ELSE IF( SCALE.LE.SAFMN2 ) THEN - COUNT = 0 +* +* Convert generated numbers to normal (0,1) distribution +* + DO 30 I = 1, IL + X( IV+I-1 ) = SQRT( -TWO*LOG( U( 2*I-1 ) ) )* + $ COS( TWOPI*U( 2*I ) ) 30 CONTINUE - COUNT = COUNT + 1 - F1 = F1*SAFMX2 - G1 = G1*SAFMX2 - SCALE = MAX( ABS( F1 ), ABS( G1 ) ) - IF( SCALE.LE.SAFMN2 ) - $ GO TO 30 - R = SQRT( F1**2+G1**2 ) - CS = F1 / R - SN = G1 / R - DO 40 I = 1, COUNT - R = R*SAFMN2 - 40 CONTINUE - ELSE - R = SQRT( F1**2+G1**2 ) - CS = F1 / R - SN = G1 / R END IF - IF( ABS( F ).GT.ABS( G ) .AND. CS.LT.ZERO ) THEN - CS = -CS - SN = -SN - R = -R - END IF - END IF + 40 CONTINUE RETURN * -* End of DLARTG +* End of DLARNV_HELPER +* + END SUBROUTINE DLARNV_HELPER + +! ################################################################################################################################## +! 049 LAPACK_BLAS_AUX + + SUBROUTINE DLARTG( F, G, CS, SN, R ) +* +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* September 30, 1994 +* +* .. Scalar Arguments .. + REAL(DOUBLE) CS, F, G, R, SN +* .. * + CALL DLARTG_HELPER( F, G, CS, SN, R ) END SUBROUTINE DLARTG ! ################################################################################################################################## @@ -7974,74 +5706,26 @@ SUBROUTINE DLARTV( N, X, INCX, Y, INCY, C, S, INCC ) REAL(DOUBLE) C( * ), S( * ), X( * ), Y( * ) * .. * -* Purpose -* ======= -* -* DLARTV applies a vector of real plane rotations to elements of the -* real vectors x and y. For i = 1,2,...,n -* -* ( x(i) ) := ( c(i) s(i) ) ( x(i) ) -* ( y(i) ) ( -s(i) c(i) ) ( y(i) ) -* -* Arguments -* ========= -* -* N (input) INTEGER -* The number of plane rotations to be applied. -* -* X (input/output) REAL(DOUBLE) array, -* dimension (1+(N-1)*INCX) -* The vector x. -* -* INCX (input) INTEGER -* The increment between elements of X. INCX > 0. -* -* Y (input/output) REAL(DOUBLE) array, -* dimension (1+(N-1)*INCY) -* The vector y. -* -* INCY (input) INTEGER -* The increment between elements of Y. INCY > 0. -* -* C (input) REAL(DOUBLE) array, dimension (1+(N-1)*INCC) -* The cosines of the plane rotations. -* -* S (input) REAL(DOUBLE) array, dimension (1+(N-1)*INCC) -* The sines of the plane rotations. -* -* INCC (input) INTEGER -* The increment between elements of C and S. INCC > 0. -* -* ===================================================================== -* -* .. Local Scalars .. - INTEGER I, IC, IX, IY - REAL(DOUBLE) XI, YI -* .. -* .. Executable Statements .. -* - IX = 1 - IY = 1 - IC = 1 - DO 10 I = 1, N - XI = X( IX ) - YI = Y( IY ) - X( IX ) = C( IC )*XI + S( IC )*YI - Y( IY ) = C( IC )*YI - S( IC )*XI - IX = IX + INCX - IY = IY + INCY - IC = IC + INCC - 10 CONTINUE - RETURN -* -* End of DLARTV -* + CALL DLARTV_HELPER( N, X, INCX, Y, INCY, C, S, INCC ) END SUBROUTINE DLARTV ! ################################################################################################################################## ! 051 LAPACK_BLAS_AUX SUBROUTINE DLARUV( ISEED, N, X ) + INTEGER N + INTEGER ISEED( 4 ) + REAL(DOUBLE) X( * ) + + CALL DLARUV_HELPER( ISEED, N, X ) + + RETURN + + END SUBROUTINE DLARUV + +! ********************************************************************************************************************************** + + SUBROUTINE DLARUV_HELPER( ISEED, N, X ) * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -8350,333 +6034,82 @@ SUBROUTINE DLARUV( ISEED, N, X ) DATA ( MM( 119, J ), J = 1, 4 ) / 1217, 3638, 2058, $ 2685 / DATA ( MM( 120, J ), J = 1, 4 ) / 1822, 3661, 692, - $ 3745 / - DATA ( MM( 121, J ), J = 1, 4 ) / 1245, 327, 1194, - $ 2325 / - DATA ( MM( 122, J ), J = 1, 4 ) / 2252, 3660, 20, - $ 3609 / - DATA ( MM( 123, J ), J = 1, 4 ) / 3904, 716, 3285, - $ 3821 / - DATA ( MM( 124, J ), J = 1, 4 ) / 2774, 1842, 2046, - $ 3537 / - DATA ( MM( 125, J ), J = 1, 4 ) / 997, 3987, 2107, - $ 517 / - DATA ( MM( 126, J ), J = 1, 4 ) / 2573, 1368, 3508, - $ 3017 / - DATA ( MM( 127, J ), J = 1, 4 ) / 1148, 1848, 3525, - $ 2141 / - DATA ( MM( 128, J ), J = 1, 4 ) / 545, 2366, 3801, - $ 1537 / -* .. -* .. Executable Statements .. -* - I1 = ISEED( 1 ) - I2 = ISEED( 2 ) - I3 = ISEED( 3 ) - I4 = ISEED( 4 ) -* - DO 10 I = 1, MIN( N, LV ) -* -* Multiply the seed by i-th power of the multiplier modulo 2**48 -* - IT4 = I4*MM( I, 4 ) - IT3 = IT4 / IPW2 - IT4 = IT4 - IPW2*IT3 - IT3 = IT3 + I3*MM( I, 4 ) + I4*MM( I, 3 ) - IT2 = IT3 / IPW2 - IT3 = IT3 - IPW2*IT2 - IT2 = IT2 + I2*MM( I, 4 ) + I3*MM( I, 3 ) + I4*MM( I, 2 ) - IT1 = IT2 / IPW2 - IT2 = IT2 - IPW2*IT1 - IT1 = IT1 + I1*MM( I, 4 ) + I2*MM( I, 3 ) + I3*MM( I, 2 ) + - $ I4*MM( I, 1 ) - IT1 = MOD( IT1, IPW2 ) -* -* Convert 48-bit integer to a real number in the interval (0,1) -* - X( I ) = R*( DBLE( IT1 )+R*( DBLE( IT2 )+R*( DBLE( IT3 )+R* - $ DBLE( IT4 ) ) ) ) - 10 CONTINUE -* -* Return final value of seed -* - ISEED( 1 ) = IT1 - ISEED( 2 ) = IT2 - ISEED( 3 ) = IT3 - ISEED( 4 ) = IT4 - RETURN -* -* End of DLARUV -* - END SUBROUTINE DLARUV - -! ################################################################################################################################## -! 052 LAPACK_BLAS_AUX - - SUBROUTINE DLASCL( TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO ) -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. - CHARACTER TYPE - INTEGER INFO, KL, KU, LDA, M, N - REAL(DOUBLE) CFROM, CTO -* .. -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DLASCL multiplies the M by N real matrix A by the real scalar -* CTO/CFROM. This is done without over/underflow as long as the final -* result CTO*A(I,J)/CFROM does not over/underflow. TYPE specifies that -* A may be full, upper triangular, lower triangular, upper Hessenberg, -* or banded. -* -* Arguments -* ========= -* -* TYPE (input) CHARACTER*1 -* TYPE indices the storage type of the input matrix. -* = 'G': A is a full matrix. -* = 'L': A is a lower triangular matrix. -* = 'U': A is an upper triangular matrix. -* = 'H': A is an upper Hessenberg matrix. -* = 'B': A is a symmetric band matrix with lower bandwidth KL -* and upper bandwidth KU and with the only the lower -* half stored. -* = 'Q': A is a symmetric band matrix with lower bandwidth KL -* and upper bandwidth KU and with the only the upper -* half stored. -* = 'Z': A is a band matrix with lower bandwidth KL and upper -* bandwidth KU. -* -* KL (input) INTEGER -* The lower bandwidth of A. Referenced only if TYPE = 'B', -* 'Q' or 'Z'. -* -* KU (input) INTEGER -* The upper bandwidth of A. Referenced only if TYPE = 'B', -* 'Q' or 'Z'. -* -* CFROM (input) REAL(DOUBLE) -* CTO (input) REAL(DOUBLE) -* The matrix A is multiplied by CTO/CFROM. A(I,J) is computed -* without over/underflow if the final result CTO*A(I,J)/CFROM -* can be represented without over/underflow. CFROM must be -* nonzero. -* -* M (input) INTEGER -* The number of rows of the matrix A. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix A. N >= 0. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,M) -* The matrix to be multiplied by CTO/CFROM. See TYPE for the -* storage type. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,M). -* -* INFO (output) INTEGER -* 0 - successful exit -* <0 - if INFO = -i, the i-th argument had an illegal value. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO, ONE - PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) -* .. -* .. Local Scalars .. - LOGICAL DONE - INTEGER I, ITYPE, J, K1, K2, K3, K4 - REAL(DOUBLE) BIGNUM, CFROM1, CFROMC, CTO1, CTOC, MUL, SMLNUM -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME - - REAL(DOUBLE) DLAMCH - EXTERNAL DLAMCH -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS, MAX, MIN -* .. -* .. External Subroutines .. -* .. -* .. Executable Statements .. -* -* Test the input arguments -* - INFO = 0 -* - IF( LSAME( TYPE, 'G' ) ) THEN - ITYPE = 0 - ELSE IF( LSAME( TYPE, 'L' ) ) THEN - ITYPE = 1 - ELSE IF( LSAME( TYPE, 'U' ) ) THEN - ITYPE = 2 - ELSE IF( LSAME( TYPE, 'H' ) ) THEN - ITYPE = 3 - ELSE IF( LSAME( TYPE, 'B' ) ) THEN - ITYPE = 4 - ELSE IF( LSAME( TYPE, 'Q' ) ) THEN - ITYPE = 5 - ELSE IF( LSAME( TYPE, 'Z' ) ) THEN - ITYPE = 6 - ELSE - ITYPE = -1 - END IF -* - IF( ITYPE.EQ.-1 ) THEN - INFO = -1 - ELSE IF( CFROM.EQ.ZERO ) THEN - INFO = -4 - ELSE IF( M.LT.0 ) THEN - INFO = -6 - ELSE IF( N.LT.0 .OR. ( ITYPE.EQ.4 .AND. N.NE.M ) .OR. - $ ( ITYPE.EQ.5 .AND. N.NE.M ) ) THEN - INFO = -7 - ELSE IF( ITYPE.LE.3 .AND. LDA.LT.MAX( 1, M ) ) THEN - INFO = -9 - ELSE IF( ITYPE.GE.4 ) THEN - IF( KL.LT.0 .OR. KL.GT.MAX( M-1, 0 ) ) THEN - INFO = -2 - ELSE IF( KU.LT.0 .OR. KU.GT.MAX( N-1, 0 ) .OR. - $ ( ( ITYPE.EQ.4 .OR. ITYPE.EQ.5 ) .AND. KL.NE.KU ) ) - $ THEN - INFO = -3 - ELSE IF( ( ITYPE.EQ.4 .AND. LDA.LT.KL+1 ) .OR. - $ ( ITYPE.EQ.5 .AND. LDA.LT.KU+1 ) .OR. - $ ( ITYPE.EQ.6 .AND. LDA.LT.2*KL+KU+1 ) ) THEN - INFO = -9 - END IF - END IF -* - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DLASCL', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 .OR. M.EQ.0 ) - $ RETURN -* -* Get machine parameters -* - SMLNUM = DLAMCH( 'S' ) - BIGNUM = ONE / SMLNUM -* - CFROMC = CFROM - CTOC = CTO -* - 10 CONTINUE - CFROM1 = CFROMC*SMLNUM - CTO1 = CTOC / BIGNUM - IF( ABS( CFROM1 ).GT.ABS( CTOC ) .AND. CTOC.NE.ZERO ) THEN - MUL = SMLNUM - DONE = .FALSE. - CFROMC = CFROM1 - ELSE IF( ABS( CTO1 ).GT.ABS( CFROMC ) ) THEN - MUL = BIGNUM - DONE = .FALSE. - CTOC = CTO1 - ELSE - MUL = CTOC / CFROMC - DONE = .TRUE. - END IF -* - IF( ITYPE.EQ.0 ) THEN -* -* Full matrix -* - DO 30 J = 1, N - DO 20 I = 1, M - A( I, J ) = A( I, J )*MUL - 20 CONTINUE - 30 CONTINUE -* - ELSE IF( ITYPE.EQ.1 ) THEN -* -* Lower triangular matrix -* - DO 50 J = 1, N - DO 40 I = J, M - A( I, J ) = A( I, J )*MUL - 40 CONTINUE - 50 CONTINUE -* - ELSE IF( ITYPE.EQ.2 ) THEN -* -* Upper triangular matrix -* - DO 70 J = 1, N - DO 60 I = 1, MIN( J, M ) - A( I, J ) = A( I, J )*MUL - 60 CONTINUE - 70 CONTINUE -* - ELSE IF( ITYPE.EQ.3 ) THEN -* -* Upper Hessenberg matrix -* - DO 90 J = 1, N - DO 80 I = 1, MIN( J+1, M ) - A( I, J ) = A( I, J )*MUL - 80 CONTINUE - 90 CONTINUE + $ 3745 / + DATA ( MM( 121, J ), J = 1, 4 ) / 1245, 327, 1194, + $ 2325 / + DATA ( MM( 122, J ), J = 1, 4 ) / 2252, 3660, 20, + $ 3609 / + DATA ( MM( 123, J ), J = 1, 4 ) / 3904, 716, 3285, + $ 3821 / + DATA ( MM( 124, J ), J = 1, 4 ) / 2774, 1842, 2046, + $ 3537 / + DATA ( MM( 125, J ), J = 1, 4 ) / 997, 3987, 2107, + $ 517 / + DATA ( MM( 126, J ), J = 1, 4 ) / 2573, 1368, 3508, + $ 3017 / + DATA ( MM( 127, J ), J = 1, 4 ) / 1148, 1848, 3525, + $ 2141 / + DATA ( MM( 128, J ), J = 1, 4 ) / 545, 2366, 3801, + $ 1537 / +* .. +* .. Executable Statements .. * - ELSE IF( ITYPE.EQ.4 ) THEN + I1 = ISEED( 1 ) + I2 = ISEED( 2 ) + I3 = ISEED( 3 ) + I4 = ISEED( 4 ) * -* Lower half of a symmetric band matrix + DO 10 I = 1, MIN( N, LV ) * - K3 = KL + 1 - K4 = N + 1 - DO 110 J = 1, N - DO 100 I = 1, MIN( K3, K4-J ) - A( I, J ) = A( I, J )*MUL - 100 CONTINUE - 110 CONTINUE +* Multiply the seed by i-th power of the multiplier modulo 2**48 * - ELSE IF( ITYPE.EQ.5 ) THEN + IT4 = I4*MM( I, 4 ) + IT3 = IT4 / IPW2 + IT4 = IT4 - IPW2*IT3 + IT3 = IT3 + I3*MM( I, 4 ) + I4*MM( I, 3 ) + IT2 = IT3 / IPW2 + IT3 = IT3 - IPW2*IT2 + IT2 = IT2 + I2*MM( I, 4 ) + I3*MM( I, 3 ) + I4*MM( I, 2 ) + IT1 = IT2 / IPW2 + IT2 = IT2 - IPW2*IT1 + IT1 = IT1 + I1*MM( I, 4 ) + I2*MM( I, 3 ) + I3*MM( I, 2 ) + + $ I4*MM( I, 1 ) + IT1 = MOD( IT1, IPW2 ) * -* Upper half of a symmetric band matrix +* Convert 48-bit integer to a real number in the interval (0,1) * - K1 = KU + 2 - K3 = KU + 1 - DO 130 J = 1, N - DO 120 I = MAX( K1-J, 1 ), K3 - A( I, J ) = A( I, J )*MUL - 120 CONTINUE - 130 CONTINUE + X( I ) = R*( DBLE( IT1 )+R*( DBLE( IT2 )+R*( DBLE( IT3 )+R* + $ DBLE( IT4 ) ) ) ) + 10 CONTINUE * - ELSE IF( ITYPE.EQ.6 ) THEN +* Return final value of seed * -* Band matrix + ISEED( 1 ) = IT1 + ISEED( 2 ) = IT2 + ISEED( 3 ) = IT3 + ISEED( 4 ) = IT4 + RETURN * - K1 = KL + KU + 2 - K2 = KL + 1 - K3 = 2*KL + KU + 1 - K4 = KL + KU + 1 + M - DO 150 J = 1, N - DO 140 I = MAX( K1-J, K2 ), MIN( K3, K4-J ) - A( I, J ) = A( I, J )*MUL - 140 CONTINUE - 150 CONTINUE +* End of DLARUV_HELPER * - END IF + END SUBROUTINE DLARUV_HELPER + +! ################################################################################################################################## +! 052 LAPACK_BLAS_AUX + + SUBROUTINE DLASCL( TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO ) * - IF( .NOT.DONE ) - $ GO TO 10 +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 * +* .. Scalar Arguments .. + CHARACTER TYPE + INTEGER INFO, KL, KU, LDA, M, N + REAL(DOUBLE) CFROM, CTO + REAL(DOUBLE) A( LDA, * ) + CALL DLASCL_HELPER( TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO ) RETURN * * End of DLASCL @@ -8697,104 +6130,8 @@ SUBROUTINE DLASET( UPLO, M, N, ALPHA, BETA, A, LDA ) CHARACTER UPLO INTEGER LDA, M, N REAL(DOUBLE) ALPHA, BETA -* .. -* .. Array Arguments .. REAL(DOUBLE) A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DLASET initializes an m-by-n matrix A to BETA on the diagonal and -* ALPHA on the offdiagonals. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* Specifies the part of the matrix A to be set. -* = 'U': Upper triangular part is set; the strictly lower -* triangular part of A is not changed. -* = 'L': Lower triangular part is set; the strictly upper -* triangular part of A is not changed. -* Otherwise: All of the matrix A is set. -* -* M (input) INTEGER -* The number of rows of the matrix A. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix A. N >= 0. -* -* ALPHA (input) REAL(DOUBLE) -* The constant to which the offdiagonal elements are to be set. -* -* BETA (input) REAL(DOUBLE) -* The constant to which the diagonal elements are to be set. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) -* On exit, the leading m-by-n submatrix of A is set as follows: -* -* if UPLO = 'U', A(i,j) = ALPHA, 1<=i<=j-1, 1<=j<=n, -* if UPLO = 'L', A(i,j) = ALPHA, j+1<=i<=m, 1<=j<=n, -* otherwise, A(i,j) = ALPHA, 1<=i<=m, 1<=j<=n, i.ne.j, -* -* and, for all UPLO, A(i,i) = BETA, 1<=i<=min(m,n). -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,M). -* -* ===================================================================== -* -* .. Local Scalars .. - INTEGER I, J -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. Intrinsic Functions .. - INTRINSIC MIN -* .. -* .. Executable Statements .. -* - IF( LSAME( UPLO, 'U' ) ) THEN -* -* Set the strictly upper triangular or trapezoidal part of the -* array to ALPHA. -* - DO 20 J = 2, N - DO 10 I = 1, MIN( J-1, M ) - A( I, J ) = ALPHA - 10 CONTINUE - 20 CONTINUE -* - ELSE IF( LSAME( UPLO, 'L' ) ) THEN -* -* Set the strictly lower triangular or trapezoidal part of the -* array to ALPHA. -* - DO 40 J = 1, MIN( M, N ) - DO 30 I = J + 1, M - A( I, J ) = ALPHA - 30 CONTINUE - 40 CONTINUE -* - ELSE -* -* Set the leading m-by-n submatrix to ALPHA. -* - DO 60 J = 1, N - DO 50 I = 1, M - A( I, J ) = ALPHA - 50 CONTINUE - 60 CONTINUE - END IF -* -* Set the first min(M,N) diagonal elements to BETA. -* - DO 70 I = 1, MIN( M, N ) - A( I, I ) = BETA - 70 CONTINUE + CALL DLASET_HELPER( UPLO, M, N, ALPHA, BETA, A, LDA ) * RETURN * @@ -8806,6 +6143,20 @@ END SUBROUTINE DLASET ! 054 LAPACK_BLAS_AUX SUBROUTINE DLASR( SIDE, PIVOT, DIRECT, M, N, C, S, A, LDA ) + CHARACTER DIRECT, PIVOT, SIDE + INTEGER LDA, M, N + REAL(DOUBLE) A( LDA, * ), C( * ), S( * ) + + CALL DLASR_HELPER( SIDE, PIVOT, DIRECT, M, N, C, S, A, LDA ) + + RETURN + + END SUBROUTINE DLASR + +! ********************************************************************************************************************************** + + SUBROUTINE DLASR_HELPER( SIDE, PIVOT, DIRECT, M, N, C, S, A, + $ LDA ) * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -9086,291 +6437,65 @@ SUBROUTINE DLASR( SIDE, PIVOT, DIRECT, M, N, C, S, A, LDA ) DO 200 J = N, 2, -1 CTEMP = C( J-1 ) STEMP = S( J-1 ) - IF( ( CTEMP.NE.ONE ) .OR. ( STEMP.NE.ZERO ) ) THEN - DO 190 I = 1, M - TEMP = A( I, J ) - A( I, J ) = CTEMP*TEMP - STEMP*A( I, 1 ) - A( I, 1 ) = STEMP*TEMP + CTEMP*A( I, 1 ) - 190 CONTINUE - END IF - 200 CONTINUE - END IF - ELSE IF( LSAME( PIVOT, 'B' ) ) THEN - IF( LSAME( DIRECT, 'F' ) ) THEN - DO 220 J = 1, N - 1 - CTEMP = C( J ) - STEMP = S( J ) - IF( ( CTEMP.NE.ONE ) .OR. ( STEMP.NE.ZERO ) ) THEN - DO 210 I = 1, M - TEMP = A( I, J ) - A( I, J ) = STEMP*A( I, N ) + CTEMP*TEMP - A( I, N ) = CTEMP*A( I, N ) - STEMP*TEMP - 210 CONTINUE - END IF - 220 CONTINUE - ELSE IF( LSAME( DIRECT, 'B' ) ) THEN - DO 240 J = N - 1, 1, -1 - CTEMP = C( J ) - STEMP = S( J ) - IF( ( CTEMP.NE.ONE ) .OR. ( STEMP.NE.ZERO ) ) THEN - DO 230 I = 1, M - TEMP = A( I, J ) - A( I, J ) = STEMP*A( I, N ) + CTEMP*TEMP - A( I, N ) = CTEMP*A( I, N ) - STEMP*TEMP - 230 CONTINUE - END IF - 240 CONTINUE - END IF - END IF - END IF -* - RETURN -* -* End of DLASR -* - END SUBROUTINE DLASR - -! ################################################################################################################################## -! 055 LAPACK_BLAS_AUX - - SUBROUTINE DLASRT( ID, N, D, INFO ) -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. - CHARACTER ID - INTEGER INFO, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) D( * ) -* .. -* -* Purpose -* ======= -* -* Sort the numbers in D in increasing order (if ID = 'I') or -* in decreasing order (if ID = 'D' ). -* -* Use Quick Sort, reverting to Insertion sort on arrays of -* size <= 20. Dimension of STACK limits N to about 2**32. -* -* Arguments -* ========= -* -* ID (input) CHARACTER*1 -* = 'I': sort D in increasing order; -* = 'D': sort D in decreasing order. -* -* N (input) INTEGER -* The length of the array D. -* -* D (input/output) REAL(DOUBLE) array, dimension (N) -* On entry, the array to be sorted. -* On exit, D has been sorted into increasing order -* (D(1) <= ... <= D(N) ) or into decreasing order -* (D(1) >= ... >= D(N) ), depending on ID. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -* ===================================================================== -* -* .. Parameters .. - INTEGER SELECT - PARAMETER ( SELECT = 20 ) -* .. -* .. Local Scalars .. - INTEGER DIR, ENDD, I, J, START, STKPNT - REAL(DOUBLE) D1, D2, D3, DMNMX, TMP -* .. -* .. Local Arrays .. - INTEGER STACK( 2, 32 ) -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Executable Statements .. -* -* Test the input paramters. -* - INFO = 0 - DIR = -1 - IF( LSAME( ID, 'D' ) ) THEN - DIR = 0 - ELSE IF( LSAME( ID, 'I' ) ) THEN - DIR = 1 - END IF - IF( DIR.EQ.-1 ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DLASRT', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.LE.1 ) - $ RETURN -* - STKPNT = 1 - STACK( 1, 1 ) = 1 - STACK( 2, 1 ) = N - 10 CONTINUE - START = STACK( 1, STKPNT ) - ENDD = STACK( 2, STKPNT ) - STKPNT = STKPNT - 1 - IF( ENDD-START.LE.SELECT .AND. ENDD-START.GT.0 ) THEN -* -* Do Insertion sort on D( START:ENDD ) -* - IF( DIR.EQ.0 ) THEN -* -* Sort into decreasing order -* - DO 30 I = START + 1, ENDD - DO 20 J = I, START + 1, -1 - IF( D( J ).GT.D( J-1 ) ) THEN - DMNMX = D( J ) - D( J ) = D( J-1 ) - D( J-1 ) = DMNMX - ELSE - GO TO 30 - END IF - 20 CONTINUE - 30 CONTINUE -* - ELSE -* -* Sort into increasing order -* - DO 50 I = START + 1, ENDD - DO 40 J = I, START + 1, -1 - IF( D( J ).LT.D( J-1 ) ) THEN - DMNMX = D( J ) - D( J ) = D( J-1 ) - D( J-1 ) = DMNMX - ELSE - GO TO 50 + IF( ( CTEMP.NE.ONE ) .OR. ( STEMP.NE.ZERO ) ) THEN + DO 190 I = 1, M + TEMP = A( I, J ) + A( I, J ) = CTEMP*TEMP - STEMP*A( I, 1 ) + A( I, 1 ) = STEMP*TEMP + CTEMP*A( I, 1 ) + 190 CONTINUE END IF - 40 CONTINUE - 50 CONTINUE -* - END IF -* - ELSE IF( ENDD-START.GT.SELECT ) THEN -* -* Partition D( START:ENDD ) and stack parts, largest one first -* -* Choose partition entry as median of 3 -* - D1 = D( START ) - D2 = D( ENDD ) - I = ( START+ENDD ) / 2 - D3 = D( I ) - IF( D1.LT.D2 ) THEN - IF( D3.LT.D1 ) THEN - DMNMX = D1 - ELSE IF( D3.LT.D2 ) THEN - DMNMX = D3 - ELSE - DMNMX = D2 + 200 CONTINUE END IF - ELSE - IF( D3.LT.D2 ) THEN - DMNMX = D2 - ELSE IF( D3.LT.D1 ) THEN - DMNMX = D3 - ELSE - DMNMX = D1 + ELSE IF( LSAME( PIVOT, 'B' ) ) THEN + IF( LSAME( DIRECT, 'F' ) ) THEN + DO 220 J = 1, N - 1 + CTEMP = C( J ) + STEMP = S( J ) + IF( ( CTEMP.NE.ONE ) .OR. ( STEMP.NE.ZERO ) ) THEN + DO 210 I = 1, M + TEMP = A( I, J ) + A( I, J ) = STEMP*A( I, N ) + CTEMP*TEMP + A( I, N ) = CTEMP*A( I, N ) - STEMP*TEMP + 210 CONTINUE + END IF + 220 CONTINUE + ELSE IF( LSAME( DIRECT, 'B' ) ) THEN + DO 240 J = N - 1, 1, -1 + CTEMP = C( J ) + STEMP = S( J ) + IF( ( CTEMP.NE.ONE ) .OR. ( STEMP.NE.ZERO ) ) THEN + DO 230 I = 1, M + TEMP = A( I, J ) + A( I, J ) = STEMP*A( I, N ) + CTEMP*TEMP + A( I, N ) = CTEMP*A( I, N ) - STEMP*TEMP + 230 CONTINUE + END IF + 240 CONTINUE END IF END IF + END IF * - IF( DIR.EQ.0 ) THEN + RETURN * -* Sort into decreasing order +* End of DLASR_HELPER * - I = START - 1 - J = ENDD + 1 - 60 CONTINUE - 70 CONTINUE - J = J - 1 - IF( D( J ).LT.DMNMX ) - $ GO TO 70 - 80 CONTINUE - I = I + 1 - IF( D( I ).GT.DMNMX ) - $ GO TO 80 - IF( I.LT.J ) THEN - TMP = D( I ) - D( I ) = D( J ) - D( J ) = TMP - GO TO 60 - END IF - IF( J-START.GT.ENDD-J-1 ) THEN - STKPNT = STKPNT + 1 - STACK( 1, STKPNT ) = START - STACK( 2, STKPNT ) = J - STKPNT = STKPNT + 1 - STACK( 1, STKPNT ) = J + 1 - STACK( 2, STKPNT ) = ENDD - ELSE - STKPNT = STKPNT + 1 - STACK( 1, STKPNT ) = J + 1 - STACK( 2, STKPNT ) = ENDD - STKPNT = STKPNT + 1 - STACK( 1, STKPNT ) = START - STACK( 2, STKPNT ) = J - END IF - ELSE + END SUBROUTINE DLASR_HELPER + +! ################################################################################################################################## +! 055 LAPACK_BLAS_AUX + + SUBROUTINE DLASRT( ID, N, D, INFO ) * -* Sort into increasing order +* -- LAPACK routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* September 30, 1994 * - I = START - 1 - J = ENDD + 1 - 90 CONTINUE - 100 CONTINUE - J = J - 1 - IF( D( J ).GT.DMNMX ) - $ GO TO 100 - 110 CONTINUE - I = I + 1 - IF( D( I ).LT.DMNMX ) - $ GO TO 110 - IF( I.LT.J ) THEN - TMP = D( I ) - D( I ) = D( J ) - D( J ) = TMP - GO TO 90 - END IF - IF( J-START.GT.ENDD-J-1 ) THEN - STKPNT = STKPNT + 1 - STACK( 1, STKPNT ) = START - STACK( 2, STKPNT ) = J - STKPNT = STKPNT + 1 - STACK( 1, STKPNT ) = J + 1 - STACK( 2, STKPNT ) = ENDD - ELSE - STKPNT = STKPNT + 1 - STACK( 1, STKPNT ) = J + 1 - STACK( 2, STKPNT ) = ENDD - STKPNT = STKPNT + 1 - STACK( 1, STKPNT ) = START - STACK( 2, STKPNT ) = J - END IF - END IF - END IF - IF( STKPNT.GT.0 ) - $ GO TO 10 +* .. Scalar Arguments .. + CHARACTER ID + INTEGER INFO, N + REAL(DOUBLE) D( * ) + CALL DLASRT_HELPER( ID, N, D, INFO ) RETURN * * End of DLASRT @@ -9390,80 +6515,8 @@ SUBROUTINE DLASSQ( N, X, INCX, SCALE, SUMSQ ) * .. Scalar Arguments .. INTEGER INCX, N REAL(DOUBLE) SCALE, SUMSQ -* .. -* .. Array Arguments .. REAL(DOUBLE) X( * ) -* .. -* -* Purpose -* ======= -* -* DLASSQ returns the values scl and smsq such that -* -* ( scl**2 )*smsq = x( 1 )**2 +...+ x( n )**2 + ( scale**2 )*sumsq, -* -* where x( i ) = X( 1 + ( i - 1 )*INCX ). The value of sumsq is -* assumed to be non-negative and scl returns the value -* -* scl = max( scale, abs( x( i ) ) ). -* -* scale and sumsq must be supplied in SCALE and SUMSQ and -* scl and smsq are overwritten on SCALE and SUMSQ respectively. -* -* The routine makes only one pass through the vector x. -* -* Arguments -* ========= -* -* N (input) INTEGER -* The number of elements to be used from the vector X. -* -* X (input) REAL(DOUBLE) array, dimension (N) -* The vector for which a scaled sum of squares is computed. -* x( i ) = X( 1 + ( i - 1 )*INCX ), 1 <= i <= n. -* -* INCX (input) INTEGER -* The increment between successive values of the vector X. -* INCX > 0. -* -* SCALE (input/output) REAL(DOUBLE) -* On entry, the value scale in the equation above. -* On exit, SCALE is overwritten with scl , the scaling factor -* for the sum of squares. -* -* SUMSQ (input/output) REAL(DOUBLE) -* On entry, the value sumsq in the equation above. -* On exit, SUMSQ is overwritten with smsq , the basic sum of -* squares from which scl has been factored out. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO - PARAMETER ( ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER IX - REAL(DOUBLE) ABSXI -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS -* .. -* .. Executable Statements .. -* - IF( N.GT.0 ) THEN - DO 10 IX = 1, 1 + ( N-1 )*INCX, INCX - IF( X( IX ).NE.ZERO ) THEN - ABSXI = ABS( X( IX ) ) - IF( SCALE.LT.ABSXI ) THEN - SUMSQ = 1 + SUMSQ*( SCALE / ABSXI )**2 - SCALE = ABSXI - ELSE - SUMSQ = SUMSQ + ( ABSXI / SCALE )**2 - END IF - END IF - 10 CONTINUE - END IF + CALL DLASSQ_HELPER( N, X, INCX, SCALE, SUMSQ ) RETURN * * End of DLASSQ @@ -9482,111 +6535,9 @@ SUBROUTINE DLASWP( N, A, LDA, K1, K2, IPIV, INCX ) * * .. Scalar Arguments .. INTEGER INCX, K1, K2, LDA, N -* .. -* .. Array Arguments .. INTEGER IPIV( * ) REAL(DOUBLE) A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DLASWP performs a series of row interchanges on the matrix A. -* One row interchange is initiated for each of rows K1 through K2 of A. -* -* Arguments -* ========= -* -* N (input) INTEGER -* The number of columns of the matrix A. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) -* On entry, the matrix of column dimension N to which the row -* interchanges will be applied. -* On exit, the permuted matrix. -* -* LDA (input) INTEGER -* The leading dimension of the array A. -* -* K1 (input) INTEGER -* The first element of IPIV for which a row interchange will -* be done. -* -* K2 (input) INTEGER -* The last element of IPIV for which a row interchange will -* be done. -* -* IPIV (input) INTEGER array, dimension (M*abs(INCX)) -* The vector of pivot indices. Only the elements in positions -* K1 through K2 of IPIV are accessed. -* IPIV(K) = L implies rows K and L are to be interchanged. -* -* INCX (input) INTEGER -* The increment between successive values of IPIV. If IPIV -* is negative, the pivots are applied in reverse order. -* -* Further Details -* =============== -* -* Modified by -* R. C. Whaley, Computer Science Dept., Univ. of Tenn., Knoxville, USA -* -* ===================================================================== -* -* .. Local Scalars .. - INTEGER I, I1, I2, INC, IP, IX, IX0, J, K, N32 - REAL(DOUBLE) TEMP -* .. -* .. Executable Statements .. -* -* Interchange row I with row IPIV(I) for each of rows K1 through K2. -* - IF( INCX.GT.0 ) THEN - IX0 = K1 - I1 = K1 - I2 = K2 - INC = 1 - ELSE IF( INCX.LT.0 ) THEN - IX0 = 1 + ( 1-K2 )*INCX - I1 = K2 - I2 = K1 - INC = -1 - ELSE - RETURN - END IF -* - N32 = ( N / 32 )*32 - IF( N32.NE.0 ) THEN - DO 30 J = 1, N32, 32 - IX = IX0 - DO 20 I = I1, I2, INC - IP = IPIV( IX ) - IF( IP.NE.I ) THEN - DO 10 K = J, J + 31 - TEMP = A( I, K ) - A( I, K ) = A( IP, K ) - A( IP, K ) = TEMP - 10 CONTINUE - END IF - IX = IX + INCX - 20 CONTINUE - 30 CONTINUE - END IF - IF( N32.NE.N ) THEN - N32 = N32 + 1 - IX = IX0 - DO 50 I = I1, I2, INC - IP = IPIV( IX ) - IF( IP.NE.I ) THEN - DO 40 K = N32, N - TEMP = A( I, K ) - A( I, K ) = A( IP, K ) - A( IP, K ) = TEMP - 40 CONTINUE - END IF - IX = IX + INCX - 50 CONTINUE - END IF + CALL DLASWP_HELPER( N, A, LDA, K1, K2, IPIV, INCX ) * RETURN * @@ -9599,9 +6550,28 @@ END SUBROUTINE DLASWP SUBROUTINE DLATBS( UPLO, TRANS, DIAG, NORMIN, N, KD, AB, LDAB, X, $ SCALE, CNORM, INFO, iter_num, dtbsv_msg ) + CHARACTER DIAG, NORMIN, TRANS, UPLO + character*1 dtbsv_msg + INTEGER INFO, KD, LDAB, N + integer iter_num + REAL(DOUBLE) SCALE + REAL(DOUBLE) AB( LDAB, * ), CNORM( * ), X( * ) + + CALL DLATBS_HELPER( UPLO, TRANS, DIAG, NORMIN, N, KD, AB, LDAB, + $ X, SCALE, CNORM, INFO, iter_num, dtbsv_msg ) + + RETURN + + END SUBROUTINE DLATBS + +! ********************************************************************************************************************************** + + SUBROUTINE DLATBS_HELPER( UPLO, TRANS, DIAG, NORMIN, N, KD, AB, + $ LDAB, X, SCALE, CNORM, INFO, iter_num, + $ dtbsv_msg ) USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DLATBS' + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DLATBS_HELPER' * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -10337,7 +7307,7 @@ SUBROUTINE DLATBS( UPLO, TRANS, DIAG, NORMIN, N, KD, AB, LDAB, X, CALL DSCAL( N, ONE / TSCAL, CNORM, 1 ) END IF * -* End of DLATBS +* End of DLATBS_HELPER * 12345 format(5X,'Iteration number ',i4,' : J = ',i8,' to ',i8, a) @@ -10346,12 +7316,25 @@ SUBROUTINE DLATBS( UPLO, TRANS, DIAG, NORMIN, N, KD, AB, LDAB, X, ! ********************************************************************************************************************************** - END SUBROUTINE DLATBS + END SUBROUTINE DLATBS_HELPER ! ################################################################################################################################## ! 059 LAPACK_BLAS_AUX SUBROUTINE DLATRD( UPLO, N, NB, A, LDA, E, TAU, W, LDW ) + CHARACTER UPLO + INTEGER LDA, LDW, N, NB + REAL(DOUBLE) A( LDA, * ), E( * ), TAU( * ), W( LDW, * ) + + CALL DLATRD_HELPER( UPLO, N, NB, A, LDA, E, TAU, W, LDW ) + + RETURN + + END SUBROUTINE DLATRD + +! ********************************************************************************************************************************** + + SUBROUTINE DLATRD_HELPER( UPLO, N, NB, A, LDA, E, TAU, W, LDW ) * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -10606,14 +7589,26 @@ SUBROUTINE DLATRD( UPLO, N, NB, A, LDA, E, TAU, W, LDW ) * RETURN * -* End of DLATRD +* End of DLATRD_HELPER * - END SUBROUTINE DLATRD + END SUBROUTINE DLATRD_HELPER ! ################################################################################################################################## ! 060 LAPACK_BLAS_AUX SUBROUTINE DORG2L( M, N, K, A, LDA, TAU, WORK, INFO ) + INTEGER INFO, K, LDA, M, N + REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( * ) + + CALL DORG2L_HELPER( M, N, K, A, LDA, TAU, WORK, INFO ) + + RETURN + + END SUBROUTINE DORG2L + +! ********************************************************************************************************************************** + + SUBROUTINE DORG2L_HELPER( M, N, K, A, LDA, TAU, WORK, INFO ) * * -- LAPACK routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -10737,14 +7732,26 @@ SUBROUTINE DORG2L( M, N, K, A, LDA, TAU, WORK, INFO ) 40 CONTINUE RETURN * -* End of DORG2L +* End of DORG2L_HELPER * - END SUBROUTINE DORG2L + END SUBROUTINE DORG2L_HELPER ! ################################################################################################################################## ! 061 LAPACK_BLAS_AUX SUBROUTINE DORG2R( M, N, K, A, LDA, TAU, WORK, INFO ) + INTEGER INFO, K, LDA, M, N + REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( * ) + + CALL DORG2R_HELPER( M, N, K, A, LDA, TAU, WORK, INFO ) + + RETURN + + END SUBROUTINE DORG2R + +! ********************************************************************************************************************************** + + SUBROUTINE DORG2R_HELPER( M, N, K, A, LDA, TAU, WORK, INFO ) * * -- LAPACK routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -10870,14 +7877,27 @@ SUBROUTINE DORG2R( M, N, K, A, LDA, TAU, WORK, INFO ) 40 CONTINUE RETURN * -* End of DORG2R +* End of DORG2R_HELPER * - END SUBROUTINE DORG2R + END SUBROUTINE DORG2R_HELPER ! ################################################################################################################################## ! 062 LAPACK_BLAS_AUX SUBROUTINE DORGQL( M, N, K, A, LDA, TAU, WORK, LWORK, INFO ) + INTEGER INFO, K, LDA, LWORK, M, N + REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( * ) + + CALL DORGQL_HELPER( M, N, K, A, LDA, TAU, WORK, LWORK, INFO ) + + RETURN + + END SUBROUTINE DORGQL + +! ********************************************************************************************************************************** + + SUBROUTINE DORGQL_HELPER( M, N, K, A, LDA, TAU, WORK, LWORK, + $ INFO ) * * -- LAPACK routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -11087,14 +8107,27 @@ SUBROUTINE DORGQL( M, N, K, A, LDA, TAU, WORK, LWORK, INFO ) WORK( 1 ) = IWS RETURN * -* End of DORGQL +* End of DORGQL_HELPER * - END SUBROUTINE DORGQL + END SUBROUTINE DORGQL_HELPER ! ################################################################################################################################## ! 063 LAPACK_BLAS_AUX SUBROUTINE DORGQR( M, N, K, A, LDA, TAU, WORK, LWORK, INFO ) + INTEGER INFO, K, LDA, LWORK, M, N + REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( * ) + + CALL DORGQR_HELPER( M, N, K, A, LDA, TAU, WORK, LWORK, INFO ) + + RETURN + + END SUBROUTINE DORGQR + +! ********************************************************************************************************************************** + + SUBROUTINE DORGQR_HELPER( M, N, K, A, LDA, TAU, WORK, LWORK, + $ INFO ) * * -- LAPACK routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -11307,14 +8340,27 @@ SUBROUTINE DORGQR( M, N, K, A, LDA, TAU, WORK, LWORK, INFO ) WORK( 1 ) = IWS RETURN * -* End of DORGQR +* End of DORGQR_HELPER * - END SUBROUTINE DORGQR + END SUBROUTINE DORGQR_HELPER ! ################################################################################################################################## ! 064 LAPACK_BLAS_AUX SUBROUTINE DRSCL( N, SA, SX, INCX ) + INTEGER INCX, N + REAL(DOUBLE) SA + REAL(DOUBLE) SX( * ) + + CALL DRSCL_HELPER( N, SA, SX, INCX ) + + RETURN + + END SUBROUTINE DRSCL + +! ********************************************************************************************************************************** + + SUBROUTINE DRSCL_HELPER( N, SA, SX, INCX ) * * -- LAPACK auxiliary routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -11425,14 +8471,27 @@ SUBROUTINE DRSCL( N, SA, SX, INCX ) * RETURN * -* End of DRSCL +* End of DRSCL_HELPER * - END SUBROUTINE DRSCL + END SUBROUTINE DRSCL_HELPER ! ################################################################################################################################## ! 065 LAPACK_BLAS_AUX SUBROUTINE DSYTD2( UPLO, N, A, LDA, D, E, TAU, INFO ) + CHARACTER UPLO + INTEGER INFO, LDA, N + REAL(DOUBLE) A( LDA, * ), D( * ), E( * ), TAU( * ) + + CALL DSYTD2_HELPER( UPLO, N, A, LDA, D, E, TAU, INFO ) + + RETURN + + END SUBROUTINE DSYTD2 + +! ********************************************************************************************************************************** + + SUBROUTINE DSYTD2_HELPER( UPLO, N, A, LDA, D, E, TAU, INFO ) * * -- LAPACK routine (version 3.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., @@ -11677,9 +8736,9 @@ SUBROUTINE DSYTD2( UPLO, N, A, LDA, D, E, TAU, INFO ) * RETURN * -* End of DSYTD2 +* End of DSYTD2_HELPER * - END SUBROUTINE DSYTD2 + END SUBROUTINE DSYTD2_HELPER ! ################################################################################################################################# ! 069 LAPACK_BLAS_AUX @@ -11709,10 +8768,7 @@ LOGICAL FUNCTION DISNAN(DIN) * * ===================================================================== * -* .. External Functions .. -* .. -* .. Executable Statements .. - DISNAN = DLAISNAN(DIN,DIN) + DISNAN = DISNAN_HELPER(DIN) RETURN END FUNCTION DISNAN @@ -11756,9 +8812,9 @@ LOGICAL FUNCTION DLAISNAN(DIN1,DIN2) * * ===================================================================== * -* .. Executable Statements .. - DLAISNAN = (DIN1.NE.DIN2) + DLAISNAN = DLAISNAN_HELPER(DIN1,DIN2) RETURN END FUNCTION DLAISNAN +! --- lapack_surgery end --- ! END MODULE LAPACK_BLAS_AUX diff --git a/Source/Modules/LAPACK/LAPACK_DGETF2_HELPER.f b/Source/Modules/LAPACK/LAPACK_DGETF2_HELPER.f new file mode 100644 index 00000000..befe83bd --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DGETF2_HELPER.f @@ -0,0 +1,65 @@ +! ################################################################################################################################## + + MODULE LAPACK_DGETF2_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE PARAMS, ONLY : EPSIL + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DGETF2_HELPER( M, N, A, LDA, IPIV, INFO ) + + INTEGER INFO, LDA, M, N + INTEGER IPIV( * ) + REAL(DOUBLE) A( LDA, * ) + + REAL(DOUBLE) ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) + + INTEGER J, JP + + INTRINSIC MAX, MIN + + INFO = 0 + IF( M.LT.0 ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, M ) ) THEN + INFO = -4 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGETF2', -INFO ) + RETURN + END IF + + IF( M.EQ.0 .OR. N.EQ.0 ) + $ RETURN + + DO 10 J = 1, MIN( M, N ) + JP = J - 1 + IDAMAX( M-J+1, A( J, J ), 1 ) + IPIV( J ) = JP + IF( DABS(A( JP, J )) > EPSIL(2)) THEN + IF( JP.NE.J ) + $ CALL DSWAP( N, A( J, 1 ), LDA, A( JP, 1 ), LDA ) + IF( J.LT.M ) + $ CALL DSCAL( M-J, ONE / A( J, J ), A( J+1, J ), 1 ) + ELSE IF( INFO.EQ.0 ) THEN + INFO = J + END IF + IF( J.LT.MIN( M, N ) ) THEN + CALL DGER( M-J, N-J, -ONE, A( J+1, J ), 1, A( J, J+1 ), LDA, + $ A( J+1, J+1 ), LDA ) + END IF + 10 CONTINUE + + RETURN + + END SUBROUTINE DGETF2_HELPER +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DGETF2_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DGETRF_HELPER.f b/Source/Modules/LAPACK/LAPACK_DGETRF_HELPER.f new file mode 100644 index 00000000..225dfd63 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DGETRF_HELPER.f @@ -0,0 +1,90 @@ +! ################################################################################################################################## + + MODULE LAPACK_DGETRF_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + USE LAPACK_DGETF2_HELPER + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DGETRF_HELPER( M, N, A, LDA, IPIV, INFO ) + + INTEGER INFO, LDA, M, N + INTEGER IPIV( * ) + REAL(DOUBLE) A( LDA, * ) + + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D+0 ) + + INTEGER I, IINFO, J, JB, NB + + INTEGER ILAENV + EXTERNAL ILAENV + + INTRINSIC MAX, MIN + + INFO = 0 + IF( M.LT.0 ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, M ) ) THEN + INFO = -4 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGETRF', -INFO ) + RETURN + END IF + + IF( M.EQ.0 .OR. N.EQ.0 ) + $ RETURN + + NB = ILAENV( 1, 'DGETRF', ' ', M, N, -1, -1 ) + IF( NB.LE.1 .OR. NB.GE.MIN( M, N ) ) THEN + + CALL DGETF2_HELPER( M, N, A, LDA, IPIV, INFO ) + ELSE + + DO 20 J = 1, MIN( M, N ), NB + JB = MIN( MIN( M, N )-J+1, NB ) + + CALL DGETF2_HELPER( M-J+1, JB, A( J, J ), LDA, IPIV( J ), + $ IINFO ) + + IF( INFO.EQ.0 .AND. IINFO.GT.0 ) + $ INFO = IINFO + J - 1 + DO 10 I = J, MIN( M, J+JB-1 ) + IPIV( I ) = J - 1 + IPIV( I ) + 10 CONTINUE + + CALL DLASWP( J-1, A, LDA, J, J+JB-1, IPIV, 1 ) + + IF( J+JB.LE.N ) THEN + + CALL DLASWP( N-J-JB+1, A( 1, J+JB ), LDA, J, J+JB-1, + $ IPIV, 1 ) + + CALL DTRSM( 'Left', 'Lower', 'No transpose', 'Unit', JB, + $ N-J-JB+1, ONE, A( J, J ), LDA, A( J, J+JB ), + $ LDA ) + IF( J+JB.LE.M ) THEN + + CALL DGEMM( 'No transpose', 'No transpose', M-J-JB+1, + $ N-J-JB+1, JB, -ONE, A( J+JB, J ), LDA, + $ A( J, J+JB ), LDA, ONE, A( J+JB, J+JB ), + $ LDA ) + END IF + END IF + 20 CONTINUE + END IF + + RETURN + + END SUBROUTINE DGETRF_HELPER +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DGETRF_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DGETRI_HELPER.f b/Source/Modules/LAPACK/LAPACK_DGETRI_HELPER.f new file mode 100644 index 00000000..1436ee48 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DGETRI_HELPER.f @@ -0,0 +1,116 @@ +! ################################################################################################################################## + + MODULE LAPACK_DGETRI_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DGETRI_HELPER( N, A, LDA, IPIV, WORK, LWORK, INFO ) + + INTEGER INFO, LDA, LWORK, N + INTEGER IPIV( * ) + DOUBLE PRECISION A( LDA, * ), WORK( * ) + + DOUBLE PRECISION ZERO, ONE + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) + + LOGICAL LQUERY + INTEGER I, IWS, J, JB, JJ, JP, LDWORK, LWKOPT, NB, + $ NBMIN, NN + + INTEGER ILAENV + EXTERNAL ILAENV + EXTERNAL DTRTRI, XERBLA + + INTRINSIC MAX, MIN + + INFO = 0 + NB = ILAENV( 1, 'DGETRI', ' ', N, -1, -1, -1 ) + LWKOPT = N*NB + WORK( 1 ) = LWKOPT + LQUERY = ( LWORK.EQ.-1 ) + IF( N.LT.0 ) THEN + INFO = -1 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -3 + ELSE IF( LWORK.LT.MAX( 1, N ) .AND. .NOT.LQUERY ) THEN + INFO = -6 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGETRI', -INFO ) + RETURN + ELSE IF( LQUERY ) THEN + RETURN + END IF + + IF( N.EQ.0 ) + $ RETURN + + CALL DTRTRI( 'Upper', 'Non-unit', N, A, LDA, INFO ) + IF( INFO.GT.0 ) + $ RETURN + + NBMIN = 2 + LDWORK = N + IF( NB.GT.1 .AND. NB.LT.N ) THEN + IWS = MAX( LDWORK*NB, 1 ) + IF( LWORK.LT.IWS ) THEN + NB = LWORK / LDWORK + NBMIN = MAX( 2, ILAENV( 2, 'DGETRI', ' ', N, -1, -1, -1 ) ) + END IF + ELSE + IWS = N + END IF + + IF( NB.LT.NBMIN .OR. NB.GE.N ) THEN + + DO 20 J = N, 1, -1 + DO 10 I = J + 1, N + WORK( I ) = A( I, J ) + A( I, J ) = ZERO + 10 CONTINUE + + IF( J.LT.N ) + $ CALL DGEMV( 'No transpose', N, N-J, -ONE, A( 1, J+1 ), + $ LDA, WORK( J+1 ), 1, ONE, A( 1, J ), 1 ) + 20 CONTINUE + ELSE + + NN = ( ( N-1 ) / NB )*NB + 1 + DO 50 J = NN, 1, -NB + JB = MIN( NB, N-J+1 ) + + DO 40 JJ = J, J + JB - 1 + DO 30 I = JJ + 1, N + WORK( I+( JJ-J )*LDWORK ) = A( I, JJ ) + A( I, JJ ) = ZERO + 30 CONTINUE + 40 CONTINUE + + IF( J+JB.LE.N ) + $ CALL DGEMM( 'No transpose', 'No transpose', N, JB, + $ N-J-JB+1, -ONE, A( 1, J+JB ), LDA, + $ WORK( J+JB ), LDWORK, ONE, A( 1, J ), LDA ) + CALL DTRSM( 'Right', 'Lower', 'No transpose', 'Unit', N, JB, + $ ONE, WORK( J ), LDWORK, A( 1, J ), LDA ) + 50 CONTINUE + END IF + + DO 60 J = N - 1, 1, -1 + JP = IPIV( J ) + IF( JP.NE.J ) + $ CALL DSWAP( N, A( 1, J ), 1, A( 1, JP ), 1 ) + 60 CONTINUE + + WORK( 1 ) = IWS + RETURN + + END SUBROUTINE DGETRI_HELPER +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DGETRI_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DGETRS_HELPER.f b/Source/Modules/LAPACK/LAPACK_DGETRS_HELPER.f new file mode 100644 index 00000000..91de4521 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DGETRS_HELPER.f @@ -0,0 +1,78 @@ +! ################################################################################################################################## + + MODULE LAPACK_DGETRS_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DGETRS_HELPER( TRANS, N, NRHS, A, LDA, IPIV, B, LDB, + $ INFO ) + + CHARACTER TRANS + INTEGER INFO, LDA, LDB, N, NRHS + INTEGER IPIV( * ) + REAL(DOUBLE) A( LDA, * ), B( LDB, * ) + + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D+0 ) + + LOGICAL NOTRAN + + LOGICAL LSAME + EXTERNAL LSAME + + INTRINSIC MAX + + INFO = 0 + NOTRAN = LSAME( TRANS, 'N' ) + IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) .AND. .NOT. + $ LSAME( TRANS, 'C' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( NRHS.LT.0 ) THEN + INFO = -3 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -5 + ELSE IF( LDB.LT.MAX( 1, N ) ) THEN + INFO = -8 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGETRS', -INFO ) + RETURN + END IF + + IF( N.EQ.0 .OR. NRHS.EQ.0 ) + $ RETURN + + IF( NOTRAN ) THEN + + CALL DLASWP( NRHS, B, LDB, 1, N, IPIV, 1 ) + + CALL DTRSM( 'Left', 'Lower', 'No transpose', 'Unit', N, NRHS, + $ ONE, A, LDA, B, LDB ) + + CALL DTRSM( 'Left', 'Upper', 'No transpose', 'Non-unit', N, + $ NRHS, ONE, A, LDA, B, LDB ) + ELSE + + CALL DTRSM( 'Left', 'Upper', 'Transpose', 'Non-unit', N, NRHS, + $ ONE, A, LDA, B, LDB ) + + CALL DTRSM( 'Left', 'Lower', 'Transpose', 'Unit', N, NRHS, ONE, + $ A, LDA, B, LDB ) + + CALL DLASWP( NRHS, B, LDB, 1, N, IPIV, -1 ) + END IF + + RETURN + + END SUBROUTINE DGETRS_HELPER +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DGETRS_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DISNAN_HELPER.f b/Source/Modules/LAPACK/LAPACK_DISNAN_HELPER.f new file mode 100644 index 00000000..e1a7a9f2 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DISNAN_HELPER.f @@ -0,0 +1,27 @@ +! ################################################################################################################################## + + MODULE LAPACK_DISNAN_HELPER + + CONTAINS + +! ################################################################################################################################## + + LOGICAL FUNCTION DLAISNAN_HELPER(DIN1,DIN2) + + DOUBLE PRECISION DIN1,DIN2 + + DLAISNAN_HELPER = (DIN1.NE.DIN2) + RETURN + END FUNCTION DLAISNAN_HELPER + +! ################################################################################################################################## + + LOGICAL FUNCTION DISNAN_HELPER(DIN) + + DOUBLE PRECISION DIN + + DISNAN_HELPER = DLAISNAN_HELPER(DIN,DIN) + RETURN + END FUNCTION DISNAN_HELPER + + END MODULE LAPACK_DISNAN_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLABAD_HELPER.f b/Source/Modules/LAPACK/LAPACK_DLABAD_HELPER.f new file mode 100644 index 00000000..8c21faba --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLABAD_HELPER.f @@ -0,0 +1,26 @@ +! ################################################################################################################################## + + MODULE LAPACK_DLABAD_HELPER + + USE PENTIUM_II_KIND, ONLY : DOUBLE + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE DLABAD_HELPER( SMALL, LARGE ) + + REAL(DOUBLE) LARGE, SMALL + + INTRINSIC LOG10, SQRT + + IF( LOG10( LARGE ).GT.2000.D0 ) THEN + SMALL = SQRT( SMALL ) + LARGE = SQRT( LARGE ) + END IF + + RETURN + + END SUBROUTINE DLABAD_HELPER + + END MODULE LAPACK_DLABAD_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLACON_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLACON_HELPER.f90 new file mode 100644 index 00000000..4342f74c --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLACON_HELPER.f90 @@ -0,0 +1,117 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLACON_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLACON_HELPER( N, V, X, ISGN, EST, KASE, itmax ) + +INTEGER KASE, N +INTEGER itmax +REAL(DOUBLE) EST +INTEGER ISGN( * ) +REAL(DOUBLE) V( * ), X( * ) +REAL(DOUBLE) ZERO, ONE, TWO +PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0 ) +INTEGER I, ITER, J, JLAST, JUMP +REAL(DOUBLE) ALTSGN, ESTOLD, TEMP + +INTRINSIC ABS, DBLE, NINT, SIGN +SAVE + +IF( KASE.EQ.0 ) THEN + DO 10 I = 1, N + X( I ) = ONE / DBLE( N ) +10 CONTINUE + KASE = 1 + JUMP = 1 + RETURN +END IF + +GO TO ( 20, 40, 70, 110, 140 ) JUMP + +20 CONTINUE +IF( N.EQ.1 ) THEN + V( 1 ) = X( 1 ) + EST = ABS( V( 1 ) ) + GO TO 150 +END IF +EST = DASUM( N, X, 1 ) + +DO 30 I = 1, N + X( I ) = SIGN( ONE, X( I ) ) + ISGN( I ) = NINT( X( I ) ) +30 CONTINUE +KASE = 2 +JUMP = 2 +RETURN + +40 CONTINUE +J = IDAMAX( N, X, 1 ) +ITER = 2 + +50 CONTINUE +DO 60 I = 1, N + X( I ) = ZERO +60 CONTINUE +X( J ) = ONE +KASE = 1 +JUMP = 3 +RETURN + +70 CONTINUE +CALL DCOPY( N, X, 1, V, 1 ) +ESTOLD = EST +EST = DASUM( N, V, 1 ) +DO 80 I = 1, N + IF( NINT( SIGN( ONE, X( I ) ) ).NE.ISGN( I ) ) GO TO 90 +80 CONTINUE +GO TO 120 + +90 CONTINUE +IF( EST.LE.ESTOLD ) GO TO 120 + +DO 100 I = 1, N + X( I ) = SIGN( ONE, X( I ) ) + ISGN( I ) = NINT( X( I ) ) +100 CONTINUE +KASE = 2 +JUMP = 4 +RETURN + +110 CONTINUE +JLAST = J +J = IDAMAX( N, X, 1 ) +IF( ( X( JLAST ).NE.ABS( X( J ) ) ) .AND. ( ITER.LT.ITMAX ) ) THEN + ITER = ITER + 1 + GO TO 50 +END IF + +120 CONTINUE +ALTSGN = ONE +DO 130 I = 1, N + X( I ) = ALTSGN*( ONE+DBLE( I-1 ) / DBLE( N-1 ) ) + ALTSGN = -ALTSGN +130 CONTINUE +KASE = 1 +JUMP = 5 +RETURN + +140 CONTINUE +TEMP = TWO*( DASUM( N, X, 1 ) / DBLE( 3*N ) ) +IF( TEMP.GT.EST ) THEN + CALL DCOPY( N, X, 1, V, 1 ) + EST = TEMP +END IF + +150 CONTINUE +KASE = 0 +RETURN + +END SUBROUTINE DLACON_HELPER + +END MODULE LAPACK_DLACON_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLACPY_HELPER.f b/Source/Modules/LAPACK/LAPACK_DLACPY_HELPER.f new file mode 100644 index 00000000..614cedf8 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLACPY_HELPER.f @@ -0,0 +1,45 @@ +! ################################################################################################################################## + + MODULE LAPACK_DLACPY_HELPER + + USE PENTIUM_II_KIND, ONLY : DOUBLE + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE DLACPY_HELPER( UPLO, M, N, A, LDA, B, LDB ) + + CHARACTER UPLO + INTEGER LDA, LDB, M, N + REAL(DOUBLE) A( LDA, * ), B( LDB, * ) + INTEGER I, J + LOGICAL LSAME + EXTERNAL LSAME + + INTRINSIC MIN + + IF( LSAME( UPLO, 'U' ) ) THEN + DO 20 J = 1, N + DO 10 I = 1, MIN( J, M ) + B( I, J ) = A( I, J ) + 10 CONTINUE + 20 CONTINUE + ELSE IF( LSAME( UPLO, 'L' ) ) THEN + DO 40 J = 1, N + DO 30 I = J, M + B( I, J ) = A( I, J ) + 30 CONTINUE + 40 CONTINUE + ELSE + DO 60 J = 1, N + DO 50 I = 1, M + B( I, J ) = A( I, J ) + 50 CONTINUE + 60 CONTINUE + END IF + RETURN + + END SUBROUTINE DLACPY_HELPER + + END MODULE LAPACK_DLACPY_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLAE2_HELPER.f b/Source/Modules/LAPACK/LAPACK_DLAE2_HELPER.f new file mode 100644 index 00000000..09fa094b --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLAE2_HELPER.f @@ -0,0 +1,59 @@ +! ################################################################################################################################## + + MODULE LAPACK_DLAE2_HELPER + + USE PENTIUM_II_KIND, ONLY : DOUBLE + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE DLAE2_HELPER( A, B, C, RT1, RT2 ) + + REAL(DOUBLE) A, B, C, RT1, RT2 + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D0 ) + REAL(DOUBLE) TWO + PARAMETER ( TWO = 2.0D0 ) + REAL(DOUBLE) ZERO + PARAMETER ( ZERO = 0.0D0 ) + REAL(DOUBLE) HALF + PARAMETER ( HALF = 0.5D0 ) + REAL(DOUBLE) AB, ACMN, ACMX, ADF, DF, RT, SM, TB + + INTRINSIC ABS, SQRT + + SM = A + C + DF = A - C + ADF = ABS( DF ) + TB = B + B + AB = ABS( TB ) + IF( ABS( A ).GT.ABS( C ) ) THEN + ACMX = A + ACMN = C + ELSE + ACMX = C + ACMN = A + END IF + IF( ADF.GT.AB ) THEN + RT = ADF*SQRT( ONE+( AB / ADF )**2 ) + ELSE IF( ADF.LT.AB ) THEN + RT = AB*SQRT( ONE+( ADF / AB )**2 ) + ELSE + RT = AB*SQRT( TWO ) + END IF + IF( SM.LT.ZERO ) THEN + RT1 = HALF*( SM-RT ) + RT2 = ( ACMX / RT1 )*ACMN - ( B / RT1 )*B + ELSE IF( SM.GT.ZERO ) THEN + RT1 = HALF*( SM+RT ) + RT2 = ( ACMX / RT1 )*ACMN - ( B / RT1 )*B + ELSE + RT1 = HALF*RT + RT2 = -HALF*RT + END IF + RETURN + + END SUBROUTINE DLAE2_HELPER + + END MODULE LAPACK_DLAE2_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLAEV2_HELPER.f b/Source/Modules/LAPACK/LAPACK_DLAEV2_HELPER.f new file mode 100644 index 00000000..feef259d --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLAEV2_HELPER.f @@ -0,0 +1,92 @@ +! ################################################################################################################################## + + MODULE LAPACK_DLAEV2_HELPER + + USE PENTIUM_II_KIND, ONLY : DOUBLE + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE DLAEV2_HELPER( A, B, C, RT1, RT2, CS1, SN1 ) + + REAL(DOUBLE) A, B, C, CS1, RT1, RT2, SN1 + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D0 ) + REAL(DOUBLE) TWO + PARAMETER ( TWO = 2.0D0 ) + REAL(DOUBLE) ZERO + PARAMETER ( ZERO = 0.0D0 ) + REAL(DOUBLE) HALF + PARAMETER ( HALF = 0.5D0 ) + INTEGER SGN1, SGN2 + REAL(DOUBLE) AB, ACMN, ACMX, ACS, ADF, CS, CT, DF, RT, SM, + $ TB, TN + + INTRINSIC ABS, SQRT + + SM = A + C + DF = A - C + ADF = ABS( DF ) + TB = B + B + AB = ABS( TB ) + IF( ABS( A ).GT.ABS( C ) ) THEN + ACMX = A + ACMN = C + ELSE + ACMX = C + ACMN = A + END IF + IF( ADF.GT.AB ) THEN + RT = ADF*SQRT( ONE+( AB / ADF )**2 ) + ELSE IF( ADF.LT.AB ) THEN + RT = AB*SQRT( ONE+( ADF / AB )**2 ) + ELSE + RT = AB*SQRT( TWO ) + END IF + IF( SM.LT.ZERO ) THEN + RT1 = HALF*( SM-RT ) + SGN1 = -1 + RT2 = ( ACMX / RT1 )*ACMN - ( B / RT1 )*B + ELSE IF( SM.GT.ZERO ) THEN + RT1 = HALF*( SM+RT ) + SGN1 = 1 + RT2 = ( ACMX / RT1 )*ACMN - ( B / RT1 )*B + ELSE + RT1 = HALF*RT + RT2 = -HALF*RT + SGN1 = 1 + END IF + + IF( DF.GE.ZERO ) THEN + CS = DF + RT + SGN2 = 1 + ELSE + CS = DF - RT + SGN2 = -1 + END IF + ACS = ABS( CS ) + IF( ACS.GT.AB ) THEN + CT = -TB / CS + SN1 = ONE / SQRT( ONE+CT*CT ) + CS1 = CT*SN1 + ELSE + IF( AB.EQ.ZERO ) THEN + CS1 = ONE + SN1 = ZERO + ELSE + TN = -CS / TB + CS1 = ONE / SQRT( ONE+TN*TN ) + SN1 = TN*CS1 + END IF + END IF + IF( SGN1.EQ.SGN2 ) THEN + TN = CS1 + CS1 = -SN1 + SN1 = TN + END IF + RETURN + + END SUBROUTINE DLAEV2_HELPER + + END MODULE LAPACK_DLAEV2_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLAGTS_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLAGTS_HELPER.f90 new file mode 100644 index 00000000..08752058 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLAGTS_HELPER.f90 @@ -0,0 +1,258 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLAGTS_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLAGTS_HELPER( JOB, N, A, B, C, D, IN, Y, TOL, INFO ) + +INTEGER INFO, JOB, N +REAL(DOUBLE) TOL +INTEGER IN( * ) +REAL(DOUBLE) A( * ), B( * ), C( * ), D( * ), Y( * ) +REAL(DOUBLE) ZERO, ONE +PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) +LOGICAL NOTRAN +INTEGER K +REAL(DOUBLE) AK, PERT, SFMIN, TEMP, ABSAK, BIGNUM + +INTRINSIC ABS, SIGN +REAL(DOUBLE) DLAMCH +EXTERNAL DLAMCH + +INFO = 0 +IF( N.EQ.0 ) RETURN + +IF( JOB.LT.-2 .OR. JOB.GT.2 ) THEN + INFO = -1 + CALL XERBLA( 'DLAGTS', -INFO ) + RETURN +END IF + +IF( JOB.EQ.0 ) THEN + TOL = ZERO +END IF + +NOTRAN = JOB.LT.0 +IF( ABS(TOL).LE.ZERO ) THEN + TOL = ABS( A( 1 ) ) + IF( N.GT.1 ) TOL = MAX( TOL, ABS( A( 2 ) ), ABS( B( 1 ) ) ) + DO 10 K = 3, N + TOL = MAX( TOL, ABS( A( K ) ), ABS( B( K-1 ) ), ABS( D( K-2 ) ) ) +10 CONTINUE + TOL = TOL*DLAMCH( 'Epsilon' ) + IF( TOL.EQ.ZERO ) TOL = DLAMCH( 'Safe minimum' ) +END IF + +SFMIN = DLAMCH( 'Safe minimum' ) +BIGNUM = ONE / SFMIN + +IF( NOTRAN ) THEN + DO 20 K = 2, N + IF( IN( K-1 ).EQ.0 ) THEN + Y( K ) = Y( K ) - C( K-1 )*Y( K-1 ) + ELSE + TEMP = Y( K-1 ) + Y( K-1 ) = Y( K ) + Y( K ) = TEMP - C( K-1 )*Y( K ) + END IF +20 CONTINUE + + IF( JOB.EQ.-1 ) THEN + DO 30 K = N, 1, -1 + IF( K.LE.N-2 ) THEN + TEMP = Y( K ) - B( K )*Y( K+1 ) - D( K )*Y( K+2 ) + ELSE IF( K.EQ.N-1 ) THEN + TEMP = Y( K ) - B( K )*Y( K+1 ) + ELSE + TEMP = Y( K ) + END IF + Y( K ) = TEMP / A( K ) +30 CONTINUE + ELSE + DO 50 K = N, 1, -1 + IF( K.LE.N-2 ) THEN + TEMP = Y( K ) - B( K )*Y( K+1 ) - D( K )*Y( K+2 ) + ELSE IF( K.EQ.N-1 ) THEN + TEMP = Y( K ) - B( K )*Y( K+1 ) + ELSE + TEMP = Y( K ) + END IF + AK = A( K ) + PERT = SIGN( TOL, AK ) +40 CONTINUE + ABSAK = ABS( AK ) + IF( ABSAK.LT.ONE ) THEN + IF( ABSAK.LT.SFMIN ) THEN + IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) THEN + AK = AK + PERT + PERT = 2*PERT + GO TO 40 + ELSE + TEMP = TEMP*BIGNUM + AK = AK*BIGNUM + END IF + ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN + AK = AK + PERT + PERT = 2*PERT + GO TO 40 + END IF + END IF + Y( K ) = TEMP / AK +50 CONTINUE + END IF +ELSE + IF( JOB.EQ.2 ) THEN + K = 1 + TEMP = Y( 1 ) + AK = A( 1 ) + ABSAK = ABS( AK ) + IF( ABSAK.LT.ONE ) THEN + IF( ABSAK.LT.SFMIN ) THEN + IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) THEN + INFO = K + RETURN + ELSE + TEMP = TEMP*BIGNUM + AK = AK*BIGNUM + END IF + ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN + INFO = K + RETURN + END IF + END IF + Y( 1 ) = TEMP / AK + IF( N.GE.2 ) THEN + K = 2 + TEMP = Y( 2 ) - B( 1 )*Y( 1 ) + AK = A( 2 ) + ABSAK = ABS( AK ) + IF( ABSAK.LT.ONE ) THEN + IF( ABSAK.LT.SFMIN ) THEN + IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) THEN + INFO = K + RETURN + ELSE + TEMP = TEMP*BIGNUM + AK = AK*BIGNUM + END IF + ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN + INFO = K + RETURN + END IF + END IF + Y( 2 ) = TEMP / AK + END IF + DO 60 K = 3, N + TEMP = Y( K ) - B( K-1 )*Y( K-1 ) - D( K-2 )*Y( K-2 ) + AK = A( K ) + ABSAK = ABS( AK ) + IF( ABSAK.LT.ONE ) THEN + IF( ABSAK.LT.SFMIN ) THEN + IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) THEN + INFO = K + RETURN + ELSE + TEMP = TEMP*BIGNUM + AK = AK*BIGNUM + END IF + ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN + INFO = K + RETURN + END IF + END IF + Y( K ) = TEMP / AK +60 CONTINUE + ELSE + K = 1 + TEMP = Y( 1 ) + AK = A( 1 ) + PERT = SIGN( TOL, AK ) +70 CONTINUE + ABSAK = ABS( AK ) + IF( ABSAK.LT.ONE ) THEN + IF( ABSAK.LT.SFMIN ) THEN + IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) THEN + AK = AK + PERT + PERT = 2*PERT + GO TO 70 + ELSE + TEMP = TEMP*BIGNUM + AK = AK*BIGNUM + END IF + ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN + AK = AK + PERT + PERT = 2*PERT + GO TO 70 + END IF + END IF + Y( 1 ) = TEMP / AK + IF( N.GE.2 ) THEN + K = 2 + TEMP = Y( 2 ) - B( 1 )*Y( 1 ) + AK = A( 2 ) + PERT = SIGN( TOL, AK ) +75 CONTINUE + ABSAK = ABS( AK ) + IF( ABSAK.LT.ONE ) THEN + IF( ABSAK.LT.SFMIN ) THEN + IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) THEN + AK = AK + PERT + PERT = 2*PERT + GO TO 75 + ELSE + TEMP = TEMP*BIGNUM + AK = AK*BIGNUM + END IF + ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN + AK = AK + PERT + PERT = 2*PERT + GO TO 75 + END IF + END IF + Y( 2 ) = TEMP / AK + END IF + DO 80 K = 3, N + TEMP = Y( K ) - B( K-1 )*Y( K-1 ) - D( K-2 )*Y( K-2 ) + AK = A( K ) + PERT = SIGN( TOL, AK ) +77 CONTINUE + ABSAK = ABS( AK ) + IF( ABSAK.LT.ONE ) THEN + IF( ABSAK.LT.SFMIN ) THEN + IF( ABSAK.EQ.ZERO .OR. ABS( TEMP )*SFMIN.GT.ABSAK ) THEN + AK = AK + PERT + PERT = 2*PERT + GO TO 77 + ELSE + TEMP = TEMP*BIGNUM + AK = AK*BIGNUM + END IF + ELSE IF( ABS( TEMP ).GT.ABSAK*BIGNUM ) THEN + AK = AK + PERT + PERT = 2*PERT + GO TO 77 + END IF + END IF + Y( K ) = TEMP / AK +80 CONTINUE + END IF + + DO 90 K = N, 2, -1 + IF( IN( K-1 ).EQ.0 ) THEN + Y( K-1 ) = Y( K-1 ) - C( K-1 )*Y( K ) + ELSE + TEMP = Y( K-1 ) + Y( K-1 ) = Y( K ) + Y( K ) = TEMP - C( K-1 )*Y( K ) + END IF +90 CONTINUE +END IF + +END SUBROUTINE DLAGTS_HELPER + +END MODULE LAPACK_DLAGTS_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLAN_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLAN_HELPER.f90 new file mode 100644 index 00000000..3ef05808 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLAN_HELPER.f90 @@ -0,0 +1,211 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLAN_HELPER + +USE PENTIUM_II_KIND, ONLY : BYTE, DOUBLE +USE IOUNT1, ONLY : SC1 +USE PARAMS, ONLY : NOCOUNTS +USE LAPACK_DLASSQ_HELPER + +CHARACTER(1*BYTE), PARAMETER :: cr13_lba_dlan = CHAR(13) + +CONTAINS + +! ################################################################################################################################## + +DOUBLE PRECISION FUNCTION DLANSB_HELPER( NORM, UPLO, N, K, AB, LDAB, WORK ) + +CHARACTER NORM, UPLO +INTEGER K, LDAB, N +REAL(DOUBLE) AB( LDAB, * ), WORK( * ) +REAL(DOUBLE) ONE, ZERO +PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +INTEGER I, J, L +REAL(DOUBLE) ABSA, SCALE, SUM, VALUE +LOGICAL LSAME +EXTERNAL LSAME +INTRINSIC ABS, MAX, MIN, SQRT + +IF( N.EQ.0 ) THEN + VALUE = ZERO +ELSE IF( LSAME( NORM, 'M' ) ) THEN + VALUE = ZERO + IF( LSAME( UPLO, 'U' ) ) THEN + DO 20 J = 1, N + IF (NOCOUNTS .NE. 'Y') THEN + WRITE(SC1,12345,ADVANCE='NO') J, N, cr13_lba_dlan + ENDIF + DO 10 I = MAX( K+2-J, 1 ), K + 1 + VALUE = MAX( VALUE, ABS( AB( I, J ) ) ) +10 CONTINUE +20 CONTINUE + ELSE + DO 40 J = 1, N + IF (NOCOUNTS .NE. 'Y') THEN + WRITE(SC1,12345,ADVANCE='NO') J, N, cr13_lba_dlan + ENDIF + DO 30 I = 1, MIN( N+1-J, K+1 ) + VALUE = MAX( VALUE, ABS( AB( I, J ) ) ) +30 CONTINUE +40 CONTINUE + END IF +ELSE IF( ( LSAME( NORM, 'I' ) ) .OR. ( LSAME( NORM, 'O' ) ) .OR. ( NORM.EQ.'1' ) ) THEN + VALUE = ZERO + IF( LSAME( UPLO, 'U' ) ) THEN + DO 60 J = 1, N + IF (NOCOUNTS .NE. 'Y') THEN + WRITE(SC1,12345,ADVANCE='NO') J, N, cr13_lba_dlan + ENDIF + SUM = ZERO + L = K + 1 - J + DO 50 I = MAX( 1, J-K ), J - 1 + ABSA = ABS( AB( L+I, J ) ) + SUM = SUM + ABSA + WORK( I ) = WORK( I ) + ABSA +50 CONTINUE + WORK( J ) = SUM + ABS( AB( K+1, J ) ) +60 CONTINUE + DO 70 I = 1, N + IF (NOCOUNTS .NE. 'Y') THEN + WRITE(SC1,12345,ADVANCE='NO') I, N, cr13_lba_dlan + ENDIF + VALUE = MAX( VALUE, WORK( I ) ) +70 CONTINUE + ELSE + DO 80 I = 1, N + WORK( I ) = ZERO +80 CONTINUE + DO 100 J = 1, N + IF (NOCOUNTS .NE. 'Y') THEN + WRITE(SC1,12345,ADVANCE='NO') J, N, cr13_lba_dlan + ENDIF + SUM = WORK( J ) + ABS( AB( 1, J ) ) + L = 1 - J + DO 90 I = J + 1, MIN( N, J+K ) + ABSA = ABS( AB( L+I, J ) ) + SUM = SUM + ABSA + WORK( I ) = WORK( I ) + ABSA +90 CONTINUE + VALUE = MAX( VALUE, SUM ) +100 CONTINUE + END IF +ELSE IF( ( LSAME( NORM, 'F' ) ) .OR. ( LSAME( NORM, 'E' ) ) ) THEN + SCALE = ZERO + SUM = ONE + IF( K.GT.0 ) THEN + IF( LSAME( UPLO, 'U' ) ) THEN + DO 110 J = 2, N + IF (NOCOUNTS .NE. 'Y') THEN + WRITE(SC1,12345,ADVANCE='NO') J, N, cr13_lba_dlan + ENDIF + CALL DLASSQ_HELPER( MIN( J-1, K ), AB( MAX( K+2-J, 1 ), J ), 1, SCALE, SUM ) +110 CONTINUE + L = K + 1 + ELSE + DO 120 J = 1, N - 1 + IF (NOCOUNTS .NE. 'Y') THEN + WRITE(SC1,12345,ADVANCE='NO') J, N, cr13_lba_dlan + ENDIF + CALL DLASSQ_HELPER( MIN( N-J, K ), AB( 2, J ), 1, SCALE, SUM ) +120 CONTINUE + L = 1 + END IF + SUM = 2*SUM + ELSE + L = 1 + END IF + CALL DLASSQ_HELPER( N, AB( L, 1 ), LDAB, SCALE, SUM ) + VALUE = SCALE*SQRT( SUM ) +END IF + +DLANSB_HELPER = VALUE +RETURN + +12345 FORMAT(5X,'Row ',I8,' of ',I8, A) + +END FUNCTION DLANSB_HELPER + +! ################################################################################################################################## + +DOUBLE PRECISION FUNCTION DLANSY_HELPER( NORM, UPLO, N, A, LDA, WORK ) + +CHARACTER NORM, UPLO +INTEGER LDA, N +REAL(DOUBLE) A( LDA, * ), WORK( * ) +REAL(DOUBLE) ONE, ZERO +PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +INTEGER I, J +REAL(DOUBLE) ABSA, SCALE, SUM, VALUE +LOGICAL LSAME +EXTERNAL LSAME +INTRINSIC ABS, MAX, SQRT + +IF( N.EQ.0 ) THEN + VALUE = ZERO +ELSE IF( LSAME( NORM, 'M' ) ) THEN + VALUE = ZERO + IF( LSAME( UPLO, 'U' ) ) THEN + DO 20 J = 1, N + DO 10 I = 1, J + VALUE = MAX( VALUE, ABS( A( I, J ) ) ) +10 CONTINUE +20 CONTINUE + ELSE + DO 40 J = 1, N + DO 30 I = J, N + VALUE = MAX( VALUE, ABS( A( I, J ) ) ) +30 CONTINUE +40 CONTINUE + END IF +ELSE IF( ( LSAME( NORM, 'I' ) ) .OR. ( LSAME( NORM, 'O' ) ) .OR. ( NORM.EQ.'1' ) ) THEN + VALUE = ZERO + IF( LSAME( UPLO, 'U' ) ) THEN + DO 60 J = 1, N + SUM = ZERO + DO 50 I = 1, J - 1 + ABSA = ABS( A( I, J ) ) + SUM = SUM + ABSA + WORK( I ) = WORK( I ) + ABSA +50 CONTINUE + WORK( J ) = SUM + ABS( A( J, J ) ) +60 CONTINUE + DO 70 I = 1, N + VALUE = MAX( VALUE, WORK( I ) ) +70 CONTINUE + ELSE + DO 80 I = 1, N + WORK( I ) = ZERO +80 CONTINUE + DO 100 J = 1, N + SUM = WORK( J ) + ABS( A( J, J ) ) + DO 90 I = J + 1, N + ABSA = ABS( A( I, J ) ) + SUM = SUM + ABSA + WORK( I ) = WORK( I ) + ABSA +90 CONTINUE + VALUE = MAX( VALUE, SUM ) +100 CONTINUE + END IF +ELSE IF( ( LSAME( NORM, 'F' ) ) .OR. ( LSAME( NORM, 'E' ) ) ) THEN + SCALE = ZERO + SUM = ONE + IF( LSAME( UPLO, 'U' ) ) THEN + DO 110 J = 2, N + CALL DLASSQ_HELPER( J-1, A( 1, J ), 1, SCALE, SUM ) +110 CONTINUE + ELSE + DO 120 J = 1, N - 1 + CALL DLASSQ_HELPER( N-J, A( J+1, J ), 1, SCALE, SUM ) +120 CONTINUE + END IF + SUM = 2*SUM + CALL DLASSQ_HELPER( N, A, LDA+1, SCALE, SUM ) + VALUE = SCALE*SQRT( SUM ) +END IF + +DLANSY_HELPER = VALUE +RETURN + +END FUNCTION DLANSY_HELPER + +END MODULE LAPACK_DLAN_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLAPY2_HELPER.f b/Source/Modules/LAPACK/LAPACK_DLAPY2_HELPER.f new file mode 100644 index 00000000..76af5946 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLAPY2_HELPER.f @@ -0,0 +1,35 @@ +! ################################################################################################################################## + + MODULE LAPACK_DLAPY2_HELPER + + USE PENTIUM_II_KIND, ONLY : DOUBLE + + CONTAINS + +! ################################################################################################################################## + + DOUBLE PRECISION FUNCTION DLAPY2_HELPER( X, Y ) + + REAL(DOUBLE) X, Y + REAL(DOUBLE) ZERO + PARAMETER ( ZERO = 0.0D0 ) + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D0 ) + REAL(DOUBLE) W, XABS, YABS, Z + + INTRINSIC ABS, MAX, MIN, SQRT + + XABS = ABS( X ) + YABS = ABS( Y ) + W = MAX( XABS, YABS ) + Z = MIN( XABS, YABS ) + IF( Z.EQ.ZERO ) THEN + DLAPY2_HELPER = W + ELSE + DLAPY2_HELPER = W*SQRT( ONE+( Z / W )**2 ) + END IF + RETURN + + END FUNCTION DLAPY2_HELPER + + END MODULE LAPACK_DLAPY2_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLARFB_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLARFB_HELPER.f90 new file mode 100644 index 00000000..d82c9463 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLARFB_HELPER.f90 @@ -0,0 +1,302 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLARFB_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLARFB_HELPER( SIDE, TRANS, DIRECT, STOREV, M, N, K, V, LDV, & + T, LDT, C, LDC, WORK, LDWORK ) + +CHARACTER SIDE, TRANS, DIRECT, STOREV +INTEGER M, N, K, LDV, LDT, LDC, LDWORK +REAL(DOUBLE) V( LDV, * ), T( LDT, * ), C( LDC, * ), WORK( LDWORK, * ) +REAL(DOUBLE) ONE +PARAMETER ( ONE = 1.0D+0 ) +CHARACTER TRANST +INTEGER I, J +LOGICAL LSAME +EXTERNAL LSAME +EXTERNAL DCOPY, DTRMM, DGEMM + +IF( M.LE.0 .OR. N.LE.0 ) RETURN + +IF( LSAME( TRANS, 'N' ) ) THEN + TRANST = 'T' +ELSE + TRANST = 'N' +END IF + +IF( LSAME( STOREV, 'C' ) ) THEN + + IF( LSAME( DIRECT, 'F' ) ) THEN + + IF( LSAME( SIDE, 'L' ) ) THEN + + DO 10 J = 1, K + CALL DCOPY( N, C( J, 1 ), LDC, WORK( 1, J ), 1 ) +10 CONTINUE + + CALL DTRMM( 'Right', 'Lower', 'No transpose', 'Unit', N, & + K, ONE, V, LDV, WORK, LDWORK ) + IF( M.GT.K ) THEN + CALL DGEMM( 'Transpose', 'No transpose', N, K, M-K, & + ONE, C( K+1, 1 ), LDC, V( K+1, 1 ), LDV, & + ONE, WORK, LDWORK ) + END IF + + CALL DTRMM( 'Right', 'Upper', TRANST, 'Non-unit', N, K, & + ONE, T, LDT, WORK, LDWORK ) + + IF( M.GT.K ) THEN + CALL DGEMM( 'No transpose', 'Transpose', M-K, N, K, & + -ONE, V( K+1, 1 ), LDV, WORK, LDWORK, ONE, & + C( K+1, 1 ), LDC ) + END IF + + CALL DTRMM( 'Right', 'Lower', 'Transpose', 'Unit', N, K, & + ONE, V, LDV, WORK, LDWORK ) + + DO 30 J = 1, K + DO 20 I = 1, N + C( J, I ) = C( J, I ) - WORK( I, J ) +20 CONTINUE +30 CONTINUE + + ELSE IF( LSAME( SIDE, 'R' ) ) THEN + + DO 40 J = 1, K + CALL DCOPY( M, C( 1, J ), 1, WORK( 1, J ), 1 ) +40 CONTINUE + + CALL DTRMM( 'Right', 'Lower', 'No transpose', 'Unit', M, & + K, ONE, V, LDV, WORK, LDWORK ) + IF( N.GT.K ) THEN + CALL DGEMM( 'No transpose', 'No transpose', M, K, N-K, & + ONE, C( 1, K+1 ), LDC, V( K+1, 1 ), LDV, & + ONE, WORK, LDWORK ) + END IF + + CALL DTRMM( 'Right', 'Upper', TRANS, 'Non-unit', M, K, & + ONE, T, LDT, WORK, LDWORK ) + + IF( N.GT.K ) THEN + CALL DGEMM( 'No transpose', 'Transpose', M, N-K, K, & + -ONE, WORK, LDWORK, V( K+1, 1 ), LDV, ONE, & + C( 1, K+1 ), LDC ) + END IF + + CALL DTRMM( 'Right', 'Lower', 'Transpose', 'Unit', M, K, & + ONE, V, LDV, WORK, LDWORK ) + + DO 60 J = 1, K + DO 50 I = 1, M + C( I, J ) = C( I, J ) - WORK( I, J ) +50 CONTINUE +60 CONTINUE + END IF + + ELSE + + IF( LSAME( SIDE, 'L' ) ) THEN + + DO 70 J = 1, K + CALL DCOPY( N, C( M-K+J, 1 ), LDC, WORK( 1, J ), 1 ) +70 CONTINUE + + CALL DTRMM( 'Right', 'Upper', 'No transpose', 'Unit', N, & + K, ONE, V( M-K+1, 1 ), LDV, WORK, LDWORK ) + IF( M.GT.K ) THEN + CALL DGEMM( 'Transpose', 'No transpose', N, K, M-K, & + ONE, C, LDC, V, LDV, ONE, WORK, LDWORK ) + END IF + + CALL DTRMM( 'Right', 'Lower', TRANST, 'Non-unit', N, K, & + ONE, T, LDT, WORK, LDWORK ) + + IF( M.GT.K ) THEN + CALL DGEMM( 'No transpose', 'Transpose', M-K, N, K, & + -ONE, V, LDV, WORK, LDWORK, ONE, C, LDC ) + END IF + + CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Unit', N, K, & + ONE, V( M-K+1, 1 ), LDV, WORK, LDWORK ) + + DO 90 J = 1, K + DO 80 I = 1, N + C( M-K+J, I ) = C( M-K+J, I ) - WORK( I, J ) +80 CONTINUE +90 CONTINUE + + ELSE IF( LSAME( SIDE, 'R' ) ) THEN + + DO 100 J = 1, K + CALL DCOPY( M, C( 1, N-K+J ), 1, WORK( 1, J ), 1 ) +100 CONTINUE + + CALL DTRMM( 'Right', 'Upper', 'No transpose', 'Unit', M, & + K, ONE, V( N-K+1, 1 ), LDV, WORK, LDWORK ) + IF( N.GT.K ) THEN + CALL DGEMM( 'No transpose', 'No transpose', M, K, N-K, & + ONE, C, LDC, V, LDV, ONE, WORK, LDWORK ) + END IF + + CALL DTRMM( 'Right', 'Lower', TRANS, 'Non-unit', M, K, & + ONE, T, LDT, WORK, LDWORK ) + + IF( N.GT.K ) THEN + CALL DGEMM( 'No transpose', 'Transpose', M, N-K, K, & + -ONE, WORK, LDWORK, V, LDV, ONE, C, LDC ) + END IF + + CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Unit', M, K, & + ONE, V( N-K+1, 1 ), LDV, WORK, LDWORK ) + + DO 120 J = 1, K + DO 110 I = 1, M + C( I, N-K+J ) = C( I, N-K+J ) - WORK( I, J ) +110 CONTINUE +120 CONTINUE + END IF + END IF + +ELSE IF( LSAME( STOREV, 'R' ) ) THEN + + IF( LSAME( DIRECT, 'F' ) ) THEN + + IF( LSAME( SIDE, 'L' ) ) THEN + + DO 130 J = 1, K + CALL DCOPY( N, C( J, 1 ), LDC, WORK( 1, J ), 1 ) +130 CONTINUE + + CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Unit', N, K, & + ONE, V, LDV, WORK, LDWORK ) + IF( M.GT.K ) THEN + CALL DGEMM( 'Transpose', 'Transpose', N, K, M-K, ONE, & + C( K+1, 1 ), LDC, V( 1, K+1 ), LDV, ONE, & + WORK, LDWORK ) + END IF + + CALL DTRMM( 'Right', 'Upper', TRANST, 'Non-unit', N, K, & + ONE, T, LDT, WORK, LDWORK ) + + IF( M.GT.K ) THEN + CALL DGEMM( 'Transpose', 'Transpose', M-K, N, K, -ONE, & + V( 1, K+1 ), LDV, WORK, LDWORK, ONE, & + C( K+1, 1 ), LDC ) + END IF + + CALL DTRMM( 'Right', 'Upper', 'No transpose', 'Unit', N, K, & + ONE, V, LDV, WORK, LDWORK ) + + DO 150 J = 1, K + DO 140 I = 1, N + C( J, I ) = C( J, I ) - WORK( I, J ) +140 CONTINUE +150 CONTINUE + + ELSE IF( LSAME( SIDE, 'R' ) ) THEN + + DO 160 J = 1, K + CALL DCOPY( M, C( 1, J ), 1, WORK( 1, J ), 1 ) +160 CONTINUE + + CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Unit', M, K, & + ONE, V, LDV, WORK, LDWORK ) + IF( N.GT.K ) THEN + CALL DGEMM( 'No transpose', 'Transpose', M, K, N-K, & + ONE, C( 1, K+1 ), LDC, V( 1, K+1 ), LDV, & + ONE, WORK, LDWORK ) + END IF + + CALL DTRMM( 'Right', 'Upper', TRANS, 'Non-unit', M, K, & + ONE, T, LDT, WORK, LDWORK ) + + IF( N.GT.K ) THEN + CALL DGEMM( 'No transpose', 'No transpose', M, N-K, K, & + -ONE, WORK, LDWORK, V( 1, K+1 ), LDV, ONE, & + C( 1, K+1 ), LDC ) + END IF + + CALL DTRMM( 'Right', 'Upper', 'No transpose', 'Unit', M, K, & + ONE, V, LDV, WORK, LDWORK ) + + DO 180 J = 1, K + DO 170 I = 1, M + C( I, J ) = C( I, J ) - WORK( I, J ) +170 CONTINUE +180 CONTINUE + END IF + + ELSE + + IF( LSAME( SIDE, 'L' ) ) THEN + + DO 190 J = 1, K + CALL DCOPY( N, C( M-K+J, 1 ), LDC, WORK( 1, J ), 1 ) +190 CONTINUE + + CALL DTRMM( 'Right', 'Lower', 'Transpose', 'Unit', N, K, & + ONE, V( 1, M-K+1 ), LDV, WORK, LDWORK ) + IF( M.GT.K ) THEN + CALL DGEMM( 'Transpose', 'Transpose', N, K, M-K, ONE, & + C, LDC, V, LDV, ONE, WORK, LDWORK ) + END IF + + CALL DTRMM( 'Right', 'Lower', TRANST, 'Non-unit', N, K, & + ONE, T, LDT, WORK, LDWORK ) + + IF( M.GT.K ) THEN + CALL DGEMM( 'Transpose', 'Transpose', M-K, N, K, -ONE, & + V, LDV, WORK, LDWORK, ONE, C, LDC ) + END IF + + CALL DTRMM( 'Right', 'Lower', 'No transpose', 'Unit', N, K, & + ONE, V( 1, M-K+1 ), LDV, WORK, LDWORK ) + + DO 210 J = 1, K + DO 200 I = 1, N + C( M-K+J, I ) = C( M-K+J, I ) - WORK( I, J ) +200 CONTINUE +210 CONTINUE + + ELSE IF( LSAME( SIDE, 'R' ) ) THEN + + DO 220 J = 1, K + CALL DCOPY( M, C( 1, N-K+J ), 1, WORK( 1, J ), 1 ) +220 CONTINUE + + CALL DTRMM( 'Right', 'Lower', 'Transpose', 'Unit', M, K, & + ONE, V( 1, N-K+1 ), LDV, WORK, LDWORK ) + IF( N.GT.K ) THEN + CALL DGEMM( 'No transpose', 'Transpose', M, K, N-K, & + ONE, C, LDC, V, LDV, ONE, WORK, LDWORK ) + END IF + + CALL DTRMM( 'Right', 'Lower', TRANS, 'Non-unit', M, K, & + ONE, T, LDT, WORK, LDWORK ) + + IF( N.GT.K ) THEN + CALL DGEMM( 'No transpose', 'No transpose', M, N-K, K, & + -ONE, WORK, LDWORK, V, LDV, ONE, C, LDC ) + END IF + + CALL DTRMM( 'Right', 'Lower', 'No transpose', 'Unit', M, K, & + ONE, V( 1, N-K+1 ), LDV, WORK, LDWORK ) + + DO 240 J = 1, K + DO 230 I = 1, M + C( I, N-K+J ) = C( I, N-K+J ) - WORK( I, J ) +230 CONTINUE +240 CONTINUE + END IF + END IF +END IF + +END SUBROUTINE DLARFB_HELPER + +END MODULE LAPACK_DLARFB_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLARFG_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLARFG_HELPER.f90 new file mode 100644 index 00000000..69c90715 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLARFG_HELPER.f90 @@ -0,0 +1,67 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLARFG_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE +USE LAPACK_DLAPY2_HELPER, ONLY : DLAPY2_HELPER + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLARFG_HELPER( N, ALPHA, X, INCX, TAU ) + +INTEGER INCX, N +REAL(DOUBLE) ALPHA, TAU +REAL(DOUBLE) X( * ) +REAL(DOUBLE) ONE, ZERO +PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +INTEGER J, KNT +REAL(DOUBLE) BETA, RSAFMN, SAFMIN, XNORM + +INTRINSIC ABS, SIGN +REAL(DOUBLE) DLAMCH, DNRM2 +EXTERNAL DLAMCH, DNRM2 +EXTERNAL DSCAL + +IF( N.LE.1 ) THEN + TAU = ZERO + RETURN +END IF + +XNORM = DNRM2( N-1, X, INCX ) + +IF( XNORM.EQ.ZERO ) THEN + TAU = ZERO +ELSE + BETA = -SIGN( DLAPY2_HELPER( ALPHA, XNORM ), ALPHA ) + SAFMIN = DLAMCH( 'S' ) / DLAMCH( 'E' ) + IF( ABS( BETA ).LT.SAFMIN ) THEN + RSAFMN = ONE / SAFMIN + KNT = 0 +10 CONTINUE + KNT = KNT + 1 + CALL DSCAL( N-1, RSAFMN, X, INCX ) + BETA = BETA*RSAFMN + ALPHA = ALPHA*RSAFMN + IF( ABS( BETA ).LT.SAFMIN ) GO TO 10 + + XNORM = DNRM2( N-1, X, INCX ) + BETA = -SIGN( DLAPY2_HELPER( ALPHA, XNORM ), ALPHA ) + TAU = ( BETA-ALPHA ) / BETA + CALL DSCAL( N-1, ONE / ( ALPHA-BETA ), X, INCX ) + + ALPHA = BETA + DO 20 J = 1, KNT + ALPHA = ALPHA*SAFMIN +20 CONTINUE + ELSE + TAU = ( BETA-ALPHA ) / BETA + CALL DSCAL( N-1, ONE / ( ALPHA-BETA ), X, INCX ) + ALPHA = BETA + END IF +END IF + +END SUBROUTINE DLARFG_HELPER + +END MODULE LAPACK_DLARFG_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLARFT_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLARFT_HELPER.f90 new file mode 100644 index 00000000..03b955fa --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLARFT_HELPER.f90 @@ -0,0 +1,83 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLARFT_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLARFT_HELPER( DIRECT, STOREV, N, K, V, LDV, TAU, T, LDT ) + +CHARACTER DIRECT, STOREV +INTEGER K, LDT, LDV, N +REAL(DOUBLE) T( LDT, * ), TAU( * ), V( LDV, * ) +REAL(DOUBLE) ONE, ZERO +PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +INTEGER I, J +REAL(DOUBLE) VII +LOGICAL LSAME +EXTERNAL LSAME +EXTERNAL DGEMV, DTRMV + +IF( N.EQ.0 ) RETURN + +IF( LSAME( DIRECT, 'F' ) ) THEN + DO 20 I = 1, K + IF( TAU( I ).EQ.ZERO ) THEN + DO 10 J = 1, I + T( J, I ) = ZERO +10 CONTINUE + ELSE + VII = V( I, I ) + V( I, I ) = ONE + IF( LSAME( STOREV, 'C' ) ) THEN + CALL DGEMV( 'Transpose', N-I+1, I-1, -TAU( I ), & + V( I, 1 ), LDV, V( I, I ), 1, ZERO, & + T( 1, I ), 1 ) + ELSE + CALL DGEMV( 'No transpose', I-1, N-I+1, -TAU( I ), & + V( 1, I ), LDV, V( I, I ), LDV, ZERO, & + T( 1, I ), 1 ) + END IF + V( I, I ) = VII + CALL DTRMV( 'Upper', 'No transpose', 'Non-unit', I-1, T, & + LDT, T( 1, I ), 1 ) + T( I, I ) = TAU( I ) + END IF +20 CONTINUE +ELSE + DO 40 I = K, 1, -1 + IF( TAU( I ).EQ.ZERO ) THEN + DO 30 J = I, K + T( J, I ) = ZERO +30 CONTINUE + ELSE + IF( I.LT.K ) THEN + IF( LSAME( STOREV, 'C' ) ) THEN + VII = V( N-K+I, I ) + V( N-K+I, I ) = ONE + CALL DGEMV( 'Transpose', N-K+I, K-I, -TAU( I ), & + V( 1, I+1 ), LDV, V( 1, I ), 1, ZERO, & + T( I+1, I ), 1 ) + V( N-K+I, I ) = VII + ELSE + VII = V( I, N-K+I ) + V( I, N-K+I ) = ONE + CALL DGEMV( 'No transpose', K-I, N-K+I, -TAU( I ), & + V( I+1, 1 ), LDV, V( I, 1 ), LDV, ZERO, & + T( I+1, I ), 1 ) + V( I, N-K+I ) = VII + END IF + CALL DTRMV( 'Lower', 'No transpose', 'Non-unit', K-I, & + T( I+1, I+1 ), LDT, T( I+1, I ), 1 ) + END IF + T( I, I ) = TAU( I ) + END IF +40 CONTINUE +END IF + +END SUBROUTINE DLARFT_HELPER + +END MODULE LAPACK_DLARFT_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLARF_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLARF_HELPER.f90 new file mode 100644 index 00000000..2a4a0911 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLARF_HELPER.f90 @@ -0,0 +1,39 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLARF_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLARF_HELPER( SIDE, M, N, V, INCV, TAU, C, LDC, WORK ) + +CHARACTER SIDE +INTEGER INCV, LDC, M, N +REAL(DOUBLE) TAU +REAL(DOUBLE) C( LDC, * ), V( * ), WORK( * ) +REAL(DOUBLE) ONE, ZERO +PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +LOGICAL LSAME +EXTERNAL LSAME +EXTERNAL DGEMV, DGER + +IF( LSAME( SIDE, 'L' ) ) THEN + IF( TAU.NE.ZERO ) THEN + CALL DGEMV( 'Transpose', M, N, ONE, C, LDC, V, INCV, ZERO, & + WORK, 1 ) + CALL DGER( M, N, -TAU, V, INCV, WORK, 1, C, LDC ) + END IF +ELSE + IF( TAU.NE.ZERO ) THEN + CALL DGEMV( 'No transpose', M, N, ONE, C, LDC, V, INCV, ZERO, & + WORK, 1 ) + CALL DGER( M, N, -TAU, WORK, 1, V, INCV, C, LDC ) + END IF +END IF + +END SUBROUTINE DLARF_HELPER + +END MODULE LAPACK_DLARF_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLARTG_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLARTG_HELPER.f90 new file mode 100644 index 00000000..edc2ba89 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLARTG_HELPER.f90 @@ -0,0 +1,93 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLARTG_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLARTG_HELPER( F, G, CS, SN, R ) + +REAL(DOUBLE) CS, F, G, R, SN +REAL(DOUBLE) ZERO +PARAMETER ( ZERO = 0.0D0 ) +REAL(DOUBLE) ONE +PARAMETER ( ONE = 1.0D0 ) +REAL(DOUBLE) TWO +PARAMETER ( TWO = 2.0D0 ) +LOGICAL FIRST +INTEGER COUNT, I +REAL(DOUBLE) EPS, F1, G1, SAFMIN, SAFMN2, SAFMX2, SCALE + +REAL(DOUBLE) DLAMCH +EXTERNAL DLAMCH +INTRINSIC ABS, INT, LOG, MAX, SQRT + +SAVE FIRST, SAFMX2, SAFMIN, SAFMN2 +DATA FIRST / .TRUE. / + +IF( FIRST ) THEN + FIRST = .FALSE. + SAFMIN = DLAMCH( 'S' ) + EPS = DLAMCH( 'E' ) + SAFMN2 = DLAMCH( 'B' )**INT( LOG( SAFMIN / EPS ) / & + LOG( DLAMCH( 'B' ) ) / TWO ) + SAFMX2 = ONE / SAFMN2 +END IF +IF( G.EQ.ZERO ) THEN + CS = ONE + SN = ZERO + R = F +ELSE IF( F.EQ.ZERO ) THEN + CS = ZERO + SN = ONE + R = G +ELSE + F1 = F + G1 = G + SCALE = MAX( ABS( F1 ), ABS( G1 ) ) + IF( SCALE.GE.SAFMX2 ) THEN + COUNT = 0 +10 CONTINUE + COUNT = COUNT + 1 + F1 = F1*SAFMN2 + G1 = G1*SAFMN2 + SCALE = MAX( ABS( F1 ), ABS( G1 ) ) + IF( SCALE.GE.SAFMX2 ) GO TO 10 + R = SQRT( F1**2+G1**2 ) + CS = F1 / R + SN = G1 / R + DO 20 I = 1, COUNT + R = R*SAFMX2 +20 CONTINUE + ELSE IF( SCALE.LE.SAFMN2 ) THEN + COUNT = 0 +30 CONTINUE + COUNT = COUNT + 1 + F1 = F1*SAFMX2 + G1 = G1*SAFMX2 + SCALE = MAX( ABS( F1 ), ABS( G1 ) ) + IF( SCALE.LE.SAFMN2 ) GO TO 30 + R = SQRT( F1**2+G1**2 ) + CS = F1 / R + SN = G1 / R + DO 40 I = 1, COUNT + R = R*SAFMN2 +40 CONTINUE + ELSE + R = SQRT( F1**2+G1**2 ) + CS = F1 / R + SN = G1 / R + END IF + IF( ABS( F ).GT.ABS( G ) .AND. CS.LT.ZERO ) THEN + CS = -CS + SN = -SN + R = -R + END IF +END IF + +END SUBROUTINE DLARTG_HELPER + +END MODULE LAPACK_DLARTG_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLAR_ROT_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLAR_ROT_HELPER.f90 new file mode 100644 index 00000000..12873d01 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLAR_ROT_HELPER.f90 @@ -0,0 +1,78 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLAR_ROT_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLARGV_HELPER( N, X, INCX, Y, INCY, C, INCC ) + +INTEGER INCC, INCX, INCY, N +REAL(DOUBLE) C( * ), X( * ), Y( * ) +REAL(DOUBLE) ZERO, ONE +PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) +INTEGER I, IC, IX, IY +REAL(DOUBLE) F, G, T, TT + +INTRINSIC ABS, SQRT + +IX = 1 +IY = 1 +IC = 1 +DO 10 I = 1, N + F = X( IX ) + G = Y( IY ) + IF( G.EQ.ZERO ) THEN + C( IC ) = ONE + ELSE IF( F.EQ.ZERO ) THEN + C( IC ) = ZERO + Y( IY ) = ONE + X( IX ) = G + ELSE IF( ABS( F ).GT.ABS( G ) ) THEN + T = G / F + TT = SQRT( ONE+T*T ) + C( IC ) = ONE / TT + Y( IY ) = T*C( IC ) + X( IX ) = F*TT + ELSE + T = F / G + TT = SQRT( ONE+T*T ) + Y( IY ) = ONE / TT + C( IC ) = T*Y( IY ) + X( IX ) = G*TT + END IF + IC = IC + INCC + IY = IY + INCY + IX = IX + INCX +10 CONTINUE + +END SUBROUTINE DLARGV_HELPER + +! ################################################################################################################################## + +SUBROUTINE DLARTV_HELPER( N, X, INCX, Y, INCY, C, S, INCC ) + +INTEGER INCC, INCX, INCY, N +REAL(DOUBLE) C( * ), S( * ), X( * ), Y( * ) +INTEGER I, IC, IX, IY +REAL(DOUBLE) XI, YI + +IX = 1 +IY = 1 +IC = 1 +DO 10 I = 1, N + XI = X( IX ) + YI = Y( IY ) + X( IX ) = C( IC )*XI + S( IC )*YI + Y( IY ) = C( IC )*YI - S( IC )*XI + IX = IX + INCX + IY = IY + INCY + IC = IC + INCC +10 CONTINUE + +END SUBROUTINE DLARTV_HELPER + +END MODULE LAPACK_DLAR_ROT_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLASCL_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLASCL_HELPER.f90 new file mode 100644 index 00000000..b1323a97 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLASCL_HELPER.f90 @@ -0,0 +1,159 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLASCL_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +SUBROUTINE DLASCL_HELPER( TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO ) + +CHARACTER TYPE +INTEGER INFO, KL, KU, LDA, M, N +REAL(DOUBLE) CFROM, CTO +REAL(DOUBLE) A( LDA, * ) +REAL(DOUBLE) ZERO, ONE +PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) +LOGICAL DONE +INTEGER I, ITYPE, J, K1, K2, K3, K4 +REAL(DOUBLE) BIGNUM, CFROM1, CFROMC, CTO1, CTOC, MUL, SMLNUM +LOGICAL LSAME +EXTERNAL LSAME +REAL(DOUBLE) DLAMCH +EXTERNAL DLAMCH +INTRINSIC ABS, MAX, MIN + +INFO = 0 + +IF( LSAME( TYPE, 'G' ) ) THEN + ITYPE = 0 +ELSE IF( LSAME( TYPE, 'L' ) ) THEN + ITYPE = 1 +ELSE IF( LSAME( TYPE, 'U' ) ) THEN + ITYPE = 2 +ELSE IF( LSAME( TYPE, 'H' ) ) THEN + ITYPE = 3 +ELSE IF( LSAME( TYPE, 'B' ) ) THEN + ITYPE = 4 +ELSE IF( LSAME( TYPE, 'Q' ) ) THEN + ITYPE = 5 +ELSE IF( LSAME( TYPE, 'Z' ) ) THEN + ITYPE = 6 +ELSE + ITYPE = -1 +END IF + +IF( ITYPE.EQ.-1 ) THEN + INFO = -1 +ELSE IF( CFROM.EQ.ZERO ) THEN + INFO = -4 +ELSE IF( M.LT.0 ) THEN + INFO = -6 +ELSE IF( N.LT.0 .OR. ( ITYPE.EQ.4 .AND. N.NE.M ) .OR. & + ( ITYPE.EQ.5 .AND. N.NE.M ) ) THEN + INFO = -7 +ELSE IF( ITYPE.LE.3 .AND. LDA.LT.MAX( 1, M ) ) THEN + INFO = -9 +ELSE IF( ITYPE.GE.4 ) THEN + IF( KL.LT.0 .OR. KL.GT.MAX( M-1, 0 ) ) THEN + INFO = -2 + ELSE IF( KU.LT.0 .OR. KU.GT.MAX( N-1, 0 ) .OR. & + ( ( ITYPE.EQ.4 .OR. ITYPE.EQ.5 ) .AND. KL.NE.KU ) ) THEN + INFO = -3 + ELSE IF( ( ITYPE.EQ.4 .AND. LDA.LT.KL+1 ) .OR. & + ( ITYPE.EQ.5 .AND. LDA.LT.KU+1 ) .OR. & + ( ITYPE.EQ.6 .AND. LDA.LT.2*KL+KU+1 ) ) THEN + INFO = -9 + END IF +END IF + +IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASCL', -INFO ) + RETURN +END IF + +IF( N.EQ.0 .OR. M.EQ.0 ) RETURN + +SMLNUM = DLAMCH( 'S' ) +BIGNUM = ONE / SMLNUM + +CFROMC = CFROM +CTOC = CTO + +10 CONTINUE +CFROM1 = CFROMC*SMLNUM +CTO1 = CTOC / BIGNUM +IF( ABS( CFROM1 ).GT.ABS( CTOC ) .AND. CTOC.NE.ZERO ) THEN + MUL = SMLNUM + DONE = .FALSE. + CFROMC = CFROM1 +ELSE IF( ABS( CTO1 ).GT.ABS( CFROMC ) ) THEN + MUL = BIGNUM + DONE = .FALSE. + CTOC = CTO1 +ELSE + MUL = CTOC / CFROMC + DONE = .TRUE. +END IF + +IF( ITYPE.EQ.0 ) THEN + DO 30 J = 1, N + DO 20 I = 1, M + A( I, J ) = A( I, J )*MUL +20 CONTINUE +30 CONTINUE +ELSE IF( ITYPE.EQ.1 ) THEN + DO 50 J = 1, N + DO 40 I = J, M + A( I, J ) = A( I, J )*MUL +40 CONTINUE +50 CONTINUE +ELSE IF( ITYPE.EQ.2 ) THEN + DO 70 J = 1, N + DO 60 I = 1, MIN( J, M ) + A( I, J ) = A( I, J )*MUL +60 CONTINUE +70 CONTINUE +ELSE IF( ITYPE.EQ.3 ) THEN + DO 90 J = 1, N + DO 80 I = 1, MIN( J+1, M ) + A( I, J ) = A( I, J )*MUL +80 CONTINUE +90 CONTINUE +ELSE IF( ITYPE.EQ.4 ) THEN + K3 = KL + 1 + K4 = N + 1 + DO 110 J = 1, N + DO 100 I = 1, MIN( K3, K4-J ) + A( I, J ) = A( I, J )*MUL +100 CONTINUE +110 CONTINUE +ELSE IF( ITYPE.EQ.5 ) THEN + K1 = KU + 2 + K3 = KU + 1 + DO 130 J = 1, N + DO 120 I = MAX( K1-J, 1 ), K3 + A( I, J ) = A( I, J )*MUL +120 CONTINUE +130 CONTINUE +ELSE IF( ITYPE.EQ.6 ) THEN + K1 = KL + KU + 2 + K2 = KL + 1 + K3 = 2*KL + KU + 1 + K4 = KL + KU + 1 + M + DO 150 J = 1, N + DO 140 I = MAX( K1-J, K2 ), MIN( K3, K4-J ) + A( I, J ) = A( I, J )*MUL +140 CONTINUE +150 CONTINUE +END IF + +IF( .NOT.DONE ) GO TO 10 + +RETURN + +END SUBROUTINE DLASCL_HELPER + +END MODULE LAPACK_DLASCL_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLASRT_HELPER.f b/Source/Modules/LAPACK/LAPACK_DLASRT_HELPER.f new file mode 100644 index 00000000..cb2ab655 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLASRT_HELPER.f @@ -0,0 +1,175 @@ +! ################################################################################################################################## + + MODULE LAPACK_DLASRT_HELPER + + USE PENTIUM_II_KIND, ONLY : DOUBLE + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE DLASRT_HELPER( ID, N, D, INFO ) + + CHARACTER ID + INTEGER INFO, N + REAL(DOUBLE) D( * ) + INTEGER SELECT + PARAMETER ( SELECT = 20 ) + INTEGER DIR, ENDD, I, J, START, STKPNT + REAL(DOUBLE) D1, D2, D3, DMNMX, TMP + INTEGER STACK( 2, 32 ) + LOGICAL LSAME + EXTERNAL LSAME + + INFO = 0 + DIR = -1 + IF( LSAME( ID, 'D' ) ) THEN + DIR = 0 + ELSE IF( LSAME( ID, 'I' ) ) THEN + DIR = 1 + END IF + IF( DIR.EQ.-1 ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASRT', -INFO ) + RETURN + END IF + + IF( N.LE.1 ) + $ RETURN + + STKPNT = 1 + STACK( 1, 1 ) = 1 + STACK( 2, 1 ) = N + 10 CONTINUE + START = STACK( 1, STKPNT ) + ENDD = STACK( 2, STKPNT ) + STKPNT = STKPNT - 1 + IF( ENDD-START.LE.SELECT .AND. ENDD-START.GT.0 ) THEN + IF( DIR.EQ.0 ) THEN + DO 30 I = START + 1, ENDD + DO 20 J = I, START + 1, -1 + IF( D( J ).GT.D( J-1 ) ) THEN + DMNMX = D( J ) + D( J ) = D( J-1 ) + D( J-1 ) = DMNMX + ELSE + GO TO 30 + END IF + 20 CONTINUE + 30 CONTINUE + ELSE + DO 50 I = START + 1, ENDD + DO 40 J = I, START + 1, -1 + IF( D( J ).LT.D( J-1 ) ) THEN + DMNMX = D( J ) + D( J ) = D( J-1 ) + D( J-1 ) = DMNMX + ELSE + GO TO 50 + END IF + 40 CONTINUE + 50 CONTINUE + END IF + + ELSE IF( ENDD-START.GT.SELECT ) THEN + D1 = D( START ) + D2 = D( ENDD ) + I = ( START+ENDD ) / 2 + D3 = D( I ) + IF( D1.LT.D2 ) THEN + IF( D3.LT.D1 ) THEN + DMNMX = D1 + ELSE IF( D3.LT.D2 ) THEN + DMNMX = D3 + ELSE + DMNMX = D2 + END IF + ELSE + IF( D3.LT.D2 ) THEN + DMNMX = D2 + ELSE IF( D3.LT.D1 ) THEN + DMNMX = D3 + ELSE + DMNMX = D1 + END IF + END IF + + IF( DIR.EQ.0 ) THEN + I = START - 1 + J = ENDD + 1 + 60 CONTINUE + 70 CONTINUE + J = J - 1 + IF( D( J ).LT.DMNMX ) + $ GO TO 70 + 80 CONTINUE + I = I + 1 + IF( D( I ).GT.DMNMX ) + $ GO TO 80 + IF( I.LT.J ) THEN + TMP = D( I ) + D( I ) = D( J ) + D( J ) = TMP + GO TO 60 + END IF + IF( J-START.GT.ENDD-J-1 ) THEN + STKPNT = STKPNT + 1 + STACK( 1, STKPNT ) = START + STACK( 2, STKPNT ) = J + STKPNT = STKPNT + 1 + STACK( 1, STKPNT ) = J + 1 + STACK( 2, STKPNT ) = ENDD + ELSE + STKPNT = STKPNT + 1 + STACK( 1, STKPNT ) = J + 1 + STACK( 2, STKPNT ) = ENDD + STKPNT = STKPNT + 1 + STACK( 1, STKPNT ) = START + STACK( 2, STKPNT ) = J + END IF + ELSE + I = START - 1 + J = ENDD + 1 + 90 CONTINUE + 100 CONTINUE + J = J - 1 + IF( D( J ).GT.DMNMX ) + $ GO TO 100 + 110 CONTINUE + I = I + 1 + IF( D( I ).LT.DMNMX ) + $ GO TO 110 + IF( I.LT.J ) THEN + TMP = D( I ) + D( I ) = D( J ) + D( J ) = TMP + GO TO 90 + END IF + IF( J-START.GT.ENDD-J-1 ) THEN + STKPNT = STKPNT + 1 + STACK( 1, STKPNT ) = START + STACK( 2, STKPNT ) = J + STKPNT = STKPNT + 1 + STACK( 1, STKPNT ) = J + 1 + STACK( 2, STKPNT ) = ENDD + ELSE + STKPNT = STKPNT + 1 + STACK( 1, STKPNT ) = J + 1 + STACK( 2, STKPNT ) = ENDD + STKPNT = STKPNT + 1 + STACK( 1, STKPNT ) = START + STACK( 2, STKPNT ) = J + END IF + END IF + END IF + IF( STKPNT.GT.0 ) + $ GO TO 10 + RETURN + + END SUBROUTINE DLASRT_HELPER + + END MODULE LAPACK_DLASRT_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLASSQ_HELPER.f b/Source/Modules/LAPACK/LAPACK_DLASSQ_HELPER.f new file mode 100644 index 00000000..254b943d --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLASSQ_HELPER.f @@ -0,0 +1,40 @@ +! ################################################################################################################################## + + MODULE LAPACK_DLASSQ_HELPER + + USE PENTIUM_II_KIND, ONLY : DOUBLE + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE DLASSQ_HELPER( N, X, INCX, SCALE, SUMSQ ) + + INTEGER INCX, N + REAL(DOUBLE) SCALE, SUMSQ + REAL(DOUBLE) X( * ) + REAL(DOUBLE) ZERO + PARAMETER ( ZERO = 0.0D+0 ) + INTEGER IX + REAL(DOUBLE) ABSXI + + INTRINSIC ABS + + IF( N.GT.0 ) THEN + DO 10 IX = 1, 1 + ( N-1 )*INCX, INCX + IF( X( IX ).NE.ZERO ) THEN + ABSXI = ABS( X( IX ) ) + IF( SCALE.LT.ABSXI ) THEN + SUMSQ = 1 + SUMSQ*( SCALE / ABSXI )**2 + SCALE = ABSXI + ELSE + SUMSQ = SUMSQ + ( ABSXI / SCALE )**2 + END IF + END IF + 10 CONTINUE + END IF + RETURN + + END SUBROUTINE DLASSQ_HELPER + + END MODULE LAPACK_DLASSQ_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLAS_MISC_HELPER.f90 b/Source/Modules/LAPACK/LAPACK_DLAS_MISC_HELPER.f90 new file mode 100644 index 00000000..1ea0873e --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLAS_MISC_HELPER.f90 @@ -0,0 +1,122 @@ +! ################################################################################################################################## + +MODULE LAPACK_DLAS_MISC_HELPER + +USE PENTIUM_II_KIND, ONLY : DOUBLE + +CONTAINS + +! ################################################################################################################################## + +DOUBLE PRECISION FUNCTION DLAMC3_HELPER( A, B ) + +REAL(DOUBLE) A, B + +DLAMC3_HELPER = A + B +RETURN + +END FUNCTION DLAMC3_HELPER + +! ################################################################################################################################## + +SUBROUTINE DLASET_HELPER( UPLO, M, N, ALPHA, BETA, A, LDA ) + +CHARACTER UPLO +INTEGER LDA, M, N +REAL(DOUBLE) ALPHA, BETA +REAL(DOUBLE) A( LDA, * ) +INTEGER I, J +LOGICAL LSAME +EXTERNAL LSAME +INTRINSIC MIN + +IF( LSAME( UPLO, 'U' ) ) THEN + DO 20 J = 2, N + DO 10 I = 1, MIN( J-1, M ) + A( I, J ) = ALPHA +10 CONTINUE +20 CONTINUE +ELSE IF( LSAME( UPLO, 'L' ) ) THEN + DO 40 J = 1, MIN( M, N ) + DO 30 I = J + 1, M + A( I, J ) = ALPHA +30 CONTINUE +40 CONTINUE +ELSE + DO 60 J = 1, N + DO 50 I = 1, M + A( I, J ) = ALPHA +50 CONTINUE +60 CONTINUE +END IF + +DO 70 I = 1, MIN( M, N ) + A( I, I ) = BETA +70 CONTINUE + +RETURN + +END SUBROUTINE DLASET_HELPER + +! ################################################################################################################################## + +SUBROUTINE DLASWP_HELPER( N, A, LDA, K1, K2, IPIV, INCX ) + +INTEGER INCX, K1, K2, LDA, N +INTEGER IPIV( * ) +REAL(DOUBLE) A( LDA, * ) +INTEGER I, I1, I2, INC, IP, IX, IX0, J, K, N32 +REAL(DOUBLE) TEMP + +IF( INCX.GT.0 ) THEN + IX0 = K1 + I1 = K1 + I2 = K2 + INC = 1 +ELSE IF( INCX.LT.0 ) THEN + IX0 = 1 + ( 1-K2 )*INCX + I1 = K2 + I2 = K1 + INC = -1 +ELSE + RETURN +END IF + +N32 = ( N / 32 )*32 +IF( N32.NE.0 ) THEN + DO 30 J = 1, N32, 32 + IX = IX0 + DO 20 I = I1, I2, INC + IP = IPIV( IX ) + IF( IP.NE.I ) THEN + DO 10 K = J, J + 31 + TEMP = A( I, K ) + A( I, K ) = A( IP, K ) + A( IP, K ) = TEMP +10 CONTINUE + END IF + IX = IX + INCX +20 CONTINUE +30 CONTINUE +END IF +IF( N32.NE.N ) THEN + N32 = N32 + 1 + IX = IX0 + DO 50 I = I1, I2, INC + IP = IPIV( IX ) + IF( IP.NE.I ) THEN + DO 40 K = N32, N + TEMP = A( I, K ) + A( I, K ) = A( IP, K ) + A( IP, K ) = TEMP +40 CONTINUE + END IF + IX = IX + INCX +50 CONTINUE +END IF + +RETURN + +END SUBROUTINE DLASWP_HELPER + +END MODULE LAPACK_DLAS_MISC_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DLAUUM_HELPER.f b/Source/Modules/LAPACK/LAPACK_DLAUUM_HELPER.f new file mode 100644 index 00000000..b8c44dda --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DLAUUM_HELPER.f @@ -0,0 +1,160 @@ +! ################################################################################################################################## + + MODULE LAPACK_DLAUUM_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DLAUUM( UPLO, N, A, LDA, INFO ) +* +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 +* + CHARACTER UPLO + INTEGER INFO, LDA, N + DOUBLE PRECISION A( LDA, * ) + DOUBLE PRECISION ONE + PARAMETER ( ONE = 1.0D+0 ) + LOGICAL UPPER + INTEGER I, IB, NB + LOGICAL LSAME + EXTERNAL LSAME + INTEGER ILAENV + EXTERNAL ILAENV + INTRINSIC MAX, MIN + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -4 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLAUUM', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) + $ RETURN + + NB = ILAENV( 1, 'DLAUUM', UPLO, N, -1, -1, -1 ) + + IF( NB.LE.1 .OR. NB.GE.N ) THEN + CALL DLAUU2( UPLO, N, A, LDA, INFO ) + ELSE + IF( UPPER ) THEN + DO 10 I = 1, N, NB + IB = MIN( NB, N-I+1 ) + CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Non-unit', + $ I-1, IB, ONE, A( I, I ), LDA, A( 1, I ), + $ LDA ) + CALL DLAUU2( 'Upper', IB, A( I, I ), LDA, INFO ) + IF( I+IB.LE.N ) THEN + CALL DGEMM( 'No transpose', 'Transpose', I-1, IB, + $ N-I-IB+1, ONE, A( 1, I+IB ), LDA, + $ A( I, I+IB ), LDA, ONE, A( 1, I ), LDA ) + CALL DSYRK( 'Upper', 'No transpose', IB, N-I-IB+1, + $ ONE, A( I, I+IB ), LDA, ONE, A( I, I ), + $ LDA ) + END IF + 10 CONTINUE + ELSE + DO 20 I = 1, N, NB + IB = MIN( NB, N-I+1 ) + CALL DTRMM( 'Left', 'Lower', 'Transpose', 'Non-unit', IB, + $ I-1, ONE, A( I, I ), LDA, A( I, 1 ), LDA ) + CALL DLAUU2( 'Lower', IB, A( I, I ), LDA, INFO ) + IF( I+IB.LE.N ) THEN + CALL DGEMM( 'Transpose', 'No transpose', IB, I-1, + $ N-I-IB+1, ONE, A( I+IB, I ), LDA, + $ A( I+IB, 1 ), LDA, ONE, A( I, 1 ), LDA ) + CALL DSYRK( 'Lower', 'Transpose', IB, N-I-IB+1, ONE, + $ A( I+IB, I ), LDA, ONE, A( I, I ), LDA ) + END IF + 20 CONTINUE + END IF + END IF + + RETURN + END SUBROUTINE DLAUUM +! --- lapack_peeloff end --- ! + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DLAUU2( UPLO, N, A, LDA, INFO ) +* +* -- LAPACK auxiliary routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 +* + CHARACTER UPLO + INTEGER INFO, LDA, N + DOUBLE PRECISION A( LDA, * ) + DOUBLE PRECISION ONE + PARAMETER ( ONE = 1.0D+0 ) + LOGICAL UPPER + INTEGER I + DOUBLE PRECISION AII + LOGICAL LSAME + EXTERNAL LSAME + INTRINSIC MAX + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -4 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLAUU2', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) + $ RETURN + + IF( UPPER ) THEN + DO 10 I = 1, N + AII = A( I, I ) + IF( I.LT.N ) THEN + A( I, I ) = DDOT( N-I+1, A( I, I ), LDA, A( I, I ), LDA ) + CALL DGEMV( 'No transpose', I-1, N-I, ONE, A( 1, I+1 ), + $ LDA, A( I, I+1 ), LDA, AII, A( 1, I ), 1 ) + ELSE + CALL DSCAL( I, AII, A( 1, I ), 1 ) + END IF + 10 CONTINUE + ELSE + DO 20 I = 1, N + AII = A( I, I ) + IF( I.LT.N ) THEN + A( I, I ) = DDOT( N-I+1, A( I, I ), 1, A( I, I ), 1 ) + CALL DGEMV( 'Transpose', N-I, I-1, ONE, A( I+1, 1 ), LDA, + $ A( I+1, I ), 1, AII, A( I, 1 ), LDA ) + ELSE + CALL DSCAL( I, AII, A( I, 1 ), LDA ) + END IF + 20 CONTINUE + END IF + + RETURN + END SUBROUTINE DLAUU2 +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DLAUUM_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DPBCON_HELPER.f b/Source/Modules/LAPACK/LAPACK_DPBCON_HELPER.f new file mode 100644 index 00000000..5dc1400e --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DPBCON_HELPER.f @@ -0,0 +1,120 @@ +! ################################################################################################################################## + + MODULE LAPACK_DPBCON_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE SCONTR, ONLY : BLNK_SUB_NAM + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPBCON( UPLO, N, KD, AB, LDAB, ANORM, RCOND, WORK, + $ IWORK, INFO, itmax, dtbsv_msg ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPBCON' + + CHARACTER UPLO + CHARACTER*1 dtbsv_msg + INTEGER INFO, KD, LDAB, N + REAL(DOUBLE) ANORM, RCOND + INTEGER IWORK( * ) + REAL(DOUBLE) AB( LDAB, * ), WORK( * ) + + REAL(DOUBLE) ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) + + LOGICAL UPPER + CHARACTER NORMIN + INTEGER IX, KASE + INTEGER iter_num, itmax + REAL(DOUBLE) AINVNM, SCALE, SCALEL, SCALEU, SMLNUM + + LOGICAL LSAME + EXTERNAL LSAME + + REAL(DOUBLE) DLAMCH + EXTERNAL DLAMCH + + INTRINSIC ABS + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N < 0 ) THEN + INFO = -2 + ELSE IF( KD < 0 ) THEN + INFO = -3 + ELSE IF( LDAB < KD+1 ) THEN + INFO = -5 + ELSE IF( ANORM < ZERO ) THEN + INFO = -6 + END IF + IF( INFO /= 0 ) THEN + CALL XERBLA( 'DPBCON', -INFO ) + GO TO 9000 + END IF + + RCOND = ZERO + IF( N == 0 ) THEN + RCOND = ONE + GO TO 9000 + ELSE IF( ANORM == ZERO ) THEN + GO TO 9000 + END IF + + SMLNUM = DLAMCH( 'Safe minimum' ) + + KASE = 0 + iter_num = 0 + NORMIN = 'N' + 10 CONTINUE + iter_num = iter_num + 1 + CALL DLACON( N, WORK( N+1 ), WORK, IWORK, AINVNM, KASE, itmax ) + IF( KASE /= 0 ) THEN + IF( UPPER ) THEN + CALL DLATBS( 'Upper', 'Transpose', 'Non-unit', NORMIN, N, + $ KD, AB, LDAB, WORK, SCALEL, WORK( 2*N+1 ), + $ INFO, iter_num, dtbsv_msg ) + NORMIN = 'Y' + CALL DLATBS( 'Upper', 'No transpose', 'Non-unit', NORMIN, N, + $ KD, AB, LDAB, WORK, SCALEU, WORK( 2*N+1 ), + $ INFO, iter_num, dtbsv_msg ) + ELSE + CALL DLATBS( 'Lower', 'No transpose', 'Non-unit', NORMIN, N, + $ KD, AB, LDAB, WORK, SCALEL, WORK( 2*N+1 ), + $ INFO, iter_num, dtbsv_msg ) + NORMIN = 'Y' + CALL DLATBS( 'Lower', 'Transpose', 'Non-unit', NORMIN, N, + $ KD, AB, LDAB, WORK, SCALEU, WORK( 2*N+1 ), + $ INFO, iter_num, dtbsv_msg ) + END IF + + SCALE = SCALEL*SCALEU + IF( SCALE /= ONE ) THEN + IX = IDAMAX( N, WORK, 1 ) + IF( SCALE < ABS( WORK( IX ) )*SMLNUM .OR. SCALE == ZERO ) + $ GO TO 20 + CALL DRSCL( N, SCALE, WORK, 1 ) + END IF + GO TO 10 + END IF + + IF( AINVNM /= ZERO ) + $ RCOND = ( ONE / AINVNM ) / ANORM + + 20 CONTINUE + + 9000 CONTINUE + + RETURN + + END SUBROUTINE DPBCON +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DPBCON_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DPBEQU_HELPER.f b/Source/Modules/LAPACK/LAPACK_DPBEQU_HELPER.f new file mode 100644 index 00000000..07013765 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DPBEQU_HELPER.f @@ -0,0 +1,94 @@ +! ################################################################################################################################## + + MODULE LAPACK_DPBEQU_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPBEQU( UPLO, N, KD, AB, LDAB, S, SCOND, AMAX, INFO ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER UPLO + INTEGER INFO, KD, LDAB, N + REAL(DOUBLE) AMAX, SCOND + REAL(DOUBLE) AB( LDAB, * ), S( * ) + + REAL(DOUBLE) ZERO, ONE + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) + + LOGICAL UPPER + INTEGER I, J + REAL(DOUBLE) SMIN + + LOGICAL LSAME + EXTERNAL LSAME + + INTRINSIC MAX, MIN, SQRT + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N < 0 ) THEN + INFO = -2 + ELSE IF( KD < 0 ) THEN + INFO = -3 + ELSE IF( LDAB < KD+1 ) THEN + INFO = -5 + END IF + IF( INFO /= 0 ) THEN + CALL XERBLA( 'DPBEQU', -INFO ) + GO TO 9000 + END IF + + IF( N == 0 ) THEN + SCOND = ONE + AMAX = ZERO + GO TO 9000 + END IF + + IF( UPPER ) THEN + J = KD + 1 + ELSE + J = 1 + END IF + + S( 1 ) = AB( J, 1 ) + SMIN = S( 1 ) + AMAX = S( 1 ) + + DO 10 I = 2, N + S( I ) = AB( J, I ) + SMIN = MIN( SMIN, S( I ) ) + AMAX = MAX( AMAX, S( I ) ) + 10 CONTINUE + + IF( SMIN <= ZERO ) THEN + DO 20 I = 1, N + IF( S( I ) <= ZERO ) THEN + INFO = I + GO TO 9000 + END IF + 20 CONTINUE + ELSE + DO 30 I = 1, N + S( I ) = ONE / SQRT( S( I ) ) + 30 CONTINUE + + SCOND = SQRT( SMIN ) / SQRT( AMAX ) + END IF + + 9000 CONTINUE + + RETURN + + END SUBROUTINE DPBEQU +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DPBEQU_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DPBSTF_HELPER.f b/Source/Modules/LAPACK/LAPACK_DPBSTF_HELPER.f new file mode 100644 index 00000000..429d4b22 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DPBSTF_HELPER.f @@ -0,0 +1,110 @@ +! ################################################################################################################################## + + MODULE LAPACK_DPBSTF_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE DPBSTF_HELPER( UPLO, N, KD, AB, LDAB, INFO ) + + CHARACTER UPLO + INTEGER INFO, KD, LDAB, N + REAL(DOUBLE) AB( LDAB, * ) + + REAL(DOUBLE) ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) + + LOGICAL UPPER + INTEGER J, KLD, KM, M + REAL(DOUBLE) AJJ + + LOGICAL LSAME + EXTERNAL LSAME + + INTRINSIC MAX, MIN, SQRT + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( KD.LT.0 ) THEN + INFO = -3 + ELSE IF( LDAB.LT.KD+1 ) THEN + INFO = -5 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DPBSTF', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) RETURN + + KLD = MAX( 1, LDAB-1 ) + M = ( N+KD ) / 2 + + IF( UPPER ) THEN + + DO 10 J = N, M + 1, -1 + AJJ = AB( KD+1, J ) + IF( AJJ.LE.ZERO ) GO TO 50 + AJJ = SQRT( AJJ ) + AB( KD+1, J ) = AJJ + KM = MIN( J-1, KD ) + CALL DSCAL( KM, ONE / AJJ, AB( KD+1-KM, J ), 1 ) + CALL DSYR( 'Upper', KM, -ONE, AB( KD+1-KM, J ), 1, + $ AB( KD+1, J-KM ), KLD ) + 10 CONTINUE + + DO 20 J = 1, M + AJJ = AB( KD+1, J ) + IF( AJJ.LE.ZERO ) GO TO 50 + AJJ = SQRT( AJJ ) + AB( KD+1, J ) = AJJ + KM = MIN( KD, M-J ) + IF( KM.GT.0 ) THEN + CALL DSCAL( KM, ONE / AJJ, AB( KD, J+1 ), KLD ) + CALL DSYR( 'Upper', KM, -ONE, AB( KD, J+1 ), KLD, + $ AB( KD+1, J+1 ), KLD ) + END IF + 20 CONTINUE + ELSE + + DO 30 J = N, M + 1, -1 + AJJ = AB( 1, J ) + IF( AJJ.LE.ZERO ) GO TO 50 + AJJ = SQRT( AJJ ) + AB( 1, J ) = AJJ + KM = MIN( J-1, KD ) + CALL DSCAL( KM, ONE / AJJ, AB( KM+1, J-KM ), KLD ) + CALL DSYR( 'Lower', KM, -ONE, AB( KM+1, J-KM ), KLD, + $ AB( 1, J-KM ), KLD ) + 30 CONTINUE + + DO 40 J = 1, M + AJJ = AB( 1, J ) + IF( AJJ.LE.ZERO ) GO TO 50 + AJJ = SQRT( AJJ ) + AB( 1, J ) = AJJ + KM = MIN( KD, M-J ) + IF( KM.GT.0 ) THEN + CALL DSCAL( KM, ONE / AJJ, AB( 2, J ), 1 ) + CALL DSYR( 'Lower', KM, -ONE, AB( 2, J ), 1, + $ AB( 1, J+1 ), KLD ) + END IF + 40 CONTINUE + END IF + RETURN + + 50 CONTINUE + INFO = J + + RETURN + + END SUBROUTINE DPBSTF_HELPER + + END MODULE LAPACK_DPBSTF_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DPBTF2_HELPER.f b/Source/Modules/LAPACK/LAPACK_DPBTF2_HELPER.f new file mode 100644 index 00000000..181792cc --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DPBTF2_HELPER.f @@ -0,0 +1,98 @@ +! ################################################################################################################################## + + MODULE LAPACK_DPBTF2_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPBTF2( UPLO, N, KD, AB, LDAB, INFO ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER UPLO + INTEGER INFO, KD, LDAB, N + REAL(DOUBLE) AB( LDAB, * ) + + REAL(DOUBLE) ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) + + LOGICAL UPPER + INTEGER J, KLD, KN + REAL(DOUBLE) AJJ + + LOGICAL LSAME + EXTERNAL LSAME + + INTRINSIC MAX, MIN, SQRT + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N < 0 ) THEN + INFO = -2 + ELSE IF( KD < 0 ) THEN + INFO = -3 + ELSE IF( LDAB < KD+1 ) THEN + INFO = -5 + END IF + IF( INFO /= 0 ) THEN + CALL XERBLA( 'DPBTF2', -INFO ) + GO TO 9000 + END IF + + IF( N == 0 ) + & GO TO 9000 + + KLD = MAX( 1, LDAB-1 ) + + IF( UPPER ) THEN + DO 10 J = 1, N + AJJ = AB( KD+1, J ) + IF( AJJ <= ZERO ) THEN + GO TO 30 + END IF + AJJ = SQRT( AJJ ) + AB( KD+1, J ) = AJJ + + KN = MIN( KD, N-J ) + IF( KN > 0 ) THEN + CALL DSCAL( KN, ONE / AJJ, AB( KD, J+1 ), KLD ) + CALL DSYR( 'Upper', KN, -ONE, AB( KD, J+1 ), KLD, + $ AB( KD+1, J+1 ), KLD ) + END IF + 10 CONTINUE + ELSE + DO 20 J = 1, N + AJJ = AB( 1, J ) + IF( AJJ <= ZERO ) + $ GO TO 30 + AJJ = SQRT( AJJ ) + AB( 1, J ) = AJJ + + KN = MIN( KD, N-J ) + IF( KN > 0 ) THEN + CALL DSCAL( KN, ONE / AJJ, AB( 2, J ), 1 ) + CALL DSYR( 'Lower', KN, -ONE, AB( 2, J ), 1, + $ AB( 1, J+1 ), KLD ) + END IF + 20 CONTINUE + END IF + GO TO 9000 + + 30 CONTINUE + INFO = J + + 9000 CONTINUE + + RETURN + + END SUBROUTINE DPBTF2 +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DPBTF2_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DPBTRF_KERNEL.f b/Source/Modules/LAPACK/LAPACK_DPBTRF_KERNEL.f new file mode 100644 index 00000000..4d1dd48b --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DPBTRF_KERNEL.f @@ -0,0 +1,193 @@ +! ################################################################################################################################## + + MODULE LAPACK_DPBTRF_KERNEL + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + USE LAPACK_DPBTF2_HELPER, ONLY : DPBTF2 + USE LAPACK_POTF2_HELPER, ONLY : DPOTF2 + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPBTRF( UPLO, N, KD, AB, LDAB, INFO ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER UPLO + INTEGER INFO, KD, LDAB, N + REAL(DOUBLE) AB( LDAB, * ) + + REAL(DOUBLE) ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) + INTEGER NBMAX, LDWORK + PARAMETER ( NBMAX = 32, LDWORK = NBMAX+1 ) + + INTEGER I, I2, I3, IB, II, J, JJ, NB + REAL(DOUBLE) WORK( LDWORK, NBMAX ) + + LOGICAL LSAME + EXTERNAL LSAME + + INTEGER ILAENV + EXTERNAL ILAENV + + INTRINSIC MIN + + INFO = 0 + IF( ( .NOT.LSAME( UPLO, 'U' ) ) .AND. + $ ( .NOT.LSAME( UPLO, 'L' ) ) ) THEN + INFO = -1 + ELSE IF( N < 0 ) THEN + INFO = -2 + ELSE IF( KD < 0 ) THEN + INFO = -3 + ELSE IF( LDAB < KD+1 ) THEN + INFO = -5 + END IF + IF( INFO /= 0 ) THEN + CALL XERBLA( 'DPBTRF', -INFO ) + GO TO 9000 + END IF + + IF( N == 0 ) + & GO TO 9000 + + NB = ILAENV( 1, 'DPBTRF', UPLO, N, KD, -1, -1 ) + NB = MIN( NB, NBMAX ) + + IF( NB <= 1 .OR. NB > KD ) THEN + CALL DPBTF2( UPLO, N, KD, AB, LDAB, INFO ) + ELSE + IF( LSAME( UPLO, 'U' ) ) THEN + DO 20 J = 1, NB + DO 10 I = 1, J - 1 + WORK( I, J ) = ZERO + 10 CONTINUE + 20 CONTINUE + + DO 70 I = 1, N, NB + IB = MIN( NB, N-I+1 ) + + CALL DPOTF2( UPLO, IB, AB( KD+1, I ), LDAB-1, II ) + IF( II /= 0 ) THEN + INFO = I + II - 1 + GO TO 150 + END IF + IF( I+IB <= N ) THEN + I2 = MIN( KD-IB, N-I-IB+1 ) + I3 = MIN( IB, N-I-KD+1 ) + + IF( I2 > 0 ) THEN + CALL DTRSM( 'Left', 'Upper', 'Transpose', + $ 'Non-unit', IB, I2, ONE, AB( KD+1, I ), + $ LDAB-1, AB( KD+1-IB, I+IB ), LDAB-1 ) + + CALL DSYRK( 'Upper', 'Transpose', I2, IB, -ONE, + $ AB( KD+1-IB, I+IB ), LDAB-1, ONE, + $ AB( KD+1, I+IB ), LDAB-1 ) + END IF + + IF( I3 > 0 ) THEN + DO 40 JJ = 1, I3 + DO 30 II = JJ, IB + WORK( II, JJ ) = AB( II-JJ+1, JJ+I+KD-1 ) + 30 CONTINUE + 40 CONTINUE + + CALL DTRSM( 'Left', 'Upper', 'Transpose', + $ 'Non-unit', IB, I3, ONE, AB( KD+1, I ), + $ LDAB-1, WORK, LDWORK ) + + IF( I2 > 0 ) + $ CALL DGEMM( 'Transpose', 'No Transpose', I2, I3, + $ IB, -ONE, AB( KD+1-IB, I+IB ), + $ LDAB-1, WORK, LDWORK, ONE, + $ AB( 1+IB, I+KD ), LDAB-1 ) + + CALL DSYRK( 'Upper', 'Transpose', I3, IB, -ONE, + $ WORK, LDWORK, ONE, AB( KD+1, I+KD ), + $ LDAB-1 ) + + DO 60 JJ = 1, I3 + DO 50 II = JJ, IB + AB( II-JJ+1, JJ+I+KD-1 ) = WORK( II, JJ ) + 50 CONTINUE + 60 CONTINUE + END IF + END IF + 70 CONTINUE + ELSE + DO 90 J = 1, NB + DO 80 I = J + 1, NB + WORK( I, J ) = ZERO + 80 CONTINUE + 90 CONTINUE + + DO 140 I = 1, N, NB + IB = MIN( NB, N-I+1 ) + + CALL DPOTF2( UPLO, IB, AB( 1, I ), LDAB-1, II ) + IF( II /= 0 ) THEN + INFO = I + II - 1 + GO TO 150 + END IF + IF( I+IB <= N ) THEN + I2 = MIN( KD-IB, N-I-IB+1 ) + I3 = MIN( IB, N-I-KD+1 ) + + IF( I2 > 0 ) THEN + CALL DTRSM( 'Right', 'Lower', 'Transpose', + $ 'Non-unit', I2, IB, ONE, AB( 1, I ), + $ LDAB-1, AB( 1+IB, I ), LDAB-1 ) + + CALL DSYRK( 'Lower', 'No Transpose', I2, IB, -ONE, + $ AB( 1+IB, I ), LDAB-1, ONE, + $ AB( 1, I+IB ), LDAB-1 ) + END IF + + IF( I3 > 0 ) THEN + DO 110 JJ = 1, IB + DO 100 II = 1, MIN( JJ, I3 ) + WORK( II, JJ ) = AB( KD+1-JJ+II, JJ+I-1 ) + 100 CONTINUE + 110 CONTINUE + + CALL DTRSM( 'Right', 'Lower', 'Transpose', + $ 'Non-unit', I3, IB, ONE, AB( 1, I ), + $ LDAB-1, WORK, LDWORK ) + + IF( I2 > 0 ) + $ CALL DGEMM( 'No transpose', 'Transpose', I3, I2, + $ IB, -ONE, WORK, LDWORK, + $ AB( 1+IB, I ), LDAB-1, ONE, + $ AB( 1+KD-IB, I+IB ), LDAB-1 ) + + CALL DSYRK( 'Lower', 'No Transpose', I3, IB, -ONE, + $ WORK, LDWORK, ONE, AB( 1, I+KD ), + $ LDAB-1 ) + + DO 130 JJ = 1, IB + DO 120 II = 1, MIN( JJ, I3 ) + AB( KD+1-JJ+II, JJ+I-1 ) = WORK( II, JJ ) + 120 CONTINUE + 130 CONTINUE + END IF + END IF + 140 CONTINUE + END IF + END IF + GO TO 9000 + + 150 CONTINUE + + 9000 CONTINUE + + RETURN + + END SUBROUTINE DPBTRF +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DPBTRF_KERNEL diff --git a/Source/Modules/LAPACK/LAPACK_DPBTRS_HELPER.f b/Source/Modules/LAPACK/LAPACK_DPBTRS_HELPER.f new file mode 100644 index 00000000..e700651e --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DPBTRS_HELPER.f @@ -0,0 +1,77 @@ +! ################################################################################################################################## + + MODULE LAPACK_DPBTRS_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPBTRS( UPLO, N, KD, NRHS, AB, LDAB, B, LDB, INFO, + & dtbsv_msg ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER UPLO + CHARACTER*1 dtbsv_msg + INTEGER INFO, KD, LDAB, LDB, N, NRHS + REAL(DOUBLE) AB( LDAB, * ), B( LDB, * ) + + LOGICAL UPPER + INTEGER J + + LOGICAL LSAME + EXTERNAL LSAME + + INTRINSIC MAX + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N < 0 ) THEN + INFO = -2 + ELSE IF( KD < 0 ) THEN + INFO = -3 + ELSE IF( NRHS < 0 ) THEN + INFO = -4 + ELSE IF( LDAB < KD+1 ) THEN + INFO = -6 + ELSE IF( LDB < MAX( 1, N ) ) THEN + INFO = -8 + END IF + IF( INFO /= 0 ) THEN + CALL XERBLA( 'DPBTRS', -INFO ) + GO TO 9000 + END IF + + IF( N == 0 .OR. NRHS == 0 ) + & GO TO 9000 + + IF( UPPER ) THEN + DO 10 J = 1, NRHS + CALL DTBSV( 'Upper', 'Transpose', 'Non-unit', N, KD, AB, + $ LDAB, B( 1, J ), 1, dtbsv_msg ) + CALL DTBSV( 'Upper', 'No transpose', 'Non-unit', N, KD, AB, + $ LDAB, B( 1, J ), 1, dtbsv_msg ) + 10 CONTINUE + ELSE + DO 20 J = 1, NRHS + CALL DTBSV( 'Lower', 'No transpose', 'Non-unit', N, KD, AB, + $ LDAB, B( 1, J ), 1, dtbsv_msg ) + CALL DTBSV( 'Lower', 'Transpose', 'Non-unit', N, KD, AB, + $ LDAB, B( 1, J ), 1, dtbsv_msg ) + 20 CONTINUE + END IF + + 9000 CONTINUE + + RETURN + + END SUBROUTINE DPBTRS +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DPBTRS_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DPOTRF_HELPER.f b/Source/Modules/LAPACK/LAPACK_DPOTRF_HELPER.f new file mode 100644 index 00000000..751d9139 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DPOTRF_HELPER.f @@ -0,0 +1,185 @@ +! ################################################################################################################################## + + MODULE LAPACK_DPOTRF_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + USE LAPACK_BLAS_AUX + USE LAPACK_POTF2_HELPER, ONLY : DPOTF2 + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPOTRF( UPLO, N, A, LDA, INFO ) +* +* -- LAPACK routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* March 31, 1993 +* +* .. Scalar Arguments .. + CHARACTER UPLO + INTEGER INFO, LDA, N +* .. +* .. Array Arguments .. + DOUBLE PRECISION A( LDA, * ) +* .. +* +* Purpose +* ======= +* +* DPOTRF computes the Cholesky factorization of a real symmetric +* positive definite matrix A. +* +* The factorization has the form +* A = U**T * U, if UPLO = 'U', or +* A = L * L**T, if UPLO = 'L', +* where U is an upper triangular matrix and L is lower triangular. +* +* This is the block version of the algorithm, calling Level 3 BLAS. +* +* Arguments +* ========= +* +* UPLO (input) CHARACTER*1 +* = 'U': Upper triangle of A is stored; +* = 'L': Lower triangle of A is stored. +* +* N (input) INTEGER +* The order of the matrix A. N >= 0. +* +* A (input/output) DOUBLE PRECISION array, dimension (LDA,N) +* On entry, the symmetric matrix A. If UPLO = 'U', the leading +* N-by-N upper triangular part of A contains the upper +* triangular part of the matrix A, and the strictly lower +* triangular part of A is not referenced. If UPLO = 'L', the +* leading N-by-N lower triangular part of A contains the lower +* triangular part of the matrix A, and the strictly upper +* triangular part of A is not referenced. +* +* On exit, if INFO = 0, the factor U or L from the Cholesky +* factorization A = U**T*U or A = L*L**T. +* +* LDA (input) INTEGER +* The leading dimension of the array A. LDA >= max(1,N). +* +* INFO (output) INTEGER +* = 0: successful exit +* < 0: if INFO = -i, the i-th argument had an illegal value +* > 0: if INFO = i, the leading minor of order i is not +* positive definite, and the factorization could not be +* completed. +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ONE + PARAMETER ( ONE = 1.0D+0 ) +* .. +* .. Local Scalars .. + LOGICAL UPPER + INTEGER J, JB, NB +* .. +* .. External Functions .. + LOGICAL LSAME + EXTERNAL LSAME + + INTEGER ILAENV + EXTERNAL ILAENV + +* .. +* .. Intrinsic Functions .. + INTRINSIC MAX, MIN +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -4 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DPOTRF', -INFO ) + RETURN + END IF +* +* Quick return if possible +* + IF( N.EQ.0 ) + $ RETURN +* +* Determine the block size for this environment. +* + NB = ILAENV( 1, 'DPOTRF', UPLO, N, -1, -1, -1 ) + IF( NB.LE.1 .OR. NB.GE.N ) THEN +* +* Use unblocked code. +* + CALL DPOTF2( UPLO, N, A, LDA, INFO ) + ELSE +* +* Use blocked code. +* + IF( UPPER ) THEN +* +* Compute the Cholesky factorization A = U'*U. +* + DO 10 J = 1, N, NB + JB = MIN( NB, N-J+1 ) + CALL DSYRK( 'Upper', 'Transpose', JB, J-1, -ONE, + $ A( 1, J ), LDA, ONE, A( J, J ), LDA ) + CALL DPOTF2( 'Upper', JB, A( J, J ), LDA, INFO ) + IF( INFO.NE.0 ) + $ GO TO 30 + IF( J+JB.LE.N ) THEN + CALL DGEMM( 'Transpose', 'No transpose', JB, N-J-JB+1, + $ J-1, -ONE, A( 1, J ), LDA, A( 1, J+JB ), + $ LDA, ONE, A( J, J+JB ), LDA ) + CALL DTRSM( 'Left', 'Upper', 'Transpose', 'Non-unit', + $ JB, N-J-JB+1, ONE, A( J, J ), LDA, + $ A( J, J+JB ), LDA ) + END IF + 10 CONTINUE +* + ELSE +* +* Compute the Cholesky factorization A = L*L'. +* + DO 20 J = 1, N, NB + JB = MIN( NB, N-J+1 ) + CALL DSYRK( 'Lower', 'No transpose', JB, J-1, -ONE, + $ A( J, 1 ), LDA, ONE, A( J, J ), LDA ) + CALL DPOTF2( 'Lower', JB, A( J, J ), LDA, INFO ) + IF( INFO.NE.0 ) + $ GO TO 30 + IF( J+JB.LE.N ) THEN + CALL DGEMM( 'No transpose', 'Transpose', N-J-JB+1, JB, + $ J-1, -ONE, A( J+JB, 1 ), LDA, A( J, 1 ), + $ LDA, ONE, A( J+JB, J ), LDA ) + CALL DTRSM( 'Right', 'Lower', 'Transpose', 'Non-unit', + $ N-J-JB+1, JB, ONE, A( J, J ), LDA, + $ A( J+JB, J ), LDA ) + END IF + 20 CONTINUE + END IF + END IF + GO TO 40 +* + 30 CONTINUE + INFO = INFO + J - 1 +* + 40 CONTINUE + RETURN +* + END SUBROUTINE DPOTRF +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DPOTRF_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DPOTRI_HELPER.f b/Source/Modules/LAPACK/LAPACK_DPOTRI_HELPER.f new file mode 100644 index 00000000..37fc86f6 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DPOTRI_HELPER.f @@ -0,0 +1,55 @@ +! ################################################################################################################################## + + MODULE LAPACK_DPOTRI_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + USE LAPACK_DLAUUM_HELPER, ONLY : DLAUUM + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPOTRI( UPLO, N, A, LDA, INFO ) +* +* -- LAPACK routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* March 31, 1993 +* + CHARACTER UPLO + INTEGER INFO, LDA, N + DOUBLE PRECISION A( LDA, * ) + LOGICAL LSAME + EXTERNAL LSAME + EXTERNAL DTRTRI + INTRINSIC MAX + + INFO = 0 + IF( .NOT.LSAME( UPLO, 'U' ) .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -4 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DPOTRI', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) + $ RETURN + + CALL DTRTRI( UPLO, 'Non-unit', N, A, LDA, INFO ) + IF( INFO.GT.0 ) + $ RETURN + + CALL DLAUUM( UPLO, N, A, LDA, INFO ) + + RETURN + END SUBROUTINE DPOTRI +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DPOTRI_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DSTEV_HELPER.f b/Source/Modules/LAPACK/LAPACK_DSTEV_HELPER.f new file mode 100644 index 00000000..74659a39 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DSTEV_HELPER.f @@ -0,0 +1,96 @@ +! ################################################################################################################################## + + MODULE LAPACK_DSTEV_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DSTEV( JOBZ, N, D, E, Z, LDZ, WORK, INFO ) + + CHARACTER JOBZ + INTEGER INFO, LDZ, N + DOUBLE PRECISION D( * ), E( * ), WORK( * ), Z( LDZ, * ) + DOUBLE PRECISION ZERO, ONE + PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) + LOGICAL WANTZ + INTEGER IMAX, ISCALE + DOUBLE PRECISION BIGNUM, EPS, RMAX, RMIN, SAFMIN, SIGMA, SMLNUM, + $ TNRM + LOGICAL LSAME + DOUBLE PRECISION DLAMCH, DLANST + EXTERNAL LSAME, DLAMCH, DLANST + EXTERNAL DSCAL, DSTEQR, DSTERF, XERBLA + INTRINSIC SQRT + + WANTZ = LSAME( JOBZ, 'V' ) + + INFO = 0 + IF( .NOT.( WANTZ .OR. LSAME( JOBZ, 'N' ) ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDZ.LT.1 .OR. ( WANTZ .AND. LDZ.LT.N ) ) THEN + INFO = -6 + END IF + + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DSTEV ', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) + $ RETURN + + IF( N.EQ.1 ) THEN + IF( WANTZ ) + $ Z( 1, 1 ) = ONE + RETURN + END IF + + SAFMIN = DLAMCH( 'Safe minimum' ) + EPS = DLAMCH( 'Precision' ) + SMLNUM = SAFMIN / EPS + BIGNUM = ONE / SMLNUM + RMIN = SQRT( SMLNUM ) + RMAX = SQRT( BIGNUM ) + + ISCALE = 0 + TNRM = DLANST( 'M', N, D, E ) + IF( TNRM.GT.ZERO .AND. TNRM.LT.RMIN ) THEN + ISCALE = 1 + SIGMA = RMIN / TNRM + ELSE IF( TNRM.GT.RMAX ) THEN + ISCALE = 1 + SIGMA = RMAX / TNRM + END IF + IF( ISCALE.EQ.1 ) THEN + CALL DSCAL( N, SIGMA, D, 1 ) + CALL DSCAL( N-1, SIGMA, E( 1 ), 1 ) + END IF + + IF( .NOT.WANTZ ) THEN + CALL DSTERF( N, D, E, INFO ) + ELSE + CALL DSTEQR( 'I', N, D, E, Z, LDZ, WORK, INFO ) + END IF + + IF( ISCALE.EQ.1 ) THEN + IF( INFO.EQ.0 ) THEN + IMAX = N + ELSE + IMAX = INFO - 1 + END IF + CALL DSCAL( IMAX, ONE / SIGMA, D, 1 ) + END IF + + RETURN + END SUBROUTINE DSTEV +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DSTEV_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DSYTF2_HELPER.f b/Source/Modules/LAPACK/LAPACK_DSYTF2_HELPER.f new file mode 100644 index 00000000..101a7f37 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DSYTF2_HELPER.f @@ -0,0 +1,361 @@ +! ################################################################################################################################## + + MODULE LAPACK_DSYTF2_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE SCONTR, ONLY : BLNK_SUB_NAM + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DSYTF2( UPLO, N, A, LDA, IPIV, INFO ) + + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSYTF2' +* +* -- LAPACK routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + CHARACTER UPLO + INTEGER INFO, LDA, N +* .. +* .. Array Arguments .. + INTEGER IPIV( * ) + DOUBLE PRECISION A( LDA, * ) +* .. +* +* Purpose +* ======= +* +* DSYTF2 computes the factorization of a real symmetric matrix A using +* the Bunch-Kaufman diagonal pivoting method: +* +* A = U*D*U' or A = L*D*L' +* +* where U (or L) is a product of permutation and unit upper (lower) +* triangular matrices, U' is the transpose of U, and D is symmetric and +* block diagonal with 1-by-1 and 2-by-2 diagonal blocks. +* +* This is the unblocked version of the algorithm, calling Level 2 BLAS. +* +* Arguments +* ========= +* +* UPLO (input) CHARACTER*1 +* Specifies whether the upper or lower triangular part of the +* symmetric matrix A is stored: +* = 'U': Upper triangular +* = 'L': Lower triangular +* +* N (input) INTEGER +* The order of the matrix A. N >= 0. +* +* A (input/output) DOUBLE PRECISION array, dimension (LDA,N) +* On entry, the symmetric matrix A. If UPLO = 'U', the leading +* n-by-n upper triangular part of A contains the upper +* triangular part of the matrix A, and the strictly lower +* triangular part of A is not referenced. If UPLO = 'L', the +* leading n-by-n lower triangular part of A contains the lower +* triangular part of the matrix A, and the strictly upper +* triangular part of A is not referenced. +* +* On exit, the block diagonal matrix D and the multipliers used +* to obtain the factor U or L (see below for further details). +* +* LDA (input) INTEGER +* The leading dimension of the array A. LDA >= max(1,N). +* +* IPIV (output) INTEGER array, dimension (N) +* Details of the interchanges and the block structure of D. +* If IPIV(k) > 0, then rows and columns k and IPIV(k) were +* interchanged and D(k,k) is a 1-by-1 diagonal block. +* If UPLO = 'U' and IPIV(k) = IPIV(k-1) < 0, then rows and +* columns k-1 and -IPIV(k) were interchanged and D(k-1:k,k-1:k) +* is a 2-by-2 diagonal block. If UPLO = 'L' and IPIV(k) = +* IPIV(k+1) < 0, then rows and columns k+1 and -IPIV(k) were +* interchanged and D(k:k+1,k:k+1) is a 2-by-2 diagonal block. +* +* INFO (output) INTEGER +* = 0: successful exit +* < 0: if INFO = -k, the k-th argument had an illegal value +* > 0: if INFO = k, D(k,k) is exactly zero. The factorization +* has been completed, but the block diagonal matrix D is +* exactly singular, and division by zero will occur if it +* is used to solve a system of equations. +* +* Further Details +* =============== +* +* 09-29-06 - patch from +* Bobby Cheng, MathWorks +* +* Replace l.204 and l.372 +* IF( MAX( ABSAKK, COLMAX ).EQ.ZERO ) THEN +* by +* IF( (MAX( ABSAKK, COLMAX ).EQ.ZERO) .OR. DISNAN(ABSAKK) ) THEN +* +* 01-01-96 - Based on modifications by +* J. Lewis, Boeing Computer Services Company +* A. Petitet, Computer Science Dept., Univ. of Tenn., Knoxville, USA +* 1-96 - Based on modifications by J. Lewis, Boeing Computer Services +* Company +* +* If UPLO = 'U', then A = U*D*U', where +* U = P(n)*U(n)* ... *P(k)U(k)* ..., +* i.e., U is a product of terms P(k)*U(k), where k decreases from n to +* 1 in steps of 1 or 2, and D is a block diagonal matrix with 1-by-1 +* and 2-by-2 diagonal blocks D(k). P(k) is a permutation matrix as +* defined by IPIV(k), and U(k) is a unit upper triangular matrix, such +* that if the diagonal block D(k) is of order s (s = 1 or 2), then +* +* ( I v 0 ) k-s +* U(k) = ( 0 I 0 ) s +* ( 0 0 I ) n-k +* k-s s n-k +* +* If s = 1, D(k) overwrites A(k,k), and v overwrites A(1:k-1,k). +* If s = 2, the upper triangle of D(k) overwrites A(k-1,k-1), A(k-1,k), +* and A(k,k), and v overwrites A(1:k-2,k-1:k). +* +* If UPLO = 'L', then A = L*D*L', where +* L = P(1)*L(1)* ... *P(k)*L(k)* ..., +* i.e., L is a product of terms P(k)*L(k), where k increases from 1 to +* n in steps of 1 or 2, and D is a block diagonal matrix with 1-by-1 +* and 2-by-2 diagonal blocks D(k). P(k) is a permutation matrix as +* defined by IPIV(k), and L(k) is a unit lower triangular matrix, such +* that if the diagonal block D(k) is of order s (s = 1 or 2), then +* +* ( I 0 0 ) k-1 +* L(k) = ( 0 I 0 ) s +* ( 0 v I ) n-k-s+1 +* k-1 s n-k-s+1 +* +* If s = 1, D(k) overwrites A(k,k), and v overwrites A(k+1:n,k). +* If s = 2, the lower triangle of D(k) overwrites A(k,k), A(k+1,k), +* and A(k+1,k+1), and v overwrites A(k+2:n,k:k+1). +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) + DOUBLE PRECISION EIGHT, SEVTEN + PARAMETER ( EIGHT = 8.0D+0, SEVTEN = 17.0D+0 ) +* .. +* .. Local Scalars .. + LOGICAL UPPER + INTEGER I, IMAX, J, JMAX, K, KK, KP, KSTEP + DOUBLE PRECISION ABSAKK, ALPHA, COLMAX, D11, D12, D21, D22, R1, + $ ROWMAX, T, WK, WKM1, WKP1 +* .. +* .. External Functions .. + LOGICAL LSAME + EXTERNAL LSAME +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, MAX, SQRT +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -4 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DSYTF2', -INFO ) + RETURN + END IF +* +* Initialize ALPHA for use in choosing pivot block size. +* + ALPHA = ( ONE+SQRT( SEVTEN ) ) / EIGHT +* + IF( UPPER ) THEN + K = N + 10 CONTINUE + IF( K.LT.1 ) + $ GO TO 70 + KSTEP = 1 + ABSAKK = ABS( A( K, K ) ) + IF( K.GT.1 ) THEN + IMAX = IDAMAX( K-1, A( 1, K ), 1 ) + COLMAX = ABS( A( IMAX, K ) ) + ELSE + COLMAX = ZERO + END IF + IF( (MAX( ABSAKK, COLMAX ).EQ.ZERO) .OR. DISNAN(ABSAKK) ) THEN + IF( INFO.EQ.0 ) + $ INFO = K + KP = K + ELSE + IF( ABSAKK.GE.ALPHA*COLMAX ) THEN + KP = K + ELSE + JMAX = IMAX + IDAMAX( K-IMAX, A( IMAX, IMAX+1 ), LDA ) + ROWMAX = ABS( A( IMAX, JMAX ) ) + IF( IMAX.GT.1 ) THEN + JMAX = IDAMAX( IMAX-1, A( 1, IMAX ), 1 ) + ROWMAX = MAX( ROWMAX, ABS( A( JMAX, IMAX ) ) ) + END IF + IF( ABSAKK.GE.ALPHA*COLMAX*( COLMAX / ROWMAX ) ) THEN + KP = K + ELSE IF( ABS( A( IMAX, IMAX ) ).GE.ALPHA*ROWMAX ) THEN + KP = IMAX + ELSE + KP = IMAX + KSTEP = 2 + END IF + END IF + KK = K - KSTEP + 1 + IF( KP.NE.KK ) THEN + CALL DSWAP( KP-1, A( 1, KK ), 1, A( 1, KP ), 1 ) + CALL DSWAP( KK-KP-1, A( KP+1, KK ), 1, A( KP, KP+1 ), + $ LDA ) + T = A( KK, KK ) + A( KK, KK ) = A( KP, KP ) + A( KP, KP ) = T + IF( KSTEP.EQ.2 ) THEN + T = A( K-1, K ) + A( K-1, K ) = A( KP, K ) + A( KP, K ) = T + END IF + END IF + IF( KSTEP.EQ.1 ) THEN + R1 = ONE / A( K, K ) + CALL DSYR( UPLO, K-1, -R1, A( 1, K ), 1, A, LDA ) + CALL DSCAL( K-1, R1, A( 1, K ), 1 ) + ELSE + IF( K.GT.2 ) THEN + D12 = A( K-1, K ) + D22 = A( K-1, K-1 ) / D12 + D11 = A( K, K ) / D12 + T = ONE / ( D11*D22-ONE ) + D12 = T / D12 + DO 30 J = K - 2, 1, -1 + WKM1 = D12*( D11*A( J, K-1 )-A( J, K ) ) + WK = D12*( D22*A( J, K )-A( J, K-1 ) ) + DO 20 I = J, 1, -1 + A( I, J ) = A( I, J ) - A( I, K )*WK - + $ A( I, K-1 )*WKM1 + 20 CONTINUE + A( J, K ) = WK + A( J, K-1 ) = WKM1 + 30 CONTINUE + END IF + END IF + END IF + IF( KSTEP.EQ.1 ) THEN + IPIV( K ) = KP + ELSE + IPIV( K ) = -KP + IPIV( K-1 ) = -KP + END IF + K = K - KSTEP + GO TO 10 + ELSE + K = 1 + 40 CONTINUE + IF( K.GT.N ) + $ GO TO 70 + KSTEP = 1 + ABSAKK = ABS( A( K, K ) ) + IF( K.LT.N ) THEN + IMAX = K + IDAMAX( N-K, A( K+1, K ), 1 ) + COLMAX = ABS( A( IMAX, K ) ) + ELSE + COLMAX = ZERO + END IF + IF( (MAX( ABSAKK, COLMAX ).EQ.ZERO) .OR. DISNAN(ABSAKK) ) THEN + IF( INFO.EQ.0 ) + $ INFO = K + KP = K + ELSE + IF( ABSAKK.GE.ALPHA*COLMAX ) THEN + KP = K + ELSE + JMAX = K - 1 + IDAMAX( IMAX-K, A( IMAX, K ), LDA ) + ROWMAX = ABS( A( IMAX, JMAX ) ) + IF( IMAX.LT.N ) THEN + JMAX = IMAX + IDAMAX( N-IMAX, A( IMAX+1, IMAX ), 1 ) + ROWMAX = MAX( ROWMAX, ABS( A( JMAX, IMAX ) ) ) + END IF + IF( ABSAKK.GE.ALPHA*COLMAX*( COLMAX / ROWMAX ) ) THEN + KP = K + ELSE IF( ABS( A( IMAX, IMAX ) ).GE.ALPHA*ROWMAX ) THEN + KP = IMAX + ELSE + KP = IMAX + KSTEP = 2 + END IF + END IF + KK = K + KSTEP - 1 + IF( KP.NE.KK ) THEN + IF( KP.LT.N ) + $ CALL DSWAP( N-KP, A( KP+1, KK ), 1, A( KP+1, KP ), 1 ) + CALL DSWAP( KP-KK-1, A( KK+1, KK ), 1, A( KP, KK+1 ), + $ LDA ) + T = A( KK, KK ) + A( KK, KK ) = A( KP, KP ) + A( KP, KP ) = T + IF( KSTEP.EQ.2 ) THEN + T = A( K+1, K ) + A( K+1, K ) = A( KP, K ) + A( KP, K ) = T + END IF + END IF + IF( KSTEP.EQ.1 ) THEN + IF( K.LT.N ) THEN + D11 = ONE / A( K, K ) + CALL DSYR( UPLO, N-K, -D11, A( K+1, K ), 1, + $ A( K+1, K+1 ), LDA ) + CALL DSCAL( N-K, D11, A( K+1, K ), 1 ) + END IF + ELSE + IF( K.LT.N-1 ) THEN + D21 = A( K+1, K ) + D11 = A( K+1, K+1 ) / D21 + D22 = A( K, K ) / D21 + T = ONE / ( D11*D22-ONE ) + D21 = T / D21 + DO 60 J = K + 2, N + WK = D21*( D11*A( J, K )-A( J, K+1 ) ) + WKP1 = D21*( D22*A( J, K+1 )-A( J, K ) ) + DO 50 I = J, N + A( I, J ) = A( I, J ) - A( I, K )*WK - + $ A( I, K+1 )*WKP1 + 50 CONTINUE + A( J, K ) = WK + A( J, K+1 ) = WKP1 + 60 CONTINUE + END IF + END IF + END IF + IF( KSTEP.EQ.1 ) THEN + IPIV( K ) = KP + ELSE + IPIV( K ) = -KP + IPIV( K+1 ) = -KP + END IF + K = K + KSTEP + GO TO 40 + END IF +* + 70 CONTINUE +* + RETURN +* +* End of DSYTF2 +* + END SUBROUTINE DSYTF2 +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DSYTF2_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DTRTI2_HELPER.f b/Source/Modules/LAPACK/LAPACK_DTRTI2_HELPER.f new file mode 100644 index 00000000..19856547 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DTRTI2_HELPER.f @@ -0,0 +1,82 @@ +! ################################################################################################################################## + + MODULE LAPACK_DTRTI2_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DTRTI2( UPLO, DIAG, N, A, LDA, INFO ) +* +* -- LAPACK routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 +* + CHARACTER DIAG, UPLO + INTEGER INFO, LDA, N + DOUBLE PRECISION A( LDA, * ) + DOUBLE PRECISION ONE + PARAMETER ( ONE = 1.0D+0 ) + LOGICAL NOUNIT, UPPER + INTEGER J + DOUBLE PRECISION AJJ + LOGICAL LSAME + EXTERNAL LSAME + INTRINSIC MAX + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + NOUNIT = LSAME( DIAG, 'N' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( .NOT.NOUNIT .AND. .NOT.LSAME( DIAG, 'U' ) ) THEN + INFO = -2 + ELSE IF( N.LT.0 ) THEN + INFO = -3 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -5 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DTRTI2', -INFO ) + RETURN + END IF + + IF( UPPER ) THEN + DO 10 J = 1, N + IF( NOUNIT ) THEN + A( J, J ) = ONE / A( J, J ) + AJJ = -A( J, J ) + ELSE + AJJ = -ONE + END IF + CALL DTRMV( 'Upper', 'No transpose', DIAG, J-1, A, LDA, + $ A( 1, J ), 1 ) + CALL DSCAL( J-1, AJJ, A( 1, J ), 1 ) + 10 CONTINUE + ELSE + DO 20 J = N, 1, -1 + IF( NOUNIT ) THEN + A( J, J ) = ONE / A( J, J ) + AJJ = -A( J, J ) + ELSE + AJJ = -ONE + END IF + IF( J.LT.N ) THEN + CALL DTRMV( 'Lower', 'No transpose', DIAG, N-J, + $ A( J+1, J+1 ), LDA, A( J+1, J ), 1 ) + CALL DSCAL( N-J, AJJ, A( J+1, J ), 1 ) + END IF + 20 CONTINUE + END IF + + RETURN + END SUBROUTINE DTRTI2 +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DTRTI2_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_DTRTRS_HELPER.f b/Source/Modules/LAPACK/LAPACK_DTRTRS_HELPER.f new file mode 100644 index 00000000..d2fabc89 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_DTRTRS_HELPER.f @@ -0,0 +1,67 @@ +! ################################################################################################################################## + + MODULE LAPACK_DTRTRS_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DTRTRS( UPLO, TRANS, DIAG, N, NRHS, A, LDA, B, LDB, + $ INFO ) + + CHARACTER DIAG, TRANS, UPLO + INTEGER INFO, LDA, LDB, N, NRHS + DOUBLE PRECISION A( LDA, * ), B( LDB, * ) + DOUBLE PRECISION ZERO, ONE + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) + LOGICAL NOUNIT + LOGICAL LSAME + EXTERNAL LSAME + EXTERNAL DTRSM, XERBLA + INTRINSIC MAX + + INFO = 0 + NOUNIT = LSAME( DIAG, 'N' ) + IF( .NOT.LSAME( UPLO, 'U' ) .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( .NOT.LSAME( TRANS, 'N' ) .AND. .NOT. + $ LSAME( TRANS, 'T' ) .AND. .NOT.LSAME( TRANS, 'C' ) ) THEN + INFO = -2 + ELSE IF( .NOT.NOUNIT .AND. .NOT.LSAME( DIAG, 'U' ) ) THEN + INFO = -3 + ELSE IF( N.LT.0 ) THEN + INFO = -4 + ELSE IF( NRHS.LT.0 ) THEN + INFO = -5 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -7 + ELSE IF( LDB.LT.MAX( 1, N ) ) THEN + INFO = -9 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DTRTRS', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) + $ RETURN + + IF( NOUNIT ) THEN + DO 10 INFO = 1, N + IF( A( INFO, INFO ).EQ.ZERO ) + $ RETURN + 10 CONTINUE + END IF + INFO = 0 + + CALL DTRSM( 'Left', UPLO, TRANS, DIAG, N, NRHS, ONE, A, LDA, B, + $ LDB ) + + RETURN + END SUBROUTINE DTRTRS +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_DTRTRS_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_GIV_MGIV_EIG.f b/Source/Modules/LAPACK/LAPACK_GIV_MGIV_EIG.f index ffe588a8..c50cf0b0 100644 --- a/Source/Modules/LAPACK/LAPACK_GIV_MGIV_EIG.f +++ b/Source/Modules/LAPACK/LAPACK_GIV_MGIV_EIG.f @@ -1,6 +1,7 @@ ! ################################################################################################################################## MODULE LAPACK_GIV_MGIV_EIG +! --- lapack_surgery begin --- ! USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE USE IOUNT1, ONLY : ERR, F06, SC1 @@ -8,6 +9,8 @@ MODULE LAPACK_GIV_MGIV_EIG USE TIMDAT, ONLY : TSEC USE LAPACK_BLAS_AUX USE LAPACK_MISCEL ! This contains DSTEQR, used in this module + USE LAPACK_GIV_MGIV_EIG_HELPER + USE LAPACK_DPBSTF_HELPER USE OURTIM_Interface USE OUTA_HERE_Interface @@ -522,313 +525,22 @@ END SUBROUTINE DSBGVX_GIV_MGIV ! 002 LAPACK_GIV_MGIV_EIG SUBROUTINE EIGENVALUE_CONVERGENCE_FAILURE ( RANGE, INFO ) - - USE PARAMS, ONLY : SUPINFO - character range - integer info - Write(err,9902) - if (supinfo == 'N') then - Write(f06,9902) - endif - - if ((info == 1) .or. (info == 3) .and. (range /= 'I')) then - Write(err,99021) - Write(f06,99021) - else if ((info == 2) .or. (info == 3) .and. (range == 'I')) then - Write(err,99022) - Write(f06,99022) - else if (( info == 4) .and. (range == 'I')) then - Write(err,803) - Write(f06,803) - fatal_err = fatal_err + 1 - call outa_here ( 'Y' ) - endif - - 9902 format(' *INFORMATION: SOME OR ALL OF THE EIGENVALUES FAILED TO CO - &NVERGE OR WERE NOT COMPUTED IN LAPACK SUBROUTINE DSTEBZ:') - -99021 format(15x,'BISECTION FAILED TO CONVERGE FOR SOME EIGENVALUES; THE - &SE EIGENVALUES ARE FLAGGED BY A NEGATIVE BLOCK NUMBER.',/,15X, - &'THE EFFECT IS THAT THE EIGENVALUES MAY NOT BE AS ACCURATE AS THE - &ABSOLUTE AND RELATIVE TOLERANCES.',/,15X, - &'THIS IS GENERALLY CAUSED BY UNEXPECTEDLY INACCURATE ARITHMETIC.' - &,/) - -99022 format(15x,'NOT ALL OF THE EIGENVALUES IN THE RANGE REQUESTED WERE - & FOUND:',/,15X, - &'CAUSE: NON-MONOTONIC ARITHMETIC, CAUSING THE STURM SEQUENCE TO BE - & NON-MONOTONIC.',/,15X, - &'CURE : RECALCULATE, REQUESTING ALL EIGENVALUES',/) - - 803 format(' *ERROR 803: PROGRAMMING ERROR IN SUBROUTINE DSTEBZ.' - &,/,15X,'NO EIGENVALUES WERE COMPUTED BY LAPACK SUBROUTINE DSTEBZ. - &THE GERSHGORIN INTERVAL INITIALLY USED WAS TOO SMALL.',/,15X, - &'PROBABLE CAUSE: YOUR MACHINE HAS SLOPPY FLOATING-POINT ARITHMETIC - &',/,15X,'CURE : INCREASE THE PARAMETER "FUDGE" IN LAPACK - &SUBROUTINE DSTEBZ, RECOMPILE, AND TRY AGAIN',/) + CALL EIGENVALUE_CONVERGENCE_FAILURE_HELPER ( RANGE, INFO ) END SUBROUTINE EIGENVALUE_CONVERGENCE_FAILURE ! ################################################################################################################################## ! 003 LAPACK_GIV_MGIV_EIG SUBROUTINE DPBSTF( UPLO, N, KD, AB, LDAB, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPDSTF' -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. CHARACTER UPLO INTEGER INFO, KD, LDAB, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) AB( LDAB, * ) -* .. -* -* Purpose -* ======= -* -* DPBSTF computes a split Cholesky factorization of a real -* symmetric positive definite band matrix A. -* -* This routine is designed to be used in conjunction with DSBGST. -* -* The factorization has the form A = S**T*S where S is a band matrix -* of the same bandwidth as A and the following structure: -* -* S = ( U ) -* ( M L ) -* -* where U is upper triangular of order m = (n+kd)/2, and L is lower -* triangular of order n-m. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangle of A is stored; -* = 'L': Lower triangle of A is stored. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* KD (input) INTEGER -* The number of superdiagonals of the matrix A if UPLO = 'U', -* or the number of subdiagonals if UPLO = 'L'. KD >= 0. -* -* AB (input/output) REAL(DOUBLE) array, dimension (LDAB,N) -* On entry, the upper or lower triangle of the symmetric band -* matrix A, stored in the first kd+1 rows of the array. The -* j-th column of A is stored in the j-th column of the array AB -* as follows: -* if UPLO = 'U', AB(kd+1+i-j,j) = A(i,j) for max(1,j-kd)<=i<=j; -* if UPLO = 'L', AB(1+i-j,j) = A(i,j) for j<=i<=min(n,j+kd). -* -* On exit, if INFO = 0, the factor S from the split Cholesky -* factorization A = S**T*S. See Further Details. -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= KD+1. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, the factorization could not be completed, -* because the updated element a(i,i) was negative; the -* matrix A is not positive definite. -* -* Further Details -* =============== -* -* The band storage scheme is illustrated by the following example, when -* N = 7, KD = 2: -* -* S = ( s11 s12 s13 ) -* ( s22 s23 s24 ) -* ( s33 s34 ) -* ( s44 ) -* ( s53 s54 s55 ) -* ( s64 s65 s66 ) -* ( s75 s76 s77 ) -* -* If UPLO = 'U', the array AB holds: -* -* on entry: on exit: -* -* * * a13 a24 a35 a46 a57 * * s13 s24 s53 s64 s75 -* * a12 a23 a34 a45 a56 a67 * s12 s23 s34 s54 s65 s76 -* a11 a22 a33 a44 a55 a66 a77 s11 s22 s33 s44 s55 s66 s77 -* -* If UPLO = 'L', the array AB holds: -* -* on entry: on exit: -* -* a11 a22 a33 a44 a55 a66 a77 s11 s22 s33 s44 s55 s66 s77 -* a21 a32 a43 a54 a65 a76 * s12 s23 s34 s54 s65 s76 * -* a31 a42 a53 a64 a64 * * s13 s24 s53 s64 s75 * * -* -* Array elements marked * are not used by the routine. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - INTEGER J, KLD, KM, M - REAL(DOUBLE) AJJ -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN, SQRT -* .. -* .. Executable Statements .. + REAL(DOUBLE) AB( LDAB, * ) -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( KD.LT.0 ) THEN - INFO = -3 - ELSE IF( LDAB.LT.KD+1 ) THEN - INFO = -5 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DPBSTF', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* - KLD = MAX( 1, LDAB-1 ) -* -* Set the splitting point m. -* - M = ( N+KD ) / 2 -* - IF( UPPER ) THEN -* -* Factorize A(m+1:n,m+1:n) as L**T*L, and update A(1:m,1:m). -* - DO 10 J = N, M + 1, -1 -* -* Compute s(j,j) and test for non-positive-definiteness. -* - AJJ = AB( KD+1, J ) - IF( AJJ.LE.ZERO ) - $ GO TO 50 - AJJ = SQRT( AJJ ) - AB( KD+1, J ) = AJJ - KM = MIN( J-1, KD ) -* -* Compute elements j-km:j-1 of the j-th column and update the -* the leading submatrix within the band. -* - CALL DSCAL( KM, ONE / AJJ, AB( KD+1-KM, J ), 1 ) - CALL DSYR( 'Upper', KM, -ONE, AB( KD+1-KM, J ), 1, - $ AB( KD+1, J-KM ), KLD ) - 10 CONTINUE -* -* Factorize the updated submatrix A(1:m,1:m) as U**T*U. -* - DO 20 J = 1, M -* -* Compute s(j,j) and test for non-positive-definiteness. -* - AJJ = AB( KD+1, J ) - IF( AJJ.LE.ZERO ) - $ GO TO 50 - AJJ = SQRT( AJJ ) - AB( KD+1, J ) = AJJ - KM = MIN( KD, M-J ) -* -* Compute elements j+1:j+km of the j-th row and update the -* trailing submatrix within the band. -* - IF( KM.GT.0 ) THEN - CALL DSCAL( KM, ONE / AJJ, AB( KD, J+1 ), KLD ) - CALL DSYR( 'Upper', KM, -ONE, AB( KD, J+1 ), KLD, - $ AB( KD+1, J+1 ), KLD ) - END IF - 20 CONTINUE - ELSE -* -* Factorize A(m+1:n,m+1:n) as L**T*L, and update A(1:m,1:m). -* - DO 30 J = N, M + 1, -1 -* -* Compute s(j,j) and test for non-positive-definiteness. -* - AJJ = AB( 1, J ) - IF( AJJ.LE.ZERO ) - $ GO TO 50 - AJJ = SQRT( AJJ ) - AB( 1, J ) = AJJ - KM = MIN( J-1, KD ) -* -* Compute elements j-km:j-1 of the j-th row and update the -* trailing submatrix within the band. -* - CALL DSCAL( KM, ONE / AJJ, AB( KM+1, J-KM ), KLD ) - CALL DSYR( 'Lower', KM, -ONE, AB( KM+1, J-KM ), KLD, - $ AB( 1, J-KM ), KLD ) - 30 CONTINUE -* -* Factorize the updated submatrix A(1:m,1:m) as U**T*U. -* - DO 40 J = 1, M -* -* Compute s(j,j) and test for non-positive-definiteness. -* - AJJ = AB( 1, J ) - IF( AJJ.LE.ZERO ) - $ GO TO 50 - AJJ = SQRT( AJJ ) - AB( 1, J ) = AJJ - KM = MIN( KD, M-J ) -* -* Compute elements j+1:j+km of the j-th column and update the -* trailing submatrix within the band. -* - IF( KM.GT.0 ) THEN - CALL DSCAL( KM, ONE / AJJ, AB( 2, J ), 1 ) - CALL DSYR( 'Lower', KM, -ONE, AB( 2, J ), 1, - $ AB( 1, J+1 ), KLD ) - END IF - 40 CONTINUE - END IF - RETURN -* - 50 CONTINUE - INFO = J + CALL DPBSTF_HELPER( UPLO, N, KD, AB, LDAB, INFO ) - RETURN -* -* End of DPBSTF -* END SUBROUTINE DPBSTF ! ################################################################################################################################## @@ -837,6 +549,19 @@ END SUBROUTINE DPBSTF SUBROUTINE DSBGST( VECT, UPLO, N, KA, KB, AB, LDAB, BB, LDBB, X, $ LDX, WORK, INFO ) + CHARACTER UPLO, VECT + INTEGER INFO, KA, KB, LDAB, LDBB, LDX, N + REAL(DOUBLE) AB( LDAB, * ), BB( LDBB, * ), WORK( * ), + $ X( LDX, * ) + + CALL DSBGST_HELPER( VECT, UPLO, N, KA, KB, AB, LDAB, BB, LDBB, + $ X, LDX, WORK, INFO ) + + END SUBROUTINE DSBGST + + SUBROUTINE DSBGST_HELPER( VECT, UPLO, N, KA, KB, AB, LDAB, BB, + $ LDBB, X, LDX, WORK, INFO ) + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSBGST' @@ -2225,7 +1950,7 @@ SUBROUTINE DSBGST( VECT, UPLO, N, KA, KB, AB, LDAB, BB, LDBB, X, ! ********************************************************************************************************************************** - END SUBROUTINE DSBGST + END SUBROUTINE DSBGST_HELPER ! ################################################################################################################################## ! 005 LAPACK_GIV_MGIV_EIG @@ -2233,6 +1958,19 @@ END SUBROUTINE DSBGST SUBROUTINE DSBTRD( VECT, UPLO, N, KD, AB, LDAB, D, E, Q, LDQ, $ WORK, INFO ) + CHARACTER UPLO, VECT + INTEGER INFO, KD, LDAB, LDQ, N + REAL(DOUBLE) AB( LDAB, * ), D( * ), E( * ), Q( LDQ, * ), + $ WORK( * ) + + CALL DSBTRD_HELPER( VECT, UPLO, N, KD, AB, LDAB, D, E, Q, LDQ, + $ WORK, INFO ) + + END SUBROUTINE DSBTRD + + SUBROUTINE DSBTRD_HELPER( VECT, UPLO, N, KD, AB, LDAB, D, E, Q, + $ LDQ, WORK, INFO ) + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSBTRD' @@ -2795,7 +2533,7 @@ SUBROUTINE DSBTRD( VECT, UPLO, N, KD, AB, LDAB, D, E, Q, LDQ, ! ********************************************************************************************************************************** - END SUBROUTINE DSBTRD + END SUBROUTINE DSBTRD_HELPER ! ################################################################################################################################## ! 006 LAPACK_GIV_MGIV_EIG @@ -4091,4 +3829,5 @@ SUBROUTINE DLAGTF( N, A, LAMBDA, B, C, TOL, D, IN, INFO ) * END SUBROUTINE DLAGTF +! --- lapack_surgery end --- ! END MODULE LAPACK_GIV_MGIV_EIG diff --git a/Source/Modules/LAPACK/LAPACK_GIV_MGIV_EIG_HELPER.f b/Source/Modules/LAPACK/LAPACK_GIV_MGIV_EIG_HELPER.f new file mode 100644 index 00000000..c9f8f26f --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_GIV_MGIV_EIG_HELPER.f @@ -0,0 +1,64 @@ +! ################################################################################################################################## + + MODULE LAPACK_GIV_MGIV_EIG_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE IOUNT1, ONLY : ERR, F06 + USE SCONTR, ONLY : FATAL_ERR + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE EIGENVALUE_CONVERGENCE_FAILURE_HELPER ( RANGE, INFO ) + + USE PARAMS, ONLY : SUPINFO + + character range + + integer info + + Write(err,9902) + if (supinfo == 'N') then + Write(f06,9902) + endif + + if ((info == 1) .or. (info == 3) .and. (range /= 'I')) then + Write(err,99021) + Write(f06,99021) + else if ((info == 2) .or. (info == 3) .and. (range == 'I')) then + Write(err,99022) + Write(f06,99022) + else if (( info == 4) .and. (range == 'I')) then + Write(err,803) + Write(f06,803) + fatal_err = fatal_err + 1 + call outa_here ( 'Y' ) + endif + + 9902 format(' *INFORMATION: SOME OR ALL OF THE EIGENVALUES FAILED TO CO + &NVERGE OR WERE NOT COMPUTED IN LAPACK SUBROUTINE DSTEBZ:') + +99021 format(15x,'BISECTION FAILED TO CONVERGE FOR SOME EIGENVALUES; THE + &SE EIGENVALUES ARE FLAGGED BY A NEGATIVE BLOCK NUMBER.',/,15X, + &'THE EFFECT IS THAT THE EIGENVALUES MAY NOT BE AS ACCURATE AS THE + &ABSOLUTE AND RELATIVE TOLERANCES.',/,15X, + &'THIS IS GENERALLY CAUSED BY UNEXPECTEDLY INACCURATE ARITHMETIC.' + &,/) + +99022 format(15x,'NOT ALL OF THE EIGENVALUES IN THE RANGE REQUESTED WERE + & FOUND:',/,15X, + &'CAUSE: NON-MONOTONIC ARITHMETIC, CAUSING THE STURM SEQUENCE TO BE + & NON-MONOTONIC.',/,15X, + &'CURE : RECALCULATE, REQUESTING ALL EIGENVALUES',/) + + 803 format(' *ERROR 803: PROGRAMMING ERROR IN SUBROUTINE DSTEBZ.' + &,/,15X,'NO EIGENVALUES WERE COMPUTED BY LAPACK SUBROUTINE DSTEBZ. + &THE GERSHGORIN INTERVAL INITIALLY USED WAS TOO SMALL.',/,15X, + &'PROBABLE CAUSE: YOUR MACHINE HAS SLOPPY FLOATING-POINT ARITHMETIC + &',/,15X,'CURE : INCREASE THE PARAMETER "FUDGE" IN LAPACK + &SUBROUTINE DSTEBZ, RECOMPILE, AND TRY AGAIN',/) + + END SUBROUTINE EIGENVALUE_CONVERGENCE_FAILURE_HELPER + + END MODULE LAPACK_GIV_MGIV_EIG_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_LANCZOS_EIG.f b/Source/Modules/LAPACK/LAPACK_LANCZOS_EIG.f index 70f72ee1..68565142 100644 --- a/Source/Modules/LAPACK/LAPACK_LANCZOS_EIG.f +++ b/Source/Modules/LAPACK/LAPACK_LANCZOS_EIG.f @@ -1,370 +1,34 @@ ! ################################################################################################################################## MODULE LAPACK_LANCZOS_EIG +! --- lapack_surgery begin --- ! - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE SCONTR, ONLY : BLNK_SUB_NAM - USE TIMDAT, ONLY : TSEC - USE LAPACK_BLAS_AUX - - USE OURTIM_Interface - -! This is the set of LAPACK routines used by the Lanczos algorithm contained in module ARPACK_LANCZOS_EIG -! The following routines are contained: - -! DGEQR2: to compute a QR factorization of a real m by n matrix A: -! A = Q * R. - -! DORM2R: to overwrite the general real m by n matrix C with -! -! Q * C if SIDE = 'L' and TRANS = 'N', or -! -! Q'* C if SIDE = 'L' and TRANS = 'T', or -! -! C * Q if SIDE = 'R' and TRANS = 'N', or -! -! C * Q' if SIDE = 'R' and TRANS = 'T', -! -! where Q is a real orthogonal matrix defined as the product of k elementary reflectors + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_LANCZOS_EIG_HELPER, ONLY : DGEQR2_HELPER => DGEQR2, & + & DORM2R_HELPER => DORM2R CONTAINS ! ################################################################################################################################## -! 001 LAPACK_LANCZOS_EIG SUBROUTINE DGEQR2( M, N, A, LDA, TAU, WORK, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSTEQR' -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. INTEGER INFO, LDA, M, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( * ) -* .. -* -* Purpose -* ======= -* -* DGEQR2 computes a QR factorization of a real m by n matrix A: -* A = Q * R. -* -* Arguments -* ========= -* -* M (input) INTEGER -* The number of rows of the matrix A. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix A. N >= 0. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) -* On entry, the m by n matrix A. -* On exit, the elements on and above the diagonal of the array -* contain the min(m,n) by n upper trapezoidal matrix R (R is -* upper triangular if m >= n); the elements below the diagonal, -* with the array TAU, represent the orthogonal matrix Q as a -* product of elementary reflectors (see Further Details). -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,M). -* -* TAU (output) REAL(DOUBLE) array, dimension (min(M,N)) -* The scalar factors of the elementary reflectors (see Further -* Details). -* -* WORK (workspace) REAL(DOUBLE) array, dimension (N) -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -* Further Details -* =============== -* -* The matrix Q is represented as a product of elementary reflectors -* -* Q = H(1) H(2) . . . H(k), where k = min(m,n). -* -* Each H(i) has the form -* -* H(i) = I - tau * v * v' -* -* where tau is a real scalar, and v is a real vector with -* v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), -* and tau in TAU(i). -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER I, K - REAL(DOUBLE) AII -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. - -* -* Test the input arguments -* - INFO = 0 - IF( M.LT.0 ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, M ) ) THEN - INFO = -4 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DGEQR2', -INFO ) - RETURN - END IF -* - K = MIN( M, N ) -* - DO 10 I = 1, K -* -* Generate elementary reflector H(i) to annihilate A(i+1:m,i) -* - CALL DLARFG( M-I+1, A( I, I ), A( MIN( I+1, M ), I ), 1, - $ TAU( I ) ) - IF( I.LT.N ) THEN -* -* Apply H(i) to A(i:m,i+1:n) from the left -* - AII = A( I, I ) - A( I, I ) = ONE - CALL DLARF( 'Left', M-I+1, N-I, A( I, I ), 1, TAU( I ), - $ A( I, I+1 ), LDA, WORK ) - A( I, I ) = AII - END IF - 10 CONTINUE - + REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( * ) + CALL DGEQR2_HELPER( M, N, A, LDA, TAU, WORK, INFO ) RETURN -* -* End of DGEQR2 -* END SUBROUTINE DGEQR2 ! ################################################################################################################################## -! 002 LAPACK_LANCZOS_EIG SUBROUTINE DORM2R( SIDE, TRANS, M, N, K, A, LDA, TAU, C, LDC, $ WORK, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = '' -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. CHARACTER SIDE, TRANS INTEGER INFO, K, LDA, LDC, M, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), C( LDC, * ), TAU( * ), WORK( * ) -* .. -* -* Purpose -* ======= -* -* DORM2R overwrites the general real m by n matrix C with -* -* Q * C if SIDE = 'L' and TRANS = 'N', or -* -* Q'* C if SIDE = 'L' and TRANS = 'T', or -* -* C * Q if SIDE = 'R' and TRANS = 'N', or -* -* C * Q' if SIDE = 'R' and TRANS = 'T', -* -* where Q is a real orthogonal matrix defined as the product of k -* elementary reflectors -* -* Q = H(1) H(2) . . . H(k) -* -* as returned by DGEQRF. Q is of order m if SIDE = 'L' and of order n -* if SIDE = 'R'. -* -* Arguments -* ========= -* -* SIDE (input) CHARACTER*1 -* = 'L': apply Q or Q' from the Left -* = 'R': apply Q or Q' from the Right -* -* TRANS (input) CHARACTER*1 -* = 'N': apply Q (No transpose) -* = 'T': apply Q' (Transpose) -* -* M (input) INTEGER -* The number of rows of the matrix C. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix C. N >= 0. -* -* K (input) INTEGER -* The number of elementary reflectors whose product defines -* the matrix Q. -* If SIDE = 'L', M >= K >= 0; -* if SIDE = 'R', N >= K >= 0. -* -* A (input) REAL(DOUBLE) array, dimension (LDA,K) -* The i-th column must contain the vector which defines the -* elementary reflector H(i), for i = 1,2,...,k, as returned by -* DGEQRF in the first k columns of its array argument A. -* A is modified by the routine but restored on exit. -* -* LDA (input) INTEGER -* The leading dimension of the array A. -* If SIDE = 'L', LDA >= max(1,M); -* if SIDE = 'R', LDA >= max(1,N). -* -* TAU (input) REAL(DOUBLE) array, dimension (K) -* TAU(i) must contain the scalar factor of the elementary -* reflector H(i), as returned by DGEQRF. -* -* C (input/output) REAL(DOUBLE) array, dimension (LDC,N) -* On entry, the m by n matrix C. -* On exit, C is overwritten by Q*C or Q'*C or C*Q' or C*Q. -* -* LDC (input) INTEGER -* The leading dimension of the array C. LDC >= max(1,M). -* -* WORK (workspace) REAL(DOUBLE) array, dimension -* (N) if SIDE = 'L', -* (M) if SIDE = 'R' -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL LEFT, NOTRAN - INTEGER I, I1, I2, I3, IC, JC, MI, NI, NQ - REAL(DOUBLE) AII -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. Executable Statements .. - -* -* Test the input arguments -* - INFO = 0 - LEFT = LSAME( SIDE, 'L' ) - NOTRAN = LSAME( TRANS, 'N' ) -* -* NQ is the order of Q -* - IF( LEFT ) THEN - NQ = M - ELSE - NQ = N - END IF - IF( .NOT.LEFT .AND. .NOT.LSAME( SIDE, 'R' ) ) THEN - INFO = -1 - ELSE IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) ) THEN - INFO = -2 - ELSE IF( M.LT.0 ) THEN - INFO = -3 - ELSE IF( N.LT.0 ) THEN - INFO = -4 - ELSE IF( K.LT.0 .OR. K.GT.NQ ) THEN - INFO = -5 - ELSE IF( LDA.LT.MAX( 1, NQ ) ) THEN - INFO = -7 - ELSE IF( LDC.LT.MAX( 1, M ) ) THEN - INFO = -10 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DORM2R', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( M.EQ.0 .OR. N.EQ.0 .OR. K.EQ.0 ) - $ RETURN -* - IF( ( LEFT .AND. .NOT.NOTRAN ) .OR. ( .NOT.LEFT .AND. NOTRAN ) ) - $ THEN - I1 = 1 - I2 = K - I3 = 1 - ELSE - I1 = K - I2 = 1 - I3 = -1 - END IF -* - IF( LEFT ) THEN - NI = N - JC = 1 - ELSE - MI = M - IC = 1 - END IF -* - DO 10 I = I1, I2, I3 - IF( LEFT ) THEN -* -* H(i) is applied to C(i:m,1:n) -* - MI = M - I + 1 - IC = I - ELSE -* -* H(i) is applied to C(1:m,i:n) -* - NI = N - I + 1 - JC = I - END IF -* -* Apply H(i) -* - AII = A( I, I ) - A( I, I ) = ONE - CALL DLARF( SIDE, MI, NI, A( I, I ), 1, TAU( I ), C( IC, JC ), - $ LDC, WORK ) - A( I, I ) = AII - 10 CONTINUE - + REAL(DOUBLE) A( LDA, * ), C( LDC, * ), TAU( * ), WORK( * ) + CALL DORM2R_HELPER( SIDE, TRANS, M, N, K, A, LDA, TAU, C, LDC, + $ WORK, INFO ) RETURN -* -* End of DORM2R -* END SUBROUTINE DORM2R +! --- lapack_surgery end --- ! END MODULE LAPACK_LANCZOS_EIG diff --git a/Source/Modules/LAPACK/LAPACK_LANCZOS_EIG_HELPER.f b/Source/Modules/LAPACK/LAPACK_LANCZOS_EIG_HELPER.f new file mode 100644 index 00000000..ed9c4245 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_LANCZOS_EIG_HELPER.f @@ -0,0 +1,139 @@ +! ################################################################################################################################## + + MODULE LAPACK_LANCZOS_EIG_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + + SUBROUTINE DGEQR2( M, N, A, LDA, TAU, WORK, INFO ) + + INTEGER INFO, LDA, M, N + REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( * ) + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D+0 ) + INTEGER I, K + REAL(DOUBLE) AII + INTRINSIC MAX, MIN + + INFO = 0 + IF( M.LT.0 ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, M ) ) THEN + INFO = -4 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGEQR2', -INFO ) + RETURN + END IF + + K = MIN( M, N ) + + DO 10 I = 1, K + CALL DLARFG( M-I+1, A( I, I ), A( MIN( I+1, M ), I ), 1, + $ TAU( I ) ) + IF( I.LT.N ) THEN + AII = A( I, I ) + A( I, I ) = ONE + CALL DLARF( 'Left', M-I+1, N-I, A( I, I ), 1, TAU( I ), + $ A( I, I+1 ), LDA, WORK ) + A( I, I ) = AII + END IF + 10 CONTINUE + + RETURN + END SUBROUTINE DGEQR2 + +! ################################################################################################################################## + + SUBROUTINE DORM2R( SIDE, TRANS, M, N, K, A, LDA, TAU, C, LDC, + $ WORK, INFO ) + + CHARACTER SIDE, TRANS + INTEGER INFO, K, LDA, LDC, M, N + REAL(DOUBLE) A( LDA, * ), C( LDC, * ), TAU( * ), WORK( * ) + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D+0 ) + LOGICAL LEFT, NOTRAN + INTEGER I, I1, I2, I3, IC, JC, MI, NI, NQ + REAL(DOUBLE) AII + LOGICAL LSAME + EXTERNAL LSAME + INTRINSIC MAX + + INFO = 0 + LEFT = LSAME( SIDE, 'L' ) + NOTRAN = LSAME( TRANS, 'N' ) + + IF( LEFT ) THEN + NQ = M + ELSE + NQ = N + END IF + IF( .NOT.LEFT .AND. .NOT.LSAME( SIDE, 'R' ) ) THEN + INFO = -1 + ELSE IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) ) THEN + INFO = -2 + ELSE IF( M.LT.0 ) THEN + INFO = -3 + ELSE IF( N.LT.0 ) THEN + INFO = -4 + ELSE IF( K.LT.0 .OR. K.GT.NQ ) THEN + INFO = -5 + ELSE IF( LDA.LT.MAX( 1, NQ ) ) THEN + INFO = -7 + ELSE IF( LDC.LT.MAX( 1, M ) ) THEN + INFO = -10 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DORM2R', -INFO ) + RETURN + END IF + + IF( M.EQ.0 .OR. N.EQ.0 .OR. K.EQ.0 ) + $ RETURN + + IF( ( LEFT .AND. .NOT.NOTRAN ) .OR. ( .NOT.LEFT .AND. NOTRAN ) ) + $ THEN + I1 = 1 + I2 = K + I3 = 1 + ELSE + I1 = K + I2 = 1 + I3 = -1 + END IF + + IF( LEFT ) THEN + NI = N + JC = 1 + ELSE + MI = M + IC = 1 + END IF + + DO 10 I = I1, I2, I3 + IF( LEFT ) THEN + MI = M - I + 1 + IC = I + ELSE + NI = N - I + 1 + JC = I + END IF + + AII = A( I, I ) + A( I, I ) = ONE + CALL DLARF( SIDE, MI, NI, A( I, I ), 1, TAU( I ), C( IC, JC ), + $ LDC, WORK ) + A( I, I ) = AII + 10 CONTINUE + + RETURN + END SUBROUTINE DORM2R + + END MODULE LAPACK_LANCZOS_EIG_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGB.f b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGB.f index 58e4ee50..8196a65d 100644 --- a/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGB.f +++ b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGB.f @@ -1,905 +1,77 @@ ! ################################################################################################################################## MODULE LAPACK_LIN_EQN_DGB +! --- lapack_surgery begin --- ! USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06, SC1 - USE SCONTR, ONLY : LINKNO, BLNK_SUB_NAM - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, STIME, TSEC - USE LAPACK_BLAS_AUX + USE LAPACK_LIN_EQN_DGB_KERNEL, ONLY: DGBTRF_KERNEL => DGBTRF, + & DGBTRS_KERNEL => DGBTRS, + & DGBTF2_KERNEL => DGBTF2 - USE OURTIM_Interface - USE OUTA_HERE_Interface - - character(1*byte), parameter :: cr13_dgb = char(13) - CHARACTER(44*BYTE), PRIVATE :: MODNAM ! Name to write to screen to describe module being run. - -! This is a set of LAPACK routines for factorization and solution of linear eqns for general band matrices - -! DGBTRF: Driver to compute a LU factorization of a real m-by-n band matrix A using partial pivoting with row interchanges. - -! DGBTRS: to solve a system of linear equations -! A * X = B or A' * X = B -! with a general band matrix A using the LU factorization computed by DGBTRF. - -! DGBTF2: to compute a LU factorization of a real m-by-n band matrix A using partial pivoting with row interchanges. -! This subroutine is called by DGBTRF +! This facade preserves the historical MYSTRAN module name while delegating +! all numerical work to LAPACK_LIN_EQN_DGB_KERNEL. CONTAINS ! ################################################################################################################################## ! 001 LAPACK_LINEAR_EQN_DGB +! --- lapack_peeloff begin --- ! SUBROUTINE DGBTRF( M, N, KL, KU, AB, LDAB, IPIV, INFO ) USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGBTRF' -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. INTEGER INFO, KL, KU, LDAB, M, N -* .. -* .. Array Arguments .. INTEGER IPIV( * ) - REAL(DOUBLE) AB( LDAB, * ) -* .. -* -* Purpose -* ======= -* -* DGBTRF computes an LU factorization of a real m-by-n band matrix A -* using partial pivoting with row interchanges. -* -* This is the blocked version of the algorithm, calling Level 3 BLAS. -* -* Arguments -* ========= -* -* M (input) INTEGER -* The number of rows of the matrix A. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix A. N >= 0. -* -* KL (input) INTEGER -* The number of subdiagonals within the band of A. KL >= 0. -* -* KU (input) INTEGER -* The number of superdiagonals within the band of A. KU >= 0. -* -* AB (input/output) REAL(DOUBLE) array, dimension (LDAB,N) -* On entry, the matrix A in band storage, in rows KL+1 to -* 2*KL+KU+1; rows 1 to KL of the array need not be set. -* The j-th column of A is stored in the j-th column of the -* array AB as follows: -* AB(kl+ku+1+i-j,j) = A(i,j) for max(1,j-ku)<=i<=min(m,j+kl) -* -* On exit, details of the factorization: U is stored as an -* upper triangular band matrix with KL+KU superdiagonals in -* rows 1 to KL+KU+1, and the multipliers used during the -* factorization are stored in rows KL+KU+2 to 2*KL+KU+1. -* See below for further details. -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= 2*KL+KU+1. -* -* IPIV (output) INTEGER array, dimension (min(M,N)) -* The pivot indices; for 1 <= i <= min(M,N), row i of the -* matrix was interchanged with row IPIV(i). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = +i, U(i,i) is exactly zero. The factorization -* has been completed, but the factor U is exactly -* singular, and division by zero will occur if it is used -* to solve a system of equations. -* -* Further Details -* =============== -* -* The band storage scheme is illustrated by the following example, when -* M = N = 6, KL = 2, KU = 1: -* -* On entry: On exit: -* -* * * * + + + * * * u14 u25 u36 -* * * + + + + * * u13 u24 u35 u46 -* * a12 a23 a34 a45 a56 * u12 u23 u34 u45 u56 -* a11 a22 a33 a44 a55 a66 u11 u22 u33 u44 u55 u66 -* a21 a32 a43 a54 a65 * m21 m32 m43 m54 m65 * -* a31 a42 a53 a64 * * m31 m42 m53 m64 * * -* -* Array elements marked * are not used by the routine; elements marked -* + need not be set on entry, but are required by the routine to store -* elements of U because of fill-in resulting from the row interchanges. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) - INTEGER NBMAX, LDWORK - PARAMETER ( NBMAX = 64, LDWORK = NBMAX+1 ) -* .. -* .. Local Scalars .. - INTEGER I, I2, I3, II, IP, J, J2, J3, JB, JJ, JM, JP, - $ JU, K2, KM, KV, NB, NW -! /////////////////////////////////////////////////////////////////////B -! Add this so we can use IMPLICIT NONE (switch "-in" when compiling) - integer numblk, iblock -! /////////////////////////////////////////////////////////////////////E - REAL(DOUBLE) TEMP -* .. -* .. Local Arrays .. - REAL(DOUBLE) WORK13( LDWORK, NBMAX ), - $ WORK31( LDWORK, NBMAX ) -* .. -* .. External Functions .. - INTEGER ILAENV - EXTERNAL ILAENV - + REAL(DOUBLE) AB( LDAB, * ) -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. - -* -* KV is the number of superdiagonals in the factor U, allowing for -* fill-in -* - KV = KU + KL -* -* Test the input parameters. -* - INFO = 0 - IF( M.LT.0 ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( KL.LT.0 ) THEN - INFO = -3 - ELSE IF( KU.LT.0 ) THEN - INFO = -4 - ELSE IF( LDAB.LT.KL+KV+1 ) THEN - INFO = -6 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DGBTRF', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( M.EQ.0 .OR. N.EQ.0 ) - $ RETURN -* -* Determine the block size for this environment -* - NB = ILAENV( 1, 'DGBTRF', ' ', M, N, KL, KU ) -* -* The block size must not exceed the limit set by the size of the -* local arrays WORK13 and WORK31. -* - NB = MIN( NB, NBMAX ) -* - IF( NB.LE.1 .OR. NB.GT.KL ) THEN -* -* Use unblocked code -* - CALL DGBTF2( M, N, KL, KU, AB, LDAB, IPIV, INFO ) - ELSE -* -* Use blocked code -* -* Zero the superdiagonal elements of the work array WORK13 -* - DO 20 J = 1, NB - DO 10 I = 1, J - 1 - WORK13( I, J ) = ZERO - 10 CONTINUE - 20 CONTINUE -* -* Zero the subdiagonal elements of the work array WORK31 -* - DO 40 J = 1, NB - DO 30 I = J + 1, NB - WORK31( I, J ) = ZERO - 30 CONTINUE - 40 CONTINUE -* -* Gaussian elimination with partial pivoting -* -* Set fill-in elements in columns KU+2 to KV to zero -* - DO 60 J = KU + 2, MIN( KV, N ) - DO 50 I = KV - J + 2, KL - AB( I, J ) = ZERO - 50 CONTINUE - 60 CONTINUE -* -* JU is the index of the last column affected by the current -* stage of the factorization -* - JU = 1 -* - numblk = int(MIN(M,n)/nb) + 1 - iblock = 0 - WRITE (*,*) - DO 180 J = 1, MIN( M, N ), NB - iblock = iblock + 1 - JB = MIN( NB, MIN( M, N )-J+1 ) - write(sc1,12345,advance='no') iblock,numblk,jb,cr13_dgb -* -* The active part of the matrix is partitioned -* -* A11 A12 A13 -* A21 A22 A23 -* A31 A32 A33 -* -* Here A11, A21 and A31 denote the current block of JB columns -* which is about to be factorized. The number of rows in the -* partitioning are JB, I2, I3 respectively, and the numbers -* of columns are JB, J2, J3. The superdiagonal elements of A13 -* and the subdiagonal elements of A31 lie outside the band. -* - I2 = MIN( KL-JB, M-J-JB+1 ) - I3 = MIN( JB, M-J-KL+1 ) -* -* J2 and J3 are computed after JU has been updated. -* -* Factorize the current block of JB columns -* - DO 80 JJ = J, J + JB - 1 -* -* Set fill-in elements in column JJ+KV to zero -* - IF( JJ+KV.LE.N ) THEN - DO 70 I = 1, KL - AB( I, JJ+KV ) = ZERO - 70 CONTINUE - END IF -* -* Find pivot and test for singularity. KM is the number of -* subdiagonal elements in the current column. -* - KM = MIN( KL, M-JJ ) - JP = IDAMAX( KM+1, AB( KV+1, JJ ), 1 ) - IPIV( JJ ) = JP + JJ - J - IF( AB( KV+JP, JJ ).NE.ZERO ) THEN - JU = MAX( JU, MIN( JJ+KU+JP-1, N ) ) - IF( JP.NE.1 ) THEN -* -* Apply interchange to columns J to J+JB-1 -* - IF( JP+JJ-1.LT.J+KL ) THEN -* - CALL DSWAP( JB, AB( KV+1+JJ-J, J ), LDAB-1, - $ AB( KV+JP+JJ-J, J ), LDAB-1 ) - ELSE -* -* The interchange affects columns J to JJ-1 of A31 -* which are stored in the work array WORK31 -* - CALL DSWAP( JJ-J, AB( KV+1+JJ-J, J ), LDAB-1, - $ WORK31( JP+JJ-J-KL, 1 ), LDWORK ) - CALL DSWAP( J+JB-JJ, AB( KV+1, JJ ), LDAB-1, - $ AB( KV+JP, JJ ), LDAB-1 ) - END IF - END IF -* -* Compute multipliers -* - CALL DSCAL( KM, ONE / AB( KV+1, JJ ), AB( KV+2, JJ ), - $ 1 ) -* -* Update trailing submatrix within the band and within -* the current block. JM is the index of the last column -* which needs to be updated. -* - JM = MIN( JU, J+JB-1 ) - IF( JM.GT.JJ ) - $ CALL DGER( KM, JM-JJ, -ONE, AB( KV+2, JJ ), 1, - $ AB( KV, JJ+1 ), LDAB-1, - $ AB( KV+1, JJ+1 ), LDAB-1 ) - ELSE -* -* If pivot is zero, set INFO to the index of the pivot -* unless a zero pivot has already been found. -* - IF( INFO.EQ.0 ) - $ INFO = JJ - END IF -* -* Copy current column of A31 into the work array WORK31 -* - NW = MIN( JJ-J+1, I3 ) - IF( NW.GT.0 ) - $ CALL DCOPY( NW, AB( KV+KL+1-JJ+J, JJ ), 1, - $ WORK31( 1, JJ-J+1 ), 1 ) - 80 CONTINUE - IF( J+JB.LE.N ) THEN -* -* Apply the row interchanges to the other blocks. -* - J2 = MIN( JU-J+1, KV ) - JB - J3 = MAX( 0, JU-J-KV+1 ) -* -* Use DLASWP to apply the row interchanges to A12, A22, and -* A32. -* - CALL DLASWP( J2, AB( KV+1-JB, J+JB ), LDAB-1, 1, JB, - $ IPIV( J ), 1 ) -* -* Adjust the pivot indices. -* - DO 90 I = J, J + JB - 1 - IPIV( I ) = IPIV( I ) + J - 1 - 90 CONTINUE -* -* Apply the row interchanges to A13, A23, and A33 -* columnwise. -* - K2 = J - 1 + JB + J2 - DO 110 I = 1, J3 - JJ = K2 + I - DO 100 II = J + I - 1, J + JB - 1 - IP = IPIV( II ) - IF( IP.NE.II ) THEN - TEMP = AB( KV+1+II-JJ, JJ ) - AB( KV+1+II-JJ, JJ ) = AB( KV+1+IP-JJ, JJ ) - AB( KV+1+IP-JJ, JJ ) = TEMP - END IF - 100 CONTINUE - 110 CONTINUE -* -* Update the relevant part of the trailing submatrix -* - IF( J2.GT.0 ) THEN -* -* Update A12 -* - CALL DTRSM( 'Left', 'Lower', 'No transpose', 'Unit', - $ JB, J2, ONE, AB( KV+1, J ), LDAB-1, - $ AB( KV+1-JB, J+JB ), LDAB-1 ) -* - IF( I2.GT.0 ) THEN -* -* Update A22 -* - CALL DGEMM( 'No transpose', 'No transpose', I2, J2, - $ JB, -ONE, AB( KV+1+JB, J ), LDAB-1, - $ AB( KV+1-JB, J+JB ), LDAB-1, ONE, - $ AB( KV+1, J+JB ), LDAB-1 ) - END IF -* - IF( I3.GT.0 ) THEN -* -* Update A32 -* - CALL DGEMM( 'No transpose', 'No transpose', I3, J2, - $ JB, -ONE, WORK31, LDWORK, - $ AB( KV+1-JB, J+JB ), LDAB-1, ONE, - $ AB( KV+KL+1-JB, J+JB ), LDAB-1 ) - END IF - END IF -* - IF( J3.GT.0 ) THEN -* -* Copy the lower triangle of A13 into the work array -* WORK13 -* - DO 130 JJ = 1, J3 - DO 120 II = JJ, JB - WORK13( II, JJ ) = AB( II-JJ+1, JJ+J+KV-1 ) - 120 CONTINUE - 130 CONTINUE -* -* Update A13 in the work array -* - CALL DTRSM( 'Left', 'Lower', 'No transpose', 'Unit', - $ JB, J3, ONE, AB( KV+1, J ), LDAB-1, - $ WORK13, LDWORK ) -* - IF( I2.GT.0 ) THEN -* -* Update A23 -* - CALL DGEMM( 'No transpose', 'No transpose', I2, J3, - $ JB, -ONE, AB( KV+1+JB, J ), LDAB-1, - $ WORK13, LDWORK, ONE, AB( 1+JB, J+KV ), - $ LDAB-1 ) - END IF -* - IF( I3.GT.0 ) THEN -* -* Update A33 -* - CALL DGEMM( 'No transpose', 'No transpose', I3, J3, - $ JB, -ONE, WORK31, LDWORK, WORK13, - $ LDWORK, ONE, AB( 1+KL, J+KV ), LDAB-1 ) - END IF -* -* Copy the lower triangle of A13 back into place -* - DO 150 JJ = 1, J3 - DO 140 II = JJ, JB - AB( II-JJ+1, JJ+J+KV-1 ) = WORK13( II, JJ ) - 140 CONTINUE - 150 CONTINUE - END IF - ELSE -* -* Adjust the pivot indices. -* - DO 160 I = J, J + JB - 1 - IPIV( I ) = IPIV( I ) + J - 1 - 160 CONTINUE - END IF -* -* Partially undo the interchanges in the current block to -* restore the upper triangular form of A31 and copy the upper -* triangle of A31 back into place -* - DO 170 JJ = J + JB - 1, J, -1 - JP = IPIV( JJ ) - JJ + 1 - IF( JP.NE.1 ) THEN -* -* Apply interchange to columns J to JJ-1 -* - IF( JP+JJ-1.LT.J+KL ) THEN -* -* The interchange does not affect A31 -* - CALL DSWAP( JJ-J, AB( KV+1+JJ-J, J ), LDAB-1, - $ AB( KV+JP+JJ-J, J ), LDAB-1 ) - ELSE -* -* The interchange does affect A31 -* - CALL DSWAP( JJ-J, AB( KV+1+JJ-J, J ), LDAB-1, - $ WORK31( JP+JJ-J-KL, 1 ), LDWORK ) - END IF - END IF -* -* Copy the current column of A31 back into place -* - NW = MIN( I3, JJ-J+1 ) - IF( NW.GT.0 ) - $ CALL DCOPY( NW, WORK31( 1, JJ-J+1 ), 1, - $ AB( KV+KL+1-JJ+J, JJ ), 1 ) - 170 CONTINUE - 180 CONTINUE - END IF -* -12345 format(5X,'Block ',i8,' of ',i8,'. Factoring rows 1 thru: ',i8,a) + CALL DGBTRF_KERNEL( M, N, KL, KU, AB, LDAB, IPIV, INFO ) RETURN -* -* End of DGBTRF -* + END SUBROUTINE DGBTRF +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 002 LAPACK_LINEAR_EQN_DGB +! --- lapack_peeloff begin --- ! SUBROUTINE DGBTRS( TRANS, N, KL, KU, NRHS, AB, LDAB, IPIV, B, LDB, - $ INFO, dtbsv_msg ) ! I added dtbsv_msg + $ INFO, dtbsv_msg ) USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGBTRS' -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* March 31, 1993 -* -* .. Scalar Arguments .. - CHARACTER TRANS - character*1 dtbsv_msg + CHARACTER TRANS, dtbsv_msg INTEGER INFO, KL, KU, LDAB, LDB, N, NRHS -* .. -* .. Array Arguments .. INTEGER IPIV( * ) - REAL(DOUBLE) AB( LDAB, * ), B( LDB, * ) -* .. -* -* Purpose -* ======= -* -* DGBTRS solves a system of linear equations -* A * X = B or A' * X = B -* with a general band matrix A using the LU factorization computed -* by DGBTRF. -* -* Arguments -* ========= -* -* TRANS (input) CHARACTER*1 -* Specifies the form of the system of equations. -* = 'N': A * X = B (No transpose) -* = 'T': A'* X = B (Transpose) -* = 'C': A'* X = B (Conjugate transpose = Transpose) -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* KL (input) INTEGER -* The number of subdiagonals within the band of A. KL >= 0. -* -* KU (input) INTEGER -* The number of superdiagonals within the band of A. KU >= 0. -* -* NRHS (input) INTEGER -* The number of right hand sides, i.e., the number of columns -* of the matrix B. NRHS >= 0. -* -* AB (input) REAL(DOUBLE) array, dimension (LDAB,N) -* Details of the LU factorization of the band matrix A, as -* computed by DGBTRF. U is stored as an upper triangular band -* matrix with KL+KU superdiagonals in rows 1 to KL+KU+1, and -* the multipliers used during the factorization are stored in -* rows KL+KU+2 to 2*KL+KU+1. -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= 2*KL+KU+1. -* -* IPIV (input) INTEGER array, dimension (N) -* The pivot indices; for 1 <= i <= N, row i of the matrix was -* interchanged with row IPIV(i). -* -* B (input/output) REAL(DOUBLE) array, dimension (LDB,NRHS) -* On entry, the right hand side matrix B. -* On exit, the solution matrix X. -* -* LDB (input) INTEGER -* The leading dimension of the array B. LDB >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -! dtbsv_msg (input) CHARACTER -! = 'Y', have subr DTBSV print Fwd, Back pass messages - -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL LNOTI, NOTRAN - INTEGER I, J, KD, L, LM -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. + REAL(DOUBLE) AB( LDAB, * ), B( LDB, * ) -* -* Test the input parameters. -* - INFO = 0 - NOTRAN = LSAME( TRANS, 'N' ) - IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) .AND. .NOT. - $ LSAME( TRANS, 'C' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( KL.LT.0 ) THEN - INFO = -3 - ELSE IF( KU.LT.0 ) THEN - INFO = -4 - ELSE IF( NRHS.LT.0 ) THEN - INFO = -5 - ELSE IF( LDAB.LT.( 2*KL+KU+1 ) ) THEN - INFO = -7 - ELSE IF( LDB.LT.MAX( 1, N ) ) THEN - INFO = -10 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DGBTRS', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 .OR. NRHS.EQ.0 ) - $ RETURN -* - KD = KU + KL + 1 - LNOTI = KL.GT.0 -* - IF( NOTRAN ) THEN -* -* Solve A*X = B. -* -* Solve L*X = B, overwriting B with X. -* -* L is represented as a product of permutations and unit lower -* triangular matrices L = P(1) * L(1) * ... * P(n-1) * L(n-1), -* where each transformation L(i) is a rank-one modification of -* the identity matrix. -* - IF( LNOTI ) THEN - DO 10 J = 1, N - 1 - LM = MIN( KL, N-J ) - L = IPIV( J ) - IF( L.NE.J ) - $ CALL DSWAP( NRHS, B( L, 1 ), LDB, B( J, 1 ), LDB ) - CALL DGER( LM, NRHS, -ONE, AB( KD+1, J ), 1, B( J, 1 ), - $ LDB, B( J+1, 1 ), LDB ) - 10 CONTINUE - END IF -* - DO 20 I = 1, NRHS -* -* Solve U*X = B, overwriting B with X. -* - CALL DTBSV( 'Upper', 'No transpose', 'Non-unit', N, KL+KU, - $ AB, LDAB, B( 1, I ), 1, dtbsv_msg ) - 20 CONTINUE -* - ELSE -* -* Solve A'*X = B. -* - DO 30 I = 1, NRHS -* -* Solve U'*X = B, overwriting B with X. -* - write(sc1,*) - CALL DTBSV( 'Upper', 'Transpose', 'Non-unit', N, KL+KU, AB, - $ LDAB, B( 1, I ), 1, dtbsv_msg ) - 30 CONTINUE -* -* Solve L'*X = B, overwriting B with X. -* - IF( LNOTI ) THEN - DO 40 J = N - 1, 1, -1 - LM = MIN( KL, N-J ) - CALL DGEMV( 'Transpose', LM, NRHS, -ONE, B( J+1, 1 ), - $ LDB, AB( KD+1, J ), 1, ONE, B( J, 1 ), LDB ) - L = IPIV( J ) - IF( L.NE.J ) - $ CALL DSWAP( NRHS, B( L, 1 ), LDB, B( J, 1 ), LDB ) - 40 CONTINUE - END IF - END IF + CALL DGBTRS_KERNEL( TRANS, N, KL, KU, NRHS, AB, LDAB, IPIV, B, + $ LDB, INFO, dtbsv_msg ) RETURN -* -* End of DGBTRS -* + END SUBROUTINE DGBTRS +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 003 LAPACK_LINEAR_EQN_DGB +! --- lapack_peeloff begin --- ! SUBROUTINE DGBTF2( M, N, KL, KU, AB, LDAB, IPIV, INFO ) USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGBTF2' -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. INTEGER INFO, KL, KU, LDAB, M, N -* .. -* .. Array Arguments .. INTEGER IPIV( * ) - REAL(DOUBLE) AB( LDAB, * ) -* .. -* -* Purpose -* ======= -* -* DGBTF2 computes an LU factorization of a real m-by-n band matrix A -* using partial pivoting with row interchanges. -* -* This is the unblocked version of the algorithm, calling Level 2 BLAS. -* -* Arguments -* ========= -* -* M (input) INTEGER -* The number of rows of the matrix A. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix A. N >= 0. -* -* KL (input) INTEGER -* The number of subdiagonals within the band of A. KL >= 0. -* -* KU (input) INTEGER -* The number of superdiagonals within the band of A. KU >= 0. -* -* AB (input/output) REAL(DOUBLE) array, dimension (LDAB,N) -* On entry, the matrix A in band storage, in rows KL+1 to -* 2*KL+KU+1; rows 1 to KL of the array need not be set. -* The j-th column of A is stored in the j-th column of the -* array AB as follows: -* AB(kl+ku+1+i-j,j) = A(i,j) for max(1,j-ku)<=i<=min(m,j+kl) -* -* On exit, details of the factorization: U is stored as an -* upper triangular band matrix with KL+KU superdiagonals in -* rows 1 to KL+KU+1, and the multipliers used during the -* factorization are stored in rows KL+KU+2 to 2*KL+KU+1. -* See below for further details. -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= 2*KL+KU+1. -* -* IPIV (output) INTEGER array, dimension (min(M,N)) -* The pivot indices; for 1 <= i <= min(M,N), row i of the -* matrix was interchanged with row IPIV(i). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = +i, U(i,i) is exactly zero. The factorization -* has been completed, but the factor U is exactly -* singular, and division by zero will occur if it is used -* to solve a system of equations. -* -* Further Details -* =============== -* -* The band storage scheme is illustrated by the following example, when -* M = N = 6, KL = 2, KU = 1: -* -* On entry: On exit: -* -* * * * + + + * * * u14 u25 u36 -* * * + + + + * * u13 u24 u35 u46 -* * a12 a23 a34 a45 a56 * u12 u23 u34 u45 u56 -* a11 a22 a33 a44 a55 a66 u11 u22 u33 u44 u55 u66 -* a21 a32 a43 a54 a65 * m21 m32 m43 m54 m65 * -* a31 a42 a53 a64 * * m31 m42 m53 m64 * * -* -* Array elements marked * are not used by the routine; elements marked -* + need not be set on entry, but are required by the routine to store -* elements of U, because of fill-in resulting from the row -* interchanges. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER I, J, JP, JU, KM, KV -* .. -* .. External Functions .. -!!1 EXTERNAL IDAMAX -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. + REAL(DOUBLE) AB( LDAB, * ) -* -* KV is the number of superdiagonals in the factor U, allowing for -* fill-in. -* - KV = KU + KL -* -* Test the input parameters. -* - INFO = 0 - IF( M.LT.0 ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( KL.LT.0 ) THEN - INFO = -3 - ELSE IF( KU.LT.0 ) THEN - INFO = -4 - ELSE IF( LDAB.LT.KL+KV+1 ) THEN - INFO = -6 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DGBTF2', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( M.EQ.0 .OR. N.EQ.0 ) - $ RETURN -* -* Gaussian elimination with partial pivoting -* -* Set fill-in elements in columns KU+2 to KV to zero. -* - DO 20 J = KU + 2, MIN( KV, N ) - DO 10 I = KV - J + 2, KL - AB( I, J ) = ZERO - 10 CONTINUE - 20 CONTINUE -* -* JU is the index of the last column affected by the current stage -* of the factorization. -* - JU = 1 -* - DO 40 J = 1, MIN( M, N ) -* -* Set fill-in elements in column J+KV to zero. -* - IF( J+KV.LE.N ) THEN - DO 30 I = 1, KL - AB( I, J+KV ) = ZERO - 30 CONTINUE - END IF -* -* Find pivot and test for singularity. KM is the number of -* subdiagonal elements in the current column. -* - KM = MIN( KL, M-J ) - JP = IDAMAX( KM+1, AB( KV+1, J ), 1 ) - IPIV( J ) = JP + J - 1 - IF( AB( KV+JP, J ).NE.ZERO ) THEN - JU = MAX( JU, MIN( J+KU+JP-1, N ) ) -* -* Apply interchange to columns J to JU. -* - IF( JP.NE.1 ) - $ CALL DSWAP( JU-J+1, AB( KV+JP, J ), LDAB-1, - $ AB( KV+1, J ), LDAB-1 ) -* - IF( KM.GT.0 ) THEN -* -* Compute multipliers. -* - CALL DSCAL( KM, ONE / AB( KV+1, J ), AB( KV+2, J ), 1 ) -* -* Update trailing submatrix within the band. -* - IF( JU.GT.J ) - $ CALL DGER( KM, JU-J, -ONE, AB( KV+2, J ), 1, - $ AB( KV, J+1 ), LDAB-1, AB( KV+1, J+1 ), - $ LDAB-1 ) - END IF - ELSE -* -* If pivot is zero, set INFO to the index of the pivot -* unless a zero pivot has already been found. -* - IF( INFO.EQ.0 ) - $ INFO = J - END IF - 40 CONTINUE + CALL DGBTF2_KERNEL( M, N, KL, KU, AB, LDAB, IPIV, INFO ) RETURN -* -* End of DGBTF2 -* + END SUBROUTINE DGBTF2 +! --- lapack_peeloff end --- ! +! --- lapack_surgery end --- ! END MODULE LAPACK_LIN_EQN_DGB diff --git a/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGB_KERNEL.f b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGB_KERNEL.f new file mode 100644 index 00000000..286de03e --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGB_KERNEL.f @@ -0,0 +1,900 @@ +! ################################################################################################################################## + + MODULE LAPACK_LIN_EQN_DGB_KERNEL +! --- lapack_surgery begin --- ! + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE SCONTR, ONLY : BLNK_SUB_NAM + USE LAPACK_BLAS_AUX + +! This is a set of LAPACK routines for factorization and solution of linear eqns for general band matrices + +! DGBTRF: Driver to compute a LU factorization of a real m-by-n band matrix A using partial pivoting with row interchanges. + +! DGBTRS: to solve a system of linear equations +! A * X = B or A' * X = B +! with a general band matrix A using the LU factorization computed by DGBTRF. + +! DGBTF2: to compute a LU factorization of a real m-by-n band matrix A using partial pivoting with row interchanges. +! This subroutine is called by DGBTRF + + CONTAINS + +! ################################################################################################################################## +! 001 LAPACK_LINEAR_EQN_DGB + +! --- lapack_peeloff begin --- ! + SUBROUTINE DGBTRF( M, N, KL, KU, AB, LDAB, IPIV, INFO ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGBTRF' +* +* -- LAPACK routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 +* +* .. Scalar Arguments .. + INTEGER INFO, KL, KU, LDAB, M, N +* .. +* .. Array Arguments .. + INTEGER IPIV( * ) + REAL(DOUBLE) AB( LDAB, * ) +* .. +* +* Purpose +* ======= +* +* DGBTRF computes an LU factorization of a real m-by-n band matrix A +* using partial pivoting with row interchanges. +* +* This is the blocked version of the algorithm, calling Level 3 BLAS. +* +* Arguments +* ========= +* +* M (input) INTEGER +* The number of rows of the matrix A. M >= 0. +* +* N (input) INTEGER +* The number of columns of the matrix A. N >= 0. +* +* KL (input) INTEGER +* The number of subdiagonals within the band of A. KL >= 0. +* +* KU (input) INTEGER +* The number of superdiagonals within the band of A. KU >= 0. +* +* AB (input/output) REAL(DOUBLE) array, dimension (LDAB,N) +* On entry, the matrix A in band storage, in rows KL+1 to +* 2*KL+KU+1; rows 1 to KL of the array need not be set. +* The j-th column of A is stored in the j-th column of the +* array AB as follows: +* AB(kl+ku+1+i-j,j) = A(i,j) for max(1,j-ku)<=i<=min(m,j+kl) +* +* On exit, details of the factorization: U is stored as an +* upper triangular band matrix with KL+KU superdiagonals in +* rows 1 to KL+KU+1, and the multipliers used during the +* factorization are stored in rows KL+KU+2 to 2*KL+KU+1. +* See below for further details. +* +* LDAB (input) INTEGER +* The leading dimension of the array AB. LDAB >= 2*KL+KU+1. +* +* IPIV (output) INTEGER array, dimension (min(M,N)) +* The pivot indices; for 1 <= i <= min(M,N), row i of the +* matrix was interchanged with row IPIV(i). +* +* INFO (output) INTEGER +* = 0: successful exit +* < 0: if INFO = -i, the i-th argument had an illegal value +* > 0: if INFO = +i, U(i,i) is exactly zero. The factorization +* has been completed, but the factor U is exactly +* singular, and division by zero will occur if it is used +* to solve a system of equations. +* +* Further Details +* =============== +* +* The band storage scheme is illustrated by the following example, when +* M = N = 6, KL = 2, KU = 1: +* +* On entry: On exit: +* +* * * * + + + * * * u14 u25 u36 +* * * + + + + * * u13 u24 u35 u46 +* * a12 a23 a34 a45 a56 * u12 u23 u34 u45 u56 +* a11 a22 a33 a44 a55 a66 u11 u22 u33 u44 u55 u66 +* a21 a32 a43 a54 a65 * m21 m32 m43 m54 m65 * +* a31 a42 a53 a64 * * m31 m42 m53 m64 * * +* +* Array elements marked * are not used by the routine; elements marked +* + need not be set on entry, but are required by the routine to store +* elements of U because of fill-in resulting from the row interchanges. +* +* ===================================================================== +* +* .. Parameters .. + REAL(DOUBLE) ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) + INTEGER NBMAX, LDWORK + PARAMETER ( NBMAX = 64, LDWORK = NBMAX+1 ) +* .. +* .. Local Scalars .. + INTEGER I, I2, I3, II, IP, J, J2, J3, JB, JJ, JM, JP, + $ JU, K2, KM, KV, NB, NW +! /////////////////////////////////////////////////////////////////////B +! Add this so we can use IMPLICIT NONE (switch "-in" when compiling) + integer numblk, iblock +! /////////////////////////////////////////////////////////////////////E + REAL(DOUBLE) TEMP +* .. +* .. Local Arrays .. + REAL(DOUBLE) WORK13( LDWORK, NBMAX ), + $ WORK31( LDWORK, NBMAX ) +* .. +* .. External Functions .. + INTEGER ILAENV + EXTERNAL ILAENV + + +* .. +* .. External Subroutines .. +* .. +* .. Intrinsic Functions .. + INTRINSIC MAX, MIN +* .. +* .. Executable Statements .. + +* +* KV is the number of superdiagonals in the factor U, allowing for +* fill-in +* + KV = KU + KL +* +* Test the input parameters. +* + INFO = 0 + IF( M.LT.0 ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( KL.LT.0 ) THEN + INFO = -3 + ELSE IF( KU.LT.0 ) THEN + INFO = -4 + ELSE IF( LDAB.LT.KL+KV+1 ) THEN + INFO = -6 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGBTRF', -INFO ) + RETURN + END IF +* +* Quick return if possible +* + IF( M.EQ.0 .OR. N.EQ.0 ) + $ RETURN +* +* Determine the block size for this environment +* + NB = ILAENV( 1, 'DGBTRF', ' ', M, N, KL, KU ) +* +* The block size must not exceed the limit set by the size of the +* local arrays WORK13 and WORK31. +* + NB = MIN( NB, NBMAX ) +* + IF( NB.LE.1 .OR. NB.GT.KL ) THEN +* +* Use unblocked code +* + CALL DGBTF2( M, N, KL, KU, AB, LDAB, IPIV, INFO ) + ELSE +* +* Use blocked code +* +* Zero the superdiagonal elements of the work array WORK13 +* + DO 20 J = 1, NB + DO 10 I = 1, J - 1 + WORK13( I, J ) = ZERO + 10 CONTINUE + 20 CONTINUE +* +* Zero the subdiagonal elements of the work array WORK31 +* + DO 40 J = 1, NB + DO 30 I = J + 1, NB + WORK31( I, J ) = ZERO + 30 CONTINUE + 40 CONTINUE +* +* Gaussian elimination with partial pivoting +* +* Set fill-in elements in columns KU+2 to KV to zero +* + DO 60 J = KU + 2, MIN( KV, N ) + DO 50 I = KV - J + 2, KL + AB( I, J ) = ZERO + 50 CONTINUE + 60 CONTINUE +* +* JU is the index of the last column affected by the current +* stage of the factorization +* + JU = 1 +* + numblk = int(MIN(M,n)/nb) + 1 + iblock = 0 + WRITE (*,*) + DO 180 J = 1, MIN( M, N ), NB + iblock = iblock + 1 + JB = MIN( NB, MIN( M, N )-J+1 ) +* +* The active part of the matrix is partitioned +* +* A11 A12 A13 +* A21 A22 A23 +* A31 A32 A33 +* +* Here A11, A21 and A31 denote the current block of JB columns +* which is about to be factorized. The number of rows in the +* partitioning are JB, I2, I3 respectively, and the numbers +* of columns are JB, J2, J3. The superdiagonal elements of A13 +* and the subdiagonal elements of A31 lie outside the band. +* + I2 = MIN( KL-JB, M-J-JB+1 ) + I3 = MIN( JB, M-J-KL+1 ) +* +* J2 and J3 are computed after JU has been updated. +* +* Factorize the current block of JB columns +* + DO 80 JJ = J, J + JB - 1 +* +* Set fill-in elements in column JJ+KV to zero +* + IF( JJ+KV.LE.N ) THEN + DO 70 I = 1, KL + AB( I, JJ+KV ) = ZERO + 70 CONTINUE + END IF +* +* Find pivot and test for singularity. KM is the number of +* subdiagonal elements in the current column. +* + KM = MIN( KL, M-JJ ) + JP = IDAMAX( KM+1, AB( KV+1, JJ ), 1 ) + IPIV( JJ ) = JP + JJ - J + IF( AB( KV+JP, JJ ).NE.ZERO ) THEN + JU = MAX( JU, MIN( JJ+KU+JP-1, N ) ) + IF( JP.NE.1 ) THEN +* +* Apply interchange to columns J to J+JB-1 +* + IF( JP+JJ-1.LT.J+KL ) THEN +* + CALL DSWAP( JB, AB( KV+1+JJ-J, J ), LDAB-1, + $ AB( KV+JP+JJ-J, J ), LDAB-1 ) + ELSE +* +* The interchange affects columns J to JJ-1 of A31 +* which are stored in the work array WORK31 +* + CALL DSWAP( JJ-J, AB( KV+1+JJ-J, J ), LDAB-1, + $ WORK31( JP+JJ-J-KL, 1 ), LDWORK ) + CALL DSWAP( J+JB-JJ, AB( KV+1, JJ ), LDAB-1, + $ AB( KV+JP, JJ ), LDAB-1 ) + END IF + END IF +* +* Compute multipliers +* + CALL DSCAL( KM, ONE / AB( KV+1, JJ ), AB( KV+2, JJ ), + $ 1 ) +* +* Update trailing submatrix within the band and within +* the current block. JM is the index of the last column +* which needs to be updated. +* + JM = MIN( JU, J+JB-1 ) + IF( JM.GT.JJ ) + $ CALL DGER( KM, JM-JJ, -ONE, AB( KV+2, JJ ), 1, + $ AB( KV, JJ+1 ), LDAB-1, + $ AB( KV+1, JJ+1 ), LDAB-1 ) + ELSE +* +* If pivot is zero, set INFO to the index of the pivot +* unless a zero pivot has already been found. +* + IF( INFO.EQ.0 ) + $ INFO = JJ + END IF +* +* Copy current column of A31 into the work array WORK31 +* + NW = MIN( JJ-J+1, I3 ) + IF( NW.GT.0 ) + $ CALL DCOPY( NW, AB( KV+KL+1-JJ+J, JJ ), 1, + $ WORK31( 1, JJ-J+1 ), 1 ) + 80 CONTINUE + IF( J+JB.LE.N ) THEN +* +* Apply the row interchanges to the other blocks. +* + J2 = MIN( JU-J+1, KV ) - JB + J3 = MAX( 0, JU-J-KV+1 ) +* +* Use DLASWP to apply the row interchanges to A12, A22, and +* A32. +* + CALL DLASWP( J2, AB( KV+1-JB, J+JB ), LDAB-1, 1, JB, + $ IPIV( J ), 1 ) +* +* Adjust the pivot indices. +* + DO 90 I = J, J + JB - 1 + IPIV( I ) = IPIV( I ) + J - 1 + 90 CONTINUE +* +* Apply the row interchanges to A13, A23, and A33 +* columnwise. +* + K2 = J - 1 + JB + J2 + DO 110 I = 1, J3 + JJ = K2 + I + DO 100 II = J + I - 1, J + JB - 1 + IP = IPIV( II ) + IF( IP.NE.II ) THEN + TEMP = AB( KV+1+II-JJ, JJ ) + AB( KV+1+II-JJ, JJ ) = AB( KV+1+IP-JJ, JJ ) + AB( KV+1+IP-JJ, JJ ) = TEMP + END IF + 100 CONTINUE + 110 CONTINUE +* +* Update the relevant part of the trailing submatrix +* + IF( J2.GT.0 ) THEN +* +* Update A12 +* + CALL DTRSM( 'Left', 'Lower', 'No transpose', 'Unit', + $ JB, J2, ONE, AB( KV+1, J ), LDAB-1, + $ AB( KV+1-JB, J+JB ), LDAB-1 ) +* + IF( I2.GT.0 ) THEN +* +* Update A22 +* + CALL DGEMM( 'No transpose', 'No transpose', I2, J2, + $ JB, -ONE, AB( KV+1+JB, J ), LDAB-1, + $ AB( KV+1-JB, J+JB ), LDAB-1, ONE, + $ AB( KV+1, J+JB ), LDAB-1 ) + END IF +* + IF( I3.GT.0 ) THEN +* +* Update A32 +* + CALL DGEMM( 'No transpose', 'No transpose', I3, J2, + $ JB, -ONE, WORK31, LDWORK, + $ AB( KV+1-JB, J+JB ), LDAB-1, ONE, + $ AB( KV+KL+1-JB, J+JB ), LDAB-1 ) + END IF + END IF +* + IF( J3.GT.0 ) THEN +* +* Copy the lower triangle of A13 into the work array +* WORK13 +* + DO 130 JJ = 1, J3 + DO 120 II = JJ, JB + WORK13( II, JJ ) = AB( II-JJ+1, JJ+J+KV-1 ) + 120 CONTINUE + 130 CONTINUE +* +* Update A13 in the work array +* + CALL DTRSM( 'Left', 'Lower', 'No transpose', 'Unit', + $ JB, J3, ONE, AB( KV+1, J ), LDAB-1, + $ WORK13, LDWORK ) +* + IF( I2.GT.0 ) THEN +* +* Update A23 +* + CALL DGEMM( 'No transpose', 'No transpose', I2, J3, + $ JB, -ONE, AB( KV+1+JB, J ), LDAB-1, + $ WORK13, LDWORK, ONE, AB( 1+JB, J+KV ), + $ LDAB-1 ) + END IF +* + IF( I3.GT.0 ) THEN +* +* Update A33 +* + CALL DGEMM( 'No transpose', 'No transpose', I3, J3, + $ JB, -ONE, WORK31, LDWORK, WORK13, + $ LDWORK, ONE, AB( 1+KL, J+KV ), LDAB-1 ) + END IF +* +* Copy the lower triangle of A13 back into place +* + DO 150 JJ = 1, J3 + DO 140 II = JJ, JB + AB( II-JJ+1, JJ+J+KV-1 ) = WORK13( II, JJ ) + 140 CONTINUE + 150 CONTINUE + END IF + ELSE +* +* Adjust the pivot indices. +* + DO 160 I = J, J + JB - 1 + IPIV( I ) = IPIV( I ) + J - 1 + 160 CONTINUE + END IF +* +* Partially undo the interchanges in the current block to +* restore the upper triangular form of A31 and copy the upper +* triangle of A31 back into place +* + DO 170 JJ = J + JB - 1, J, -1 + JP = IPIV( JJ ) - JJ + 1 + IF( JP.NE.1 ) THEN +* +* Apply interchange to columns J to JJ-1 +* + IF( JP+JJ-1.LT.J+KL ) THEN +* +* The interchange does not affect A31 +* + CALL DSWAP( JJ-J, AB( KV+1+JJ-J, J ), LDAB-1, + $ AB( KV+JP+JJ-J, J ), LDAB-1 ) + ELSE +* +* The interchange does affect A31 +* + CALL DSWAP( JJ-J, AB( KV+1+JJ-J, J ), LDAB-1, + $ WORK31( JP+JJ-J-KL, 1 ), LDWORK ) + END IF + END IF +* +* Copy the current column of A31 back into place +* + NW = MIN( I3, JJ-J+1 ) + IF( NW.GT.0 ) + $ CALL DCOPY( NW, WORK31( 1, JJ-J+1 ), 1, + $ AB( KV+KL+1-JJ+J, JJ ), 1 ) + 170 CONTINUE + 180 CONTINUE + END IF +* + RETURN +* +* End of DGBTRF +* + END SUBROUTINE DGBTRF +! --- lapack_peeloff end --- ! + +! ################################################################################################################################## +! 002 LAPACK_LINEAR_EQN_DGB + +! --- lapack_peeloff begin --- ! + SUBROUTINE DGBTRS( TRANS, N, KL, KU, NRHS, AB, LDAB, IPIV, B, LDB, + $ INFO, dtbsv_msg ) ! I added dtbsv_msg + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGBTRS' +* +* -- LAPACK routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* March 31, 1993 +* +* .. Scalar Arguments .. + CHARACTER TRANS + character*1 dtbsv_msg + INTEGER INFO, KL, KU, LDAB, LDB, N, NRHS +* .. +* .. Array Arguments .. + INTEGER IPIV( * ) + REAL(DOUBLE) AB( LDAB, * ), B( LDB, * ) +* .. +* +* Purpose +* ======= +* +* DGBTRS solves a system of linear equations +* A * X = B or A' * X = B +* with a general band matrix A using the LU factorization computed +* by DGBTRF. +* +* Arguments +* ========= +* +* TRANS (input) CHARACTER*1 +* Specifies the form of the system of equations. +* = 'N': A * X = B (No transpose) +* = 'T': A'* X = B (Transpose) +* = 'C': A'* X = B (Conjugate transpose = Transpose) +* +* N (input) INTEGER +* The order of the matrix A. N >= 0. +* +* KL (input) INTEGER +* The number of subdiagonals within the band of A. KL >= 0. +* +* KU (input) INTEGER +* The number of superdiagonals within the band of A. KU >= 0. +* +* NRHS (input) INTEGER +* The number of right hand sides, i.e., the number of columns +* of the matrix B. NRHS >= 0. +* +* AB (input) REAL(DOUBLE) array, dimension (LDAB,N) +* Details of the LU factorization of the band matrix A, as +* computed by DGBTRF. U is stored as an upper triangular band +* matrix with KL+KU superdiagonals in rows 1 to KL+KU+1, and +* the multipliers used during the factorization are stored in +* rows KL+KU+2 to 2*KL+KU+1. +* +* LDAB (input) INTEGER +* The leading dimension of the array AB. LDAB >= 2*KL+KU+1. +* +* IPIV (input) INTEGER array, dimension (N) +* The pivot indices; for 1 <= i <= N, row i of the matrix was +* interchanged with row IPIV(i). +* +* B (input/output) REAL(DOUBLE) array, dimension (LDB,NRHS) +* On entry, the right hand side matrix B. +* On exit, the solution matrix X. +* +* LDB (input) INTEGER +* The leading dimension of the array B. LDB >= max(1,N). +* +* INFO (output) INTEGER +* = 0: successful exit +* < 0: if INFO = -i, the i-th argument had an illegal value +* +! dtbsv_msg (input) CHARACTER +! = 'Y', have subr DTBSV print Fwd, Back pass messages + +* ===================================================================== +* +* .. Parameters .. + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D+0 ) +* .. +* .. Local Scalars .. + LOGICAL LNOTI, NOTRAN + INTEGER I, J, KD, L, LM +* .. +* .. External Functions .. + LOGICAL LSAME + EXTERNAL LSAME +* .. +* .. External Subroutines .. +* .. +* .. Intrinsic Functions .. + INTRINSIC MAX, MIN +* .. +* .. Executable Statements .. + +* +* Test the input parameters. +* + INFO = 0 + NOTRAN = LSAME( TRANS, 'N' ) + IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) .AND. .NOT. + $ LSAME( TRANS, 'C' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( KL.LT.0 ) THEN + INFO = -3 + ELSE IF( KU.LT.0 ) THEN + INFO = -4 + ELSE IF( NRHS.LT.0 ) THEN + INFO = -5 + ELSE IF( LDAB.LT.( 2*KL+KU+1 ) ) THEN + INFO = -7 + ELSE IF( LDB.LT.MAX( 1, N ) ) THEN + INFO = -10 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGBTRS', -INFO ) + RETURN + END IF +* +* Quick return if possible +* + IF( N.EQ.0 .OR. NRHS.EQ.0 ) + $ RETURN +* + KD = KU + KL + 1 + LNOTI = KL.GT.0 +* + IF( NOTRAN ) THEN +* +* Solve A*X = B. +* +* Solve L*X = B, overwriting B with X. +* +* L is represented as a product of permutations and unit lower +* triangular matrices L = P(1) * L(1) * ... * P(n-1) * L(n-1), +* where each transformation L(i) is a rank-one modification of +* the identity matrix. +* + IF( LNOTI ) THEN + DO 10 J = 1, N - 1 + LM = MIN( KL, N-J ) + L = IPIV( J ) + IF( L.NE.J ) + $ CALL DSWAP( NRHS, B( L, 1 ), LDB, B( J, 1 ), LDB ) + CALL DGER( LM, NRHS, -ONE, AB( KD+1, J ), 1, B( J, 1 ), + $ LDB, B( J+1, 1 ), LDB ) + 10 CONTINUE + END IF +* + DO 20 I = 1, NRHS +* +* Solve U*X = B, overwriting B with X. +* + CALL DTBSV( 'Upper', 'No transpose', 'Non-unit', N, KL+KU, + $ AB, LDAB, B( 1, I ), 1, dtbsv_msg ) + 20 CONTINUE +* + ELSE +* +* Solve A'*X = B. +* + DO 30 I = 1, NRHS +* +* Solve U'*X = B, overwriting B with X. +* + CALL DTBSV( 'Upper', 'Transpose', 'Non-unit', N, KL+KU, AB, + $ LDAB, B( 1, I ), 1, dtbsv_msg ) + 30 CONTINUE +* +* Solve L'*X = B, overwriting B with X. +* + IF( LNOTI ) THEN + DO 40 J = N - 1, 1, -1 + LM = MIN( KL, N-J ) + CALL DGEMV( 'Transpose', LM, NRHS, -ONE, B( J+1, 1 ), + $ LDB, AB( KD+1, J ), 1, ONE, B( J, 1 ), LDB ) + L = IPIV( J ) + IF( L.NE.J ) + $ CALL DSWAP( NRHS, B( L, 1 ), LDB, B( J, 1 ), LDB ) + 40 CONTINUE + END IF + END IF + + RETURN +* +* End of DGBTRS +* + END SUBROUTINE DGBTRS +! --- lapack_peeloff end --- ! + +! ################################################################################################################################## +! 003 LAPACK_LINEAR_EQN_DGB + +! --- lapack_peeloff begin --- ! + SUBROUTINE DGBTF2( M, N, KL, KU, AB, LDAB, IPIV, INFO ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGBTF2' +* +* -- LAPACK routine (version 3.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 +* +* .. Scalar Arguments .. + INTEGER INFO, KL, KU, LDAB, M, N +* .. +* .. Array Arguments .. + INTEGER IPIV( * ) + REAL(DOUBLE) AB( LDAB, * ) +* .. +* +* Purpose +* ======= +* +* DGBTF2 computes an LU factorization of a real m-by-n band matrix A +* using partial pivoting with row interchanges. +* +* This is the unblocked version of the algorithm, calling Level 2 BLAS. +* +* Arguments +* ========= +* +* M (input) INTEGER +* The number of rows of the matrix A. M >= 0. +* +* N (input) INTEGER +* The number of columns of the matrix A. N >= 0. +* +* KL (input) INTEGER +* The number of subdiagonals within the band of A. KL >= 0. +* +* KU (input) INTEGER +* The number of superdiagonals within the band of A. KU >= 0. +* +* AB (input/output) REAL(DOUBLE) array, dimension (LDAB,N) +* On entry, the matrix A in band storage, in rows KL+1 to +* 2*KL+KU+1; rows 1 to KL of the array need not be set. +* The j-th column of A is stored in the j-th column of the +* array AB as follows: +* AB(kl+ku+1+i-j,j) = A(i,j) for max(1,j-ku)<=i<=min(m,j+kl) +* +* On exit, details of the factorization: U is stored as an +* upper triangular band matrix with KL+KU superdiagonals in +* rows 1 to KL+KU+1, and the multipliers used during the +* factorization are stored in rows KL+KU+2 to 2*KL+KU+1. +* See below for further details. +* +* LDAB (input) INTEGER +* The leading dimension of the array AB. LDAB >= 2*KL+KU+1. +* +* IPIV (output) INTEGER array, dimension (min(M,N)) +* The pivot indices; for 1 <= i <= min(M,N), row i of the +* matrix was interchanged with row IPIV(i). +* +* INFO (output) INTEGER +* = 0: successful exit +* < 0: if INFO = -i, the i-th argument had an illegal value +* > 0: if INFO = +i, U(i,i) is exactly zero. The factorization +* has been completed, but the factor U is exactly +* singular, and division by zero will occur if it is used +* to solve a system of equations. +* +* Further Details +* =============== +* +* The band storage scheme is illustrated by the following example, when +* M = N = 6, KL = 2, KU = 1: +* +* On entry: On exit: +* +* * * * + + + * * * u14 u25 u36 +* * * + + + + * * u13 u24 u35 u46 +* * a12 a23 a34 a45 a56 * u12 u23 u34 u45 u56 +* a11 a22 a33 a44 a55 a66 u11 u22 u33 u44 u55 u66 +* a21 a32 a43 a54 a65 * m21 m32 m43 m54 m65 * +* a31 a42 a53 a64 * * m31 m42 m53 m64 * * +* +* Array elements marked * are not used by the routine; elements marked +* + need not be set on entry, but are required by the routine to store +* elements of U, because of fill-in resulting from the row +* interchanges. +* +* ===================================================================== +* +* .. Parameters .. + REAL(DOUBLE) ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +* .. +* .. Local Scalars .. + INTEGER I, J, JP, JU, KM, KV +* .. +* .. External Functions .. +!!1 EXTERNAL IDAMAX +* .. +* .. External Subroutines .. +* .. +* .. Intrinsic Functions .. + INTRINSIC MAX, MIN +* .. +* .. Executable Statements .. + +* +* KV is the number of superdiagonals in the factor U, allowing for +* fill-in. +* + KV = KU + KL +* +* Test the input parameters. +* + INFO = 0 + IF( M.LT.0 ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( KL.LT.0 ) THEN + INFO = -3 + ELSE IF( KU.LT.0 ) THEN + INFO = -4 + ELSE IF( LDAB.LT.KL+KV+1 ) THEN + INFO = -6 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGBTF2', -INFO ) + RETURN + END IF +* +* Quick return if possible +* + IF( M.EQ.0 .OR. N.EQ.0 ) + $ RETURN +* +* Gaussian elimination with partial pivoting +* +* Set fill-in elements in columns KU+2 to KV to zero. +* + DO 20 J = KU + 2, MIN( KV, N ) + DO 10 I = KV - J + 2, KL + AB( I, J ) = ZERO + 10 CONTINUE + 20 CONTINUE +* +* JU is the index of the last column affected by the current stage +* of the factorization. +* + JU = 1 +* + DO 40 J = 1, MIN( M, N ) +* +* Set fill-in elements in column J+KV to zero. +* + IF( J+KV.LE.N ) THEN + DO 30 I = 1, KL + AB( I, J+KV ) = ZERO + 30 CONTINUE + END IF +* +* Find pivot and test for singularity. KM is the number of +* subdiagonal elements in the current column. +* + KM = MIN( KL, M-J ) + JP = IDAMAX( KM+1, AB( KV+1, J ), 1 ) + IPIV( J ) = JP + J - 1 + IF( AB( KV+JP, J ).NE.ZERO ) THEN + JU = MAX( JU, MIN( J+KU+JP-1, N ) ) +* +* Apply interchange to columns J to JU. +* + IF( JP.NE.1 ) + $ CALL DSWAP( JU-J+1, AB( KV+JP, J ), LDAB-1, + $ AB( KV+1, J ), LDAB-1 ) +* + IF( KM.GT.0 ) THEN +* +* Compute multipliers. +* + CALL DSCAL( KM, ONE / AB( KV+1, J ), AB( KV+2, J ), 1 ) +* +* Update trailing submatrix within the band. +* + IF( JU.GT.J ) + $ CALL DGER( KM, JU-J, -ONE, AB( KV+2, J ), 1, + $ AB( KV, J+1 ), LDAB-1, AB( KV+1, J+1 ), + $ LDAB-1 ) + END IF + ELSE +* +* If pivot is zero, set INFO to the index of the pivot +* unless a zero pivot has already been found. +* + IF( INFO.EQ.0 ) + $ INFO = J + END IF + 40 CONTINUE + + RETURN +* +* End of DGBTF2 +* + END SUBROUTINE DGBTF2 +! --- lapack_peeloff end --- ! + +! --- lapack_surgery end --- ! + END MODULE LAPACK_LIN_EQN_DGB_KERNEL diff --git a/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGE.f b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGE.f index 60fb6e62..91e06de7 100644 --- a/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGE.f +++ b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGE.f @@ -1,19 +1,13 @@ ! ################################################################################################################################## MODULE LAPACK_LIN_EQN_DGE +! --- lapack_surgery begin --- ! - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06 - USE SCONTR, ONLY : LINKNO, BLNK_SUB_NAM - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, STIME, TSEC - USE PARAMS, ONLY : EPSIL - USE LAPACK_BLAS_AUX - - USE OURTIM_Interface - USE OUTA_HERE_Interface - - CHARACTER(44*BYTE), PRIVATE :: MODNAM ! Name to write to screen to describe module being run. + USE PENTIUM_II_KIND, ONLY : DOUBLE + USE LAPACK_DGETF2_HELPER + USE LAPACK_DGETRF_HELPER + USE LAPACK_DGETRI_HELPER + USE LAPACK_DGETRS_HELPER ! This is the set of LAPACK routines for solving equations @@ -37,669 +31,59 @@ MODULE LAPACK_LIN_EQN_DGE ! ################################################################################################################################## ! 001 LAPACK_LINEAR_EQN_DGE +! --- lapack_peeloff begin --- ! SUBROUTINE DGETRF( M, N, A, LDA, IPIV, INFO ) -* - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGETRF' - -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* March 31, 1993 -* -* .. Scalar Arguments .. INTEGER INFO, LDA, M, N -* .. -* .. Array Arguments .. INTEGER IPIV( * ) - REAL(DOUBLE) A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DGETRF computes an LU factorization of a general M-by-N matrix A -* using partial pivoting with row interchanges. -* -* The factorization has the form -* A = P * L * U -* where P is a permutation matrix, L is lower triangular with unit -* diagonal elements (lower trapezoidal if m > n), and U is upper -* triangular (upper trapezoidal if m < n). -* -* This is the right-looking Level 3 BLAS version of the algorithm. -* -* Arguments -* ========= -* -* M (input) INTEGER -* The number of rows of the matrix A. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix A. N >= 0. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) -* On entry, the M-by-N matrix to be factored. -* On exit, the factors L and U from the factorization -* A = P*L*U; the unit diagonal elements of L are not stored. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,M). -* -* IPIV (output) INTEGER array, dimension (min(M,N)) -* The pivot indices; for 1 <= i <= min(M,N), row i of the -* matrix was interchanged with row IPIV(i). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, U(i,i) is exactly zero. The factorization -* has been completed, but the factor U is exactly -* singular, and division by zero will occur if it is used -* to solve a system of equations. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER I, IINFO, J, JB, NB -* .. -* .. External Subroutines .. -* - INTEGER ILAENV - EXTERNAL ILAENV -* .. -* .. External Functions .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. + REAL(DOUBLE) A( LDA, * ) + + CALL DGETRF_HELPER( M, N, A, LDA, IPIV, INFO ) -* -* Test the input parameters. -* - INFO = 0 - IF( M.LT.0 ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, M ) ) THEN - INFO = -4 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DGETRF', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( M.EQ.0 .OR. N.EQ.0 ) - $ RETURN -* -* Determine the block size for this environment. -* - NB = ILAENV( 1, 'DGETRF', ' ', M, N, -1, -1 ) - IF( NB.LE.1 .OR. NB.GE.MIN( M, N ) ) THEN -* -* Use unblocked code. -* - CALL DGETF2( M, N, A, LDA, IPIV, INFO ) - ELSE -* -* Use blocked code. -* - DO 20 J = 1, MIN( M, N ), NB - JB = MIN( MIN( M, N )-J+1, NB ) -* -* Factor diagonal and subdiagonal blocks and test for exact -* singularity. -* - CALL DGETF2( M-J+1, JB, A( J, J ), LDA, IPIV( J ), IINFO ) -* -* Adjust INFO and the pivot indices. -* - IF( INFO.EQ.0 .AND. IINFO.GT.0 ) - $ INFO = IINFO + J - 1 - DO 10 I = J, MIN( M, J+JB-1 ) - IPIV( I ) = J - 1 + IPIV( I ) - 10 CONTINUE -* -* Apply interchanges to columns 1:J-1. -* - CALL DLASWP( J-1, A, LDA, J, J+JB-1, IPIV, 1 ) -* - IF( J+JB.LE.N ) THEN -* -* Apply interchanges to columns J+JB:N. -* - CALL DLASWP( N-J-JB+1, A( 1, J+JB ), LDA, J, J+JB-1, - $ IPIV, 1 ) -* -* Compute block row of U. -* - CALL DTRSM( 'Left', 'Lower', 'No transpose', 'Unit', JB, - $ N-J-JB+1, ONE, A( J, J ), LDA, A( J, J+JB ), - $ LDA ) - IF( J+JB.LE.M ) THEN -* -* Update trailing submatrix. -* - CALL DGEMM( 'No transpose', 'No transpose', M-J-JB+1, - $ N-J-JB+1, JB, -ONE, A( J+JB, J ), LDA, - $ A( J, J+JB ), LDA, ONE, A( J+JB, J+JB ), - $ LDA ) - END IF - END IF - 20 CONTINUE - END IF - RETURN -* -* End of DGETRF -* END SUBROUTINE DGETRF +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 002 LAPACK_LINEAR_EQN_DGE +! --- lapack_peeloff begin --- ! SUBROUTINE DGETRI( N, A, LDA, IPIV, WORK, LWORK, INFO ) -* - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGETRF' - -* -- LAPACK routine (version 3.1) -- -* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. -* November 2006 -* -* .. Scalar Arguments .. INTEGER INFO, LDA, LWORK, N -* .. -* .. Array Arguments .. INTEGER IPIV( * ) DOUBLE PRECISION A( LDA, * ), WORK( * ) -* .. -* -* Purpose -* ======= -* -* DGETRI computes the inverse of a matrix using the LU factorization -* computed by DGETRF. -* -* This method inverts U and then computes inv(A) by solving the system -* inv(A)*L = inv(U) for inv(A). -* -* Arguments -* ========= -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* A (input/output) DOUBLE PRECISION array, dimension (LDA,N) -* On entry, the factors L and U from the factorization -* A = P*L*U as computed by DGETRF. -* On exit, if INFO = 0, the inverse of the original matrix A. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* IPIV (input) INTEGER array, dimension (N) -* The pivot indices from DGETRF; for 1<=i<=N, row i of the -* matrix was interchanged with row IPIV(i). -* -* WORK (workspace/output) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) -* On exit, if INFO=0, then WORK(1) returns the optimal LWORK. -* -* LWORK (input) INTEGER -* The dimension of the array WORK. LWORK >= max(1,N). -* For optimal performance LWORK >= N*NB, where NB is -* the optimal blocksize returned by ILAENV. -* -* If LWORK = -1, then a workspace query is assumed; the routine -* only calculates the optimal size of the WORK array, returns -* this value as the first entry of the WORK array, and no error -* message related to LWORK is issued by XERBLA. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, U(i,i) is exactly zero; the matrix is -* singular and its inverse could not be computed. -* -* ===================================================================== -* -* .. Parameters .. - DOUBLE PRECISION ZERO, ONE - PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL LQUERY - INTEGER I, IWS, J, JB, JJ, JP, LDWORK, LWKOPT, NB, - $ NBMIN, NN -* .. -* .. External Functions .. - INTEGER ILAENV - EXTERNAL ILAENV -* .. -* .. External Subroutines .. - EXTERNAL DGEMM, DGEMV, DSWAP, DTRSM, DTRTRI, XERBLA -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - NB = ILAENV( 1, 'DGETRI', ' ', N, -1, -1, -1 ) - LWKOPT = N*NB - WORK( 1 ) = LWKOPT - LQUERY = ( LWORK.EQ.-1 ) - IF( N.LT.0 ) THEN - INFO = -1 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -3 - ELSE IF( LWORK.LT.MAX( 1, N ) .AND. .NOT.LQUERY ) THEN - INFO = -6 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DGETRI', -INFO ) - RETURN - ELSE IF( LQUERY ) THEN - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* -* Form inv(U). If INFO > 0 from DTRTRI, then U is singular, -* and the inverse is not computed. -* - CALL DTRTRI( 'Upper', 'Non-unit', N, A, LDA, INFO ) - IF( INFO.GT.0 ) - $ RETURN -* - NBMIN = 2 - LDWORK = N - IF( NB.GT.1 .AND. NB.LT.N ) THEN - IWS = MAX( LDWORK*NB, 1 ) - IF( LWORK.LT.IWS ) THEN - NB = LWORK / LDWORK - NBMIN = MAX( 2, ILAENV( 2, 'DGETRI', ' ', N, -1, -1, -1 ) ) - END IF - ELSE - IWS = N - END IF -* -* Solve the equation inv(A)*L = inv(U) for inv(A). -* - IF( NB.LT.NBMIN .OR. NB.GE.N ) THEN -* -* Use unblocked code. -* - DO 20 J = N, 1, -1 -* -* Copy current column of L to WORK and replace with zeros. -* - DO 10 I = J + 1, N - WORK( I ) = A( I, J ) - A( I, J ) = ZERO - 10 CONTINUE -* -* Compute current column of inv(A). -* - IF( J.LT.N ) - $ CALL DGEMV( 'No transpose', N, N-J, -ONE, A( 1, J+1 ), - $ LDA, WORK( J+1 ), 1, ONE, A( 1, J ), 1 ) - 20 CONTINUE - ELSE -* -* Use blocked code. -* - NN = ( ( N-1 ) / NB )*NB + 1 - DO 50 J = NN, 1, -NB - JB = MIN( NB, N-J+1 ) -* -* Copy current block column of L to WORK and replace with -* zeros. -* - DO 40 JJ = J, J + JB - 1 - DO 30 I = JJ + 1, N - WORK( I+( JJ-J )*LDWORK ) = A( I, JJ ) - A( I, JJ ) = ZERO - 30 CONTINUE - 40 CONTINUE -* -* Compute current block column of inv(A). -* - IF( J+JB.LE.N ) - $ CALL DGEMM( 'No transpose', 'No transpose', N, JB, - $ N-J-JB+1, -ONE, A( 1, J+JB ), LDA, - $ WORK( J+JB ), LDWORK, ONE, A( 1, J ), LDA ) - CALL DTRSM( 'Right', 'Lower', 'No transpose', 'Unit', N, JB, - $ ONE, WORK( J ), LDWORK, A( 1, J ), LDA ) - 50 CONTINUE - END IF -* -* Apply column interchanges. -* - DO 60 J = N - 1, 1, -1 - JP = IPIV( J ) - IF( JP.NE.J ) - $ CALL DSWAP( N, A( 1, J ), 1, A( 1, JP ), 1 ) - 60 CONTINUE -* - WORK( 1 ) = IWS - RETURN -* -* End of DGETRF -* + CALL DGETRI_HELPER( N, A, LDA, IPIV, WORK, LWORK, INFO ) + END SUBROUTINE DGETRI +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 003 LAPACK_LINEAR_EQN_DGE +! --- lapack_peeloff begin --- ! SUBROUTINE DGETRS( TRANS, N, NRHS, A, LDA, IPIV, B, LDB, INFO ) -* - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGETRS' - -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* March 31, 1993 -* -* .. Scalar Arguments .. CHARACTER TRANS INTEGER INFO, LDA, LDB, N, NRHS -* .. -* .. Array Arguments .. INTEGER IPIV( * ) - REAL(DOUBLE) A( LDA, * ), B( LDB, * ) -* .. -* -* Purpose -* ======= -* -* DGETRS solves a system of linear equations -* A * X = B or A' * X = B -* with a general N-by-N matrix A using the LU factorization computed -* by DGETRF. -* -* Arguments -* ========= -* -* TRANS (input) CHARACTER*1 -* Specifies the form of the system of equations: -* = 'N': A * X = B (No transpose) -* = 'T': A'* X = B (Transpose) -* = 'C': A'* X = B (Conjugate transpose = Transpose) -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* NRHS (input) INTEGER -* The number of right hand sides, i.e., the number of columns -* of the matrix B. NRHS >= 0. -* -* A (input) REAL(DOUBLE) array, dimension (LDA,N) -* The factors L and U from the factorization A = P*L*U -* as computed by DGETRF. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* IPIV (input) INTEGER array, dimension (N) -* The pivot indices from DGETRF; for 1<=i<=N, row i of the -* matrix was interchanged with row IPIV(i). -* -* B (input/output) REAL(DOUBLE) array, dimension (LDB,NRHS) -* On entry, the right hand side matrix B. -* On exit, the solution matrix X. -* -* LDB (input) INTEGER -* The leading dimension of the array B. LDB >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL NOTRAN -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - NOTRAN = LSAME( TRANS, 'N' ) - IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) .AND. .NOT. - $ LSAME( TRANS, 'C' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( NRHS.LT.0 ) THEN - INFO = -3 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -5 - ELSE IF( LDB.LT.MAX( 1, N ) ) THEN - INFO = -8 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DGETRS', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 .OR. NRHS.EQ.0 ) - $ RETURN -* - IF( NOTRAN ) THEN -* -* Solve A * X = B. -* -* Apply row interchanges to the right hand sides. -* - CALL DLASWP( NRHS, B, LDB, 1, N, IPIV, 1 ) -* -* Solve L*X = B, overwriting B with X. -* - CALL DTRSM( 'Left', 'Lower', 'No transpose', 'Unit', N, NRHS, - $ ONE, A, LDA, B, LDB ) -* -* Solve U*X = B, overwriting B with X. -* - CALL DTRSM( 'Left', 'Upper', 'No transpose', 'Non-unit', N, - $ NRHS, ONE, A, LDA, B, LDB ) - ELSE -* -* Solve A' * X = B. -* -* Solve U'*X = B, overwriting B with X. -* - CALL DTRSM( 'Left', 'Upper', 'Transpose', 'Non-unit', N, NRHS, - $ ONE, A, LDA, B, LDB ) -* -* Solve L'*X = B, overwriting B with X. -* - CALL DTRSM( 'Left', 'Lower', 'Transpose', 'Unit', N, NRHS, ONE, - $ A, LDA, B, LDB ) -* -* Apply row interchanges to the solution vectors. -* - CALL DLASWP( NRHS, B, LDB, 1, N, IPIV, -1 ) - END IF -* - RETURN -* -* End of DGETRS -* + REAL(DOUBLE) A( LDA, * ), B( LDB, * ) + + CALL DGETRS_HELPER( TRANS, N, NRHS, A, LDA, IPIV, B, LDB, INFO ) + END SUBROUTINE DGETRS +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 004 LAPACK_LINEAR_EQN_DGE +! --- lapack_peeloff begin --- ! SUBROUTINE DGETF2( M, N, A, LDA, IPIV, INFO ) -* - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DGETF2' - -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* June 30, 1992 -* -* .. Scalar Arguments .. INTEGER INFO, LDA, M, N -* .. -* .. Array Arguments .. INTEGER IPIV( * ) - REAL(DOUBLE) A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DGETF2 computes an LU factorization of a general m-by-n matrix A -* using partial pivoting with row interchanges. -* -* The factorization has the form -* A = P * L * U -* where P is a permutation matrix, L is lower triangular with unit -* diagonal elements (lower trapezoidal if m > n), and U is upper -* triangular (upper trapezoidal if m < n). -* -* This is the right-looking Level 2 BLAS version of the algorithm. -* -* Arguments -* ========= -* -* M (input) INTEGER -* The number of rows of the matrix A. M >= 0. -* -* N (input) INTEGER -* The number of columns of the matrix A. N >= 0. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) -* On entry, the m by n matrix to be factored. -* On exit, the factors L and U from the factorization -* A = P*L*U; the unit diagonal elements of L are not stored. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,M). -* -* IPIV (output) INTEGER array, dimension (min(M,N)) -* The pivot indices; for 1 <= i <= min(M,N), row i of the -* matrix was interchanged with row IPIV(i). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -k, the k-th argument had an illegal value -* > 0: if INFO = k, U(k,k) is exactly zero. The factorization -* has been completed, but the factor U is exactly -* singular, and division by zero will occur if it is used -* to solve a system of equations. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - INTEGER J, JP -* .. -* .. External Functions .. -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - IF( M.LT.0 ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, M ) ) THEN - INFO = -4 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DGETF2', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( M.EQ.0 .OR. N.EQ.0 ) - $ RETURN -* - DO 10 J = 1, MIN( M, N ) -* -* Find pivot and test for singularity. -* - JP = J - 1 + IDAMAX( M-J+1, A( J, J ), 1 ) - IPIV( J ) = JP -!xxx IF( A( JP, J ).NE.ZERO ) THEN - IF( DABS(A( JP, J )) > EPSIL(2)) THEN -* -* Apply the interchange to columns 1:N. -* - IF( JP.NE.J ) - $ CALL DSWAP( N, A( J, 1 ), LDA, A( JP, 1 ), LDA ) -* -* Compute elements J+1:M of J-th column. -* - IF( J.LT.M ) - $ CALL DSCAL( M-J, ONE / A( J, J ), A( J+1, J ), 1 ) -* - ELSE IF( INFO.EQ.0 ) THEN -* - INFO = J - END IF -* - IF( J.LT.MIN( M, N ) ) THEN -* -* Update trailing submatrix. -* - CALL DGER( M-J, N-J, -ONE, A( J+1, J ), 1, A( J, J+1 ), LDA, - $ A( J+1, J+1 ), LDA ) - END IF - 10 CONTINUE - RETURN -* -* End of DGETF2 -* + REAL(DOUBLE) A( LDA, * ) + + CALL DGETF2_HELPER( M, N, A, LDA, IPIV, INFO ) + END SUBROUTINE DGETF2 +! --- lapack_peeloff end --- ! +! --- lapack_surgery end --- ! END MODULE LAPACK_LIN_EQN_DGE diff --git a/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGE_ext.f90 b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGE_ext.f90 new file mode 100644 index 00000000..662a71fb --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DGE_ext.f90 @@ -0,0 +1,37 @@ +module LAPACK_LIN_EQN_DGE + + use PENTIUM_II_KIND, only : BYTE, LONG, DOUBLE + + implicit none + + interface + subroutine DGETRF(M, N, A, LDA, IPIV, INFO) + use PENTIUM_II_KIND, only : LONG, DOUBLE + implicit none + integer(LONG), intent(in) :: M, N, LDA + integer(LONG), intent(out) :: IPIV(*), INFO + real(DOUBLE), intent(inout) :: A(LDA,*) + end subroutine DGETRF + + subroutine DGETRS(TRANS, N, NRHS, A, LDA, IPIV, B, LDB, INFO) + use PENTIUM_II_KIND, only : LONG, DOUBLE + implicit none + character(1), intent(in) :: TRANS + integer(LONG), intent(in) :: N, NRHS, LDA, LDB + integer(LONG), intent(in) :: IPIV(*) + integer(LONG), intent(out) :: INFO + real(DOUBLE), intent(in) :: A(LDA,*) + real(DOUBLE), intent(inout) :: B(LDB,*) + end subroutine DGETRS + + subroutine DGETRI(N, A, LDA, IPIV, WORK, LWORK, INFO) + use PENTIUM_II_KIND, only : LONG, DOUBLE + implicit none + integer(LONG), intent(in) :: N, LDA, LWORK + integer(LONG), intent(in) :: IPIV(*) + integer(LONG), intent(out) :: INFO + real(DOUBLE), intent(inout) :: A(LDA,*), WORK(*) + end subroutine DGETRI + end interface + +end module LAPACK_LIN_EQN_DGE diff --git a/Source/Modules/LAPACK/LAPACK_LIN_EQN_DPB.f b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DPB.f index a29e198f..fdf7c722 100644 --- a/Source/Modules/LAPACK/LAPACK_LIN_EQN_DPB.f +++ b/Source/Modules/LAPACK/LAPACK_LIN_EQN_DPB.f @@ -1,6 +1,7 @@ ! ################################################################################################################################## MODULE LAPACK_LIN_EQN_DPB +! --- lapack_surgery begin --- ! USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE USE IOUNT1, ONLY : ERR, F06 @@ -8,6 +9,13 @@ MODULE LAPACK_LIN_EQN_DPB USE TIMDAT, ONLY : HOUR, MINUTE, SEC, & SFRAC, TSEC USE LAPACK_BLAS_AUX + USE LAPACK_DPBCON_HELPER, ONLY : DPBCON_HELPER => DPBCON + USE LAPACK_DPBEQU_HELPER, ONLY : DPBEQU_HELPER => DPBEQU + USE LAPACK_DPBTF2_HELPER, ONLY : DPBTF2_HELPER => DPBTF2 + USE LAPACK_DPBTRF_KERNEL, ONLY : DPBTRF_KERNEL => DPBTRF + USE LAPACK_DPBTRS_HELPER, ONLY : DPBTRS_HELPER => DPBTRS + USE LAPACK_DSYTF2_HELPER, ONLY : DSYTF2_HELPER => DSYTF2 + USE LAPACK_POTF2_HELPER, ONLY : DPOTF2_HELPER => DPOTF2 USE PARAMS, ONLY : NOCOUNTS character(1*byte), parameter :: cr13_dpb = char(13) @@ -42,1367 +50,126 @@ MODULE LAPACK_LIN_EQN_DPB ! ################################################################################################################################## ! 001 LAPACK_LINEAR_EQN_DPB +! --- lapack_peeloff begin --- ! SUBROUTINE DPBEQU( UPLO, N, KD, AB, LDAB, S, SCOND, AMAX, INFO ) USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPBEQU' -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* March 31, 1993 -* -* .. Scalar Arguments .. CHARACTER UPLO INTEGER INFO, KD, LDAB, N REAL(DOUBLE) AMAX, SCOND -* .. -* .. Array Arguments .. REAL(DOUBLE) AB( LDAB, * ), S( * ) -* .. -* -* Purpose -* ======= -* -* DPBEQU computes row and column scalings intended to equilibrate a -* symmetric positive definite band matrix A and reduce its condition -* number (with respect to the two-norm). S contains the scale factors, -* S(i) = 1/sqrt(A(i,i)), chosen so that the scaled matrix B with -* elements B(i,j) = S(i)*A(i,j)*S(j) has ones on the diagonal. This -* choice of S puts the condition number of B within a factor N of the -* smallest possible condition number over all possible diagonal -* scalings. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangular of A is stored; -* = 'L': Lower triangular of A is stored. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* KD (input) INTEGER -* The number of superdiagonals of the matrix A if UPLO = 'U', -* or the number of subdiagonals if UPLO = 'L'. KD >= 0. -* -* AB (input) REAL(DOUBLE) array, dimension (LDAB,N) -* The upper or lower triangle of the symmetric band matrix A, -* stored in the first KD+1 rows of the array. The j-th column -* of A is stored in the j-th column of the array AB as follows: -* if UPLO = 'U', AB(kd+1+i-j,j) = A(i,j) for max(1,j-kd)<=i<=j; -* if UPLO = 'L', AB(1+i-j,j) = A(i,j) for j<=i<=min(n,j+kd). -* -* LDAB (input) INTEGER -* The leading dimension of the array A. LDAB >= KD+1. -* -* S (output) REAL(DOUBLE) array, dimension (N) -* If INFO = 0, S contains the scale factors for A. -* -* SCOND (output) REAL(DOUBLE) -* If INFO = 0, S contains the ratio of the smallest S(i) to -* the largest S(i). If SCOND >= 0.1 and AMAX is neither too -* large nor too small, it is not worth scaling by S. -* -* AMAX (output) REAL(DOUBLE) -* Absolute value of largest matrix element. If AMAX is very -* close to overflow or very close to underflow, the matrix -* should be scaled. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value. -* > 0: if INFO = i, the i-th diagonal element is nonpositive. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO, ONE - PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - INTEGER I, J - REAL(DOUBLE) SMIN -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN, SQRT -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N < 0 ) THEN - INFO = -2 - ELSE IF( KD < 0 ) THEN - INFO = -3 - ELSE IF( LDAB < KD+1 ) THEN - INFO = -5 - END IF - IF( INFO /= 0 ) THEN - CALL XERBLA( 'DPBEQU', -INFO ) - go to 9000 ! My line - END IF -* -* Quick return if possible -* - IF( N == 0 ) THEN - SCOND = ONE - AMAX = ZERO - go to 9000 ! My line - END IF -* - IF( UPPER ) THEN - J = KD + 1 - ELSE - J = 1 - END IF -* -* Initialize SMIN and AMAX. -* - S( 1 ) = AB( J, 1 ) - SMIN = S( 1 ) - AMAX = S( 1 ) -* -* Find the minimum and maximum diagonal elements. -* - DO 10 I = 2, N - S( I ) = AB( J, I ) - SMIN = MIN( SMIN, S( I ) ) - AMAX = MAX( AMAX, S( I ) ) - 10 CONTINUE -* - IF( SMIN <= ZERO ) THEN -* -* Find the first non-positive diagonal element and return. -* - DO 20 I = 1, N - IF( S( I ) <= ZERO ) THEN - INFO = I - go to 9000 ! My line - END IF - 20 CONTINUE - ELSE -* -* Set the scale factors to the reciprocals -* of the diagonal elements. -* - DO 30 I = 1, N - S( I ) = ONE / SQRT( S( I ) ) - 30 CONTINUE -* -* Compute SCOND = min(S(I)) / max(S(I)) -* - SCOND = SQRT( SMIN ) / SQRT( AMAX ) - END IF - go to 9000 ! My line -* -* End of DPBEQU -* -! ********************************************************************************************************************************** - 9000 continue - + CALL DPBEQU_HELPER( UPLO, N, KD, AB, LDAB, S, SCOND, AMAX, INFO ) RETURN - -! ********************************************************************************************************************************** - END SUBROUTINE DPBEQU +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 002 LAPACK_LINEAR_EQN_DPB +! --- lapack_peeloff begin --- ! SUBROUTINE DPBTRF( UPLO, N, KD, AB, LDAB, INFO ) - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPBTRF' -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* March 31, 1993 -* -* .. Scalar Arguments .. CHARACTER UPLO INTEGER INFO, KD, LDAB, N -* .. -* .. Array Arguments .. REAL(DOUBLE) AB( LDAB, * ) -* .. -* -* Purpose -* ======= -* -* DPBTRF computes the Cholesky factorization of a real symmetric -* positive definite band matrix A. -* -* The factorization has the form -* A = U**T * U, if UPLO = 'U', or -* A = L * L**T, if UPLO = 'L', -* where U is an upper triangular matrix and L is lower triangular. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangle of A is stored; -* = 'L': Lower triangle of A is stored. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* KD (input) INTEGER -* The number of superdiagonals of the matrix A if UPLO = 'U', -* or the number of subdiagonals if UPLO = 'L'. KD >= 0. -* -* AB (input/output) REAL(DOUBLE) array, dimension (LDAB,N) -* On entry, the upper or lower triangle of the symmetric band -* matrix A, stored in the first KD+1 rows of the array. The -* j-th column of A is stored in the j-th column of the array AB -* as follows: -* if UPLO = 'U', AB(kd+1+i-j,j) = A(i,j) for max(1,j-kd)<=i<=j; -* if UPLO = 'L', AB(1+i-j,j) = A(i,j) for j<=i<=min(n,j+kd). -* -* On exit, if INFO = 0, the triangular factor U or L from the -* Cholesky factorization A = U**T*U or A = L*L**T of the band -* matrix A, in the same storage format as A. -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= KD+1. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, the leading minor of order i is not -* positive definite, and the factorization could not be -* completed. -* -* iblock (local) INTEGER -* Block number that the calling program (DPBTRF) is working on -* -* numblk (local) INTEGER -* Total number of diagonal blocks that the calling program -* (DPBTRF) has to work on -* -* Further Details -* =============== -* -* The band storage scheme is illustrated by the following example, when -* N = 6, KD = 2, and UPLO = 'U': -* -* On entry: On exit: -* -* * * a13 a24 a35 a46 * * u13 u24 u35 u46 -* * a12 a23 a34 a45 a56 * u12 u23 u34 u45 u56 -* a11 a22 a33 a44 a55 a66 u11 u22 u33 u44 u55 u66 -* -* Similarly, if UPLO = 'L' the format of A is as follows: -* -* On entry: On exit: -* -* a11 a22 a33 a44 a55 a66 l11 l22 l33 l44 l55 l66 -* a21 a32 a43 a54 a65 * l21 l32 l43 l54 l65 * -* a31 a42 a53 a64 * * l31 l42 l53 l64 * * -* -* Array elements marked * are not used by the routine. -* -* Contributed by -* Peter Mayes and Giuseppe Radicati, IBM ECSEC, Rome, March 23, 1989 -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) - INTEGER NBMAX, LDWORK - PARAMETER ( NBMAX = 32, LDWORK = NBMAX+1 ) -* .. -* .. Local Scalars .. - INTEGER I, I2, I3, IB, II, J, JJ, NB, iblock - & , numblk -* .. -* .. Local Arrays .. - REAL(DOUBLE) WORK( LDWORK, NBMAX ) -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME - - INTEGER ILAENV - EXTERNAL ILAENV -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MIN -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - IF( ( .NOT.LSAME( UPLO, 'U' ) ) .AND. - $ ( .NOT.LSAME( UPLO, 'L' ) ) ) THEN - INFO = -1 - ELSE IF( N < 0 ) THEN - INFO = -2 - ELSE IF( KD < 0 ) THEN - INFO = -3 - ELSE IF( LDAB < KD+1 ) THEN - INFO = -5 - END IF - IF( INFO /= 0 ) THEN - CALL XERBLA( 'DPBTRF', -INFO ) - go to 9000 ! My line - END IF -* -* Quick return if possible -* - IF( N == 0 ) - & go to 9000 ! My line -* -* Determine the block size for this environment -* - NB = ILAENV( 1, 'DPBTRF', UPLO, N, KD, -1, -1 ) -* -* The block size must not exceed the semi-bandwidth KD, and must not -* exceed the limit set by the size of the local array WORK. -* - NB = MIN( NB, NBMAX ) -* - IF( NB <= 1 .OR. NB > KD ) THEN -* -* Use unblocked code -* - CALL DPBTF2( UPLO, N, KD, AB, LDAB, INFO ) - ELSE -* -* Use blocked code -* - IF( LSAME( UPLO, 'U' ) ) THEN -* -* Compute the Cholesky factorization of a symmetric band -* matrix, given the upper triangle of the matrix in band -* storage. -* -* Zero the upper triangle of the work array. -* - DO 20 J = 1, NB - DO 10 I = 1, J - 1 - WORK( I, J ) = ZERO - 10 CONTINUE - 20 CONTINUE -* -* Process the band matrix one diagonal block at a time. -* - numblk = int(n/nb) + 1 - iblock = 0 - WRITE (*,*) - DO 70 I = 1, N, NB - iblock = iblock + 1 - IB = MIN( NB, N-I+1 ) -* -* Factorize the diagonal block -* - write(sc1,12345,advance='no') iblock,numblk,ib,cr13_dpb - CALL DPOTF2( UPLO, IB, AB( KD+1, I ), LDAB-1, II ) - IF( II /= 0 ) THEN - INFO = I + II - 1 - GO TO 150 - END IF - IF( I+IB <= N ) THEN -* -* Update the relevant part of the trailing submatrix. -* If A11 denotes the diagonal block which has just been -* factorized, then we need to update the remaining -* blocks in the diagram: -* -* A11 A12 A13 -* A22 A23 -* A33 -* -* The numbers of rows and columns in the partitioning -* are IB, I2, I3 respectively. The blocks A12, A22 and -* A23 are empty if IB = KD. The upper triangle of A13 -* lies outside the band. -* - I2 = MIN( KD-IB, N-I-IB+1 ) - I3 = MIN( IB, N-I-KD+1 ) -* - IF( I2 > 0 ) THEN -* -* Update A12 -* - CALL DTRSM( 'Left', 'Upper', 'Transpose', - $ 'Non-unit', IB, I2, ONE, AB( KD+1, I ), - $ LDAB-1, AB( KD+1-IB, I+IB ), LDAB-1 ) -* -* Update A22 -* - CALL DSYRK( 'Upper', 'Transpose', I2, IB, -ONE, - $ AB( KD+1-IB, I+IB ), LDAB-1, ONE, - $ AB( KD+1, I+IB ), LDAB-1 ) - END IF -* - IF( I3 > 0 ) THEN -* -* Copy the lower triangle of A13 into the work array. -* - DO 40 JJ = 1, I3 - DO 30 II = JJ, IB - WORK( II, JJ ) = AB( II-JJ+1, JJ+I+KD-1 ) - 30 CONTINUE - 40 CONTINUE -* -* Update A13 (in the work array). -* - CALL DTRSM( 'Left', 'Upper', 'Transpose', - $ 'Non-unit', IB, I3, ONE, AB( KD+1, I ), - $ LDAB-1, WORK, LDWORK ) -* -* Update A23 -* - IF( I2 > 0 ) - $ CALL DGEMM( 'Transpose', 'No Transpose', I2, I3, - $ IB, -ONE, AB( KD+1-IB, I+IB ), - $ LDAB-1, WORK, LDWORK, ONE, - $ AB( 1+IB, I+KD ), LDAB-1 ) -* -* Update A33 -* - CALL DSYRK( 'Upper', 'Transpose', I3, IB, -ONE, - $ WORK, LDWORK, ONE, AB( KD+1, I+KD ), - $ LDAB-1 ) -* -* Copy the lower triangle of A13 back into place. -* - DO 60 JJ = 1, I3 - DO 50 II = JJ, IB - AB( II-JJ+1, JJ+I+KD-1 ) = WORK( II, JJ ) - 50 CONTINUE - 60 CONTINUE - END IF - END IF - 70 CONTINUE - ELSE -* -* Compute the Cholesky factorization of a symmetric band -* matrix, given the lower triangle of the matrix in band -* storage. -* -* Zero the lower triangle of the work array. -* - DO 90 J = 1, NB - DO 80 I = J + 1, NB - WORK( I, J ) = ZERO - 80 CONTINUE - 90 CONTINUE -* -* Process the band matrix one diagonal block at a time. -* - numblk = int(n/nb) + 1 - iblock = 0 - WRITE (*,*) - DO 140 I = 1, N, NB - iblock = iblock + 1 - IB = MIN( NB, N-I+1 ) -* -* Factorize the diagonal block -* - write(sc1,12345,advance='no') iblock,numblk,ib,cr13_dpb - CALL DPOTF2( UPLO, IB, AB( 1, I ), LDAB-1, II ) - IF( II /= 0 ) THEN - INFO = I + II - 1 - GO TO 150 - END IF - IF( I+IB <= N ) THEN -* -* Update the relevant part of the trailing submatrix. -* If A11 denotes the diagonal block which has just been -* factorized, then we need to update the remaining -* blocks in the diagram: -* -* A11 -* A21 A22 -* A31 A32 A33 -* -* The numbers of rows and columns in the partitioning -* are IB, I2, I3 respectively. The blocks A21, A22 and -* A32 are empty if IB = KD. The lower triangle of A31 -* lies outside the band. -* - I2 = MIN( KD-IB, N-I-IB+1 ) - I3 = MIN( IB, N-I-KD+1 ) -* - IF( I2 > 0 ) THEN -* -* Update A21 -* - CALL DTRSM( 'Right', 'Lower', 'Transpose', - $ 'Non-unit', I2, IB, ONE, AB( 1, I ), - $ LDAB-1, AB( 1+IB, I ), LDAB-1 ) -* -* Update A22 -* - CALL DSYRK( 'Lower', 'No Transpose', I2, IB, -ONE, - $ AB( 1+IB, I ), LDAB-1, ONE, - $ AB( 1, I+IB ), LDAB-1 ) - END IF -* - IF( I3 > 0 ) THEN -* -* Copy the upper triangle of A31 into the work array. -* - DO 110 JJ = 1, IB - DO 100 II = 1, MIN( JJ, I3 ) - WORK( II, JJ ) = AB( KD+1-JJ+II, JJ+I-1 ) - 100 CONTINUE - 110 CONTINUE -* -* Update A31 (in the work array). -* - CALL DTRSM( 'Right', 'Lower', 'Transpose', - $ 'Non-unit', I3, IB, ONE, AB( 1, I ), - $ LDAB-1, WORK, LDWORK ) -* -* Update A32 -* - IF( I2 > 0 ) - $ CALL DGEMM( 'No transpose', 'Transpose', I3, I2, - $ IB, -ONE, WORK, LDWORK, - $ AB( 1+IB, I ), LDAB-1, ONE, - $ AB( 1+KD-IB, I+IB ), LDAB-1 ) -* -* Update A33 -* - CALL DSYRK( 'Lower', 'No Transpose', I3, IB, -ONE, - $ WORK, LDWORK, ONE, AB( 1, I+KD ), - $ LDAB-1 ) -* -* Copy the upper triangle of A31 back into place. -* - DO 130 JJ = 1, IB - DO 120 II = 1, MIN( JJ, I3 ) - AB( KD+1-JJ+II, JJ+I-1 ) = WORK( II, JJ ) - 120 CONTINUE - 130 CONTINUE - END IF - END IF - 140 CONTINUE - END IF - END IF - go to 9000 ! My line -* - 150 CONTINUE - go to 9000 ! My line -* -12345 format(5X,'Block ',i8,' of ',i8,'. Factoring rows 1 thru: ',i8,a) - -* End of DPBTRF -* -! ********************************************************************************************************************************** - 9000 continue ! My lines - - RETURN - -! ********************************************************************************************************************************** - - END SUBROUTINE DPBTRF -! ################################################################################################################################## -! 003 LAPACK_LINEAR_EQN_DPB - - SUBROUTINE DPBTF2( UPLO, N, KD, AB, LDAB, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPBTF2' -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. - CHARACTER UPLO - INTEGER INFO, KD, LDAB, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) AB( LDAB, * ) -* .. -* -* Purpose -* ======= -* -* DPBTF2 computes the Cholesky factorization of a real symmetric -* positive definite band matrix A. -* -* The factorization has the form -* A = U' * U , if UPLO = 'U', or -* A = L * L', if UPLO = 'L', -* where U is an upper triangular matrix, U' is the transpose of U, and -* L is lower triangular. -* -* This is the unblocked version of the algorithm, calling Level 2 BLAS. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* Specifies whether the upper or lower triangular part of the -* symmetric matrix A is stored: -* = 'U': Upper triangular -* = 'L': Lower triangular -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* KD (input) INTEGER -* The number of super-diagonals of the matrix A if UPLO = 'U', -* or the number of sub-diagonals if UPLO = 'L'. KD >= 0. -* -* AB (input/output) REAL(DOUBLE) array, dimension (LDAB,N) -* On entry, the upper or lower triangle of the symmetric band -* matrix A, stored in the first KD+1 rows of the array. The -* j-th column of A is stored in the j-th column of the array AB -* as follows: -* if UPLO = 'U', AB(kd+1+i-j,j) = A(i,j) for max(1,j-kd)<=i<=j; -* if UPLO = 'L', AB(1+i-j,j) = A(i,j) for j<=i<=min(n,j+kd). -* -* On exit, if INFO = 0, the triangular factor U or L from the -* Cholesky factorization A = U'*U or A = L*L' of the band -* matrix A, in the same storage format as A. -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= KD+1. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -k, the k-th argument had an illegal value -* > 0: if INFO = k, the leading minor of order k is not -* positive definite, and the factorization could not be -* completed. -* -* Further Details -* =============== -* -* The band storage scheme is illustrated by the following example, when -* N = 6, KD = 2, and UPLO = 'U': -* -* On entry: On exit: -* -* * * a13 a24 a35 a46 * * u13 u24 u35 u46 -* * a12 a23 a34 a45 a56 * u12 u23 u34 u45 u56 -* a11 a22 a33 a44 a55 a66 u11 u22 u33 u44 u55 u66 -* -* Similarly, if UPLO = 'L' the format of A is as follows: -* -* On entry: On exit: -* -* a11 a22 a33 a44 a55 a66 l11 l22 l33 l44 l55 l66 -* a21 a32 a43 a54 a65 * l21 l32 l43 l54 l65 * -* a31 a42 a53 a64 * * l31 l42 l53 l64 * * -* -* Array elements marked * are not used by the routine. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - INTEGER i,J, KLD, KN - REAL(DOUBLE) AJJ -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN, SQRT -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N < 0 ) THEN - INFO = -2 - ELSE IF( KD < 0 ) THEN - INFO = -3 - ELSE IF( LDAB < KD+1 ) THEN - INFO = -5 - END IF - IF( INFO /= 0 ) THEN - CALL XERBLA( 'DPBTF2', -INFO ) - go to 9000 ! My line - END IF -* -* Quick return if possible -* - IF( N == 0 ) - & go to 9000 ! My line -* - KLD = MAX( 1, LDAB-1 ) -* - IF( UPPER ) THEN -* -* Compute the Cholesky factorization A = U'*U. -* - WRITE(SC1,*) - DO 10 J = 1, N - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') j,n,cr13_dpb - ENDIF -* -* Compute U(J,J) and test for non-positive-definiteness. -* - AJJ = AB( KD+1, J ) - IF( AJJ <= ZERO ) THEN - GO TO 30 - ENDIF - AJJ = SQRT( AJJ ) - AB( KD+1, J ) = AJJ -* -* Compute elements J+1:J+KN of row J and update the -* trailing submatrix within the band. -* - KN = MIN( KD, N-J ) - IF( KN > 0 ) THEN - CALL DSCAL( KN, ONE / AJJ, AB( KD, J+1 ), KLD ) - CALL DSYR( 'Upper', KN, -ONE, AB( KD, J+1 ), KLD, - $ AB( KD+1, J+1 ), KLD ) - END IF - 10 CONTINUE - ELSE -* -* Compute the Cholesky factorization A = L*L'. -* - WRITE(SC1,*) - DO 20 J = 1, N - IF (NOCOUNTS .NE. 'Y') THEN - write(sc1,12345,advance='no') j,n,cr13_dpb - ENDIF -* -* Compute L(J,J) and test for non-positive-definiteness. -* - AJJ = AB( 1, J ) - IF( AJJ <= ZERO ) - $ GO TO 30 - AJJ = SQRT( AJJ ) - AB( 1, J ) = AJJ -* -* Compute elements J+1:J+KN of column J and update the -* trailing submatrix within the band. -* - KN = MIN( KD, N-J ) - IF( KN > 0 ) THEN - CALL DSCAL( KN, ONE / AJJ, AB( 2, J ), 1 ) - CALL DSYR( 'Lower', KN, -ONE, AB( 2, J ), 1, - $ AB( 1, J+1 ), KLD ) - END IF - 20 CONTINUE - END IF - go to 9000 ! My line -* - 30 CONTINUE - INFO = J - go to 9000 ! My line -* -12345 format(5x,'DPBTF2: Unblocked code. Factoring row ',i8,' of ',i8,a) - -* End of DPBTF2 -* -! ********************************************************************************************************************************** - 9000 continue ! My lines - - RETURN - -! ********************************************************************************************************************************** - - END SUBROUTINE DPBTF2 - -! ################################################################################################################################## -! 004 LAPACK_LINEAR_EQN_DPB - - SUBROUTINE DPOTF2( UPLO, N, A, LDA, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPOTF2' -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. - CHARACTER UPLO - INTEGER INFO, LDA, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DPOTF2 computes the Cholesky factorization of a real symmetric -* positive definite matrix A. -* -* The factorization has the form -* A = U' * U , if UPLO = 'U', or -* A = L * L', if UPLO = 'L', -* where U is an upper triangular matrix and L is lower triangular. -* -* This is the unblocked version of the algorithm, calling Level 2 BLAS. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* Specifies whether the upper or lower triangular part of the -* symmetric matrix A is stored. -* = 'U': Upper triangular -* = 'L': Lower triangular -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) -* On entry, the symmetric matrix A. If UPLO = 'U', the leading -* n by n upper triangular part of A contains the upper -* triangular part of the matrix A, and the strictly lower -* triangular part of A is not referenced. If UPLO = 'L', the -* leading n by n lower triangular part of A contains the lower -* triangular part of the matrix A, and the strictly upper -* triangular part of A is not referenced. -* -* On exit, if INFO = 0, the factor U or L from the Cholesky -* factorization A = U'*U or A = L*L'. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -k, the k-th argument had an illegal value -* > 0: if INFO = k, the leading minor of order k is not -* positive definite, and the factorization could not be -* completed. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - INTEGER J - REAL(DOUBLE) AJJ,dda -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, SQRT -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N < 0 ) THEN - INFO = -2 - ELSE IF( LDA < MAX( 1, N ) ) THEN - INFO = -4 - END IF - IF( INFO /= 0 ) THEN - CALL XERBLA( 'DPOTF2', -INFO ) - go to 9000 ! My line - END IF -* -* Quick return if possible -* - IF( N == 0 ) - & go to 9000 ! My line -* - IF( UPPER ) THEN -* -* Compute the Cholesky factorization A = U'*U. -* - DO 10 J = 1, N -* -* Compute U(J,J) and test for non-positive-definiteness. -* - AJJ = A( J, J ) - DDOT( J-1, A( 1, J ), 1, A( 1, J ), 1 ) - IF( AJJ <= ZERO ) THEN - AJJ = ONE - A( J, J ) = AJJ - GO TO 30 - END IF - AJJ = SQRT( AJJ ) - A( J, J ) = AJJ -* -* Compute elements J+1:N of row J. -* - IF( J < N ) THEN - CALL DGEMV( 'Transpose', J-1, N-J, -ONE, A( 1, J+1 ), - $ LDA, A( 1, J ), 1, ONE, A( J, J+1 ), LDA ) - CALL DSCAL( N-J, ONE / AJJ, A( J, J+1 ), LDA ) - END IF - 10 CONTINUE - ELSE -* -* Compute the Cholesky factorization A = L*L'. -* - DO 20 J = 1, N -* -* Compute L(J,J) and test for non-positive-definiteness. -* - AJJ = A( J, J ) - DDOT( J-1, A( J, 1 ), LDA, A( J, 1 ), - $ LDA ) - IF( AJJ <= ZERO ) THEN - A( J, J ) = AJJ - GO TO 30 - END IF - AJJ = SQRT( AJJ ) - A( J, J ) = AJJ -* -* Compute elements J+1:N of column J. -* - IF( J < N ) THEN - CALL DGEMV( 'No transpose', N-J, J-1, -ONE, A( J+1, 1 ), - $ LDA, A( J, 1 ), LDA, ONE, A( J+1, J ), 1 ) - CALL DSCAL( N-J, ONE / AJJ, A( J+1, J ), 1 ) - END IF - 20 CONTINUE - END IF - GO TO 40 -* - 30 CONTINUE - INFO = J -* - 40 CONTINUE -* -* End of DPOTF2 -* -! ********************************************************************************************************************************** - 9000 continue ! My lines - - RETURN - -! ********************************************************************************************************************************** - - END SUBROUTINE DPOTF2 - -! ################################################################################################################################## -! 005 LAPACK_LINEAR_EQN_DPB - - SUBROUTINE DPBCON( UPLO, N, KD, AB, LDAB, ANORM, RCOND, WORK, - $ IWORK, INFO, itmax, dtbsv_msg ) - -! I added itmax, dtbsv_msg - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPBCON' -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. - CHARACTER UPLO - character*1 dtbsv_msg - INTEGER INFO, KD, LDAB, N - REAL(DOUBLE) ANORM, RCOND -* .. -* .. Array Arguments .. - INTEGER IWORK( * ) - REAL(DOUBLE) AB( LDAB, * ), WORK( * ) -* .. -* -* Purpose -* ======= -* -* DPBCON estimates the reciprocal of the condition number (in the -* 1-norm) of a real symmetric positive definite band matrix using the -* Cholesky factorization A = U**T*U or A = L*L**T computed by DPBTRF. -* -* An estimate is obtained for norm(inv(A)), and the reciprocal of the -* condition number is computed as RCOND = 1 / (ANORM * norm(inv(A))). -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangular factor stored in AB; -* = 'L': Lower triangular factor stored in AB. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* KD (input) INTEGER -* The number of superdiagonals of the matrix A if UPLO = 'U', -* or the number of subdiagonals if UPLO = 'L'. KD >= 0. -* -* AB (input) REAL(DOUBLE) array, dimension (LDAB,N) -* The triangular factor U or L from the Cholesky factorization -* A = U**T*U or A = L*L**T of the band matrix A, stored in the -* first KD+1 rows of the array. The j-th column of U or L is -* stored in the j-th column of the array AB as follows: -* if UPLO ='U', AB(kd+1+i-j,j) = U(i,j) for max(1,j-kd)<=i<=j; -* if UPLO ='L', AB(1+i-j,j) = L(i,j) for j<=i<=min(n,j+kd). -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= KD+1. -* -* ANORM (input) REAL(DOUBLE) -* The 1-norm (or infinity-norm) of the symmetric band matrix A. -* -* RCOND (output) REAL(DOUBLE) -* The reciprocal of the condition number of the matrix A, -* computed as RCOND = 1/(ANORM * AINVNM), where AINVNM is an -* estimate of the 1-norm of inv(A) computed in this routine. -* -* WORK (workspace) REAL(DOUBLE) array, dimension (3*N) -* -* IWORK (workspace) INTEGER array, dimension (N) -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -! dtbsv_msg (input) CHARACTER -! = 'Y', have subr DTBSV print Fwd, Back pass messages - -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE, ZERO - PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - CHARACTER NORMIN - CHARACTER*48 MODNAM - INTEGER IX, KASE - integer iter_num, itmax - REAL(DOUBLE) AINVNM, SCALE, SCALEL, SCALEU, SMLNUM -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME - - REAL(DOUBLE) DLAMCH - EXTERNAL DLAMCH -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC ABS -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N < 0 ) THEN - INFO = -2 - ELSE IF( KD < 0 ) THEN - INFO = -3 - ELSE IF( LDAB < KD+1 ) THEN - INFO = -5 - ELSE IF( ANORM < ZERO ) THEN - INFO = -6 - END IF - IF( INFO /= 0 ) THEN - CALL XERBLA( 'DPBCON', -INFO ) - go to 9000 ! My line - END IF -* -* Quick return if possible -* - RCOND = ZERO - IF( N == 0 ) THEN - RCOND = ONE - go to 9000 ! My line - ELSE IF( ANORM == ZERO ) THEN - go to 9000 ! My line - END IF -* - SMLNUM = DLAMCH( 'Safe minimum' ) -* -* Estimate the 1-norm of the inverse. -* - KASE = 0 - iter_num = 0 - NORMIN = 'N' - 10 CONTINUE - iter_num = iter_num + 1 - CALL DLACON( N, WORK( N+1 ), WORK, IWORK, AINVNM, KASE, itmax )! my ITMAX - IF( KASE /= 0 ) THEN - IF( UPPER ) THEN -* -* Multiply by inv(U'). -* - CALL DLATBS( 'Upper', 'Transpose', 'Non-unit', NORMIN, N, - $ KD, AB, LDAB, WORK, SCALEL, WORK( 2*N+1 ), - $ INFO, iter_num, dtbsv_msg ) - NORMIN = 'Y' -* -* Multiply by inv(U). -* - CALL DLATBS( 'Upper', 'No transpose', 'Non-unit', NORMIN, N, - $ KD, AB, LDAB, WORK, SCALEU, WORK( 2*N+1 ), - $ INFO, iter_num, dtbsv_msg ) - ELSE -* -* Multiply by inv(L). -* - CALL DLATBS( 'Lower', 'No transpose', 'Non-unit', NORMIN, N, - $ KD, AB, LDAB, WORK, SCALEL, WORK( 2*N+1 ), - $ INFO, iter_num, dtbsv_msg ) - NORMIN = 'Y' -* -* Multiply by inv(L'). -* - CALL DLATBS( 'Lower', 'Transpose', 'Non-unit', NORMIN, N, - $ KD, AB, LDAB, WORK, SCALEU, WORK( 2*N+1 ), - $ INFO, iter_num, dtbsv_msg ) - END IF -* -* Multiply by 1/SCALE if doing so will not cause overflow. -* - SCALE = SCALEL*SCALEU - IF( SCALE /= ONE ) THEN - IX = IDAMAX( N, WORK, 1 ) - IF( SCALE < ABS( WORK( IX ) )*SMLNUM .OR. SCALE == ZERO ) - $ GO TO 20 - CALL DRSCL( N, SCALE, WORK, 1 ) + INTEGER I, IB, NB, iblock, numblk + INTEGER ILAENV + EXTERNAL ILAENV + INTRINSIC MIN + + INFO = 0 + IF( N > 0 .AND. KD >= 0 .AND. LDAB >= KD+1 ) THEN + NB = ILAENV( 1, 'DPBTRF', UPLO, N, KD, -1, -1 ) + NB = MIN( NB, 32 ) + IF( NB > 1 .AND. NB <= KD ) THEN + numblk = int(n/nb) + 1 + iblock = 0 + WRITE (*,*) + DO I = 1, N, NB + iblock = iblock + 1 + IB = MIN( NB, N-I+1 ) + IF (NOCOUNTS .NE. 'Y') THEN + write(sc1,12345) iblock,numblk,ib + ENDIF + END DO END IF - GO TO 10 END IF -* -* Compute the estimate of the reciprocal condition number. -* - IF( AINVNM /= ZERO ) - $ RCOND = ( ONE / AINVNM ) / ANORM -* - 20 CONTINUE -* - go to 9000 ! My line -* -* End of DPBCON -* -! ********************************************************************************************************************************** - 9000 continue ! My lines + + CALL DPBTRF_KERNEL( UPLO, N, KD, AB, LDAB, INFO ) + +12345 format(5X,'Block ',i8,' of ',i8,'. Factoring rows 1 thru: ',i8) RETURN + END SUBROUTINE DPBTRF +! --- lapack_peeloff end --- ! -! ********************************************************************************************************************************** +! ################################################################################################################################## +! 003 LAPACK_LINEAR_EQN_DPB + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPBTF2( UPLO, N, KD, AB, LDAB, INFO ) + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + CHARACTER UPLO + INTEGER INFO, KD, LDAB, N + REAL(DOUBLE) AB( LDAB, * ) + CALL DPBTF2_HELPER( UPLO, N, KD, AB, LDAB, INFO ) + RETURN + END SUBROUTINE DPBTF2 +! --- lapack_peeloff end --- ! + +! ################################################################################################################################## +! 004 LAPACK_LINEAR_EQN_DPB + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPOTF2( UPLO, N, A, LDA, INFO ) + CHARACTER UPLO + INTEGER INFO, LDA, N + REAL(DOUBLE) A( LDA, * ) + CALL DPOTF2_HELPER( UPLO, N, A, LDA, INFO ) + RETURN + END SUBROUTINE DPOTF2 +! --- lapack_peeloff end --- ! + +! ################################################################################################################################## +! 005 LAPACK_LINEAR_EQN_DPB +! --- lapack_peeloff begin --- ! + SUBROUTINE DPBCON( UPLO, N, KD, AB, LDAB, ANORM, RCOND, WORK, + $ IWORK, INFO, itmax, dtbsv_msg ) + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + CHARACTER UPLO + CHARACTER*1 dtbsv_msg + INTEGER INFO, KD, LDAB, N + REAL(DOUBLE) ANORM, RCOND + INTEGER IWORK( * ) + REAL(DOUBLE) AB( LDAB, * ), WORK( * ) + INTEGER itmax + CALL DPBCON_HELPER( UPLO, N, KD, AB, LDAB, ANORM, RCOND, WORK, + $ IWORK, INFO, itmax, dtbsv_msg ) END SUBROUTINE DPBCON +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 006 LAPACK_LINEAR_EQN_DPB +! --- lapack_peeloff begin --- ! SUBROUTINE DPBTRS( UPLO, N, KD, NRHS, AB, LDAB, B, LDB, INFO, - & dtbsv_msg ) ! my addition + $ dtbsv_msg ) ! my addition USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPBTRS' -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. CHARACTER UPLO character*1 dtbsv_msg INTEGER INFO, KD, LDAB, LDB, N, NRHS -* .. -* .. Array Arguments .. REAL(DOUBLE) AB( LDAB, * ), B( LDB, * ) -* .. -* -* Purpose -* ======= -* -* DPBTRS solves a system of linear equations A*X = B with a symmetric -* positive definite band matrix A using the Cholesky factorization -* A = U**T*U or A = L*L**T computed by DPBTRF. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangular factor stored in AB; -* = 'L': Lower triangular factor stored in AB. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* KD (input) INTEGER -* The number of superdiagonals of the matrix A if UPLO = 'U', -* or the number of subdiagonals if UPLO = 'L'. KD >= 0. -* -* NRHS (input) INTEGER -* The number of right hand sides, i.e., the number of columns -* of the matrix B. NRHS >= 0. -* -* AB (input) REAL(DOUBLE) array, dimension (LDAB,N) -* The triangular factor U or L from the Cholesky factorization -* A = U**T*U or A = L*L**T of the band matrix A, stored in the -* first KD+1 rows of the array. The j-th column of U or L is -* stored in the j-th column of the array AB as follows: -* if UPLO ='U', AB(kd+1+i-j,j) = U(i,j) for max(1,j-kd)<=i<=j; -* if UPLO ='L', AB(1+i-j,j) = L(i,j) for j<=i<=min(n,j+kd). -* -* LDAB (input) INTEGER -* The leading dimension of the array AB. LDAB >= KD+1. -* -* B (input/output) REAL(DOUBLE) array, dimension (LDB,NRHS) -* On entry, the right hand side matrix B. -* On exit, the solution matrix X. -* -* LDB (input) INTEGER -* The leading dimension of the array B. LDB >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -! dtbsv_msg (input) CHARACTER -! = 'Y', have subr DTBSV print Fwd, Back pass messages - -* ===================================================================== -* -* .. Local Scalars .. - LOGICAL UPPER - INTEGER J -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N < 0 ) THEN - INFO = -2 - ELSE IF( KD < 0 ) THEN - INFO = -3 - ELSE IF( NRHS < 0 ) THEN - INFO = -4 - ELSE IF( LDAB < KD+1 ) THEN - INFO = -6 - ELSE IF( LDB < MAX( 1, N ) ) THEN - INFO = -8 - END IF - IF( INFO /= 0 ) THEN - CALL XERBLA( 'DPBTRS', -INFO ) - go to 9000 ! My line - END IF -* -* Quick return if possible -* - IF( N == 0 .OR. NRHS == 0 ) - & go to 9000 ! My line -* - IF( UPPER ) THEN -* -* Solve A*X = B where A = U'*U. -* - DO 10 J = 1, NRHS - write(f06,*) 'In DPBTRS calling DTBSV with j = ',j -* -* Solve U'*X = B, overwriting B with X. -* - CALL DTBSV( 'Upper', 'Transpose', 'Non-unit', N, KD, AB, - $ LDAB, B( 1, J ), 1, dtbsv_msg ) -* -* Solve U*X = B, overwriting B with X. -* - CALL DTBSV( 'Upper', 'No transpose', 'Non-unit', N, KD, AB, - $ LDAB, B( 1, J ), 1, dtbsv_msg ) - 10 CONTINUE - ELSE -* -* Solve A*X = B where A = L*L'. -* - DO 20 J = 1, NRHS -* -* Solve L*X = B, overwriting B with X. -* - CALL DTBSV( 'Lower', 'No transpose', 'Non-unit', N, KD, AB, - $ LDAB, B( 1, J ), 1, dtbsv_msg ) -* -* Solve L'*X = B, overwriting B with X. -* - CALL DTBSV( 'Lower', 'Transpose', 'Non-unit', N, KD, AB, - $ LDAB, B( 1, J ), 1, dtbsv_msg ) - 20 CONTINUE - END IF -* - go to 9000 ! My line -* -* End of DPBTRS -* -! ********************************************************************************************************************************** - 9000 continue ! My lines - + CALL DPBTRS_HELPER( UPLO, N, KD, NRHS, AB, LDAB, B, LDB, INFO, + $ dtbsv_msg ) RETURN - -! ********************************************************************************************************************************** -98711 FORMAT(' ido = ',I3,', ipntr(1) = ',I8,', ipntr(2) = ',I8) - -98712 FORMAT(' I, AB, B = ',I8,2(1ES15.6)) - END SUBROUTINE DPBTRS +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 007 LAPACK_LINEAR_EQN_DPB @@ -1581,6 +348,7 @@ END SUBROUTINE DPTTRF_MYSTRAN ! ################################################################################################################################# ! 008 LAPACK_LINEAR_EQN_DPB +! --- lapack_peeloff begin --- ! SUBROUTINE DSYTF2( UPLO, N, A, LDA, IPIV, INFO ) CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSYTF2' @@ -1734,379 +502,13 @@ SUBROUTINE DSYTF2( UPLO, N, A, LDA, IPIV, INFO ) * * Test the input parameters. * - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -4 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DSYTF2', -INFO ) - RETURN - END IF -* -* Initialize ALPHA for use in choosing pivot block size. -* - ALPHA = ( ONE+SQRT( SEVTEN ) ) / EIGHT -* - IF( UPPER ) THEN -* -* Factorize A as U*D*U' using the upper triangle of A -* -* K is the main loop index, decreasing from N to 1 in steps of -* 1 or 2 -* - K = N - 10 CONTINUE -* -* If K < 1, exit from loop -* - IF( K.LT.1 ) - $ GO TO 70 - KSTEP = 1 -* -* Determine rows and columns to be interchanged and whether -* a 1-by-1 or 2-by-2 pivot block will be used -* - ABSAKK = ABS( A( K, K ) ) -* -* IMAX is the row-index of the largest off-diagonal element in -* column K, and COLMAX is its absolute value -* - IF( K.GT.1 ) THEN - IMAX = IDAMAX( K-1, A( 1, K ), 1 ) - COLMAX = ABS( A( IMAX, K ) ) - ELSE - COLMAX = ZERO - END IF -* - IF( (MAX( ABSAKK, COLMAX ).EQ.ZERO) .OR. DISNAN(ABSAKK) ) THEN -* -* Column K is zero or contains a NaN: set INFO and continue -* - IF( INFO.EQ.0 ) - $ INFO = K - KP = K - ELSE - IF( ABSAKK.GE.ALPHA*COLMAX ) THEN -* -* no interchange, use 1-by-1 pivot block -* - KP = K - ELSE -* -* JMAX is the column-index of the largest off-diagonal -* element in row IMAX, and ROWMAX is its absolute value -* - JMAX = IMAX + IDAMAX( K-IMAX, A( IMAX, IMAX+1 ), LDA ) - ROWMAX = ABS( A( IMAX, JMAX ) ) - IF( IMAX.GT.1 ) THEN - JMAX = IDAMAX( IMAX-1, A( 1, IMAX ), 1 ) - ROWMAX = MAX( ROWMAX, ABS( A( JMAX, IMAX ) ) ) - END IF -* - IF( ABSAKK.GE.ALPHA*COLMAX*( COLMAX / ROWMAX ) ) THEN -* -* no interchange, use 1-by-1 pivot block -* - KP = K - ELSE IF( ABS( A( IMAX, IMAX ) ).GE.ALPHA*ROWMAX ) THEN -* -* interchange rows and columns K and IMAX, use 1-by-1 -* pivot block -* - KP = IMAX - ELSE -* -* interchange rows and columns K-1 and IMAX, use 2-by-2 -* pivot block -* - KP = IMAX - KSTEP = 2 - END IF - END IF -* - KK = K - KSTEP + 1 - IF( KP.NE.KK ) THEN -* -* Interchange rows and columns KK and KP in the leading -* submatrix A(1:k,1:k) -* - CALL DSWAP( KP-1, A( 1, KK ), 1, A( 1, KP ), 1 ) - CALL DSWAP( KK-KP-1, A( KP+1, KK ), 1, A( KP, KP+1 ), - $ LDA ) - T = A( KK, KK ) - A( KK, KK ) = A( KP, KP ) - A( KP, KP ) = T - IF( KSTEP.EQ.2 ) THEN - T = A( K-1, K ) - A( K-1, K ) = A( KP, K ) - A( KP, K ) = T - END IF - END IF -* -* Update the leading submatrix -* - IF( KSTEP.EQ.1 ) THEN -* -* 1-by-1 pivot block D(k): column k now holds -* -* W(k) = U(k)*D(k) -* -* where U(k) is the k-th column of U -* -* Perform a rank-1 update of A(1:k-1,1:k-1) as -* -* A := A - U(k)*D(k)*U(k)' = A - W(k)*1/D(k)*W(k)' -* - R1 = ONE / A( K, K ) - CALL DSYR( UPLO, K-1, -R1, A( 1, K ), 1, A, LDA ) -* -* Store U(k) in column k -* - CALL DSCAL( K-1, R1, A( 1, K ), 1 ) - ELSE -* -* 2-by-2 pivot block D(k): columns k and k-1 now hold -* -* ( W(k-1) W(k) ) = ( U(k-1) U(k) )*D(k) -* -* where U(k) and U(k-1) are the k-th and (k-1)-th columns -* of U -* -* Perform a rank-2 update of A(1:k-2,1:k-2) as -* -* A := A - ( U(k-1) U(k) )*D(k)*( U(k-1) U(k) )' -* = A - ( W(k-1) W(k) )*inv(D(k))*( W(k-1) W(k) )' -* - IF( K.GT.2 ) THEN -* - D12 = A( K-1, K ) - D22 = A( K-1, K-1 ) / D12 - D11 = A( K, K ) / D12 - T = ONE / ( D11*D22-ONE ) - D12 = T / D12 -* - DO 30 J = K - 2, 1, -1 - WKM1 = D12*( D11*A( J, K-1 )-A( J, K ) ) - WK = D12*( D22*A( J, K )-A( J, K-1 ) ) - DO 20 I = J, 1, -1 - A( I, J ) = A( I, J ) - A( I, K )*WK - - $ A( I, K-1 )*WKM1 - 20 CONTINUE - A( J, K ) = WK - A( J, K-1 ) = WKM1 - 30 CONTINUE -* - END IF -* - END IF - END IF -* -* Store details of the interchanges in IPIV -* - IF( KSTEP.EQ.1 ) THEN - IPIV( K ) = KP - ELSE - IPIV( K ) = -KP - IPIV( K-1 ) = -KP - END IF -* -* Decrease K and return to the start of the main loop -* - K = K - KSTEP - GO TO 10 -* - ELSE -* -* Factorize A as L*D*L' using the lower triangle of A -* -* K is the main loop index, increasing from 1 to N in steps of -* 1 or 2 -* - K = 1 - 40 CONTINUE -* -* If K > N, exit from loop -* - IF( K.GT.N ) - $ GO TO 70 - KSTEP = 1 -* -* Determine rows and columns to be interchanged and whether -* a 1-by-1 or 2-by-2 pivot block will be used -* - ABSAKK = ABS( A( K, K ) ) -* -* IMAX is the row-index of the largest off-diagonal element in -* column K, and COLMAX is its absolute value -* - IF( K.LT.N ) THEN - IMAX = K + IDAMAX( N-K, A( K+1, K ), 1 ) - COLMAX = ABS( A( IMAX, K ) ) - ELSE - COLMAX = ZERO - END IF -* - IF( (MAX( ABSAKK, COLMAX ).EQ.ZERO) .OR. DISNAN(ABSAKK) ) THEN -* -* Column K is zero or contains a NaN: set INFO and continue -* - IF( INFO.EQ.0 ) - $ INFO = K - KP = K - ELSE - IF( ABSAKK.GE.ALPHA*COLMAX ) THEN -* -* no interchange, use 1-by-1 pivot block -* - KP = K - ELSE -* -* JMAX is the column-index of the largest off-diagonal -* element in row IMAX, and ROWMAX is its absolute value -* - JMAX = K - 1 + IDAMAX( IMAX-K, A( IMAX, K ), LDA ) - ROWMAX = ABS( A( IMAX, JMAX ) ) - IF( IMAX.LT.N ) THEN - JMAX = IMAX + IDAMAX( N-IMAX, A( IMAX+1, IMAX ), 1 ) - ROWMAX = MAX( ROWMAX, ABS( A( JMAX, IMAX ) ) ) - END IF -* - IF( ABSAKK.GE.ALPHA*COLMAX*( COLMAX / ROWMAX ) ) THEN -* -* no interchange, use 1-by-1 pivot block -* - KP = K - ELSE IF( ABS( A( IMAX, IMAX ) ).GE.ALPHA*ROWMAX ) THEN -* -* interchange rows and columns K and IMAX, use 1-by-1 -* pivot block -* - KP = IMAX - ELSE -* -* interchange rows and columns K+1 and IMAX, use 2-by-2 -* pivot block -* - KP = IMAX - KSTEP = 2 - END IF - END IF -* - KK = K + KSTEP - 1 - IF( KP.NE.KK ) THEN -* -* Interchange rows and columns KK and KP in the trailing -* submatrix A(k:n,k:n) -* - IF( KP.LT.N ) - $ CALL DSWAP( N-KP, A( KP+1, KK ), 1, A( KP+1, KP ), 1 ) - CALL DSWAP( KP-KK-1, A( KK+1, KK ), 1, A( KP, KK+1 ), - $ LDA ) - T = A( KK, KK ) - A( KK, KK ) = A( KP, KP ) - A( KP, KP ) = T - IF( KSTEP.EQ.2 ) THEN - T = A( K+1, K ) - A( K+1, K ) = A( KP, K ) - A( KP, K ) = T - END IF - END IF -* -* Update the trailing submatrix -* - IF( KSTEP.EQ.1 ) THEN -* -* 1-by-1 pivot block D(k): column k now holds -* -* W(k) = L(k)*D(k) -* -* where L(k) is the k-th column of L -* - IF( K.LT.N ) THEN -* -* Perform a rank-1 update of A(k+1:n,k+1:n) as -* -* A := A - L(k)*D(k)*L(k)' = A - W(k)*(1/D(k))*W(k)' -* - D11 = ONE / A( K, K ) - CALL DSYR( UPLO, N-K, -D11, A( K+1, K ), 1, - $ A( K+1, K+1 ), LDA ) -* -* Store L(k) in column K -* - CALL DSCAL( N-K, D11, A( K+1, K ), 1 ) - END IF - ELSE -* -* 2-by-2 pivot block D(k) -* - IF( K.LT.N-1 ) THEN -* -* Perform a rank-2 update of A(k+2:n,k+2:n) as -* -* A := A - ( (A(k) A(k+1))*D(k)**(-1) ) * (A(k) A(k+1))' -* -* where L(k) and L(k+1) are the k-th and (k+1)-th -* columns of L -* - D21 = A( K+1, K ) - D11 = A( K+1, K+1 ) / D21 - D22 = A( K, K ) / D21 - T = ONE / ( D11*D22-ONE ) - D21 = T / D21 -* - DO 60 J = K + 2, N -* - WK = D21*( D11*A( J, K )-A( J, K+1 ) ) - WKP1 = D21*( D22*A( J, K+1 )-A( J, K ) ) -* - DO 50 I = J, N - A( I, J ) = A( I, J ) - A( I, K )*WK - - $ A( I, K+1 )*WKP1 - 50 CONTINUE -* - A( J, K ) = WK - A( J, K+1 ) = WKP1 -* - 60 CONTINUE - END IF - END IF - END IF -* -* Store details of the interchanges in IPIV -* - IF( KSTEP.EQ.1 ) THEN - IPIV( K ) = KP - ELSE - IPIV( K ) = -KP - IPIV( K+1 ) = -KP - END IF -* -* Increase K and return to the start of the main loop -* - K = K + KSTEP - GO TO 40 -* - END IF -* - 70 CONTINUE -* - RETURN -* -* End of DSYTF2 -* -! ********************************************************************************************************************************** - 9000 continue ! My lines - + CALL DSYTF2_HELPER( UPLO, N, A, LDA, IPIV, INFO ) RETURN ! ********************************************************************************************************************************** END SUBROUTINE DSYTF2 +! --- lapack_peeloff end --- ! +! --- lapack_surgery end --- ! END MODULE LAPACK_LIN_EQN_DPB diff --git a/Source/Modules/LAPACK/LAPACK_MISCEL.f b/Source/Modules/LAPACK/LAPACK_MISCEL.f index d298926f..9ad5c7b5 100644 --- a/Source/Modules/LAPACK/LAPACK_MISCEL.f +++ b/Source/Modules/LAPACK/LAPACK_MISCEL.f @@ -1,16 +1,12 @@ ! ################################################################################################################################## MODULE LAPACK_MISCEL +! --- lapack_surgery begin --- ! USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE SCONTR, ONLY : BLNK_SUB_NAM - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & TSEC - USE LAPACK_BLAS_AUX - USE LAPACK_LIN_EQN_DPB - - USE OURTIM_Interface + USE LAPACK_DSTEV_HELPER, ONLY : DSTEV_HELPER => DSTEV + USE LAPACK_DTRTRS_HELPER, ONLY : DTRTRS_HELPER => DTRTRS ! This is a set of LAPACK routines that are used in several other modules but are not BLAS or auxiliary routines ! The routines included herein are: @@ -33,330 +29,31 @@ MODULE LAPACK_MISCEL ! ################################################################################################################################## ! 003 LAPACK_MISCEL +! --- lapack_peeloff begin --- ! SUBROUTINE DSTEV( JOBZ, N, D, E, Z, LDZ, WORK, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSTERF' -* -* -- LAPACK driver routine (version 3.1) -- -* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. -* November 2006 -* -* .. Scalar Arguments .. CHARACTER JOBZ INTEGER INFO, LDZ, N -* .. -* .. Array Arguments .. DOUBLE PRECISION D( * ), E( * ), WORK( * ), Z( LDZ, * ) -* .. -* -* Purpose -* ======= -* -* DSTEV computes all eigenvalues and, optionally, eigenvectors of a -* real symmetric tridiagonal matrix A. -* -* Arguments -* ========= -* -* JOBZ (input) CHARACTER*1 -* = 'N': Compute eigenvalues only; -* = 'V': Compute eigenvalues and eigenvectors. -* -* N (input) INTEGER -* The order of the matrix. N >= 0. -* -* D (input/output) DOUBLE PRECISION array, dimension (N) -* On entry, the n diagonal elements of the tridiagonal matrix -* A. -* On exit, if INFO = 0, the eigenvalues in ascending order. -* -* E (input/output) DOUBLE PRECISION array, dimension (N-1) -* On entry, the (n-1) subdiagonal elements of the tridiagonal -* matrix A, stored in elements 1 to N-1 of E. -* On exit, the contents of E are destroyed. -* -* Z (output) DOUBLE PRECISION array, dimension (LDZ, N) -* If JOBZ = 'V', then if INFO = 0, Z contains the orthonormal -* eigenvectors of the matrix A, with the i-th column of Z -* holding the eigenvector associated with D(i). -* If JOBZ = 'N', then Z is not referenced. -* -* LDZ (input) INTEGER -* The leading dimension of the array Z. LDZ >= 1, and if -* JOBZ = 'V', LDZ >= max(1,N). -* -* WORK (workspace) DOUBLE PRECISION array, dimension (max(1,2*N-2)) -* If JOBZ = 'N', WORK is not referenced. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, the algorithm failed to converge; i -* off-diagonal elements of E did not converge to zero. -* -* ===================================================================== -* -* .. Parameters .. - DOUBLE PRECISION ZERO, ONE - PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) -* .. -* .. Local Scalars .. - LOGICAL WANTZ - INTEGER IMAX, ISCALE - DOUBLE PRECISION BIGNUM, EPS, RMAX, RMIN, SAFMIN, SIGMA, SMLNUM, - $ TNRM -* .. -* .. External Functions .. - LOGICAL LSAME - DOUBLE PRECISION DLAMCH, DLANST - EXTERNAL LSAME, DLAMCH, DLANST -* .. -* .. External Subroutines .. - EXTERNAL DSCAL, DSTEQR, DSTERF, XERBLA -* .. -* .. Intrinsic Functions .. - INTRINSIC SQRT -* .. -* .. Executable Statements .. - -* -* Test the input parameters. -* - WANTZ = LSAME( JOBZ, 'V' ) -* - INFO = 0 - IF( .NOT.( WANTZ .OR. LSAME( JOBZ, 'N' ) ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDZ.LT.1 .OR. ( WANTZ .AND. LDZ.LT.N ) ) THEN - INFO = -6 - END IF -* - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DSTEV ', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* - IF( N.EQ.1 ) THEN - IF( WANTZ ) - $ Z( 1, 1 ) = ONE - RETURN - END IF -* -* Get machine constants. -* - SAFMIN = DLAMCH( 'Safe minimum' ) - EPS = DLAMCH( 'Precision' ) - SMLNUM = SAFMIN / EPS - BIGNUM = ONE / SMLNUM - RMIN = SQRT( SMLNUM ) - RMAX = SQRT( BIGNUM ) -* -* Scale matrix to allowable range, if necessary. -* - ISCALE = 0 - TNRM = DLANST( 'M', N, D, E ) - IF( TNRM.GT.ZERO .AND. TNRM.LT.RMIN ) THEN - ISCALE = 1 - SIGMA = RMIN / TNRM - ELSE IF( TNRM.GT.RMAX ) THEN - ISCALE = 1 - SIGMA = RMAX / TNRM - END IF - IF( ISCALE.EQ.1 ) THEN - CALL DSCAL( N, SIGMA, D, 1 ) - CALL DSCAL( N-1, SIGMA, E( 1 ), 1 ) - END IF -* -* For eigenvalues only, call DSTERF. For eigenvalues and -* eigenvectors, call DSTEQR. -* - IF( .NOT.WANTZ ) THEN - CALL DSTERF( N, D, E, INFO ) - ELSE - CALL DSTEQR( 'I', N, D, E, Z, LDZ, WORK, INFO ) - END IF -* -* If matrix was scaled, then rescale eigenvalues appropriately. -* - IF( ISCALE.EQ.1 ) THEN - IF( INFO.EQ.0 ) THEN - IMAX = N - ELSE - IMAX = INFO - 1 - END IF - CALL DSCAL( IMAX, ONE / SIGMA, D, 1 ) - END IF -* + CALL DSTEV_HELPER( JOBZ, N, D, E, Z, LDZ, WORK, INFO ) RETURN -* -* End of DSTEV -* END SUBROUTINE DSTEV +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 004 LAPACK_MISCEL +! --- lapack_peeloff begin --- ! SUBROUTINE DTRTRS( UPLO, TRANS, DIAG, N, NRHS, A, LDA, B, LDB, $ INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSTERF' -* -* -- LAPACK routine (version 3.1) -- -* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. -* November 2006 -* -* .. Scalar Arguments .. CHARACTER DIAG, TRANS, UPLO INTEGER INFO, LDA, LDB, N, NRHS -* .. -* .. Array Arguments .. DOUBLE PRECISION A( LDA, * ), B( LDB, * ) -* .. -* -* Purpose -* ======= -* -* DTRTRS solves a triangular system of the form -* -* A * X = B or A**T * X = B, -* -* where A is a triangular matrix of order N, and B is an N-by-NRHS -* matrix. A check is made to verify that A is nonsingular. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': A is upper triangular; -* = 'L': A is lower triangular. -* -* TRANS (input) CHARACTER*1 -* Specifies the form of the system of equations: -* = 'N': A * X = B (No transpose) -* = 'T': A**T * X = B (Transpose) -* = 'C': A**H * X = B (Conjugate transpose = Transpose) -* -* DIAG (input) CHARACTER*1 -* = 'N': A is non-unit triangular; -* = 'U': A is unit triangular. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* NRHS (input) INTEGER -* The number of right hand sides, i.e., the number of columns -* of the matrix B. NRHS >= 0. -* -* A (input) DOUBLE PRECISION array, dimension (LDA,N) -* The triangular matrix A. If UPLO = 'U', the leading N-by-N -* upper triangular part of the array A contains the upper -* triangular matrix, and the strictly lower triangular part of -* A is not referenced. If UPLO = 'L', the leading N-by-N lower -* triangular part of the array A contains the lower triangular -* matrix, and the strictly upper triangular part of A is not -* referenced. If DIAG = 'U', the diagonal elements of A are -* also not referenced and are assumed to be 1. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* B (input/output) DOUBLE PRECISION array, dimension (LDB,NRHS) -* On entry, the right hand side matrix B. -* On exit, if INFO = 0, the solution matrix X. -* -* LDB (input) INTEGER -* The leading dimension of the array B. LDB >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, the i-th diagonal element of A is zero, -* indicating that the matrix is singular and the solutions -* X have not been computed. -* -* ===================================================================== -* -* .. Parameters .. - DOUBLE PRECISION ZERO, ONE - PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL NOUNIT -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. - EXTERNAL DTRSM, XERBLA -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. Executable Statements .. - -* -* Test the input parameters. -* - INFO = 0 - NOUNIT = LSAME( DIAG, 'N' ) - IF( .NOT.LSAME( UPLO, 'U' ) .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( .NOT.LSAME( TRANS, 'N' ) .AND. .NOT. - $ LSAME( TRANS, 'T' ) .AND. .NOT.LSAME( TRANS, 'C' ) ) THEN - INFO = -2 - ELSE IF( .NOT.NOUNIT .AND. .NOT.LSAME( DIAG, 'U' ) ) THEN - INFO = -3 - ELSE IF( N.LT.0 ) THEN - INFO = -4 - ELSE IF( NRHS.LT.0 ) THEN - INFO = -5 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -7 - ELSE IF( LDB.LT.MAX( 1, N ) ) THEN - INFO = -9 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DTRTRS', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* -* Check for singularity. -* - IF( NOUNIT ) THEN - DO 10 INFO = 1, N - IF( A( INFO, INFO ).EQ.ZERO ) - $ RETURN - 10 CONTINUE - END IF - INFO = 0 -* -* Solve A * x = b or A' * x = b. -* - CALL DTRSM( 'Left', UPLO, TRANS, DIAG, N, NRHS, ONE, A, LDA, B, - $ LDB ) -* + CALL DTRTRS_HELPER( UPLO, TRANS, DIAG, N, NRHS, A, LDA, B, LDB, + $ INFO ) RETURN -* -* End of DTRTRS -* END SUBROUTINE DTRTRS +! --- lapack_peeloff end --- ! +! --- lapack_surgery end --- ! END MODULE LAPACK_MISCEL diff --git a/Source/Modules/LAPACK/LAPACK_MISCEL_ext.f90 b/Source/Modules/LAPACK/LAPACK_MISCEL_ext.f90 new file mode 100644 index 00000000..8b4afa20 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_MISCEL_ext.f90 @@ -0,0 +1,37 @@ +module LAPACK_MISCEL + + use PENTIUM_II_KIND, only : BYTE, LONG, DOUBLE + + implicit none + + interface + subroutine DSTEQR(COMPZ, N, D, E, Z, LDZ, WORK, INFO) + use PENTIUM_II_KIND, only : LONG, DOUBLE + implicit none + character(1), intent(in) :: COMPZ + integer(LONG), intent(in) :: N, LDZ + integer(LONG), intent(out) :: INFO + real(DOUBLE), intent(inout) :: D(*), E(*), Z(LDZ,*), WORK(*) + end subroutine DSTEQR + + subroutine DSTEV(JOBZ, N, D, E, Z, LDZ, WORK, INFO) + use PENTIUM_II_KIND, only : LONG, DOUBLE + implicit none + character(1), intent(in) :: JOBZ + integer(LONG), intent(in) :: N, LDZ + integer(LONG), intent(out) :: INFO + real(DOUBLE), intent(inout) :: D(*), E(*), Z(LDZ,*), WORK(*) + end subroutine DSTEV + + subroutine DTRTRS(UPLO, TRANS, DIAG, N, NRHS, A, LDA, B, LDB, INFO) + use PENTIUM_II_KIND, only : LONG, DOUBLE + implicit none + character(1), intent(in) :: UPLO, TRANS, DIAG + integer(LONG), intent(in) :: N, NRHS, LDA, LDB + integer(LONG), intent(out) :: INFO + real(DOUBLE), intent(in) :: A(LDA,*) + real(DOUBLE), intent(inout) :: B(LDB,*) + end subroutine DTRTRS + end interface + +end module LAPACK_MISCEL diff --git a/Source/Modules/LAPACK/LAPACK_POTF2_HELPER.f b/Source/Modules/LAPACK/LAPACK_POTF2_HELPER.f new file mode 100644 index 00000000..a6d5e658 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_POTF2_HELPER.f @@ -0,0 +1,188 @@ +! ################################################################################################################################## + + MODULE LAPACK_POTF2_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE SCONTR, ONLY : BLNK_SUB_NAM + USE LAPACK_BLAS_AUX + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DPOTF2( UPLO, N, A, LDA, INFO ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DPOTF2' +* +* -- LAPACK routine (version 2.0) -- +* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., +* Courant Institute, Argonne National Lab, and Rice University +* February 29, 1992 +* +* .. Scalar Arguments .. + CHARACTER UPLO + INTEGER INFO, LDA, N +* .. +* .. Array Arguments .. + REAL(DOUBLE) A( LDA, * ) +* .. +* +* Purpose +* ======= +* +* DPOTF2 computes the Cholesky factorization of a real symmetric +* positive definite matrix A. +* +* The factorization has the form +* A = U' * U , if UPLO = 'U', or +* A = L * L', if UPLO = 'L', +* where U is an upper triangular matrix and L is lower triangular. +* +* This is the unblocked version of the algorithm, calling Level 2 BLAS. +* +* Arguments +* ========= +* +* UPLO (input) CHARACTER*1 +* Specifies whether the upper or lower triangular part of the +* symmetric matrix A is stored. +* = 'U': Upper triangular +* = 'L': Lower triangular +* +* N (input) INTEGER +* The order of the matrix A. N >= 0. +* +* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) +* On entry, the symmetric matrix A. If UPLO = 'U', the leading +* n by n upper triangular part of A contains the upper +* triangular part of the matrix A, and the strictly lower +* triangular part of A is not referenced. If UPLO = 'L', the +* leading n by n lower triangular part of A contains the lower +* triangular part of the matrix A, and the strictly upper +* triangular part of A is not referenced. +* +* On exit, if INFO = 0, the factor U or L from the Cholesky +* factorization A = U'*U or A = L*L'. +* +* LDA (input) INTEGER +* The leading dimension of the array A. LDA >= max(1,N). +* +* INFO (output) INTEGER +* = 0: successful exit +* < 0: if INFO = -k, the k-th argument had an illegal value +* > 0: if INFO = k, the leading minor of order k is not +* positive definite, and the factorization could not be +* completed. +* +* ===================================================================== +* +* .. Parameters .. + REAL(DOUBLE) ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +* .. +* .. Local Scalars .. + LOGICAL UPPER + INTEGER J + REAL(DOUBLE) AJJ,dda +* .. +* .. External Functions .. + LOGICAL LSAME + EXTERNAL LSAME +* .. +* .. Intrinsic Functions .. + INTRINSIC MAX, SQRT +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N < 0 ) THEN + INFO = -2 + ELSE IF( LDA < MAX( 1, N ) ) THEN + INFO = -4 + END IF + IF( INFO /= 0 ) THEN + CALL XERBLA( 'DPOTF2', -INFO ) + GO TO 9000 + END IF +* +* Quick return if possible +* + IF( N == 0 ) + & GO TO 9000 +* + IF( UPPER ) THEN +* +* Compute the Cholesky factorization A = U'*U. +* + DO 10 J = 1, N +* +* Compute U(J,J) and test for non-positive-definiteness. +* + AJJ = A( J, J ) - DDOT( J-1, A( 1, J ), 1, A( 1, J ), 1 ) + IF( AJJ <= ZERO ) THEN + AJJ = ONE + A( J, J ) = AJJ + GO TO 30 + END IF + AJJ = SQRT( AJJ ) + A( J, J ) = AJJ +* +* Compute elements J+1:N of row J. +* + IF( J < N ) THEN + CALL DGEMV( 'Transpose', J-1, N-J, -ONE, A( 1, J+1 ), + $ LDA, A( 1, J ), 1, ONE, A( J, J+1 ), LDA ) + CALL DSCAL( N-J, ONE / AJJ, A( J, J+1 ), LDA ) + END IF + 10 CONTINUE + ELSE +* +* Compute the Cholesky factorization A = L*L'. +* + DO 20 J = 1, N +* +* Compute L(J,J) and test for non-positive-definiteness. +* + AJJ = A( J, J ) - DDOT( J-1, A( J, 1 ), LDA, A( J, 1 ), + $ LDA ) + IF( AJJ <= ZERO ) THEN + A( J, J ) = AJJ + GO TO 30 + END IF + AJJ = SQRT( AJJ ) + A( J, J ) = AJJ +* +* Compute elements J+1:N of column J. +* + IF( J < N ) THEN + CALL DGEMV( 'No transpose', N-J, J-1, -ONE, A( J+1, 1 ), + $ LDA, A( J, 1 ), LDA, ONE, A( J+1, J ), 1 ) + CALL DSCAL( N-J, ONE / AJJ, A( J+1, J ), 1 ) + END IF + 20 CONTINUE + END IF + GO TO 40 +* + 30 CONTINUE + INFO = J +* + 40 CONTINUE +* +* End of DPOTF2 +* + 9000 CONTINUE + + RETURN + + END SUBROUTINE DPOTF2 +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_POTF2_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_STD_EIG_1.f b/Source/Modules/LAPACK/LAPACK_STD_EIG_1.f index 28ed0fae..164d8363 100644 --- a/Source/Modules/LAPACK/LAPACK_STD_EIG_1.f +++ b/Source/Modules/LAPACK/LAPACK_STD_EIG_1.f @@ -1,718 +1,52 @@ ! ################################################################################################################################## MODULE LAPACK_STD_EIG_1 +! --- lapack_surgery begin --- ! USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06 - USE SCONTR, ONLY : LINKNO, BLNK_SUB_NAM - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & SFRAC, STIME, TSEC - USE LAPACK_BLAS_AUX - USE LAPACK_MISCEL ! This contains DSTEQR, used in this module - - USE OURTIM_Interface - USE OUTA_HERE_Interface - - CHARACTER(44*BYTE), PRIVATE :: MODNAM ! Name to write to screen to describe module being run. - -! This is a set of LAPACK routines for solving for all of the eigenvalues -! and, possibly, all eigenvectors of: - -! Ax = (Lambda)x (1) - -! where A is real and symmetric. - -! This module contains LAPACK subroutines described below: - -! DSYEV : Main driver for solving (1) for eigenvalues and eigenvectors - -! DSYEV calls the LAPACK subroutines included herein and described below to do the main computations. - -! DSYTRD: To reduce A to tridiagonal form, and -! DSTERF: To calc eigenvalues if NO eigenvectors are sought, or -! NOTE: DSTEQR is not in this module, it is in module LAPACK_MISCEL, since it is used in several modules -! DORGTR: To generate orthogonal matrices - -! DSYEV also uses - -! DSTEQR: to compute all eigenvalues and all eigenvectors of the tridiagonal matrix (if eigenvectors are requested). -! NOTE: DSTEQR is not in this module, it is in module LAPACK_MISCEL, since it is used in several other modules - -! In addition, other LAPACK procedures are called from module LAPACK_BLAS_AUX_1 + USE LAPACK_STD_EIG_1_HELPER, ONLY : DSYEV_HELPER => DSYEV, & + & DSYTRD_HELPER => DSYTRD, & + & DORGTR_HELPER => DORGTR CONTAINS ! ################################################################################################################################## -! 001 LAPACK_STD_EIG_1 +! --- lapack_peeloff begin --- ! SUBROUTINE DSYEV( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSYEV' -* -* -- LAPACK driver routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. CHARACTER JOBZ, UPLO INTEGER INFO, LDA, LWORK, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), W( * ), WORK( * ) -* .. -* -* Purpose -* ======= -* -* DSYEV computes all eigenvalues and, optionally, eigenvectors of a -* real symmetric matrix A. -* -* Arguments -* ========= -* -* JOBZ (input) CHARACTER*1 -* = 'N': Compute eigenvalues only; -* = 'V': Compute eigenvalues and eigenvectors. -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangle of A is stored; -* = 'L': Lower triangle of A is stored. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA, N) -* On entry, the symmetric matrix A. If UPLO = 'U', the -* leading N-by-N upper triangular part of A contains the -* upper triangular part of the matrix A. If UPLO = 'L', -* the leading N-by-N lower triangular part of A contains -* the lower triangular part of the matrix A. -* On exit, if JOBZ = 'V', then if INFO = 0, A contains the -* orthonormal eigenvectors of the matrix A. -* If JOBZ = 'N', then on exit the lower triangle (if UPLO='L') -* or the upper triangle (if UPLO='U') of A, including the -* diagonal, is destroyed. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* W (output) REAL(DOUBLE) array, dimension (N) -* If INFO = 0, the eigenvalues in ascending order. -* -* WORK (workspace/output) REAL(DOUBLE) array, dimension (LWORK) -* On exit, if INFO = 0, WORK(1) returns the optimal LWORK. -* -* LWORK (input) INTEGER -* The length of the array WORK. LWORK >= max(1,3*N-1). -* For optimal efficiency, LWORK >= (NB+2)*N, -* where NB is the blocksize for DSYTRD returned by ILAENV. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, the algorithm failed to converge; i -* off-diagonal elements of an intermediate tridiagonal -* form did not converge to zero. -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO, ONE - PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) -* .. -* .. Local Scalars .. - LOGICAL LOWER, WANTZ - INTEGER IINFO, IMAX, INDE, INDTAU, INDWRK, ISCALE, - $ LLWORK, LOPT - REAL(DOUBLE) ANRM, BIGNUM, EPS, RMAX, RMIN, SAFMIN, SIGMA, - $ SMLNUM -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME - - INTEGER ILAENV - EXTERNAL ILAENV - - REAL(DOUBLE) DLAMCH - EXTERNAL DLAMCH -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, SQRT -* .. -* .. Executable Statements .. - -* -* Test the input parameters. -* - WANTZ = LSAME( JOBZ, 'V' ) - LOWER = LSAME( UPLO, 'L' ) -* - INFO = 0 - IF( .NOT.( WANTZ .OR. LSAME( JOBZ, 'N' ) ) ) THEN - INFO = -1 - ELSE IF( .NOT.( LOWER .OR. LSAME( UPLO, 'U' ) ) ) THEN - INFO = -2 - ELSE IF( N.LT.0 ) THEN - INFO = -3 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -5 - ELSE IF( LWORK.LT.MAX( 1, 3*N-1 ) ) THEN - INFO = -8 - END IF -* - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DSYEV ', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) THEN - WORK( 1 ) = 1 - RETURN - END IF -* - IF( N.EQ.1 ) THEN - W( 1 ) = A( 1, 1 ) - WORK( 1 ) = 3 - IF( WANTZ ) - $ A( 1, 1 ) = ONE - RETURN - END IF -* -* Get machine constants. -* - SAFMIN = DLAMCH( 'Safe minimum' ) - EPS = DLAMCH( 'Precision' ) - SMLNUM = SAFMIN / EPS - BIGNUM = ONE / SMLNUM - RMIN = SQRT( SMLNUM ) - RMAX = SQRT( BIGNUM ) -* -* Scale matrix to allowable range, if necessary. -* - ANRM = DLANSY( 'M', UPLO, N, A, LDA, WORK ) - ISCALE = 0 - IF( ANRM.GT.ZERO .AND. ANRM.LT.RMIN ) THEN - ISCALE = 1 - SIGMA = RMIN / ANRM - ELSE IF( ANRM.GT.RMAX ) THEN - ISCALE = 1 - SIGMA = RMAX / ANRM - END IF - IF( ISCALE.EQ.1 ) - $ CALL DLASCL( UPLO, 0, 0, ONE, SIGMA, N, N, A, LDA, INFO ) -* -* Call DSYTRD to reduce symmetric matrix to tridiagonal form. -* - INDE = 1 - INDTAU = INDE + N - INDWRK = INDTAU + N - LLWORK = LWORK - INDWRK + 1 - CALL DSYTRD( UPLO, N, A, LDA, W, WORK( INDE ), WORK( INDTAU ), - $ WORK( INDWRK ), LLWORK, IINFO ) - LOPT = 2*N + WORK( INDWRK ) -* -* For eigenvalues only, call DSTERF. For eigenvectors, first call -* DORGTR to generate the orthogonal matrix, then call DSTEQR. -* - IF( .NOT.WANTZ ) THEN - CALL DSTERF( N, W, WORK( INDE ), INFO ) - ELSE - CALL DORGTR( UPLO, N, A, LDA, WORK( INDTAU ), WORK( INDWRK ), - $ LLWORK, IINFO ) - CALL DSTEQR( JOBZ, N, W, WORK( INDE ), A, LDA, WORK( INDTAU ), - $ INFO ) - END IF -* -* If matrix was scaled, then rescale eigenvalues appropriately. -* - IF( ISCALE.EQ.1 ) THEN - IF( INFO.EQ.0 ) THEN - IMAX = N - ELSE - IMAX = INFO - 1 - END IF - CALL DSCAL( IMAX, ONE / SIGMA, W, 1 ) - END IF -* -* Set WORK(1) to optimal workspace size. -* - WORK( 1 ) = MAX( 3*N-1, LOPT ) -* - + REAL(DOUBLE) A( LDA, * ), W( * ), WORK( * ) + CALL DSYEV_HELPER( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO ) RETURN -* -* End of DSYEV -* END SUBROUTINE DSYEV +! --- lapack_peeloff end --- ! ! ################################################################################################################################## -! 002 LAPACK_STD_EIG_1 +! --- lapack_peeloff begin --- ! SUBROUTINE DSYTRD( UPLO, N, A, LDA, D, E, TAU, WORK, LWORK, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DSYTRD' -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. CHARACTER UPLO INTEGER INFO, LDA, LWORK, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), D( * ), E( * ), TAU( * ), + REAL(DOUBLE) A( LDA, * ), D( * ), E( * ), TAU( * ), $ WORK( * ) -* .. -* -* Purpose -* ======= -* -* DSYTRD reduces a real symmetric matrix A to real symmetric -* tridiagonal form T by an orthogonal similarity transformation: -* Q**T * A * Q = T. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangle of A is stored; -* = 'L': Lower triangle of A is stored. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) -* On entry, the symmetric matrix A. If UPLO = 'U', the leading -* N-by-N upper triangular part of A contains the upper -* triangular part of the matrix A, and the strictly lower -* triangular part of A is not referenced. If UPLO = 'L', the -* leading N-by-N lower triangular part of A contains the lower -* triangular part of the matrix A, and the strictly upper -* triangular part of A is not referenced. -* On exit, if UPLO = 'U', the diagonal and first superdiagonal -* of A are overwritten by the corresponding elements of the -* tridiagonal matrix T, and the elements above the first -* superdiagonal, with the array TAU, represent the orthogonal -* matrix Q as a product of elementary reflectors; if UPLO -* = 'L', the diagonal and first subdiagonal of A are over- -* written by the corresponding elements of the tridiagonal -* matrix T, and the elements below the first subdiagonal, with -* the array TAU, represent the orthogonal matrix Q as a product -* of elementary reflectors. See Further Details. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* D (output) REAL(DOUBLE) array, dimension (N) -* The diagonal elements of the tridiagonal matrix T: -* D(i) = A(i,i). -* -* E (output) REAL(DOUBLE) array, dimension (N-1) -* The off-diagonal elements of the tridiagonal matrix T: -* E(i) = A(i,i+1) if UPLO = 'U', E(i) = A(i+1,i) if UPLO = 'L'. -* -* TAU (output) REAL(DOUBLE) array, dimension (N-1) -* The scalar factors of the elementary reflectors (see Further -* Details). -* -* WORK (workspace/output) REAL(DOUBLE) array, dimension (LWORK) -* On exit, if INFO = 0, WORK(1) returns the optimal LWORK. -* -* LWORK (input) INTEGER -* The dimension of the array WORK. LWORK >= 1. -* For optimum performance LWORK >= N*NB, where NB is the -* optimal blocksize. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -* Further Details -* =============== -* -* If UPLO = 'U', the matrix Q is represented as a product of elementary -* reflectors -* -* Q = H(n-1) . . . H(2) H(1). -* -* Each H(i) has the form -* -* H(i) = I - tau * v * v' -* -* where tau is a real scalar, and v is a real vector with -* v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in -* A(1:i-1,i+1), and tau in TAU(i). -* -* If UPLO = 'L', the matrix Q is represented as a product of elementary -* reflectors -* -* Q = H(1) H(2) . . . H(n-1). -* -* Each H(i) has the form -* -* H(i) = I - tau * v * v' -* -* where tau is a real scalar, and v is a real vector with -* v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), -* and tau in TAU(i). -* -* The contents of A on exit are illustrated by the following examples -* with n = 5: -* -* if UPLO = 'U': if UPLO = 'L': -* -* ( d e v2 v3 v4 ) ( d ) -* ( d e v3 v4 ) ( e d ) -* ( d e v4 ) ( v1 e d ) -* ( d e ) ( v1 v2 e d ) -* ( d ) ( v1 v2 v3 e d ) -* -* where d and e denote diagonal and off-diagonal elements of T, and vi -* denotes an element of the vector defining H(i). -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ONE - PARAMETER ( ONE = 1.0D0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - INTEGER I, IINFO, IWS, J, KK, LDWORK, NB, NBMIN, NX -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME - - INTEGER ILAENV - EXTERNAL ILAENV -* .. -* .. Executable Statements .. - -* -* Test the input parameters -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -4 - ELSE IF( LWORK.LT.1 ) THEN - INFO = -9 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DSYTRD', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) THEN - WORK( 1 ) = 1 - RETURN - END IF -* -* Determine the block size. -* - NB = ILAENV( 1, 'DSYTRD', UPLO, N, -1, -1, -1 ) - NX = N - IWS = 1 - IF( NB.GT.1 .AND. NB.LT.N ) THEN -* -* Determine when to cross over from blocked to unblocked code -* (last block is always handled by unblocked code). -* - NX = MAX( NB, ILAENV( 3, 'DSYTRD', UPLO, N, -1, -1, -1 ) ) - IF( NX.LT.N ) THEN -* -* Determine if workspace is large enough for blocked code. -* - LDWORK = N - IWS = LDWORK*NB - IF( LWORK.LT.IWS ) THEN -* -* Not enough workspace to use optimal NB: determine the -* minimum value of NB, and reduce NB or force use of -* unblocked code by setting NX = N. -* - NB = MAX( LWORK / LDWORK, 1 ) - NBMIN = ILAENV( 2, 'DSYTRD', UPLO, N, -1, -1, -1 ) - IF( NB.LT.NBMIN ) - $ NX = N - END IF - ELSE - NX = N - END IF - ELSE - NB = 1 - END IF -* - IF( UPPER ) THEN -* -* Reduce the upper triangle of A. -* Columns 1:kk are handled by the unblocked method. -* - KK = N - ( ( N-NX+NB-1 ) / NB )*NB - DO 20 I = N - NB + 1, KK + 1, -NB -* -* Reduce columns i:i+nb-1 to tridiagonal form and form the -* matrix W which is needed to update the unreduced part of -* the matrix -* - CALL DLATRD( UPLO, I+NB-1, NB, A, LDA, E, TAU, WORK, - $ LDWORK ) -* -* Update the unreduced submatrix A(1:i-1,1:i-1), using an -* update of the form: A := A - V*W' - W*V' -* - CALL DSYR2K( UPLO, 'No transpose', I-1, NB, -ONE, A( 1, I ), - $ LDA, WORK, LDWORK, ONE, A, LDA ) -* -* Copy superdiagonal elements back into A, and diagonal -* elements into D -* - DO 10 J = I, I + NB - 1 - A( J-1, J ) = E( J-1 ) - D( J ) = A( J, J ) - 10 CONTINUE - 20 CONTINUE -* -* Use unblocked code to reduce the last or only block -* - CALL DSYTD2( UPLO, KK, A, LDA, D, E, TAU, IINFO ) - ELSE -* -* Reduce the lower triangle of A -* - DO 40 I = 1, N - NX, NB -* -* Reduce columns i:i+nb-1 to tridiagonal form and form the -* matrix W which is needed to update the unreduced part of -* the matrix -* - CALL DLATRD( UPLO, N-I+1, NB, A( I, I ), LDA, E( I ), - $ TAU( I ), WORK, LDWORK ) -* -* Update the unreduced submatrix A(i+ib:n,i+ib:n), using -* an update of the form: A := A - V*W' - W*V' -* - CALL DSYR2K( UPLO, 'No transpose', N-I-NB+1, NB, -ONE, - $ A( I+NB, I ), LDA, WORK( NB+1 ), LDWORK, ONE, - $ A( I+NB, I+NB ), LDA ) -* -* Copy subdiagonal elements back into A, and diagonal -* elements into D -* - DO 30 J = I, I + NB - 1 - A( J+1, J ) = E( J ) - D( J ) = A( J, J ) - 30 CONTINUE - 40 CONTINUE -* -* Use unblocked code to reduce the last or only block -* - CALL DSYTD2( UPLO, N-I+1, A( I, I ), LDA, D( I ), E( I ), - $ TAU( I ), IINFO ) - END IF -* - WORK( 1 ) = IWS - + CALL DSYTRD_HELPER( UPLO, N, A, LDA, D, E, TAU, WORK, LWORK, + $ INFO ) RETURN -* -* End of DSYTRD -* END SUBROUTINE DSYTRD +! --- lapack_peeloff end --- ! ! ################################################################################################################################## -! 003 LAPACK_STD_EIG_1 +! --- lapack_peeloff begin --- ! SUBROUTINE DORGTR( UPLO, N, A, LDA, TAU, WORK, LWORK, INFO ) - - USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - - CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: subr_name = 'DORGTR' -* -* -- LAPACK routine (version 2.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* September 30, 1994 -* -* .. Scalar Arguments .. CHARACTER UPLO INTEGER INFO, LDA, LWORK, N -* .. -* .. Array Arguments .. - REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( LWORK ) -* .. -* -* Purpose -* ======= -* -* DORGTR generates a real orthogonal matrix Q which is defined as the -* product of n-1 elementary reflectors of order N, as returned by -* DSYTRD: -* -* if UPLO = 'U', Q = H(n-1) . . . H(2) H(1), -* -* if UPLO = 'L', Q = H(1) H(2) . . . H(n-1). -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangle of A contains elementary reflectors -* from DSYTRD; -* = 'L': Lower triangle of A contains elementary reflectors -* from DSYTRD. -* -* N (input) INTEGER -* The order of the matrix Q. N >= 0. -* -* A (input/output) REAL(DOUBLE) array, dimension (LDA,N) -* On entry, the vectors which define the elementary reflectors, -* as returned by DSYTRD. -* On exit, the N-by-N orthogonal matrix Q. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* TAU (input) REAL(DOUBLE) array, dimension (N-1) -* TAU(i) must contain the scalar factor of the elementary -* reflector H(i), as returned by DSYTRD. -* -* WORK (workspace/output) REAL(DOUBLE) array, dimension (LWORK) -* On exit, if INFO = 0, WORK(1) returns the optimal LWORK. -* -* LWORK (input) INTEGER -* The dimension of the array WORK. LWORK >= max(1,N-1). -* For optimum performance LWORK >= (N-1)*NB, where NB is -* the optimal blocksize. -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* -* ===================================================================== -* -* .. Parameters .. - REAL(DOUBLE) ZERO, ONE - PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - INTEGER I, IINFO, J -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. Executable Statements .. - -* -* Test the input arguments -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -4 - ELSE IF( LWORK.LT.MAX( 1, N-1 ) ) THEN - INFO = -7 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DORGTR', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) THEN - WORK( 1 ) = 1 - RETURN - END IF -* - IF( UPPER ) THEN -* -* Q was determined by a call to DSYTRD with UPLO = 'U' -* -* Shift the vectors which define the elementary reflectors one -* column to the left, and set the last row and column of Q to -* those of the unit matrix -* - DO 20 J = 1, N - 1 - DO 10 I = 1, J - 1 - A( I, J ) = A( I, J+1 ) - 10 CONTINUE - A( N, J ) = ZERO - 20 CONTINUE - DO 30 I = 1, N - 1 - A( I, N ) = ZERO - 30 CONTINUE - A( N, N ) = ONE -* -* Generate Q(1:n-1,1:n-1) -* - CALL DORGQL( N-1, N-1, N-1, A, LDA, TAU, WORK, LWORK, IINFO ) -* - ELSE -* -* Q was determined by a call to DSYTRD with UPLO = 'L'. -* -* Shift the vectors which define the elementary reflectors one -* column to the right, and set the first row and column of Q to -* those of the unit matrix -* - DO 50 J = N, 2, -1 - A( 1, J ) = ZERO - DO 40 I = J + 1, N - A( I, J ) = A( I, J-1 ) - 40 CONTINUE - 50 CONTINUE - A( 1, 1 ) = ONE - DO 60 I = 2, N - A( I, 1 ) = ZERO - 60 CONTINUE - IF( N.GT.1 ) THEN -* -* Generate Q(2:n,2:n) -* - CALL DORGQR( N-1, N-1, N-1, A( 2, 2 ), LDA, TAU, WORK, - $ LWORK, IINFO ) - END IF - END IF - + REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( LWORK ) + CALL DORGTR_HELPER( UPLO, N, A, LDA, TAU, WORK, LWORK, INFO ) RETURN -* -* End of DORGTR -* END SUBROUTINE DORGTR +! --- lapack_peeloff end --- ! +! --- lapack_surgery end --- ! END MODULE LAPACK_STD_EIG_1 diff --git a/Source/Modules/LAPACK/LAPACK_STD_EIG_1_HELPER.f b/Source/Modules/LAPACK/LAPACK_STD_EIG_1_HELPER.f new file mode 100644 index 00000000..6a97aca8 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_STD_EIG_1_HELPER.f @@ -0,0 +1,285 @@ +! ################################################################################################################################## + + MODULE LAPACK_STD_EIG_1_HELPER + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE LAPACK_BLAS_AUX + USE LAPACK_MISCEL + + CONTAINS + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DSYEV( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO ) + + CHARACTER JOBZ, UPLO + INTEGER INFO, LDA, LWORK, N + REAL(DOUBLE) A( LDA, * ), W( * ), WORK( * ) + REAL(DOUBLE) ZERO, ONE + PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) + LOGICAL LOWER, WANTZ + INTEGER IINFO, IMAX, INDE, INDTAU, INDWRK, ISCALE, + $ LLWORK, LOPT + REAL(DOUBLE) ANRM, BIGNUM, EPS, RMAX, RMIN, SAFMIN, SIGMA, + $ SMLNUM + LOGICAL LSAME + EXTERNAL LSAME + INTEGER ILAENV + EXTERNAL ILAENV + REAL(DOUBLE) DLAMCH + EXTERNAL DLAMCH + INTRINSIC MAX, SQRT + + WANTZ = LSAME( JOBZ, 'V' ) + LOWER = LSAME( UPLO, 'L' ) + + INFO = 0 + IF( .NOT.( WANTZ .OR. LSAME( JOBZ, 'N' ) ) ) THEN + INFO = -1 + ELSE IF( .NOT.( LOWER .OR. LSAME( UPLO, 'U' ) ) ) THEN + INFO = -2 + ELSE IF( N.LT.0 ) THEN + INFO = -3 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -5 + ELSE IF( LWORK.LT.MAX( 1, 3*N-1 ) ) THEN + INFO = -8 + END IF + + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DSYEV ', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) THEN + WORK( 1 ) = 1 + RETURN + END IF + + IF( N.EQ.1 ) THEN + W( 1 ) = A( 1, 1 ) + WORK( 1 ) = 3 + IF( WANTZ ) + $ A( 1, 1 ) = ONE + RETURN + END IF + + SAFMIN = DLAMCH( 'Safe minimum' ) + EPS = DLAMCH( 'Precision' ) + SMLNUM = SAFMIN / EPS + BIGNUM = ONE / SMLNUM + RMIN = SQRT( SMLNUM ) + RMAX = SQRT( BIGNUM ) + + ANRM = DLANSY( 'M', UPLO, N, A, LDA, WORK ) + ISCALE = 0 + IF( ANRM.GT.ZERO .AND. ANRM.LT.RMIN ) THEN + ISCALE = 1 + SIGMA = RMIN / ANRM + ELSE IF( ANRM.GT.RMAX ) THEN + ISCALE = 1 + SIGMA = RMAX / ANRM + END IF + IF( ISCALE.EQ.1 ) + $ CALL DLASCL( UPLO, 0, 0, ONE, SIGMA, N, N, A, LDA, INFO ) + + INDE = 1 + INDTAU = INDE + N + INDWRK = INDTAU + N + LLWORK = LWORK - INDWRK + 1 + CALL DSYTRD( UPLO, N, A, LDA, W, WORK( INDE ), WORK( INDTAU ), + $ WORK( INDWRK ), LLWORK, IINFO ) + LOPT = 2*N + WORK( INDWRK ) + + IF( .NOT.WANTZ ) THEN + CALL DSTERF( N, W, WORK( INDE ), INFO ) + ELSE + CALL DORGTR( UPLO, N, A, LDA, WORK( INDTAU ), WORK( INDWRK ), + $ LLWORK, IINFO ) + CALL DSTEQR( JOBZ, N, W, WORK( INDE ), A, LDA, WORK( INDTAU ), + $ INFO ) + END IF + + IF( ISCALE.EQ.1 ) THEN + IF( INFO.EQ.0 ) THEN + IMAX = N + ELSE + IMAX = INFO - 1 + END IF + CALL DSCAL( IMAX, ONE / SIGMA, W, 1 ) + END IF + + WORK( 1 ) = MAX( 3*N-1, LOPT ) + + RETURN + END SUBROUTINE DSYEV +! --- lapack_peeloff end --- ! + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DSYTRD( UPLO, N, A, LDA, D, E, TAU, WORK, LWORK, INFO ) + + CHARACTER UPLO + INTEGER INFO, LDA, LWORK, N + REAL(DOUBLE) A( LDA, * ), D( * ), E( * ), TAU( * ), + $ WORK( * ) + REAL(DOUBLE) ONE + PARAMETER ( ONE = 1.0D0 ) + LOGICAL UPPER + INTEGER I, IINFO, IWS, J, KK, LDWORK, NB, NBMIN, NX + INTRINSIC MAX + LOGICAL LSAME + EXTERNAL LSAME + INTEGER ILAENV + EXTERNAL ILAENV + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -4 + ELSE IF( LWORK.LT.1 ) THEN + INFO = -9 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DSYTRD', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) THEN + WORK( 1 ) = 1 + RETURN + END IF + + NB = ILAENV( 1, 'DSYTRD', UPLO, N, -1, -1, -1 ) + NX = N + IWS = 1 + IF( NB.GT.1 .AND. NB.LT.N ) THEN + NX = MAX( NB, ILAENV( 3, 'DSYTRD', UPLO, N, -1, -1, -1 ) ) + IF( NX.LT.N ) THEN + LDWORK = N + IWS = LDWORK*NB + IF( LWORK.LT.IWS ) THEN + NB = MAX( LWORK / LDWORK, 1 ) + NBMIN = ILAENV( 2, 'DSYTRD', UPLO, N, -1, -1, -1 ) + IF( NB.LT.NBMIN ) + $ NX = N + END IF + ELSE + NX = N + END IF + ELSE + NB = 1 + END IF + + IF( UPPER ) THEN + KK = N - ( ( N-NX+NB-1 ) / NB )*NB + DO 20 I = N - NB + 1, KK + 1, -NB + CALL DLATRD( UPLO, I+NB-1, NB, A, LDA, E, TAU, WORK, + $ LDWORK ) + CALL DSYR2K( UPLO, 'No transpose', I-1, NB, -ONE, A( 1, I ), + $ LDA, WORK, LDWORK, ONE, A, LDA ) + DO 10 J = I, I + NB - 1 + A( J-1, J ) = E( J-1 ) + D( J ) = A( J, J ) + 10 CONTINUE + 20 CONTINUE + CALL DSYTD2( UPLO, KK, A, LDA, D, E, TAU, IINFO ) + ELSE + DO 40 I = 1, N - NX, NB + CALL DLATRD( UPLO, N-I+1, NB, A( I, I ), LDA, E( I ), + $ TAU( I ), WORK, LDWORK ) + CALL DSYR2K( UPLO, 'No transpose', N-I-NB+1, NB, -ONE, + $ A( I+NB, I ), LDA, WORK( NB+1 ), LDWORK, ONE, + $ A( I+NB, I+NB ), LDA ) + DO 30 J = I, I + NB - 1 + A( J+1, J ) = E( J ) + D( J ) = A( J, J ) + 30 CONTINUE + 40 CONTINUE + CALL DSYTD2( UPLO, N-I+1, A( I, I ), LDA, D( I ), E( I ), + $ TAU( I ), IINFO ) + END IF + + WORK( 1 ) = IWS + + RETURN + END SUBROUTINE DSYTRD +! --- lapack_peeloff end --- ! + +! ################################################################################################################################## + +! --- lapack_peeloff begin --- ! + SUBROUTINE DORGTR( UPLO, N, A, LDA, TAU, WORK, LWORK, INFO ) + + CHARACTER UPLO + INTEGER INFO, LDA, LWORK, N + REAL(DOUBLE) A( LDA, * ), TAU( * ), WORK( LWORK ) + REAL(DOUBLE) ZERO, ONE + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) + LOGICAL UPPER + INTEGER I, IINFO, J + LOGICAL LSAME + EXTERNAL LSAME + INTRINSIC MAX + + INFO = 0 + UPPER = LSAME( UPLO, 'U' ) + IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( LDA.LT.MAX( 1, N ) ) THEN + INFO = -4 + ELSE IF( LWORK.LT.MAX( 1, N-1 ) ) THEN + INFO = -7 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DORGTR', -INFO ) + RETURN + END IF + + IF( N.EQ.0 ) THEN + WORK( 1 ) = 1 + RETURN + END IF + + IF( UPPER ) THEN + DO 20 J = 1, N - 1 + DO 10 I = 1, J - 1 + A( I, J ) = A( I, J+1 ) + 10 CONTINUE + A( N, J ) = ZERO + 20 CONTINUE + DO 30 I = 1, N - 1 + A( I, N ) = ZERO + 30 CONTINUE + A( N, N ) = ONE + CALL DORGQL( N-1, N-1, N-1, A, LDA, TAU, WORK, LWORK, IINFO ) + ELSE + DO 50 J = N, 2, -1 + A( 1, J ) = ZERO + DO 40 I = J + 1, N + A( I, J ) = A( I, J-1 ) + 40 CONTINUE + 50 CONTINUE + A( 1, 1 ) = ONE + DO 60 I = 2, N + A( I, 1 ) = ZERO + 60 CONTINUE + IF( N.GT.1 ) THEN + CALL DORGQR( N-1, N-1, N-1, A( 2, 2 ), LDA, TAU, WORK, + $ LWORK, IINFO ) + END IF + END IF + + RETURN + END SUBROUTINE DORGTR +! --- lapack_peeloff end --- ! + + END MODULE LAPACK_STD_EIG_1_HELPER diff --git a/Source/Modules/LAPACK/LAPACK_STD_EIG_1_ext.f90 b/Source/Modules/LAPACK/LAPACK_STD_EIG_1_ext.f90 new file mode 100644 index 00000000..992db3c1 --- /dev/null +++ b/Source/Modules/LAPACK/LAPACK_STD_EIG_1_ext.f90 @@ -0,0 +1,20 @@ +module LAPACK_STD_EIG_1 + + use PENTIUM_II_KIND, only : BYTE, LONG, DOUBLE + + implicit none + + interface + subroutine DSYEV(JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO) + use PENTIUM_II_KIND, only : LONG, DOUBLE + implicit none + character(1), intent(in) :: JOBZ, UPLO + integer(LONG), intent(in) :: N, LDA, LWORK + integer(LONG), intent(out) :: INFO + real(DOUBLE), intent(inout) :: A(LDA,*) + real(DOUBLE), intent(out) :: W(*) + real(DOUBLE), intent(inout) :: WORK(*) + end subroutine DSYEV + end interface + +end module LAPACK_STD_EIG_1 diff --git a/Source/Modules/LAPACK/LAPACK_SYM_MAT_INV.f b/Source/Modules/LAPACK/LAPACK_SYM_MAT_INV.f index 9f942c4e..e7631370 100644 --- a/Source/Modules/LAPACK/LAPACK_SYM_MAT_INV.f +++ b/Source/Modules/LAPACK/LAPACK_SYM_MAT_INV.f @@ -1,16 +1,15 @@ ! ################################################################################################################################## MODULE LAPACK_SYM_MAT_INV +! --- lapack_surgery begin --- ! USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE SCONTR, ONLY : BLNK_SUB_NAM - USE TIMDAT, ONLY : HOUR, MINUTE, SEC, - & TSEC USE LAPACK_BLAS_AUX - USE LAPACK_LIN_EQN_DPB ! Subr DPOTF2 - - USE OURTIM_Interface + USE LAPACK_DLAUUM_HELPER, ONLY : DLAUUM_HELPER => DLAUUM + USE LAPACK_DPOTRF_HELPER, ONLY : DPOTRF_HELPER => DPOTRF + USE LAPACK_DPOTRI_HELPER, ONLY : DPOTRI_HELPER => DPOTRI + USE LAPACK_DTRTI2_HELPER, ONLY : DTRTI2_HELPER => DTRTI2 ! This is a set of LAPACK routines that are used in inverting symmetric matrices (not band matrices) @@ -19,458 +18,46 @@ MODULE LAPACK_SYM_MAT_INV ! ################################################################################################################################## ! 001 LAPACK_SYM_MAT_INV +! --- lapack_peeloff begin --- ! SUBROUTINE DPOTRF( UPLO, N, A, LDA, INFO ) -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* March 31, 1993 -* -* .. Scalar Arguments .. CHARACTER UPLO INTEGER INFO, LDA, N -* .. -* .. Array Arguments .. DOUBLE PRECISION A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DPOTRF computes the Cholesky factorization of a real symmetric -* positive definite matrix A. -* -* The factorization has the form -* A = U**T * U, if UPLO = 'U', or -* A = L * L**T, if UPLO = 'L', -* where U is an upper triangular matrix and L is lower triangular. -* -* This is the block version of the algorithm, calling Level 3 BLAS. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangle of A is stored; -* = 'L': Lower triangle of A is stored. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* A (input/output) DOUBLE PRECISION array, dimension (LDA,N) -* On entry, the symmetric matrix A. If UPLO = 'U', the leading -* N-by-N upper triangular part of A contains the upper -* triangular part of the matrix A, and the strictly lower -* triangular part of A is not referenced. If UPLO = 'L', the -* leading N-by-N lower triangular part of A contains the lower -* triangular part of the matrix A, and the strictly upper -* triangular part of A is not referenced. -* -* On exit, if INFO = 0, the factor U or L from the Cholesky -* factorization A = U**T*U or A = L*L**T. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, the leading minor of order i is not -* positive definite, and the factorization could not be -* completed. -* -* ===================================================================== -* -* .. Parameters .. - DOUBLE PRECISION ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - INTEGER J, JB, NB -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME - - INTEGER ILAENV - EXTERNAL ILAENV - -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -4 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DPOTRF', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* -* Determine the block size for this environment. -* - NB = ILAENV( 1, 'DPOTRF', UPLO, N, -1, -1, -1 ) - IF( NB.LE.1 .OR. NB.GE.N ) THEN -* -* Use unblocked code. -* - CALL DPOTF2( UPLO, N, A, LDA, INFO ) - ELSE -* -* Use blocked code. -* - IF( UPPER ) THEN -* -* Compute the Cholesky factorization A = U'*U. -* - DO 10 J = 1, N, NB -* -* Update and factorize the current diagonal block and test -* for non-positive-definiteness. -* - JB = MIN( NB, N-J+1 ) - CALL DSYRK( 'Upper', 'Transpose', JB, J-1, -ONE, - $ A( 1, J ), LDA, ONE, A( J, J ), LDA ) - CALL DPOTF2( 'Upper', JB, A( J, J ), LDA, INFO ) - IF( INFO.NE.0 ) - $ GO TO 30 - IF( J+JB.LE.N ) THEN -* -* Compute the current block row. -* - CALL DGEMM( 'Transpose', 'No transpose', JB, N-J-JB+1, - $ J-1, -ONE, A( 1, J ), LDA, A( 1, J+JB ), - $ LDA, ONE, A( J, J+JB ), LDA ) - CALL DTRSM( 'Left', 'Upper', 'Transpose', 'Non-unit', - $ JB, N-J-JB+1, ONE, A( J, J ), LDA, - $ A( J, J+JB ), LDA ) - END IF - 10 CONTINUE -* - ELSE -* -* Compute the Cholesky factorization A = L*L'. -* - DO 20 J = 1, N, NB -* -* Update and factorize the current diagonal block and test -* for non-positive-definiteness. -* - JB = MIN( NB, N-J+1 ) - CALL DSYRK( 'Lower', 'No transpose', JB, J-1, -ONE, - $ A( J, 1 ), LDA, ONE, A( J, J ), LDA ) - CALL DPOTF2( 'Lower', JB, A( J, J ), LDA, INFO ) - IF( INFO.NE.0 ) - $ GO TO 30 - IF( J+JB.LE.N ) THEN -* -* Compute the current block column. -* - CALL DGEMM( 'No transpose', 'Transpose', N-J-JB+1, JB, - $ J-1, -ONE, A( J+JB, 1 ), LDA, A( J, 1 ), - $ LDA, ONE, A( J+JB, J ), LDA ) - CALL DTRSM( 'Right', 'Lower', 'Transpose', 'Non-unit', - $ N-J-JB+1, JB, ONE, A( J, J ), LDA, - $ A( J+JB, J ), LDA ) - END IF - 20 CONTINUE - END IF - END IF - GO TO 40 -* - 30 CONTINUE - INFO = INFO + J - 1 -* - 40 CONTINUE + CALL DPOTRF_HELPER( UPLO, N, A, LDA, INFO ) RETURN -* -* End of DPOTRF -* END SUBROUTINE DPOTRF +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 002 LAPACK_SYM_MAT_INV +! --- lapack_peeloff begin --- ! SUBROUTINE DPOTRI( UPLO, N, A, LDA, INFO ) -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* March 31, 1993 -* -* .. Scalar Arguments .. CHARACTER UPLO INTEGER INFO, LDA, N -* .. -* .. Array Arguments .. DOUBLE PRECISION A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DPOTRI computes the inverse of a real symmetric positive definite -* matrix A using the Cholesky factorization A = U**T*U or A = L*L**T -* computed by DPOTRF. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* = 'U': Upper triangle of A is stored; -* = 'L': Lower triangle of A is stored. -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* A (input/output) DOUBLE PRECISION array, dimension (LDA,N) -* On entry, the triangular factor U or L from the Cholesky -* factorization A = U**T*U or A = L*L**T, as computed by -* DPOTRF. -* On exit, the upper or lower triangle of the (symmetric) -* inverse of A, overwriting the input factor U or L. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -i, the i-th argument had an illegal value -* > 0: if INFO = i, the (i,i) element of the factor U or L is -* zero, and the inverse could not be computed. -* -* ===================================================================== -* -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - IF( .NOT.LSAME( UPLO, 'U' ) .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -4 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DPOTRI', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* -* Invert the triangular Cholesky factor U or L. -* - CALL DTRTRI( UPLO, 'Non-unit', N, A, LDA, INFO ) - IF( INFO.GT.0 ) - $ RETURN -* -* Form inv(U)*inv(U)' or inv(L)'*inv(L). -* - CALL DLAUUM( UPLO, N, A, LDA, INFO ) -* + CALL DPOTRI_HELPER( UPLO, N, A, LDA, INFO ) RETURN -* -* End of DPOTRI -* END SUBROUTINE DPOTRI +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 003 LAPACK_SYM_MAT_INV +! --- lapack_peeloff begin --- ! SUBROUTINE DLAUUM( UPLO, N, A, LDA, INFO ) -* -* -- LAPACK auxiliary routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. CHARACTER UPLO INTEGER INFO, LDA, N -* .. -* .. Array Arguments .. DOUBLE PRECISION A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DLAUUM computes the product U * U' or L' * L, where the triangular -* factor U or L is stored in the upper or lower triangular part of -* the array A. -* -* If UPLO = 'U' or 'u' then the upper triangle of the result is stored, -* overwriting the factor U in A. -* If UPLO = 'L' or 'l' then the lower triangle of the result is stored, -* overwriting the factor L in A. -* -* This is the blocked form of the algorithm, calling Level 3 BLAS. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* Specifies whether the triangular factor stored in the array A -* is upper or lower triangular: -* = 'U': Upper triangular -* = 'L': Lower triangular -* -* N (input) INTEGER -* The order of the triangular factor U or L. N >= 0. -* -* A (input/output) DOUBLE PRECISION array, dimension (LDA,N) -* On entry, the triangular factor U or L. -* On exit, if UPLO = 'U', the upper triangle of A is -* overwritten with the upper triangle of the product U * U'; -* if UPLO = 'L', the lower triangle of A is overwritten with -* the lower triangle of the product L' * L. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -k, the k-th argument had an illegal value -* -* ===================================================================== -* -* .. Parameters .. - DOUBLE PRECISION ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL UPPER - INTEGER I, IB, NB -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME - - INTEGER ILAENV - EXTERNAL ILAENV - -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX, MIN -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( N.LT.0 ) THEN - INFO = -2 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -4 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DLAUUM', -INFO ) - RETURN - END IF -* -* Quick return if possible -* - IF( N.EQ.0 ) - $ RETURN -* -* Determine the block size for this environment. -* - NB = ILAENV( 1, 'DLAUUM', UPLO, N, -1, -1, -1 ) -* - IF( NB.LE.1 .OR. NB.GE.N ) THEN -* -* Use unblocked code -* - CALL DLAUU2( UPLO, N, A, LDA, INFO ) - ELSE -* -* Use blocked code -* - IF( UPPER ) THEN -* -* Compute the product U * U'. -* - DO 10 I = 1, N, NB - IB = MIN( NB, N-I+1 ) - CALL DTRMM( 'Right', 'Upper', 'Transpose', 'Non-unit', - $ I-1, IB, ONE, A( I, I ), LDA, A( 1, I ), - $ LDA ) - CALL DLAUU2( 'Upper', IB, A( I, I ), LDA, INFO ) - IF( I+IB.LE.N ) THEN - CALL DGEMM( 'No transpose', 'Transpose', I-1, IB, - $ N-I-IB+1, ONE, A( 1, I+IB ), LDA, - $ A( I, I+IB ), LDA, ONE, A( 1, I ), LDA ) - CALL DSYRK( 'Upper', 'No transpose', IB, N-I-IB+1, - $ ONE, A( I, I+IB ), LDA, ONE, A( I, I ), - $ LDA ) - END IF - 10 CONTINUE - ELSE -* -* Compute the product L' * L. -* - DO 20 I = 1, N, NB - IB = MIN( NB, N-I+1 ) - CALL DTRMM( 'Left', 'Lower', 'Transpose', 'Non-unit', IB, - $ I-1, ONE, A( I, I ), LDA, A( I, 1 ), LDA ) - CALL DLAUU2( 'Lower', IB, A( I, I ), LDA, INFO ) - IF( I+IB.LE.N ) THEN - CALL DGEMM( 'Transpose', 'No transpose', IB, I-1, - $ N-I-IB+1, ONE, A( I+IB, I ), LDA, - $ A( I+IB, 1 ), LDA, ONE, A( I, 1 ), LDA ) - CALL DSYRK( 'Lower', 'Transpose', IB, N-I-IB+1, ONE, - $ A( I+IB, I ), LDA, ONE, A( I, I ), LDA ) - END IF - 20 CONTINUE - END IF - END IF -* + CALL DLAUUM_HELPER( UPLO, N, A, LDA, INFO ) RETURN -* -* End of DLAUUM -* END SUBROUTINE DLAUUM +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 005 LAPACK_SYM_MAT_INV +! --- lapack_peeloff begin --- ! SUBROUTINE DLAUU2( UPLO, N, A, LDA, INFO ) * * -- LAPACK auxiliary routine (version 3.0) -- @@ -607,156 +194,21 @@ SUBROUTINE DLAUU2( UPLO, N, A, LDA, INFO ) * End of DLAUU2 * END SUBROUTINE DLAUU2 +! --- lapack_peeloff end --- ! ! ################################################################################################################################## ! 006 LAPACK_SYM_MAT_INV +! --- lapack_peeloff begin --- ! SUBROUTINE DTRTI2( UPLO, DIAG, N, A, LDA, INFO ) -* -* -- LAPACK routine (version 3.0) -- -* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., -* Courant Institute, Argonne National Lab, and Rice University -* February 29, 1992 -* -* .. Scalar Arguments .. CHARACTER DIAG, UPLO INTEGER INFO, LDA, N -* .. -* .. Array Arguments .. DOUBLE PRECISION A( LDA, * ) -* .. -* -* Purpose -* ======= -* -* DTRTI2 computes the inverse of a real upper or lower triangular -* matrix. -* -* This is the Level 2 BLAS version of the algorithm. -* -* Arguments -* ========= -* -* UPLO (input) CHARACTER*1 -* Specifies whether the matrix A is upper or lower triangular. -* = 'U': Upper triangular -* = 'L': Lower triangular -* -* DIAG (input) CHARACTER*1 -* Specifies whether or not the matrix A is unit triangular. -* = 'N': Non-unit triangular -* = 'U': Unit triangular -* -* N (input) INTEGER -* The order of the matrix A. N >= 0. -* -* A (input/output) DOUBLE PRECISION array, dimension (LDA,N) -* On entry, the triangular matrix A. If UPLO = 'U', the -* leading n by n upper triangular part of the array A contains -* the upper triangular matrix, and the strictly lower -* triangular part of A is not referenced. If UPLO = 'L', the -* leading n by n lower triangular part of the array A contains -* the lower triangular matrix, and the strictly upper -* triangular part of A is not referenced. If DIAG = 'U', the -* diagonal elements of A are also not referenced and are -* assumed to be 1. -* -* On exit, the (triangular) inverse of the original matrix, in -* the same storage format. -* -* LDA (input) INTEGER -* The leading dimension of the array A. LDA >= max(1,N). -* -* INFO (output) INTEGER -* = 0: successful exit -* < 0: if INFO = -k, the k-th argument had an illegal value -* -* ===================================================================== -* -* .. Parameters .. - DOUBLE PRECISION ONE - PARAMETER ( ONE = 1.0D+0 ) -* .. -* .. Local Scalars .. - LOGICAL NOUNIT, UPPER - INTEGER J - DOUBLE PRECISION AJJ -* .. -* .. External Functions .. - LOGICAL LSAME - EXTERNAL LSAME -* .. -* .. External Subroutines .. -* .. -* .. Intrinsic Functions .. - INTRINSIC MAX -* .. -* .. Executable Statements .. -* -* Test the input parameters. -* - INFO = 0 - UPPER = LSAME( UPLO, 'U' ) - NOUNIT = LSAME( DIAG, 'N' ) - IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN - INFO = -1 - ELSE IF( .NOT.NOUNIT .AND. .NOT.LSAME( DIAG, 'U' ) ) THEN - INFO = -2 - ELSE IF( N.LT.0 ) THEN - INFO = -3 - ELSE IF( LDA.LT.MAX( 1, N ) ) THEN - INFO = -5 - END IF - IF( INFO.NE.0 ) THEN - CALL XERBLA( 'DTRTI2', -INFO ) - RETURN - END IF -* - IF( UPPER ) THEN -* -* Compute inverse of upper triangular matrix. -* - DO 10 J = 1, N - IF( NOUNIT ) THEN - A( J, J ) = ONE / A( J, J ) - AJJ = -A( J, J ) - ELSE - AJJ = -ONE - END IF -* -* Compute elements 1:j-1 of j-th column. -* - CALL DTRMV( 'Upper', 'No transpose', DIAG, J-1, A, LDA, - $ A( 1, J ), 1 ) - CALL DSCAL( J-1, AJJ, A( 1, J ), 1 ) - 10 CONTINUE - ELSE -* -* Compute inverse of lower triangular matrix. -* - DO 20 J = N, 1, -1 - IF( NOUNIT ) THEN - A( J, J ) = ONE / A( J, J ) - AJJ = -A( J, J ) - ELSE - AJJ = -ONE - END IF - IF( J.LT.N ) THEN -* -* Compute elements j+1:n of j-th column. -* - CALL DTRMV( 'Lower', 'No transpose', DIAG, N-J, - $ A( J+1, J+1 ), LDA, A( J+1, J ), 1 ) - CALL DSCAL( N-J, AJJ, A( J+1, J ), 1 ) - END IF - 20 CONTINUE - END IF -* + CALL DTRTI2_HELPER( UPLO, DIAG, N, A, LDA, INFO ) RETURN -* -* End of DTRTI2 -* END SUBROUTINE DTRTI2 +! --- lapack_peeloff end --- ! +! --- lapack_surgery end --- ! END MODULE LAPACK_SYM_MAT_INV diff --git a/Source/Modules/PARAMS.f90 b/Source/Modules/PARAMS.f90 index bb894b2d..bf12ea31 100644 --- a/Source/Modules/PARAMS.f90 +++ b/Source/Modules/PARAMS.f90 @@ -48,6 +48,10 @@ MODULE PARAMS ! NOTE: if a value of -1. is input on a PARAM ARP_TOL entry, then the ! Lanczos algorithm will use machine precision for ARP_TOL +! ---------------------------------------------------------------------------------------------------------------------------------- +! --- chase_feast_add --- begin ! + CHARACTER( 8*BYTE) :: LANCMETH = ' '! Deprecated alias for EIGRL extract method selection. +! --- chase_feast_add --- end ! ! ---------------------------------------------------------------------------------------------------------------------------------- CHARACTER( 1*BYTE) :: ART_KED = 'N' ! Indicates whether to add artificial differ stiff terms to KED REAL(DOUBLE) :: ART_TRAN_KED = ONEPM6 ! Artificial differ stiff for translational DOF's @@ -74,6 +78,9 @@ MODULE PARAMS ! ---------------------------------------------------------------------------------------------------------------------------------- INTEGER(LONG) :: BAILOUT = 0 ! If >= 0 quit if a singularity in decomposing a matrix is detected +! ---------------------------------------------------------------------------------------------------------------------------------- + CHARACTER( 1*BYTE) :: BANDEDOPT = 'N' ! 'Y'/'N' enable experimental banded-order optimization path + ! ---------------------------------------------------------------------------------------------------------------------------------- REAL(DOUBLE) :: CBMIN3 = TWO ! Trans shear factor for MIN3 triangle elems (TRIA3) @@ -161,6 +168,7 @@ MODULE PARAMS ! BANDIT for bandit auto grid swquencing ! GRID for grid numerical order ! INPUT for grid input order +! RCM for reserved/add-on RCM path (currently mapped to INPUT flow) CHARACTER( 1*BYTE) :: SEQQUIT = 'N' !*'Y', 'N' indicator to stop processing if G.P. auto sequencing failed ! (goes in field 4 of PARAM GRIDSEQ entry) CHARACTER( 1*BYTE) :: SEQPRT = 'N' !*'Y', 'N' indicator to print SEQGP card images from bandit @@ -482,7 +490,11 @@ MODULE PARAMS INTEGER(LONG) :: USR_JCT = 0 ! User supplied (PARAM B.D. card) value for JCT - used in sort subr's ! ---------------------------------------------------------------------------------------------------------------------------------- - REAL(DOUBLE) :: WINAMEM =2147.483647! Max MB of memory that Windows XP allows for arrays +! --- BANDED_optimizisation -begin-- ! + REAL(DOUBLE) :: WINAMEM = ZERO ! Optional per-array MB cap. If > 0, large allocations are blocked +! before ALLOCATE; if <= 0, rely on 64-bit OS/compiler allocation +! failure handling instead of the old Windows XP 2 GB cap. +! --- BANDED_optimizisation -end-- ! ! ---------------------------------------------------------------------------------------------------------------------------------- REAL(DOUBLE) :: WTMASS = ONE ! Value from PARAM WTMASS Bulk Data card diff --git a/Source/USE_IFs/EIG_INV_PWR_USE_IFs.f90 b/Source/USE_IFs/EIG_INV_PWR_USE_IFs.f90 index 0e5b084b..6cbd798a 100644 --- a/Source/USE_IFs/EIG_INV_PWR_USE_IFs.f90 +++ b/Source/USE_IFs/EIG_INV_PWR_USE_IFs.f90 @@ -32,6 +32,9 @@ MODULE EIG_INV_PWR_USE_IFs USE ALLOCATE_SPARSE_MAT_Interface USE MATADD_SSS_Interface USE ALLOCATE_EIGEN1_MAT_Interface +! !--- matrix classifier --- begin! + USE BANDGEN_LAPACK_DGB_Interface +! !--- matrix classifier --- end! USE SYM_MAT_DECOMP_LAPACK_Interface USE SYM_MAT_DECOMP_SUPRLU_Interface USE OUTA_HERE_Interface diff --git a/Source/USE_IFs/SEQ_PROC_USE_IFs.f90 b/Source/USE_IFs/SEQ_PROC_USE_IFs.f90 index 2c7b00a4..d905c832 100644 --- a/Source/USE_IFs/SEQ_PROC_USE_IFs.f90 +++ b/Source/USE_IFs/SEQ_PROC_USE_IFs.f90 @@ -27,9 +27,12 @@ MODULE SEQ_PROC_USE_IFs ! USE Interface statements for all subroutines called by SUBROUTINE SEQ_PROC +! !--- RCM BANDED ADD-ON --- begin! USE OURTIM_Interface USE GET_ARRAY_ROW_NUM_Interface + USE GET_ELGP_Interface USE OUTA_HERE_Interface +! !--- RCM BANDED ADD-ON --- end! USE SORT_INT1_REAL1_Interface USE SORT_REAL1_INT1_Interface USE SORT_INT2_REAL1_Interface diff --git a/Source/UTIL/ALLOCATE_LAPACK_MAT.f90 b/Source/UTIL/ALLOCATE_LAPACK_MAT.f90 index fd79255f..47d62728 100644 --- a/Source/UTIL/ALLOCATE_LAPACK_MAT.f90 +++ b/Source/UTIL/ALLOCATE_LAPACK_MAT.f90 @@ -34,7 +34,7 @@ SUBROUTINE ALLOCATE_LAPACK_MAT ( NAME, NROWS, NCOLS, CALLING_SUBR ) USE SCONTR, ONLY : BLNK_SUB_NAM, FATAL_ERR, TOT_MB_MEM_ALLOC USE TIMDAT, ONLY : TSEC USE DEBUG_PARAMETERS, ONLY : DEBUG - USE PARAMS, ONLY : WINAMEM + USE PARAMS, ONLY : SUPINFO, WINAMEM USE ARPACK_MATRICES_1 , ONLY : IWORK, RFAC, RESID, SELECT, VBAS, WORKD, WORKL USE LAPACK_DPB_MATRICES, ONLY : ABAND, BBAND, LAPACK_S, RES @@ -85,11 +85,20 @@ SUBROUTINE ALLOCATE_LAPACK_MAT ( NAME, NROWS, NCOLS, CALLING_SUBR ) ELSE MB_ALLOCATED = REAL(DOUBLE)*RNROWS*RNCOLS/ONEPP6 WRITE(SC1,9199) NAME, MB_ALLOCATED - IF (MB_ALLOCATED > WINAMEM) THEN +! --- BANDED_optimizisation -begin-- ! + IF (WINAMEM > ZERO) THEN + WRITE(ERR,941) NAME, NROWS, NCOLS, MB_ALLOCATED, WINAMEM + IF (SUPINFO == 'N') WRITE(F06,941) NAME, NROWS, NCOLS, MB_ALLOCATED, WINAMEM + ELSE + WRITE(ERR,942) NAME, NROWS, NCOLS, MB_ALLOCATED + IF (SUPINFO == 'N') WRITE(F06,942) NAME, NROWS, NCOLS, MB_ALLOCATED + ENDIF + IF ((WINAMEM > ZERO) .AND. (MB_ALLOCATED > WINAMEM)) THEN WRITE(ERR,940) MB_ALLOCATED, NAME, WINAMEM WRITE(F06,940) MB_ALLOCATED, NAME, WINAMEM CALL OUTA_HERE ( 'Y' ) ENDIF +! --- BANDED_optimizisation -end-- ! ALLOCATE (ABAND(NROWS,NCOLS),STAT=IERR) IF (IERR == 0) THEN !xx WRITE(SC1, * ) ! Advance 1 line for screen messages @@ -119,6 +128,20 @@ SUBROUTINE ALLOCATE_LAPACK_MAT ( NAME, NROWS, NCOLS, CALLING_SUBR ) ELSE MB_ALLOCATED = REAL(DOUBLE)*RNROWS*RNCOLS/ONEPP6 WRITE(SC1,9199) NAME, MB_ALLOCATED +! --- BANDED_optimizisation -begin-- ! + IF (WINAMEM > ZERO) THEN + WRITE(ERR,941) NAME, NROWS, NCOLS, MB_ALLOCATED, WINAMEM + IF (SUPINFO == 'N') WRITE(F06,941) NAME, NROWS, NCOLS, MB_ALLOCATED, WINAMEM + ELSE + WRITE(ERR,942) NAME, NROWS, NCOLS, MB_ALLOCATED + IF (SUPINFO == 'N') WRITE(F06,942) NAME, NROWS, NCOLS, MB_ALLOCATED + ENDIF + IF ((WINAMEM > ZERO) .AND. (MB_ALLOCATED > WINAMEM)) THEN + WRITE(ERR,940) MB_ALLOCATED, NAME, WINAMEM + WRITE(F06,940) MB_ALLOCATED, NAME, WINAMEM + CALL OUTA_HERE ( 'Y' ) + ENDIF +! --- BANDED_optimizisation -end-- ! ALLOCATE (BBAND(NROWS,NCOLS),STAT=IERR) IF (IERR == 0) THEN !xx WRITE(SC1, * ) ! Advance 1 line for screen messages @@ -260,6 +283,20 @@ SUBROUTINE ALLOCATE_LAPACK_MAT ( NAME, NROWS, NCOLS, CALLING_SUBR ) ELSE MB_ALLOCATED = REAL(DOUBLE)*RNROWS*RNCOLS/ONEPP6 WRITE(SC1,9199) NAME, MB_ALLOCATED +! --- BANDED_optimizisation -begin-- ! + IF (WINAMEM > ZERO) THEN + WRITE(ERR,941) NAME, NROWS, NCOLS, MB_ALLOCATED, WINAMEM + IF (SUPINFO == 'N') WRITE(F06,941) NAME, NROWS, NCOLS, MB_ALLOCATED, WINAMEM + ELSE + WRITE(ERR,942) NAME, NROWS, NCOLS, MB_ALLOCATED + IF (SUPINFO == 'N') WRITE(F06,942) NAME, NROWS, NCOLS, MB_ALLOCATED + ENDIF + IF ((WINAMEM > ZERO) .AND. (MB_ALLOCATED > WINAMEM)) THEN + WRITE(ERR,940) MB_ALLOCATED, NAME, WINAMEM + WRITE(F06,940) MB_ALLOCATED, NAME, WINAMEM + CALL OUTA_HERE ( 'Y' ) + ENDIF +! --- BANDED_optimizisation -end-- ! ALLOCATE (RFAC(NROWS,NCOLS),STAT=IERR) IF (IERR == 0) THEN !xx WRITE(SC1, * ) ! Advance 1 line for screen messages @@ -379,9 +416,17 @@ SUBROUTINE ALLOCATE_LAPACK_MAT ( NAME, NROWS, NCOLS, CALLING_SUBR ) 915 FORMAT(' *ERROR 915: PROGRAMMING ERROR IN SUBROUTINE ',A & ,/,14X,' NAME OF ARRAY TO BE ',A,' IS INCORRECT. INPUT NAME WAS ',A) - 940 FORMAT(' *ERROR 940: ATTEMPT TO ALLOCATE ',F10.3,' MB OF MEMORY TO ARRAY ',A,' EXCEEDING WINDOWS MAX MEMORY ALLOWED OF ', & + 940 FORMAT(' *ERROR 940: ATTEMPT TO ALLOCATE ',F10.3,' MB OF MEMORY TO ARRAY ',A,' EXCEEDS PARAM WINAMEM LIMIT OF ', & F10.3,' MB') +! --- BANDED_optimizisation -begin-- ! + 941 FORMAT(' *INFORMATION: BANDED ALLOCATION REQUEST FOR ',A,': ROWS = ',I12,', COLS = ',I12,', ESTIMATED MB = ',F13.3, & + /,14X,' PARAM WINAMEM LIMIT IS ACTIVE AT ',F13.3,' MB; EXCEEDANCE IS FATAL FOR BANDED ALLOCATION') + + 942 FORMAT(' *INFORMATION: BANDED ALLOCATION REQUEST FOR ',A,': ROWS = ',I12,', COLS = ',I12,', ESTIMATED MB = ',F13.3, & + /,14X,' PARAM WINAMEM LIMIT IS DISABLED; ALLOCATE/STAT WILL REPORT MEMORY FAILURE') +! --- BANDED_optimizisation -end-- ! + 990 FORMAT(' *ERROR 990: PROGRAMMING ERROR IN SUBROUTINE ',A & ,/,14X,' CANNOT ALLOCATE MEMORY TO ARRAY ',A,'. IT IS ALREADY ALLOCATED') diff --git a/Source/UTIL/REPORT_BANDED_STORAGE_ESTIMATE.f90 b/Source/UTIL/REPORT_BANDED_STORAGE_ESTIMATE.f90 new file mode 100644 index 00000000..ad044d95 --- /dev/null +++ b/Source/UTIL/REPORT_BANDED_STORAGE_ESTIMATE.f90 @@ -0,0 +1,116 @@ +! ################################################################################################################################## +! Begin MIT license text. +! _______________________________________________________________________________________________________ + +! Copyright 2022 Dr William R Case, Jr (mystransolver@gmail.com) + +! Permission is hereby granted, free of charge, to any person obtaining a copy of this software and +! associated documentation files (the "Software"), to deal in the Software without restriction, including +! without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +! copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to +! the following conditions: + +! The above copyright notice and this permission notice shall be included in all copies or substantial +! portions of the Software and documentation. + +! THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +! OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +! FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +! AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +! LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +! OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +! THE SOFTWARE. +! _______________________________________________________________________________________________________ + +! End MIT license text. + + SUBROUTINE REPORT_BANDED_STORAGE_ESTIMATE ( MAT_NAME, NROWS, NTERMS, I_MAT, J_MAT, BAND_WIDTH, SUPINFO_IN ) + +! --- BANDED_optimizisation -begin-- ! +! Reports Stage 1 storage estimates for banded solver planning. This routine does not change solver dispatch. +! --- BANDED_optimizisation -end-- ! + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE CONSTANTS_1, ONLY : ZERO, ONEPP6 + USE IOUNT1, ONLY : ERR, F06 + + IMPLICIT NONE + + CHARACTER(LEN=*), INTENT(IN) :: MAT_NAME + CHARACTER(LEN=*), INTENT(IN) :: SUPINFO_IN + + INTEGER(LONG), INTENT(IN) :: BAND_WIDTH + INTEGER(LONG), INTENT(IN) :: NROWS + INTEGER(LONG), INTENT(IN) :: NTERMS + INTEGER(LONG), INTENT(IN) :: I_MAT(NROWS+1) + INTEGER(LONG), INTENT(IN) :: J_MAT(NTERMS) + + INTEGER(LONG) :: I + INTEGER(LONG) :: IERR + INTEGER(LONG) :: J + INTEGER(LONG) :: K + INTEGER(LONG) :: PROFILE_HEIGHT + INTEGER(LONG) :: PROFILE_MAX_HEIGHT + INTEGER(LONG), ALLOCATABLE :: PROFILE_FIRST_ROW(:) + + REAL(DOUBLE) :: BAND_MB_EST + REAL(DOUBLE) :: BAND_TERMS_EST + REAL(DOUBLE) :: CSR_MB_EST + REAL(DOUBLE) :: PROFILE_MB_EST + REAL(DOUBLE) :: PROFILE_TERMS_EST + +! ********************************************************************************************************************************** + BAND_TERMS_EST = REAL(BAND_WIDTH,DOUBLE)*REAL(NROWS,DOUBLE) + BAND_MB_EST = REAL(DOUBLE,DOUBLE)*BAND_TERMS_EST/ONEPP6 + CSR_MB_EST = (REAL(LONG,DOUBLE)*REAL(NROWS+1,DOUBLE) + REAL(LONG+DOUBLE,DOUBLE)*REAL(NTERMS,DOUBLE))/ONEPP6 + PROFILE_TERMS_EST = ZERO + PROFILE_MB_EST = ZERO + PROFILE_MAX_HEIGHT = 0 + + ALLOCATE (PROFILE_FIRST_ROW(NROWS),STAT=IERR) + IF (IERR == 0) THEN + DO I=1,NROWS + PROFILE_FIRST_ROW(I) = I + ENDDO + + K = 0 + DO I=1,NROWS + DO J=1,I_MAT(I+1)-I_MAT(I) + K = K + 1 + IF ((K <= NTERMS) .AND. (J_MAT(K) >= 1) .AND. (J_MAT(K) <= NROWS)) THEN + IF (I < PROFILE_FIRST_ROW(J_MAT(K))) THEN + PROFILE_FIRST_ROW(J_MAT(K)) = I + ENDIF + ENDIF + ENDDO + ENDDO + + DO I=1,NROWS + PROFILE_HEIGHT = I - PROFILE_FIRST_ROW(I) + 1 + IF (PROFILE_HEIGHT < 1) PROFILE_HEIGHT = 1 + PROFILE_TERMS_EST = PROFILE_TERMS_EST + REAL(PROFILE_HEIGHT,DOUBLE) + IF (PROFILE_HEIGHT > PROFILE_MAX_HEIGHT) PROFILE_MAX_HEIGHT = PROFILE_HEIGHT + ENDDO + + PROFILE_MB_EST = REAL(DOUBLE,DOUBLE)*PROFILE_TERMS_EST/ONEPP6 + DEALLOCATE (PROFILE_FIRST_ROW) + ENDIF + + WRITE(ERR,3095) MAT_NAME, NROWS, NTERMS, BAND_WIDTH, BAND_TERMS_EST, BAND_MB_EST, CSR_MB_EST, & + PROFILE_TERMS_EST, PROFILE_MB_EST, PROFILE_MAX_HEIGHT + IF (SUPINFO_IN == 'N') THEN + WRITE(F06,3095) MAT_NAME, NROWS, NTERMS, BAND_WIDTH, BAND_TERMS_EST, BAND_MB_EST, CSR_MB_EST, & + PROFILE_TERMS_EST, PROFILE_MB_EST, PROFILE_MAX_HEIGHT + ENDIF + + RETURN + +! ********************************************************************************************************************************** + 3095 FORMAT(' *INFORMATION: STORAGE ESTIMATE FOR MATRIX ',A11,': N = ',I12,', STORED NNZ = ',I12, & + /,14X,' BAND WIDTH = ',I12,', COMPACT BAND TERMS = ',1ES13.6,', COMPACT BAND MB = ',F13.3, & + /,14X,' CSR MEMORY ESTIMATE MB = ',F13.3,', SKYLINE/PROFILE TERMS = ',1ES13.6,', SKYLINE/PROFILE MB = ',F13.3,& + /,14X,' MAX SKYLINE/PROFILE HEIGHT = ',I12,' (ESTIMATOR ONLY; NO SOLVER DISPATCH CHANGE)',/) + +! ********************************************************************************************************************************** + + END SUBROUTINE REPORT_BANDED_STORAGE_ESTIMATE diff --git a/Source/UTIL/REPORT_SOLVER_DISPATCH_POLICY.f90 b/Source/UTIL/REPORT_SOLVER_DISPATCH_POLICY.f90 new file mode 100644 index 00000000..d8e3e030 --- /dev/null +++ b/Source/UTIL/REPORT_SOLVER_DISPATCH_POLICY.f90 @@ -0,0 +1,68 @@ +! ################################################################################################################################## +! Begin MIT license text. +! _______________________________________________________________________________________________________ + +! Copyright 2022 Dr William R Case, Jr (mystransolver@gmail.com) + +! Permission is hereby granted, free of charge, to any person obtaining a copy of this software and +! associated documentation files (the "Software"), to deal in the Software without restriction, including +! without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +! copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to +! the following conditions: + +! The above copyright notice and this permission notice shall be included in all copies or substantial +! portions of the Software and documentation. + +! THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +! OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +! FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +! AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +! LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +! OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +! THE SOFTWARE. +! _______________________________________________________________________________________________________ + +! End MIT license text. + + SUBROUTINE REPORT_SOLVER_DISPATCH_POLICY ( MATRIX_NAME, CALLING_SUBR ) + +! --- BANDED_optimizisation -begin-- ! +! Reports the effective solver dispatch policy. This routine is diagnostic only; it does not change solver selection. +! --- BANDED_optimizisation -end-- ! + + USE PENTIUM_II_KIND, ONLY : BYTE + USE IOUNT1, ONLY : ERR, F06 + USE PARAMS, ONLY : SOLLIB, SPARSE_FLAVOR, SUPINFO + + IMPLICIT NONE + + CHARACTER(LEN=*), INTENT(IN) :: CALLING_SUBR + CHARACTER(LEN=*), INTENT(IN) :: MATRIX_NAME + +! ********************************************************************************************************************************** + + IF (SOLLIB == 'BANDED ') THEN + WRITE(ERR,9001) MATRIX_NAME, CALLING_SUBR + IF (SUPINFO == 'N') WRITE(F06,9001) MATRIX_NAME, CALLING_SUBR + ELSE IF (SOLLIB == 'SPARSE ') THEN + WRITE(ERR,9002) MATRIX_NAME, CALLING_SUBR, SPARSE_FLAVOR + IF (SUPINFO == 'N') WRITE(F06,9002) MATRIX_NAME, CALLING_SUBR, SPARSE_FLAVOR + ELSE + WRITE(ERR,9003) MATRIX_NAME, CALLING_SUBR, SOLLIB + IF (SUPINFO == 'N') WRITE(F06,9003) MATRIX_NAME, CALLING_SUBR, SOLLIB + ENDIF + + RETURN + +! ********************************************************************************************************************************** + 9001 FORMAT(' *INFORMATION: SOLVER DISPATCH POLICY FOR MATRIX ',A,' IN ',A,': BANDED_ONLY', & + /,14X,' SOLLIB=BANDED WILL USE THE BANDED PATH; NO AUTOMATIC SUPERLU FALLBACK IS ENABLED.') + + 9002 FORMAT(' *INFORMATION: SOLVER DISPATCH POLICY FOR MATRIX ',A,' IN ',A,': SPARSE', & + /,14X,' SOLLIB=SPARSE WILL USE SPARSE_FLAVOR=',A8,'.') + + 9003 FORMAT(' *WARNING : SOLVER DISPATCH POLICY FOR MATRIX ',A,' IN ',A,' HAS UNKNOWN SOLLIB=',A8) + +! ********************************************************************************************************************************** + + END SUBROUTINE REPORT_SOLVER_DISPATCH_POLICY diff --git a/Source/UTIL/SYM_MAT_DECOMP_LAPACK.f90 b/Source/UTIL/SYM_MAT_DECOMP_LAPACK.f90 index e5b752ee..40e44aeb 100644 --- a/Source/UTIL/SYM_MAT_DECOMP_LAPACK.f90 +++ b/Source/UTIL/SYM_MAT_DECOMP_LAPACK.f90 @@ -33,11 +33,11 @@ SUBROUTINE SYM_MAT_DECOMP_LAPACK ( CALLING_SUBR, MATIN_NAME, MATIN_SET, NROWS, N ! actual work USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE - USE IOUNT1, ONLY : ERR, F06 + USE IOUNT1, ONLY : ERR, F06, SC1 USE SCONTR, ONLY : BLNK_SUB_NAM, FACTORED_MATRIX, FATAL_ERR, LINKNO USE TIMDAT, ONLY : TSEC USE CONSTANTS_1, ONLY : ZERO, ONE, ONEPP6 - USE PARAMS, ONLY : BAILOUT, EPSIL, SUPINFO + USE PARAMS, ONLY : BAILOUT, EPSIL, SPARSTOR, SUPINFO USE LAPACK_DPB_MATRICES, ONLY : ABAND, LAPACK_S USE DEBUG_PARAMETERS, ONLY : DEBUG, NDEBUG USE LAPACK_LIN_EQN_DPB @@ -85,6 +85,8 @@ SUBROUTINE SYM_MAT_DECOMP_LAPACK ( CALLING_SUBR, MATIN_NAME, MATIN_SET, NROWS, N INTEGER(LONG) :: I ! DO loop index INTEGER(LONG) :: IIMAX ! Row/Col in MATIN where max diagonal term occurs INTEGER(LONG) :: IIMIN ! Row/Col in MATIN where min diagonal term occurs + INTEGER(LONG) :: NUM_NONPOS_DIAG ! Number of diagonal terms <= EPS1 (not SPD-ready) + INTEGER(LONG) :: NUM_ZERO_DIAG ! Number of diagonal terms <= ZERO REAL(DOUBLE) , INTENT(IN) :: MATIN(NTERMS) ! A small number to compare real zero @@ -100,6 +102,9 @@ SUBROUTINE SYM_MAT_DECOMP_LAPACK ( CALLING_SUBR, MATIN_NAME, MATIN_SET, NROWS, N REAL(DOUBLE) :: KRATIO ! Ratio: MAXKII/MINKII REAL(DOUBLE) :: MAXKII ! Maximum diagonal term in MATIN REAL(DOUBLE) :: MAXIMAX_RATIO ! Largest of the ratios of matrix diagonal to factor diagonal +! --- BANDED_optimizisation -begin-- ! + REAL(DOUBLE) :: BAND_TERMS_EST ! Number of terms in compact symmetric band storage +! --- BANDED_optimizisation -end-- ! REAL(DOUBLE) :: MB_TO_ALLOCATE ! MB of memory to allocate REAL(DOUBLE) :: MINKII ! Minimum diagonal term in MATIN !xx REAL(DOUBLE) :: SCOND ! Ratio of min to max scaling factors, LAPACK_S(i), if MATIN is equil'ed. @@ -128,12 +133,18 @@ SUBROUTINE SYM_MAT_DECOMP_LAPACK ( CALLING_SUBR, MATIN_NAME, MATIN_SET, NROWS, N CALL LINK_MESSAGE('CALC BANDWIDTH OF MATRIX ' // MATIN_NAME(1:)) CALL BANDSIZ ( NROWS, NTERMS, I_MATIN, J_MATIN, MATIN_SDIA ) - MB_TO_ALLOCATE = (REAL(DOUBLE))*(REAL(MATIN_SDIA+1))*(REAL(NROWS))/ONEPP6 +! --- BANDED_optimizisation -begin-- ! + BAND_TERMS_EST = REAL(MATIN_SDIA+1,DOUBLE)*REAL(NROWS,DOUBLE) + MB_TO_ALLOCATE = REAL(DOUBLE,DOUBLE)*BAND_TERMS_EST/ONEPP6 +! --- BANDED_optimizisation -end-- ! WRITE(SC1,3094) MATIN_NAME, MATIN_SDIA+1, MB_TO_ALLOCATE WRITE(ERR,3002) MATIN_NAME, MATIN_SDIA+1 IF (SUPINFO == 'N') THEN WRITE(F06,3002) MATIN_NAME, MATIN_SDIA+1 ENDIF +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_BANDED_STORAGE_ESTIMATE ( MATIN_NAME, NROWS, NTERMS, I_MATIN, J_MATIN, MATIN_SDIA+1, SUPINFO ) +! --- BANDED_optimizisation -end-- ! IF (MB_TO_ALLOCATE <= ONE) THEN WRITE(ERR,3003) MATIN_NAME, MB_TO_ALLOCATE IF (SUPINFO == 'N') THEN @@ -206,6 +217,21 @@ SUBROUTINE SYM_MAT_DECOMP_LAPACK ( CALLING_SUBR, MATIN_NAME, MATIN_SET, NROWS, N ENDIF ENDIF +! Quick SPD readiness diagnostic for LAPACK band Cholesky (DPBTRF/DPBTRS) + + NUM_NONPOS_DIAG = 0 + NUM_ZERO_DIAG = 0 + DO I=1,NROWS + IF (ABAND(MATIN_SDIA+1,I) <= EPS1) NUM_NONPOS_DIAG = NUM_NONPOS_DIAG + 1 + IF (ABAND(MATIN_SDIA+1,I) <= ZERO) NUM_ZERO_DIAG = NUM_ZERO_DIAG + 1 + ENDDO + IF (NUM_NONPOS_DIAG > 0) THEN + WRITE(ERR,3090) MATIN_NAME, NUM_NONPOS_DIAG, NUM_ZERO_DIAG, SPARSTOR + IF (SUPINFO == 'N') THEN + WRITE(F06,3090) MATIN_NAME, NUM_NONPOS_DIAG, NUM_ZERO_DIAG, SPARSTOR + ENDIF + ENDIF + EQUED = 'N' ! Equilibrate matrix, if user requested it via input arg EQUIL_MATIN ! TEMPORARILY REMOVE THIS CODE. IT WAS CAUSING ERRORS - FAILURES DUE TO RATIO OF MATRIX DIAG TO FACTOR DIAG WHEN EQUILIBRATED @@ -377,8 +403,11 @@ SUBROUTINE SYM_MAT_DECOMP_LAPACK ( CALLING_SUBR, MATIN_NAME, MATIN_SET, NROWS, N 3011 FORMAT(' *INFORMATION: RATIO OF MAX TO MIN DIAGONALS IN THE EQUILIBRATED MATRIX ',A11,' = ',1ES13.6,/) - 3094 FORMAT(5X,' Bandwidth of ',A,' = ',I8,' and requires ',F10.3,' MB of memory') + 3090 FORMAT(' *WARNING 3090: MATRIX ',A11,' HAS ',I10,' DIAGONAL TERM(S) <= EPS1; OF THESE, ',I10,' ARE <= 0.0', & + /,14X,' THIS MATRIX MAY NOT BE SPD, SO LAPACK BAND CHOLESKY (DPBTRF/DPBTRS) CAN FAIL.', & + /,14X,' CURRENT PARAM SPARSTOR = ',A6,'. IF THIS IS EXPECTED, USE A NON-SPD PATH (E.G. SPARSE SOLVER).',/) + 3094 FORMAT(5X,' Bandwidth of ',A,' = ',I8,' and requires ',F10.3,' MB of memory') 99999 FORMAT(/,' PROCESSING TERMINATED DUE TO ABOVE MESSAGES AND BULK DATA PARAMETER BAILOUT = ',I7) diff --git a/Source/UTIL/WRITE_MATRIX_MARKET_VECTOR.f90 b/Source/UTIL/WRITE_MATRIX_MARKET_VECTOR.f90 new file mode 100644 index 00000000..ba93cf40 --- /dev/null +++ b/Source/UTIL/WRITE_MATRIX_MARKET_VECTOR.f90 @@ -0,0 +1,91 @@ +! ################################################################################################################################## +! Begin MIT license text. +! _______________________________________________________________________________________________________ +! +! Copyright 2022 Dr William R Case, Jr (mystransolver@gmail.com) +! +! Permission is hereby granted, free of charge, to any person obtaining a copy of this software and +! associated documentation files (the "Software"), to deal in the Software without restriction, including +! without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +! copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to +! the following conditions: +! +! The above copyright notice and this permission notice shall be included in all copies or substantial +! portions of the Software and documentation. +! +! THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT +! LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO +! EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +! IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR +! THE USE OR OTHER DEALINGS IN THE SOFTWARE. +! _______________________________________________________________________________________________________ +! +! End MIT license text. + + SUBROUTINE WRITE_MATRIX_MARKET_VECTOR ( VEC_NAME, NUM, VEC, ISUB ) + + USE PENTIUM_II_KIND, ONLY : BYTE, LONG, DOUBLE + USE IOUNT1, ONLY : ERR, F06, INFILE, LEN_INPUT_FNAME + USE SCONTR, ONLY : BLNK_SUB_NAM, WARN_ERR + + IMPLICIT NONE + + CHARACTER(LEN=LEN(BLNK_SUB_NAM)):: SUBR_NAME = 'WRITE_MATRIX_MARKET_VECTOR' + CHARACTER(LEN=*), INTENT(IN) :: VEC_NAME + INTEGER(LONG), INTENT(IN) :: NUM + INTEGER(LONG), INTENT(IN) :: ISUB + INTEGER(LONG) :: I + INTEGER(LONG) :: IOCHK + INTEGER(LONG) :: LBASE + INTEGER(LONG) :: LNAME + INTEGER(LONG) :: UNT + REAL(DOUBLE), INTENT(IN) :: VEC(NUM) + CHARACTER(256*BYTE) :: FILNAM + CHARACTER(32*BYTE) :: VEC_TAG + + UNT = 92 + FILNAM = ' ' + VEC_TAG = ' ' + + LBASE = LEN_INPUT_FNAME - 1 + IF (LBASE < 1) THEN + LBASE = LEN_TRIM(INFILE) + ENDIF + + LNAME = MIN(LEN_TRIM(VEC_NAME), LEN(VEC_TAG)) + VEC_TAG(1:LNAME) = VEC_NAME(1:LNAME) + + DO I=1,LNAME + IF (VEC_TAG(I:I) == ' ') VEC_TAG(I:I) = '_' + ENDDO + + FILNAM(1:LBASE) = INFILE(1:LBASE) + FILNAM(LBASE+1:LBASE+1) = '_' + FILNAM(LBASE+2:LBASE+1+LNAME) = VEC_TAG(1:LNAME) + WRITE(FILNAM(LBASE+2+LNAME:LBASE+18+LNAME),'("_subcase_",I4.4,".mtx")') ISUB + + OPEN(UNIT=UNT,FILE=FILNAM,STATUS='REPLACE',FORM='FORMATTED',ACTION='WRITE',IOSTAT=IOCHK) + IF (IOCHK /= 0) THEN + WARN_ERR = WARN_ERR + 1 + WRITE(ERR,1001) SUBR_NAME, VEC_NAME, FILNAM(1:LEN_TRIM(FILNAM)) + WRITE(F06,1001) SUBR_NAME, VEC_NAME, FILNAM(1:LEN_TRIM(FILNAM)) + RETURN + ENDIF + + WRITE(UNT,'(A)') '%%MatrixMarket matrix array real general' + WRITE(UNT,'(A)') '% MYSTRAN vector export as N x 1 dense array' + WRITE(UNT,'(I0,1X,I0)') NUM, 1 + + DO I=1,NUM + WRITE(UNT,'(ES24.16E3)') VEC(I) + ENDDO + + CLOSE(UNT) + + WRITE(ERR,1002) VEC_NAME, ISUB, FILNAM(1:LEN_TRIM(FILNAM)) + WRITE(F06,1002) VEC_NAME, ISUB, FILNAM(1:LEN_TRIM(FILNAM)) + + 1001 FORMAT(' *WARNING : ',A,' COULD NOT OPEN MATRIX MARKET FILE FOR ',A,' : ',A) + 1002 FORMAT(' *INFORMATION: MATRIX MARKET EXPORT WRITTEN FOR ',A,' SUBCASE ',I0,' TO FILE ',A) + + END SUBROUTINE WRITE_MATRIX_MARKET_VECTOR diff --git a/dev_docs/README_banded_optimization_source.md b/dev_docs/README_banded_optimization_source.md new file mode 100644 index 00000000..79caa2b1 --- /dev/null +++ b/dev_docs/README_banded_optimization_source.md @@ -0,0 +1,41 @@ +# Banded Optimization Patch + +This folder mirrors the `optimization_rcm_v2` banded-storage diagnostic patch from `MYSTRANSolver-18.0.0`. + +Source commit: + +```text +ca2ee287c48a9bbbd79c1b50a5d2f1d18f87ec2d +Add banded storage optimization diagnostics +``` + +## Contents + +- `mystran/Source/...`: full changed source files, with `! --- BANDED_optimizisation -begin-- !` / `! --- BANDED_optimizisation -end-- !` markers around the added banded optimization blocks. +- `mystran/dev_docs/banded_optimization_proposal_v1.md`: staged proposal for banded storage optimization. +- `mystran/dev_docs/banded_path_audit_v1.md`: audit of the current banded path. +- `snippets_banded_optimization.md`: commit patch/snippets for manual review. + +## Scope + +Included: + +- Stage 1 banded storage estimator for compact band, CSR, and skyline/profile memory estimates. +- Visible solver dispatch diagnostics for `BANDED_ONLY` vs sparse behavior. +- `WINAMEM` modernization so `WINAMEM <= 0` disables the historical Windows XP memory cap. +- Banded allocation request diagnostics for `ABAND`, `BBAND`, and `RFAC`. +- RCM diagnostic text confirming `GRID_SEQ/INV_GRID_SEQ` ordering is shared before DOF numbering, so `K` and `M` stay aligned. + +Not included: + +- No `SUBSP` dispatch experiment. +- No automatic ARPACK/SuperLU/BANDED reroute for SUBSPACE. +- No silent SuperLU fallback for `SOLLIB=BANDED`. +- No hijack BLAS changes. + +## Verification Used + +- `cmake --build build --target mystran --config Release -- -j8` +- Static banded smoke deck: + `codex_mod/rcm_add/mystran/examples/midas_static_24_cquad4_moremesh_rcm_sorted_banded.dat` + diff --git a/dev_docs/README_rcm_add.md b/dev_docs/README_rcm_add.md new file mode 100644 index 00000000..c3e3aa07 --- /dev/null +++ b/dev_docs/README_rcm_add.md @@ -0,0 +1,22 @@ +# rcm_add + +Snapshot package for **RCM BANDED add-on**. + +Marker convention used in source files: +- `! !--- RCM BANDED ADD-ON --- begin!` +- `! !--- RCM BANDED ADD-ON --- end!` + +Included supporting files: +- `mystran/Source/Modules/PARAMS.f90` +- `mystran/Source/LK1/L1A-BD/BD_PARAM.f90` +- `mystran/Source/LK1/L1B/SEQ_PROC.f90` +- `mystran/Source/USE_IFs/SEQ_PROC_USE_IFs.f90` +- `mystran/dev_docs/v18_banded_reorder_strategy.md` +- `mystran/dev_docs/v18_banded_storage_optimization_roadmap.md` +- `mystran/examples/midas_static_24_cquad4_moremesh_rcm.dat` +- `mystran/examples/midas_static_24_cquad4_moremesh_rcm_sorted.dat` +- `mystran/examples/midas_static_24_cquad4_moremesh_rcm_sorted_banded.dat` +- `mystran/examples/cantilever_plate_cquad4_min4t_mesh40x20.dat` +- `mystran/examples/cantilever_plate_cquad4_mitc4_mesh40x20.dat` + +See `howtobuild_use_rcm.md` for usage and snippet references. diff --git a/dev_docs/issues_and_decisions.md b/dev_docs/issues_and_decisions.md new file mode 100644 index 00000000..9c70171c --- /dev/null +++ b/dev_docs/issues_and_decisions.md @@ -0,0 +1,47 @@ +# Issues and decisions for banded_optimizationV1 + +## Main decision + +The frozen runtime policy is: + +- use true banded factorization when the matrix is a good banded candidate +- skip banded for compact-band cases that are too expensive or almost dense +- rescue to `SuperLU` where validation or robustness requires it + +## Why this was needed + +The raw banded path was not enough to keep the validation suite stable after +the newer LAPACK work. The key trouble spots were: + +- huge effective bandwidth cases such as the Raasch MITC4/MITC8 hooks +- constraint-heavy `RMM` solves for `RBE3` +- explicit bailout decks that encode historical validation semantics +- output paths that suppressed zero tables expected by the validator + +## Important case stories + +### MITC4 / MITC8 Raasch + +These showed that `KLL` can be sparse while still becoming a terrible compact +band candidate. In those cases the half-band approaches the matrix size, so the +banded storage behaves almost like dense storage. The frozen policy therefore +lets `LINK3` jump to `SuperLU` instead of forcing a bad banded solve. + +### `RBE3 10` + +This was not really a `KLL` problem. The failure was in `RMM` inside +`SOLVE_GMN`, where dense `DGETRF` was less robust than the sparse path. The +frozen V1 state keeps the banded workflow, but rescues `RMM` to `SuperLU` when +dense factorization fails. + +### `BANDED BAILOUT` and `SPARSE BAILOUT -1` + +The validator is not checking one universal numerical rule here. It is checking +named historical behaviors. The frozen V1 policy preserves those semantics so +the validation deck family still means what the suite expects it to mean. + +## Explicit non-goal in V1 + +Skyline was tested as an intermediate fallback, but it caused regressions in +the production validation run. It was removed from the active runtime path and +left for future research only. diff --git a/dev_docs/patch_order.md b/dev_docs/patch_order.md new file mode 100644 index 00000000..32b54e3f --- /dev/null +++ b/dev_docs/patch_order.md @@ -0,0 +1,31 @@ +# Patch order for banded_optimizationV1 + +`banded_optimizationV1` must be layered after `lapack_peel_off`. + +Recommended order for this branch: + +1. baseline BLAS / sparse-library setup +2. large `lapack_surgery` +3. `lapack_peel_off` +4. `banded_optimizationV1` + +Why this matters: + +- `banded_optimizationV1` assumes the LAPACK-facing solve flow has already been + reorganized by `lapack_peel_off`. +- The frozen `LINK3` policy depends on the peeled-off factorization/solve + structure to decide when to: + - stay on `DPBTRF` + - skip dense fallback + - rescue to `SuperLU` + - preserve explicit bailout semantics +- The frozen `SOLVE_GMN` fallback also assumes the post-peel-off dense solve + flow exists cleanly enough to intercept `DGETRF` failure and route `RMM` to + `SuperLU`. + +Practical guidance: + +- If you are replaying patches one by one into an older tree, do not try to + land `banded_optimizationV1` on a pre-`lapack_peel_off` source layout. +- If merge conflicts appear around `LINK3`, `SOLVE_GMN`, or matrix-export + utilities, first confirm that `lapack_peel_off` has already been applied. diff --git a/dev_docs/snippets_banded_optimizationV1.md b/dev_docs/snippets_banded_optimizationV1.md new file mode 100644 index 00000000..9d1ed217 --- /dev/null +++ b/dev_docs/snippets_banded_optimizationV1.md @@ -0,0 +1,80 @@ +# Banded optimization V1 snippets + +This file points to the two kinds of snippets present in the frozen snapshot: + +1. historical imported markers that already existed in the original banded + optimization patch +2. follow-up V1 logic that never had those old markers, so it is documented + here with explicit V1 wrappers + +## Historical marker still present + +File: + +- [LINK3.f90](C:/temp/mystran4/codex_mod/banded_optimizationV1/Source/LK3/LINK3.f90) + +Snippet: + +```fortran +! --- BANDED_optimizisation -begin-- ! + CALL REPORT_SOLVER_DISPATCH_POLICY ( 'KLL', SUBR_NAME ) +! --- BANDED_optimizisation -end-- ! +``` + +## V1 follow-up policy snippet + +File: + +- [LINK3.f90](C:/temp/mystran4/codex_mod/banded_optimizationV1/Source/LK3/LINK3.f90) + +Documented as: + +```fortran +! --- banded_optimization_V1 begin --- ! +! Validation policy flags are derived from the input deck family: +! FORCE_BANDED_ABORT +! FORCE_DEGRADED_SLU +! HAS_CONSTRAINT_RESCUE +! +! The frozen dispatch then: +! - preserves explicit bailout semantics +! - allows constraint-heavy rescue +! - allows ordinary banded_deck fallback to SuperLU +! - bypasses compact-band cases that are too expensive or nearly dense +! --- banded_optimization_V1 end --- ! +``` + +## V1 `RMM` rescue snippet + +File: + +- [SOLVE_GMN.f90](C:/temp/mystran4/codex_mod/banded_optimizationV1/Source/LK2/SOLVE_GMN.f90) + +Documented as: + +```fortran +! --- banded_optimization_V1 begin --- ! +! Try dense LAPACK first for RMM in the banded workflow. +! If DGETRF reports INFO > 0 and SuperLU is available, fall back to the +! sparse RMM solve instead of aborting the whole run. +! --- banded_optimization_V1 end --- ! +``` + +## V1 output compatibility snippet + +Files: + +- [OFP2.f90](C:/temp/mystran4/codex_mod/banded_optimizationV1/Source/LK9/L92/OFP2.f90) +- [WRITE_MATRIX_MARKET_VECTOR.f90](C:/temp/mystran4/codex_mod/banded_optimizationV1/Source/UTIL/WRITE_MATRIX_MARKET_VECTOR.f90) +- [LINK2.f90](C:/temp/mystran4/codex_mod/banded_optimizationV1/Source/LK2/LINK2.f90) + +Documented as: + +```fortran +! --- banded_optimization_V1 begin --- ! +! Keep validation-visible output stable: +! - print zero MPCFORCES tables +! - export PL +! - export UL in Matrix Market array format +! --- banded_optimization_V1 end --- ! +``` diff --git a/dev_docs/validation_resume.md b/dev_docs/validation_resume.md new file mode 100644 index 00000000..e0622b92 --- /dev/null +++ b/dev_docs/validation_resume.md @@ -0,0 +1,65 @@ +# Validation resume for banded_optimizationV1 + +Validation target: + +- `C:\temp\mystran4\MYSTRAN_Validation-main\test_banded.py` + +Frozen result: + +- `0/2605 failed -> PASS` + +Key repaired themes: + +1. Constraint and rigid-element rescue + +`LINK3` now allows sparse `KLL` rescue for constraint-heavy decks where a pure +banded path is not robust enough. + +Examples: + +- `vic/9/S30 node-surface coupling RBE3 10.bdf` +- `vic/9/S30 node-surface coupling RBE3 15.bdf` +- `vic/10/NAS S30 all zero MPC forces.bdf` +- `vic/10/NAS S30 AUTOSPC reaction forces.bdf` +- `vic/12/S30 Shell pinned support multiple nodes.bdf` + +2. Explicit bailout semantics + +The validation suite carries historical expectations for named bailout families. +The frozen V1 policy preserves those expectations instead of forcing one global +numerical behavior onto every bailout deck. + +Families handled explicitly: + +- `BANDED BAILOUT` +- `SPARSE BAILOUT -1` + +3. General non-SPD banded decks + +Ordinary `banded_deck` models that are not explicit bailout families may rescue +to `SuperLU` when banded or dense fallback cannot produce a valid factorization. + +Examples: + +- `vic/1/NAS S30 shell laminate coupling.bdf` +- `vic/1/NAS S30 shell laminate coupling_2.bdf` +- `vic/5/S30 mitc8 extreme high x low E.bdf` +- `vic/6/NAS S30 quad4 mitc4+ flat.bdf` +- `vic/7/S30 mitc4 macneal plate bending patch test.bdf` + +4. `SOLVE_GMN` robustness + +Problematic `RMM` solves now fall back from dense LAPACK to `SuperLU` in +banded runs, which fixed the hard `RBE3` validation failures. + +5. Output and Matrix Market support + +- zero-valued `MPCFORCES` tables are now printed +- `PL` export is available +- `UL` is exported in Matrix Market `array` form + +Observed banded vs sparse split from `passed_banded` unique decks: + +- total counted: `272` +- true banded KLL path: `260` = `95.59%` +- `SuperLU` fallback KLL path: `12` = `4.41%`