Version 1.2, January 2004:
Changes since version 1.1:
- Added missing documentation in dlanbpro.f and dritzvec.f and in other
places such that at least the non-trivial parts of the code are now
documented in some detail :-).
- Extended the programs in the Examples directory to handle more matrix
formats. In addition to the Harwell-Boeing format it now handles dense
and diagonal matrices as well as sparse matrices stored in coordinate
format. The routines handling matrix I/O and matrix-vector multiply
can be found in the file matvec.F. I stress that these routines are not
written to achieve production level performance for sparse matrix-vector
multiplication, but are primarily meant to illustrate how to use DLANSVD
and DLANSVD_IRL and to make it easy to explore the numerical properties
of the algorithms with test matrices without having to write any new code.
- Changed the installation procedure by introducing the shell script
"configure" that examines the OS and CPU type of the system and
automatically generates a (hopefully) appropriate make.inc file
with the system and compiler dependent options. Some manual hacking
of make.linux_gcc_ia32 is still required to fine tune the gcc
optimization flags for various flavors of ia32 processors
(AMD processors and older Pentiums and such).
- The code has been parallelized using OpenMP. It was tested on
fairly small SMP systems including ia32 (dual Xeon system) with
the Intel compiler version 7.1, on ia64 (SGI Altix 3300 system) with
the Intel compiler version 8.0, and on IRIX with the MIPSpro compilers
version 7.30 (SGI Origin 2000 system). If you run PROPACK on larger
or different SMP systems I would be interested in hearing how well it
scales. I am working on getting an MPI version ready for public
consumption, and hopefully will find the time to get it into shape
for version 1.3.
- Added support for the ia64 (Itanium) platform. I highly recommend
using the Intel compilers for this platform if available (run
the configure script with option "-icc"), since gcc generates
terribly slow code for the Itanium processor.
- After experiencing endless problems with performance bugs,
incorrect results, and linking failures I decided to include
all LAPACK routines used by PROPACK as source code instead
of relying on pre-built LAPACK libraries. This also eliminates
the problem with older systems only having LAPACK version 2.0
installed. See known problems under version 1.1 for more info.
- Fixed a bug that prevented the singular vectors from being
computed when an invariant subspace was found and the dimension
of the subspace was smaller than the requested number of singular
values. Thanks to Dr. Wolfgang Duemmler, Siemens AG, Erlangen, for
reporting this. In addition, an exit code of info == 0 was returned
instead of info == dimension of the invariant subspace. This was
also reported by Eugene M. Fluder, Jr., Merck & Co., Inc..
- Changed error-bound refinement using the gap theorem to be
more robust (pessimistic). The old version would only look at
the gap |\theta_i-theta_{i+1}| when refining \theta_i, while
strictly speaking,
min( |\theta_i-theta_{i+1}|, |\theta_i-theta_{i-1}| )
(minus slack from existing error bounds) should be used.
Here I define \theta_{0}=+infinity, \theta{n+1}=0 when
refining the extreme Ritz values. This adds refinement to the
last Ritz value, which was previously missed in the case when
the dimension of the Krylov subspace was equal to min(m,n). This
solves a problem where PROPACK could get stuck when trying to
compute all singular values for a matrix with a tiny smallest
singular value.
- Fixed a bug where the last left singular vector would be reported
as zero when computing all singular values and vectors of a matrix
of rank min(m,n)-1, even though it could have been computed accurately
from the available Lanczos bidiagonalization.
Known Problems:
- DLANSVD_IRL computes incorrect results if WHICH='S' and P>DIM/2.
==============================================================================
Version 1.1, June 2003:
Changes since version 1.0:
- Fixed two bugs where dgetu0 and dreorth were being called with the
wrong number of parameters. Thanks to Jerzy Czaplicki, Institut de
Pharmacologie et de Biologie Structurale CNRS, Université Paul
Sabatier, Toulouse for reporting this.
- Added experimental support for computing the smallest singular
values in the implicitly restarted version of PROPACK. The
subroutine DLANSVD_IRL now takes an additional argument "WHICH",
which can have the values 'L' or 'S'. If WHICH is 'L' then
the NEIG largest singular values are computed. If WHICH is 'S'
then DLANBSVD_IRL attempts to compute the NEIG smallest singular
values by repeatedly filtering out the largest Ritz values when
restarting (using them as shifts) until convergence.
NOTICE: Be aware that for large and ill-conditioned matrices the
convergence can be very slow and the algorithm may even fail to
converge at all.
- Added support for the Intel compilers under Linux.
- Split options for GCC and the Intel compilers into separate files
make.linux_gcc and make.linux_intel.
- The minimum length of the integer workspace IWORK as specified in the
interface of DLANSVD and DLANSVD_IRL was incorrect and inconsistent
with the length used in the example programs. Thanks to Tom Schweiger,
Acxiom Corporation for reporting this.
- Fixed bugs in example programs:
o Dimensions of array arguments x and y were reversed in the Harwell-Boeing
matrix-vector multiply subroutine atvHB(m, n, x, y) used by the example
program. Thanks to Hannes Schwarzl of Institute of Geophysics and Planetary
Physics, UCLA, for reporting this.
o The COLPTR array in HB.h should be of length NMAX+1, not NMAX.
- Changed the order in which libraries are linked with the example programs
to ensure that the platform optimized version of the ILAENV subroutine
provided by a commercial LAPACK implementation is not overwritten by the
default values in the file supplied with PROPACK. The divide-and-conquer
code in the DC directory in only meant as a backup for systems that have
an LAPACK library older than version 3.0 installed.
- Made a small modification of the divide-and-conquer SVD code in dbdsdc.f
to manually set the SMLSIZ parameter to 25, if run in combination with
version 2.0 of ILAENV.
Known Problems:
- We have observed two problems when using the Intel Math Kernel Library(tm)
(MKL) and the Intel compiler on the ia32 platform under Linux:
1) the performance of the LAPACK routines DBDSQR and DBDSDC from MKL is
severely crippled (presumably) to ensure thread safety. This is a
problem in MKL, not PROPACK, but we mention it since it can severely
reduce performance.
2) The LAPACK divide-and-conquer source code (DBDSDC) supplied with
PROPACK generates incorrect singular vectors when compiled with the
Intel compiler version 7.0. The version in the Intel Math Kernel
Library (TM) works correctly (albeit very slowly).
To get the best performance with the Intel compiler and MKL on the ia32
platform we recommend using only the BLAS routines in MKL in combination
with either the pre-compiled LAPACK 3.0 libraries available from NETLIB
or LAPACK 3.0 compiled with GCC from source code.
- DLANSVD_IRL computes incorrect results if WHICH='S' and P>DIM/2.
==============================================================================
Version 1.0: Initial version.