## "Fossies" - the Fresh Open Source Software Archive

### Member "lapack-3.9.1/DOCS/lawn81.tex" (25 Mar 2021, 71100 Bytes) of package /linux/misc/lapack-3.9.1.tar.gz:

As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 \documentclass[11pt]{report}
2
3 \usepackage{indentfirst}
4 \usepackage[body={6in,8.5in}]{geometry}
5 \usepackage{hyperref}
6 \usepackage{graphicx}
7 \DeclareGraphicsRule{.ps}{eps}{}{}
8
9 \renewcommand{\thesection}{\arabic{section}}
10 \setcounter{tocdepth}{3}
11 \setcounter{secnumdepth}{3}
12
13 \begin{document}
14 \begin{center}
15   {\Large LAPACK Working Note 81\\
16   Quick Installation Guide for LAPACK on Unix Systems\footnote{This work was
17  supported by NSF Grant No. ASC-8715728  and NSF Grant No. 0444486}}
18 \end{center}
19 \begin{center}
20 %  Edward Anderson\footnote{Current address:  Cray Research Inc.,
21 %                           655F Lone Oak Drive, Eagan, MN  55121},
22   The LAPACK Authors\\
23   Department of Computer Science \\
24   University of Tennessee \\
25   Knoxville, Tennessee  37996-1301 \\
26 \end{center}
27 \begin{center}
28   REVISED:  VERSION 3.1.1, February 2007 \\
29   REVISED:  VERSION 3.2.0, November 2008
30 \end{center}
31
32 \begin{center}
33 Abstract
34 \end{center}
35 This working note describes how to install, and test version 3.2.0
36 of LAPACK, a linear algebra package for high-performance
37 computers, on a Unix System.  The timing routines are not actually included in
38 release 3.2.0, and that part of the LAWN refers to release 3.0.  Also,
39 version 3.2.0 contains many prototype routines needing user feedback.
40 Non-Unix installation instructions and
41 further details of the testing and timing suites are only contained in
42 LAPACK Working Note 41, and not in this abbreviated version.
43 %Separate instructions are provided for the Unix and non-Unix
44 %versions of the test package.
45 %Further details are also given on the design of the test and timing
46 %programs.
47 \newpage
48
49 \tableofcontents
50
51 \newpage
52 % Introduction to Implementation Guide
53
54 \section{Introduction}
55
56 LAPACK is a linear algebra library for high-performance
57 computers.
58 The library includes Fortran subroutines for
59 the analysis and solution of systems of simultaneous linear algebraic
60 equations, linear least-squares problems, and matrix eigenvalue
61 problems.
62 Our approach to achieving high efficiency is based on the use of
63 a standard set of Basic Linear Algebra Subprograms (the BLAS),
64 which can be optimized for each computing environment.
65 By confining most of the computational work to the BLAS,
66 the subroutines should be
67 transportable and efficient across a wide range of computers.
68
69 This working note describes how to install, test, and time this
70 release of LAPACK on a Unix System.
71
72 The instructions for installing, testing, and timing
73 \footnote{timing are only provided in LAPACK 3.0 and before}
74 are designed for a person whose
75 responsibility is the maintenance of a mathematical software library.
76 We assume the installer has experience in compiling and running
77 Fortran programs and in creating object libraries.
78 The installation process involves untarring the file, creating a set of
79 libraries, and compiling and running the test and timing programs
80 \footnotemark[\value{footnote}].
81
82 %This guide combines the instructions for the Unix and non-Unix
83 %versions of the LAPACK test package (the non-Unix version is in Appendix
84 %~\ref{appendixe}).
85 %At this time, the non-Unix version of LAPACK can only be obtained
86 %after first untarring the Unix tar tape and then following the instructions in
87 %Appendix ~\ref{appendixe}.
88
89 Section~\ref{fileformat} describes how the files are organized in the
90 file, and
91 Section~\ref{overview} gives a general overview of the parts of the test package.
92 Step-by-step instructions appear in Section~\ref{installation}.
93 %for the Unix version and in the appendix for the non-Unix version.
94
96 Working Note 41.
97 % Sections~\ref{moretesting}
98 %and ~\ref{moretiming} give
99 %details of the test and timing programs and their input files.
100 %Appendices ~\ref{appendixa} and ~\ref{appendixb} briefly describe
101 %the LAPACK routines and auxiliary routines provided
102 %in this release.
103 %Appendix ~\ref{appendixc} lists the operation counts we have computed
104 %for the BLAS and for some of the LAPACK routines.
105 Appendix ~\ref{appendixd}, entitled Caveats'', is a compendium of the known
106 problems from our own experiences, with suggestions on how to
107 overcome them.
108
110 A before proceeding with the installation process.}
111 %Appendix E contains the execution times of the different test
112 %and timing runs on two sample machines.
113 %Appendix ~\ref{appendixe} contains the instructions to install LAPACK on a non-Unix
114 %system.
115
116 \section{Revisions Since the First Public Release}
117
118 Since its first public release in February, 1992, LAPACK has had
119 several updates, which have encompassed the introduction of new routines
120 as well as extending the functionality of existing routines.  The first
121 update,
122 June 30, 1992, was version 1.0a; the second update, October 31, 1992,
123 was version 1.0b; the third update, March 31, 1993, was version 1.1;
124 version 2.0 on September 30, 1994, coincided with the release of the
125 Second Edition of the LAPACK Users' Guide;
126 version 3.0 on June 30, 1999 coincided with the release of the Third Edition of
127 the LAPACK Users' Guide;
128 version 3.1 was released on November, 2006;
129 version 3.1.1 was released on November, 2007;
130 and version 3.2.0 was released on November, 2008.
131
132 All LAPACK routines reflect the current version number with the date
135 to the \texttt{revisions.info} file in the lapack directory on netlib.
136 \begin{quote}
137 \url{http://www.netlib.org/lapack/revisions.info}
138 \end{quote}
139
140 %The distribution \texttt{tar} file \texttt{lapack.tar.z} that is
141 %available on netlib is always the most up-to-date.
142 %
143 %On-line manpages (troff files) for LAPACK driver and computational
144 %routines, as well as most of the BLAS routines, are available via
145 %the \texttt{lapack} index on netlib.
146
147 \section{File Format}\label{fileformat}
148
149 The software for LAPACK is distributed in the form of a
150 gzipped tar file (via anonymous ftp or the World Wide Web),
151 which contains the Fortran source for LAPACK,
152 the Basic Linear Algebra Subprograms
153 (the Level 1, 2, and 3 BLAS) needed by LAPACK, the testing programs,
154 and the timing programs\footnotemark[\value{footnote}].
155 Users who wish to have a non-Unix installation should refer to LAPACK
156 Working Note 41,
157 although the overview in section~\ref{overview} applies to both the Unix and non-Unix
158 versions.
159 %Users who wish to have a non-Unix installation should go to Appendix ~\ref{appendixe},
160 %although the overview in section ~\ref{overview} applies to both the Unix and non-Unix
161 %versions.
162
163 The package may be accessed via the World Wide Web through
165 \begin{quote}
166 \url{http://www.netlib.org/lapack/lapack.tgz}
167 \end{quote}
168
169 Or, you can retrieve the file via anonymous ftp at netlib:
170
171 \begin{verbatim}
172      ftp ftp.netlib.org
175      cd lapack
176      binary
177      get lapack.tgz
178      quit
179 \end{verbatim}
180
181 The software in the \texttt{tar} file
182 is organized in a number of essential directories as shown
183 in Figure 1.  Please note that this figure does not reflect everything
184 that is contained in the \texttt{LAPACK} directory.  Input and instructional
185 files are also located at various levels.
186 \begin{figure}
187 \vspace{11pt}
188 \centerline{\includegraphics[width=6.5in,height=3in]{org2.ps}}
189 \caption{Unix organization of LAPACK 3.0}
190 \vspace{11pt}
191 \end{figure}
192 Libraries are created in the LAPACK directory and
193 executable files are created in one of the directories BLAS, TESTING,
194 or TIMING\footnotemark[\value{footnote}].  Input files for the test and
195 timing\footnotemark[\value{footnote}]  programs are also
196 found in these three directories so that testing may be carried out
197 in the directories LAPACK/BLAS, LAPACK/TESTING, and LAPACK/TIMING \footnotemark[\value{footnote}].
198 A top-level makefile in the LAPACK directory is provided to perform the
199 entire installation procedure.
200
201 \section{Overview of Tape Contents}\label{overview}
202
203 Most routines in LAPACK occur in four versions: REAL,
204 DOUBLE PRECISION, COMPLEX, and COMPLEX*16.
205 The first three versions (REAL, DOUBLE PRECISION, and COMPLEX)
206 are written in standard Fortran and are completely portable;
207 the COMPLEX*16 version is provided for
208 those compilers which allow this data type.
209 Some routines use features of Fortran 90.
210 For convenience, we often refer to routines by their single precision
211 names; the leading S' can be replaced by a D' for double precision,
212 a C' for complex, or a Z' for complex*16.
213 For LAPACK use and testing you must decide which version(s)
214 of the package you intend to install at your site (for example,
215 REAL and COMPLEX on a Cray computer or DOUBLE PRECISION and
216 COMPLEX*16 on an IBM computer).
217
218 \subsection{LAPACK Routines}
219
220 There are three classes of LAPACK routines:
221 \begin{itemize}
222
223 \item \textbf{driver} routines solve a complete problem, such as solving
224 a system of linear equations or computing the eigenvalues of a real
225 symmetric matrix.  Users are encouraged to use a driver routine if there
226 is one that meets their requirements.  The driver routines are listed
227 in LAPACK Working Note 41~\cite{WN41} and the LAPACK Users' Guide~\cite{LUG}.
228 %in Appendix ~\ref{appendixa}.
229
230 \item \textbf{computational} routines, also called simply LAPACK routines,
231 perform a distinct computational task, such as computing
232 the $LU$ decomposition of an $m$-by-$n$ matrix or finding the
233 eigenvalues and eigenvectors of a symmetric tridiagonal matrix using
234 the $QR$ algorithm.
235 The LAPACK routines are listed in LAPACK Working Note 41~\cite{WN41}
236 and the LAPACK Users' Guide~\cite{LUG}.
237 %The LAPACK routines are listed in Appendix ~\ref{appendixa}; see also LAPACK
238 %Working Note \#5 \cite{WN5}.
239
240 \item \textbf{auxiliary} routines are all the other subroutines called
241 by the driver routines and computational routines.
242 %Among them are subroutines to perform subtasks of block algorithms,
243 %in particular, the unblocked versions of the block algorithms;
244 %extensions to the BLAS, such as matrix-vector operations involving
245 %complex symmetric matrices;
246 %the special routines LSAME and XERBLA which first appeared with the
247 %BLAS;
248 %and a number of routines to perform common low-level computations,
249 %such as computing a matrix norm, generating an elementary Householder
250 %transformation, and applying a sequence of plane rotations.
251 %Many of the auxiliary routines may be of use to numerical analysts
252 %or software developers, so we have documented the Fortran source for
253 %these routines with the same level of detail used for the LAPACK
254 %routines and driver routines.
255 The auxiliary routines are listed in LAPACK Working Note 41~\cite{WN41}
256 and the LAPACK Users' Guide~\cite{LUG}.
257 %The auxiliary routines are listed in Appendix ~\ref{appendixb}.
258 \end{itemize}
259
260 \subsection{Level 1, 2, and 3 BLAS}
261
262 The BLAS are a set of Basic Linear Algebra Subprograms that perform
263 vector-vector, matrix-vector, and matrix-matrix operations.
264 LAPACK is designed around the Level 1, 2, and 3 BLAS, and nearly all
265 of the parallelism in the LAPACK routines is contained in the BLAS.
266 Therefore,
267 the key to getting good performance from LAPACK lies in having an
268 efficient version of the BLAS optimized for your particular machine.
269 Optimized BLAS libraries are available on a variety of architectures,
270 refer to the BLAS FAQ on netlib for further information.
271 \begin{quote}
272 \url{http://www.netlib.org/blas/faq.html}
273 \end{quote}
274 There are also freely available BLAS generators that automatically
275 tune a subset of the BLAS for a given architecture.  E.g.,
276 \begin{quote}
277 \url{http://www.netlib.org/atlas/}
278 \end{quote}
279 And, if all else fails, there is the Fortran~77 reference implementation
280 of the Level 1, 2, and 3 BLAS available on netlib (also included in
281 the LAPACK distribution tar file).
282 \begin{quote}
283 \url{http://www.netlib.org/blas/blas.tgz}
284 \end{quote}
285 No matter which BLAS library is used, the BLAS test programs should
286 always be run.
287
288 Users should not expect too much from the Fortran~77 reference implementation
289 BLAS; these versions were written to define the basic operations and do not
290 employ the standard tricks for optimizing Fortran code.
291
292 The formal definitions of the Level 1, 2, and 3 BLAS
293 are in \cite{BLAS1}, \cite{BLAS2}, and \cite{BLAS3}.
294 The BLAS Quick Reference card is available on netlib.
295
296 \subsection{Mixed- and Extended-Precision BLAS: XBLAS}
297
298 The XBLAS extend the BLAS to work with mixed input and output
299 precisions as well as using extra precision internally.  The XBLAS are
300 used in the prototype extra-precise iterative refinement codes.
301
302 The current release of the XBLAS is available through
303 Netlib\footnote{Development versions may be available through
304   \url{http://www.cs.berkeley.edu/~yozo/} or
305   \url{http://www.nersc.gov/~xiaoye/XBLAS/}.}  at
306 \begin{quote}
307   \url{http://www.netlib.org/xblas}
308 \end{quote}
309 Their formal definition is in \cite{XBLAS}.
310
311 \subsection{LAPACK Test Routines}
312
313 This release contains two distinct test programs for LAPACK routines
314 in each data type.  One test program tests the routines for solving
315 linear equations and linear least squares problems,
316 and the other tests routines for the matrix eigenvalue problem.
317 The routines for generating test matrices are used by both test
318 programs and are compiled into a library for use by both test programs.
319
320 \subsection{LAPACK Timing Routines (for LAPACK 3.0 and before) }
321
322 This release also contains two distinct timing programs for the
323 LAPACK routines in each data type.
324 The linear equation timing program gathers performance data in
325 megaflops on the factor, solve, and inverse routines for solving
326 linear systems, the routines to generate or apply an orthogonal matrix
327 given as a sequence of elementary transformations, and the reductions
328 to bidiagonal, tridiagonal, or Hessenberg form for eigenvalue
329 computations.
330 The operation counts used in computing the megaflop rates are computed
331 from a formula;
332 see LAPACK Working Note 41~\cite{WN41}.
333 % see Appendix ~\ref{appendixc}.
334 The eigenvalue timing program is used with the eigensystem routines
335 and returns the execution time, number of floating point operations, and
336 megaflop rate for each of the requested subroutines.
337 In this program, the number of operations is computed while the
338 code is executing using special instrumented versions of the LAPACK
339 subroutines.
340
341 \section{Installing LAPACK on a Unix System}\label{installation}
342
343 Installing, testing, and timing\footnotemark[\value{footnote}] the Unix version of LAPACK
344 involves the following steps:
345 \begin{enumerate}
346 \item Gunzip and tar the file.
347
348 \item Copy and edit the file \texttt{LAPACK/make.inc.example to LAPACK/make.inc}.
349
350 \item Edit the file \texttt{LAPACK/Makefile} and type \texttt{make}.
351
352 %\item Test and Install the Machine-Dependent Routines \\
353 %\emph{(WARNING:  You may need to supply a correct version of second.f and
355 %{\tt
356 %\begin{list}{}{}
357 %\item cd LAPACK
358 %\item make install
359 %\end{list} }
360 %
361 %\item Create the BLAS Library, \emph{if necessary} \\
362 %\emph{(NOTE:  For best performance, it is recommended you use the manufacturers' BLAS)}
363 %{\tt
364 %\begin{list}{}{}
365 %\item \texttt{cd LAPACK}
366 %\item \texttt{make blaslib}
367 %\end{list} }
368 %
369 %\item Run the Level 1, 2, and 3 BLAS Test Programs
370 %\begin{list}{}{}
371 %\item \texttt{cd LAPACK}
372 %\item \texttt{make blas\_testing}
373 %\end{list}
374 %
375 %\item Create the LAPACK Library
376 %\begin{list}{}{}
377 %\item \texttt{cd LAPACK}
378 %\item \texttt{make lapacklib}
379 %\end{list}
380 %
381 %\item Create the Library of Test Matrix Generators
382 %\begin{list}{}{}
383 %\item \texttt{cd LAPACK}
384 %\item \texttt{make tmglib}
385 %\end{list}
386 %
387 %\item Run the LAPACK Test Programs
388 %\begin{list}{}{}
389 %\item \texttt{cd LAPACK}
390 %\item \texttt{make testing}
391 %\end{list}
392 %
393 %\item Run the LAPACK Timing Programs
394 %\begin{list}{}{}
395 %\item \texttt{cd LAPACK}
396 %\item \texttt{make timing}
397 %\end{list}
398 %
399 %\item Run the BLAS Timing Programs
400 %\begin{list}{}{}
401 %\item \texttt{cd LAPACK}
402 %\item \texttt{make blas\_timing}
403 %\end{list}
404 \end{enumerate}
405
406 \subsection{Untar the File}
407
408 If you received a tar file of LAPACK via the World Wide
409 Web or anonymous ftp, enter the following command:
410
411 \begin{list}{}
412 \item{\texttt{gunzip -c lapack.tgz | tar xvf -}}
413 \end{list}
414
415 \noindent
416 This will create a top-level directory called \texttt{LAPACK}, which
417 requires approximately 34 Mbytes of disk space.
418 The total space requirements including the object files and executables
419 is approximately 100 Mbytes for all four data types.
420
421 \subsection{Copy and edit the file \texttt{LAPACK/make.inc.example to LAPACK/make.inc}}
422
423 Before the libraries can be built, or the testing and timing\footnotemark[\value{footnote}] programs
424 run, you must define all machine-specific parameters for the
425 architecture to which you are installing LAPACK.  All machine-specific
426 parameters are contained in the file \texttt{LAPACK/make.inc}.
427 An example of  \texttt{LAPACK/make.inc} for a LINUX machine with GNU compilers is given
428 in \texttt{LAPACK/make.inc.example}, copy that file to LAPACK/make.inc by entering the following command:
429
430 \begin{list}{}
431 \item{\texttt{cp LAPACK/make.inc.example LAPACK/make.inc}}
432 \end{list}
433
434 \noindent
435 Now modify your \texttt{LAPACK/make.inc} by applying the following recommendations.
436 The first line of this \texttt{make.inc} file is:
437 \begin{quote}
438 SHELL = /bin/sh
439 \end{quote}
440 and it will need to be modified to \texttt{SHELL = /sbin/sh} if you are
441 installing LAPACK on an SGI architecture.
442 Next, you will need to modify \texttt{FC}, \texttt{FFLAGS},
443 \texttt{FFLAGS\_DRV}, \texttt{FFLAGS\_NOOPT}, and \texttt{LDFLAGS} to specify
444 the compiler, compiler options, compiler options for the testing and
445 timing\footnotemark[\value{footnote}] main programs, and linker options.
446 Next you will have to choose which function you will use to time in the
447 \texttt{SECOND} and \texttt{DSECND} routines.
448 \begin{verbatim}
449 #  Default:  SECOND and DSECND will use a call to the
450 #  EXTERNAL FUNCTION ETIME
451 #TIMER = EXT_ETIME
452 #  For RS6K:  SECOND and DSECND will use a call to the
453 #  EXTERNAL FUNCTION ETIME_
454 #TIMER = EXT_ETIME_
455 #  For gfortran compiler:  SECOND and DSECND will use a call to the
456 #  INTERNAL FUNCTION ETIME
457 TIMER = INT_ETIME
458 #  If your Fortran compiler does not provide etime (like Nag Fortran
459 #  Compiler, etc...) SECOND and DSECND will use a call to the
460 #  INTERNAL FUNCTION CPU_TIME
461 #TIMER = INT_CPU_TIME
462 #  If none of these work, you can use the NONE value.
463 #  In that case, SECOND and DSECND will always return 0.
464 #TIMER = NONE
465 \end{verbatim}
467
468
469 Next, you will need to modify \texttt{AR}, \texttt{ARFLAGS}, and \texttt{RANLIB} to specify archiver,
471 does not require \texttt{ranlib} to be run after each archive command (as
472 is the case with CRAY computers running UNICOS, Hewlett Packard
473 computers running HP-UX, or SUN SPARCstations running Solaris), set
474 \texttt{RANLIB = echo}.  And finally, you must
475 modify the \texttt{BLASLIB} definition to specify the BLAS library to which
476 you will be linking.  If an optimized version of the BLAS is available
477 on your machine, you are highly recommended to link to that library.
478 Otherwise, by default, \texttt{BLASLIB} is set to the Fortran~77 version.
479
480 If you want to enable the XBLAS, define the variable \texttt{USEXBLAS}
481 to some value, for example \texttt{USEXBLAS = Yes}.  Then set the
482 variable \texttt{XBLASLIB} to point at the XBLAS library.  Note that
483 the prototype iterative refinement routines and their testers will not
484 be built unless \texttt{USEXBLAS} is defined.
485
486 \textbf{NOTE:}  Example \texttt{make.inc} include files are contained in the
487 \texttt{LAPACK/INSTALL} directory.  Please refer to
488 Appendix~\ref{appendixd} for machine-specific installation hints, and/or
489 the \texttt{release\_notes} file on \texttt{netlib}.
490 \begin{quote}
491 \url{http://www.netlib.org/lapack/release\_notes}
492 \end{quote}
493
494 \subsection{Edit the file \texttt{LAPACK/Makefile}}\label{toplevelmakefile}
495
496 This \texttt{Makefile} can be modified to perform as much of the
497 installation process as the user desires.  Ideally, this is the ONLY
498 makefile the user must modify.  However, modification of lower-level
499 makefiles may be necessary if a specific routine needs to be compiled
500 with a different level of optimization.
501
502 First, edit the definitions of \texttt{blaslib}, \texttt{lapacklib},
503 \texttt{tmglib}, \texttt{lapack\_testing}, and \texttt{timing}\footnotemark[\value{footnote}] in the file \texttt{LAPACK/Makefile}
504 to specify the data types desired.  For example,
505 if you only wish to compile the single precision real version of the
506 LAPACK library, you would modify the \texttt{lapacklib} definition to be:
507
508 \begin{verbatim}
509 lapacklib:
510         $(MAKE) -C SRC single 511 \end{verbatim} 512 513 Likewise, you could specify \texttt{double, complex, or complex16} to 514 build the double precision real, single precision complex, or double 515 precision complex libraries, respectively. By default, the presence of 516 no arguments following the \texttt{make} command will result in the 517 building of all four data types. 518 The make command can be run more than once to add another 519 data type to the library if necessary. 520 521 %If you are installing LAPACK on a Silicon Graphics machine, you must 522 %modify the respective definitions of \texttt{testing} and \texttt{timing} to be 523 %\begin{verbatim} 524 %testing: 525 % ( cd TESTING;$(MAKE) -f Makefile.sgi )
526 %\end{verbatim}
527 %and
528 %\begin{verbatim}
529 %timing:
530 %        ( cd TIMING; $(MAKE) -f Makefile.sgi ) 531 %\end{verbatim} 532 533 Next, if you will be using a locally available BLAS library, you will need 534 to remove \texttt{blaslib} from the \texttt{lib} definition. And finally, 535 if you do not wish to build all of the libraries individually and 536 likewise run all of the testing and timing separately, you can 537 modify the \texttt{all} definition to specify the amount of the 538 installation process that you want performed. By default, 539 the \texttt{all} definition is set to 540 \begin{verbatim} 541 all: lapack_install lib lapack_testing blas_testing 542 \end{verbatim} 543 which will perform all phases of the installation 544 process -- testing of machine-dependent routines, building the libraries, 545 BLAS testing and LAPACK testing. 546 547 The entire installation process will then be performed by typing 548 \texttt{make}. 549 550 Questions and/or comments can be directed to the 551 authors as described in Section~\ref{sendresults}. If test failures 552 occur, please refer to the appropriate subsection in 553 Section~\ref{furtherdetails}. 554 555 If disk space is limited, we suggest building each data type separately 556 and/or deleting all object files after building the libraries. Likewise, all 557 testing and timing executables can be deleted after the testing and timing 558 process is completed. The removal of all object files and executables 559 can be accomplished by the following: 560 561 \begin{list}{}{} 562 \item \texttt{cd LAPACK} 563 \item \texttt{make cleanobj} 564 \end{list} 565 566 \section{Further Details of the Installation Process}\label{furtherdetails} 567 568 Alternatively, you can choose to run each of the phases of the 569 installation process separately. The following sections give details 570 on how this may be achieved. 571 572 \subsection{Test and Install the Machine-Dependent Routines.} 573 574 There are six machine-dependent functions in the test and timing 575 package, at least three of which must be installed. They are 576 577 \begin{tabbing} 578 MONOMO \= DOUBLE PRECYSION \= \kill 579 LSAME \> LOGICAL \> Test if two characters are the same regardless of case \\ 580 SLAMCH \> REAL \> Determine machine-dependent parameters \\ 581 DLAMCH \> DOUBLE PRECISION \> Determine machine-dependent parameters \\ 582 SECOND \> REAL \> Return time in seconds from a fixed starting time \\ 583 DSECND \> DOUBLE PRECISION \> Return time in seconds from a fixed starting time\\ 584 ILAENV \> INTEGER \> Checks that NaN and infinity arithmetic are IEEE-754 compliant 585 \end{tabbing} 586 587 \noindent 588 If you are working only in single precision, you do not need to install 589 DLAMCH and DSECND, and if you are working only in double precision, 590 you do not need to install SLAMCH and SECOND. 591 592 These six subroutines are provided in \texttt{LAPACK/INSTALL}, 593 along with six test programs. 594 To compile the six test programs and run the tests, go to \texttt{LAPACK} and 595 type \texttt{make lapack\_install}. The test programs are called 596 \texttt{testlsame, testslamch, testdlamch, testsecond, testdsecnd} and 597 \texttt{testieee}. 598 If you do not wish to run all tests, you will need to modify the 599 \texttt{lapack\_install} definition in the \texttt{LAPACK/Makefile} to only include the 600 tests you wish to run. Otherwise, all tests will be performed. 601 The expected results of each test program are described below. 602 603 \subsubsection{Installing LSAME} 604 605 LSAME is a logical function with two character parameters, A and B. 606 It returns .TRUE. if A and B are the same regardless of case, or .FALSE. 607 if they are different. 608 For example, the expression 609 610 \begin{list}{}{} 611 \item \texttt{LSAME( UPLO, 'U' )} 612 \end{list} 613 \noindent 614 is equivalent to 615 \begin{list}{}{} 616 \item \texttt{( UPLO.EQ.'U' ).OR.( UPLO.EQ.'u' )} 617 \end{list} 618 619 The test program in \texttt{lsametst.f} tests all combinations of 620 the same character in upper and lower case for A and B, and two 621 cases where A and B are different characters. 622 623 Run the test program by typing \texttt{testlsame}. 624 If LSAME works correctly, the only message you should see after the 625 execution of \texttt{testlsame} is 626 \begin{verbatim} 627 ASCII character set 628 Tests completed 629 \end{verbatim} 630 The file \texttt{lsame.f} is automatically copied to 631 \texttt{LAPACK/BLAS/SRC/} and \texttt{LAPACK/SRC/}. 632 The function LSAME is needed by both the BLAS and LAPACK, so it is safer 633 to have it in both libraries as long as this does not cause trouble 634 in the link phase when both libraries are used. 635 636 \subsubsection{Installing SLAMCH and DLAMCH} 637 638 SLAMCH and DLAMCH are real functions with a single character parameter 639 that indicates the machine parameter to be returned. The test 640 program in \texttt{slamchtst.f} 641 simply prints out the different values computed by SLAMCH, 642 so you need to know something about what the values should be. 643 For example, the output of the test program executable \texttt{testslamch} 644 for SLAMCH on a Sun SPARCstation is 645 \begin{verbatim} 646 Epsilon = 5.96046E-08 647 Safe minimum = 1.17549E-38 648 Base = 2.00000 649 Precision = 1.19209E-07 650 Number of digits in mantissa = 24.0000 651 Rounding mode = 1.00000 652 Minimum exponent = -125.000 653 Underflow threshold = 1.17549E-38 654 Largest exponent = 128.000 655 Overflow threshold = 3.40282E+38 656 Reciprocal of safe minimum = 8.50706E+37 657 \end{verbatim} 658 On a Cray machine, the safe minimum underflows its output 659 representation and the overflow threshold overflows its output 660 representation, so the safe minimum is printed as 0.00000 and overflow 661 is printed as R. This is normal. 662 If you would prefer to print a representable number, you can modify 663 the test program to print SFMIN*100. and RMAX/100. for the safe 664 minimum and overflow thresholds. 665 666 Likewise, the test executable \texttt{testdlamch} is run for DLAMCH. 667 668 If both tests were successful, go to Section~\ref{second}. 669 670 If SLAMCH (or DLAMCH) returns an invalid value, you will have to create 671 your own version of this function. The following options are used in 672 LAPACK and must be set: 673 674 \begin{list}{}{} 675 \item {B': } Base of the machine 676 \item {E': } Epsilon (relative machine precision) 677 \item {O': } Overflow threshold 678 \item {P': } Precision = Epsilon*Base 679 \item {S': } Safe minimum (often same as underflow threshold) 680 \item {U': } Underflow threshold 681 \end{list} 682 683 Some people may be familiar with R1MACH (D1MACH), a primitive 684 routine for setting machine parameters in which the user must 685 comment out the appropriate assignment statements for the target 686 machine. If a version of R1MACH is on hand, the assignments in 687 SLAMCH can be made to refer to R1MACH using the correspondence 688 689 \begin{list}{}{} 690 \item {SLAMCH( U' )}$=$R1MACH( 1 ) 691 \item {SLAMCH( O' )}$=$R1MACH( 2 ) 692 \item {SLAMCH( E' )}$=$R1MACH( 3 ) 693 \item {SLAMCH( B' )}$=$R1MACH( 5 ) 694 \end{list} 695 696 \noindent 697 The safe minimum returned by SLAMCH( 'S' ) is initially set to the 698 underflow value, but if$1/(\mathrm{overflow}) \geq (\mathrm{underflow})$699 it is recomputed as$(1/(\mathrm{overflow})) * ( 1 + \varepsilon )$, 700 where$\varepsilon$is the machine precision. 701 702 BE AWARE that the initial call to SLAMCH or DLAMCH is expensive. 703 We suggest that installers run it once, save the results, and hard-code 704 the constants in the version they put in their library. 705 706 \subsubsection{Installing SECOND and DSECND}\label{second} 707 708 Both the timing routines\footnotemark[\value{footnote}] and the test routines call SECOND 709 (DSECND), a real function with no arguments that returns the time 710 in seconds from some fixed starting time. 711 Our version of this routine 712 returns only user time'', and not user time$+$system time''. 713 The following version of SECOND in \texttt{second\_EXT\_ETIME.f, second\_INT\_ETIME.f} calls 714 ETIME, a Fortran library routine available on some computer systems. 715 If ETIME is not available or a better local timing function exists, 716 you will have to provide the correct interface to SECOND and DSECND 717 on your machine. 718 719 Since LAPACK 3.1.1 we provide 5 different flavours of the SECOND and DSECND routines. 720 The version that will be used depends on the value of the TIMER variable in the make.inc 721 722 \begin{itemize} 723 \item If ETIME is available as an external function, set the value of the TIMER variable in your 724 make.inc to \texttt{EXT\_ETIME}: \texttt{second\_EXT\_ETIME.f} and \texttt{dsecnd\_EXT\_ETIME.f} will be used. 725 Usually on HPPA architectures, 726 the compiler and linker flag \texttt{+U77} should be included to access 727 the function \texttt{ETIME}. 728 729 \item If ETIME\_ is available as an external function, set the value of the TIMER variable in your make.inc 730 to \texttt{EXT\_ETIME\_}: \texttt{second\_EXT\_ETIME\_.f} and \texttt{dsecnd\_EXT\_ETIME\_.f} will be used. 731 It is the case on some IBM architectures such as IBM RS/6000s. 732 733 \item If ETIME is available as an internal function, set the value of the TIMER variable in your make.inc 734 to \texttt{INT\_ETIME}: \texttt{second\_INT\_ETIME.f} and \texttt{dsecnd\_INT\_ETIME.f} will be used. 735 This is the case with gfortan. 736 737 \item If CPU\_TIME is available as an internal function, set the value of the TIMER variable in your make.inc 738 to \texttt{INT\_CPU\_TIME}: \texttt{second\_INT\_CPU\_TIME.f} and \texttt{dsecnd\_INT\_CPU\_TIME.f} will be used. 739 740 \item If none of these function is available, set the value of the TIMER variable in your make.inc 741 to \texttt{NONE}: \texttt{second\_NONE.f} and \texttt{dsecnd\_NONE.f} will be used. 742 These routines will always return zero. 743 \end{itemize} 744 745 The test program in \texttt{secondtst.f} 746 performs a million operations using 5000 iterations of 747 the SAXPY operation$y := y + \alpha x$on a vector of length 100. 748 The total time and megaflops for this test is reported, then 749 the operation is repeated including a call to SECOND on each of 750 the 5000 iterations to determine the overhead due to calling SECOND. 751 The test program executable is called \texttt{testsecond} (or \texttt{testdsecnd}). 752 There is no single right answer, but the times 753 in seconds should be positive and the megaflop ratios should be 754 appropriate for your machine. 755 756 \subsubsection{Testing IEEE arithmetic and ILAENV}\label{testieee} 757 758 %\textbf{If you are installing LAPACK on a non-IEEE machine, you MUST 759 %modify ILAENV! Otherwise, ILAENV will crash . By default, ILAENV 760 %assumes an IEEE machine, and does a test for IEEE-754 compliance.} 761 762 As some new routines in LAPACK rely on IEEE-754 compliance, 763 two settings (\texttt{ISPEC=10} and \texttt{ISPEC=11}) have been added to ILAENV 764 (\texttt{LAPACK/SRC/ilaenv.f}) to denote IEEE-754 compliance for NaN and 765 infinity arithmetic, respectively. By default, ILAENV assumes an IEEE 766 machine, and does a test for IEEE-754 compliance. \textbf{NOTE: If you 767 are installing LAPACK on a non-IEEE machine, you MUST modify ILAENV, 768 as this test inside ILAENV will crash!} 769 770 If \texttt{ILAENV( 10,$\ldots$)} or \texttt{ILAENV( 11,$\ldots$)} is 771 issued, then \texttt{ILAENV=1} is returned to signal IEEE-754 compliance, 772 and \texttt{ILAENV=0} if the architecture is non-IEEE-754 compliant. 773 774 Thus, for non-IEEE machines, the user must hard-code the setting of 775 (\texttt{ILAENV=0}) for (\texttt{ISPEC=10} and \texttt{ISPEC=11}) in the version 776 of \texttt{LAPACK/SRC/ilaenv.f} to be put in 777 his library. There are also specialized testing and timing\footnotemark[\value{footnote}] versions of 778 ILAENV that will also need to be modified. 779 \begin{itemize} 780 \item Testing/timing version of \texttt{LAPACK/TESTING/LIN/ilaenv.f} 781 \item Testing/timing version of \texttt{LAPACK/TESTING/EIG/ilaenv.f} 782 \item Testing/timing version of \texttt{LAPACK/TIMING/LIN/ilaenv.f} 783 \item Testing/timing version of \texttt{LAPACK/TIMING/EIG/ilaenv.f} 784 \end{itemize} 785 786 %Some new routines in LAPACK rely on IEEE-754 compliance, and if non-compliance 787 %is detected (via a call to the function ILAENV), alternative (slower) 788 %algorithms will be chosen. 789 %For further details, refer to the leading comments of routines such 790 %as \texttt{LAPACK/SRC/sstevr.f}. 791 792 The test program in \texttt{LAPACK/INSTALL/tstiee.f} checks an installation 793 architecture 794 to see if infinity arithmetic and NaN arithmetic are IEEE-754 compliant. 795 A warning message to the user is printed if non-compliance is detected. 796 This same test is performed inside the function ILAENV. If 797 \texttt{ILAENV( 10,$\ldots$)} or \texttt{ILAENV( 11,$\ldots$)} is 798 issued, then \texttt{ILAENV=1} is returned to signal IEEE-754 compliance, 799 and \texttt{ILAENV=0} if the architecture is non-IEEE-754 compliant. 800 801 To avoid this IEEE test being run every time you call 802 \texttt{ILAENV( 10,$\ldots$)} or \texttt{ILAENV( 11,$\ldots$)}, we suggest 803 that the user hard-code the setting of 804 \texttt{ILAENV=1} or \texttt{ILAENV=0} in the version of \texttt{LAPACK/SRC/ilaenv.f} to be put in 805 his library. As aforementioned, there are also specialized testing and 806 timing\footnotemark[\value{footnote}] versions of ILAENV that will also need to be modified. 807 808 \subsection{Create the BLAS Library} 809 810 Ideally, a highly optimized version of the BLAS library already 811 exists on your machine. 812 In this case you can go directly to Section~\ref{testblas} to 813 make the BLAS test programs. 814 815 \begin{itemize} 816 \item[a)] 817 Go to \texttt{LAPACK} and edit the definition of \texttt{blaslib} in the 818 file \texttt{Makefile} to specify the data types desired, as in the example 819 in Section~\ref{toplevelmakefile}. 820 821 If you already have some of the BLAS, you will need to edit the file 822 \texttt{LAPACK/BLAS/SRC/Makefile} to comment out the lines 823 defining the BLAS you have. 824 825 \item[b)] 826 Type \texttt{make blaslib}. 827 The make command can be run more than once to add another 828 data type to the library if necessary. 829 \end{itemize} 830 831 \noindent 832 The BLAS library is created in \texttt{LAPACK/librefblas.a}, 833 or in the user-defined location specified by \texttt{BLASLIB} in the file 834 \texttt{LAPACK/make.inc}. 835 836 \subsection{Run the BLAS Test Programs}\label{testblas} 837 838 Test programs for the Level 1, 2, and 3 BLAS are in the directory 839 \texttt{LAPACK/BLAS/TESTING}. 840 841 To compile and run the Level 1, 2, and 3 BLAS test programs, 842 go to \texttt{LAPACK} and type \texttt{make blas\_testing}. The executable 843 files are called \texttt{xblat\_s}, \texttt{xblat\_d}, \texttt{xblat\_c}, and 844 \texttt{xblat\_z}, where the \_ (underscore) is replaced by 1, 2, or 3, 845 depending upon the level of BLAS that it is testing. All executable and 846 output files are created in \texttt{LAPACK/BLAS/}. 847 For the Level 1 BLAS tests, the output file names are \texttt{sblat1.out}, 848 \texttt{dblat1.out}, \texttt{cblat1.out}, and \texttt{zblat1.out}. For the Level 849 2 and 3 BLAS, the name of the output file is indicated on the first line of the 850 input file and is currently defined to be \texttt{sblat2.out} for 851 the Level 2 REAL version, and \texttt{sblat3.out} for the Level 3 REAL 852 version, with similar names for the other data types. 853 854 If the tests using the supplied data files were completed successfully, 855 consider whether the tests were sufficiently thorough. 856 For example, on a machine with vector registers, at least one value 857 of$N$greater than the length of the vector registers should be used; 858 otherwise, important parts of the compiled code may not be 859 exercised by the tests. 860 If the tests were not successful, either because the program did not 861 finish or the test ratios did not pass the threshold, you will 862 probably have to find and correct the problem before continuing. 863 If you have been testing a system-specific 864 BLAS library, try using the Fortran BLAS for the routines that 865 did not pass the tests. 866 For more details on the BLAS test programs, 867 see \cite{BLAS2-test} and \cite{BLAS3-test}. 868 869 \subsection{Create the LAPACK Library} 870 871 \begin{itemize} 872 \item[a)] 873 Go to the directory \texttt{LAPACK} and edit the definition of 874 \texttt{lapacklib} in the file \texttt{Makefile} to specify the data types desired, 875 as in the example in Section~\ref{toplevelmakefile}. 876 877 \item[b)] 878 Type \texttt{make lapacklib}. 879 The make command can be run more than once to add another 880 data type to the library if necessary. 881 882 \end{itemize} 883 884 \noindent 885 The LAPACK library is created in \texttt{LAPACK/liblapack.a}, 886 or in the user-defined location specified by \texttt{LAPACKLIB} in the file 887 \texttt{LAPACK/make.inc}. 888 889 \subsection{Create the Test Matrix Generator Library} 890 891 \begin{itemize} 892 \item[a)] 893 Go to the directory \texttt{LAPACK} and edit the definition of \texttt{tmglib} 894 in the file \texttt{Makefile} to specify the data types desired, as in the 895 example in Section~\ref{toplevelmakefile}. 896 897 \item[b)] 898 Type \texttt{make tmglib}. 899 The make command can be run more than once to add another 900 data type to the library if necessary. 901 902 \end{itemize} 903 904 \noindent 905 The test matrix generator library is created in \texttt{LAPACK/libtmglib.a}, 906 or in the user-defined location specified by \texttt{TMGLIB} in the file 907 \texttt{LAPACK/make.inc}. 908 909 \subsection{Run the LAPACK Test Programs} 910 911 There are two distinct test programs for LAPACK routines 912 in each data type, one for the linear equation routines and 913 one for the eigensystem routines. 914 In each data type, there is one input file for testing the linear 915 equation routines and eighteen input files for testing the eigenvalue 916 routines. 917 The input files reside in \texttt{LAPACK/TESTING}. 918 For more information on the test programs and how to modify the 919 input files, please refer to LAPACK Working Note 41~\cite{WN41}. 920 % see Section~\ref{moretesting}. 921 922 If you do not wish to run each of the tests individually, you can 923 go to \texttt{LAPACK}, edit the definition \texttt{lapack\_testing} in the file 924 \texttt{Makefile} to specify the data types desired, and type \texttt{make 925 lapack\_testing}. This will 926 compile and run the tests as described in sections~\ref{testlin} 927 and ~\ref{testeig}. 928 929 %If you are installing LAPACK on a Silicon Graphics machine, you must 930 %modify the definition of \texttt{testing} to be 931 %\begin{verbatim} 932 %testing: 933 % ( cd TESTING;$(MAKE) -f Makefile.sgi )
934 %\end{verbatim}
935
936 \subsubsection{Testing the Linear Equations Routines}\label{testlin}
937
938 \begin{itemize}
939
940 \item[a)]
941 Go to \texttt{LAPACK/TESTING/LIN} and type \texttt{make} followed by the data types
942 desired.  The executable files are called \texttt{xlintsts, xlintstc,
943 xlintstd}, or \texttt{xlintstz} and are created in \texttt{LAPACK/TESTING}.
944
945 \item[b)]
946 Go to \texttt{LAPACK/TESTING} and run the tests for each data type.
947 For the REAL version, the command is
948 \begin{list}{}{}
949 \item{} \texttt{xlintsts  < stest.in > stest.out}
950 \end{list}
951
952 \noindent
953 The tests using \texttt{xlintstd}, \texttt{xlintstc}, and \texttt{xlintstz} are similar
954 with the leading s' in the input and output file names replaced
955 by d', c', or z'.
956
957 \end{itemize}
958
959 If you encountered failures in this phase of the testing process, please
960 refer to Section~\ref{sendresults}.
961
962 \subsubsection{Testing the Eigensystem Routines}\label{testeig}
963
964 \begin{itemize}
965
966 \item[a)]
967 Go to \texttt{LAPACK/TESTING/EIG} and type \texttt{make} followed by the data types
968 desired.  The executable files are called \texttt{xeigtsts,
969 xeigtstc, xeigtstd}, and \texttt{xeigtstz} and are created
970 in \texttt{LAPACK/TESTING}.
971
972 \item[b)]
973 Go to \texttt{LAPACK/TESTING} and run the tests for each data type.
974 The tests for the eigensystem routines use eighteen separate input files
975 for testing the nonsymmetric eigenvalue problem,
976 the symmetric eigenvalue problem, the banded symmetric eigenvalue
977 problem, the generalized symmetric eigenvalue
978 problem, the generalized nonsymmetric eigenvalue problem, the
979 singular value decomposition, the banded singular value decomposition,
980 the generalized singular value
981 decomposition, the generalized QR and RQ factorizations, the generalized
982 linear regression model, and the constrained linear least squares
983 problem.
984 The tests for the REAL version are as follows:
985 \begin{list}{}{}
986 \item \texttt{xeigtsts  < nep.in > snep.out}
987 \item \texttt{xeigtsts  < sep.in > ssep.out}
988 \item \texttt{xeigtsts  < svd.in > ssvd.out}
989 \item \texttt{xeigtsts  < sec.in > sec.out}
990 \item \texttt{xeigtsts  < sed.in > sed.out}
991 \item \texttt{xeigtsts  < sgg.in > sgg.out}
992 \item \texttt{xeigtsts  < sgd.in > sgd.out}
993 \item \texttt{xeigtsts  < ssg.in > ssg.out}
994 \item \texttt{xeigtsts  < ssb.in > ssb.out}
995 \item \texttt{xeigtsts  < sbb.in > sbb.out}
996 \item \texttt{xeigtsts  < sbal.in > sbal.out}
997 \item \texttt{xeigtsts  < sbak.in > sbak.out}
998 \item \texttt{xeigtsts  < sgbal.in > sgbal.out}
999 \item \texttt{xeigtsts  < sgbak.in > sgbak.out}
1000 \item \texttt{xeigtsts  < glm.in > sglm.out}
1001 \item \texttt{xeigtsts  < gqr.in > sgqr.out}
1002 \item \texttt{xeigtsts  < gsv.in > sgsv.out}
1003 \item \texttt{xeigtsts  < lse.in > slse.out}
1004 \end{list}
1005 The tests using \texttt{xeigtstc}, \texttt{xeigtstd}, and \texttt{xeigtstz} also
1006 use the input files \texttt{nep.in}, \texttt{sep.in}, \texttt{svd.in},
1007 \texttt{glm.in}, \texttt{gqr.in}, \texttt{gsv.in}, and \texttt{lse.in},
1008 but the leading s' in the other input file names must be changed
1009 to c', d', or z'.
1010 \end{itemize}
1011
1012 If you encountered failures in this phase of the testing process, please
1013 refer to Section~\ref{sendresults}.
1014
1015 \subsection{Run the LAPACK Timing Programs (For LAPACK 3.0 and before)}
1016
1017 There are two distinct timing programs for LAPACK routines
1018 in each data type, one for the linear equation routines and
1019 one for the eigensystem routines.  The timing program for the
1020 linear equation routines is also used to time the BLAS.
1021 We encourage you to conduct these timing experiments
1022 in REAL and COMPLEX or in DOUBLE PRECISION and COMPLEX*16; it is
1023 not necessary to send timing results in all four data types.
1024
1025 Two sets of input files are provided, a small set and a large set.
1026 The small data sets are appropriate for a standard workstation or
1027 other non-vector machine.
1028 The large data sets are appropriate for supercomputers, vector
1029 computers, and high-performance workstations.
1030 We are mainly interested in results from the large data sets, and
1031 it is not necessary to run both the large and small sets.
1032 The values of N in the large data sets are about five times larger
1033 than those in the small data set,
1034 and the large data sets use additional values for parameters such as the
1035 block size NB and the leading array dimension LDA.
1036 Small data sets finished with the \_small in their name , such as
1037 \texttt{stime\_small.in}, and large data sets finished with \_large in their name,
1038 such as \texttt{stime\_large.in}.
1039 Except as noted, the leading s' in the input file name must be
1040 replaced by d', c', or z' for the other data types.
1041
1042 We encourage you to obtain timing results with the large data sets,
1043 as this allows us to compare different machines.
1044 If this would take too much time, suggestions for paring back the large
1045 data sets are given in the instructions below.
1046 We also encourage you to experiment with these timing
1047 programs and send us any interesting results, such as results for
1048 larger problems or for a wider range of block sizes.
1049 The main programs are dimensioned for the large data sets,
1050 so the parameters in the main program may have to be reduced in order
1051 to run the small data sets on a small machine, or increased to run
1052 experiments with larger problems.
1053
1054 The minimum time each subroutine will be timed is set to 0.0 in
1055 the large data files and to 0.05 in the small data files, and on
1056 many machines this value should be increased.
1057 If the timing interval is not long
1058 enough, the time for the subroutine after subtracting the overhead
1059 may be very small or zero, resulting in megaflop rates that are
1060 very large or zero. (To avoid division by zero, the megaflop rate is
1061 set to zero if the time is less than or equal to zero.)
1062 The minimum time that should be used depends on the machine and the
1063 resolution of the clock.
1064
1065 For more information on the timing programs and how to modify the
1066 input files, please refer to LAPACK Working Note 41~\cite{WN41}.
1067 % see Section~\ref{moretiming}.
1068
1069 If you do not wish to run each of the timings individually, you can
1070 go to \texttt{LAPACK}, edit the definition \texttt{lapack\_timing} in the file
1071 \texttt{Makefile} to specify the data types desired, and type \texttt{make
1072 lapack\_timing}.  This will compile
1073 and run the timings for the linear equation routines and the eigensystem
1074 routines (see Sections~\ref{timelin} and ~\ref{timeeig}).
1075
1076 %If you are installing LAPACK on a Silicon Graphics machine, you must
1077 %modify the definition of \texttt{timing} to be
1078 %\begin{verbatim}
1079 %timing:
1080 %        ( cd TIMING; $(MAKE) -f Makefile.sgi ) 1081 %\end{verbatim} 1082 1083 If you encounter failures in any phase of the timing process, please 1084 feel free to contact the authors as directed in Section~\ref{sendresults}. 1085 Tell us the 1086 type of machine on which the tests were run, the version of the operating 1087 system, the compiler and compiler options that were used, 1088 and details of the BLAS library or libraries that you used. You should 1089 also include a copy of the output file in which the failure occurs. 1090 1091 Please note that the BLAS 1092 timing runs will still need to be run as instructed in ~\ref{timeblas}. 1093 1094 \subsubsection{Timing the Linear Equations Routines}\label{timelin} 1095 1096 The linear equation timing program is found in \texttt{LAPACK/TIMING/LIN} 1097 and the input files are in \texttt{LAPACK/TIMING}. 1098 Three input files are provided in each data type for timing the 1099 linear equation routines, one for square matrices, one for band 1100 matrices, and one for rectangular matrices. The small data sets for the REAL version 1101 are \texttt{stime\_small.in}, \texttt{sband\_small.in}, and \texttt{stime2\_small.in}, respectively, 1102 and the large data sets are 1103 \texttt{stime\_large.in}, \texttt{sband\_large.in}, and \texttt{stime2\_large.in}. 1104 1105 The timing program for the least squares routines uses special instrumented 1106 versions of the LAPACK routines to time individual sections of the code. 1107 The first step in compiling the timing program is therefore to make a library 1108 of the instrumented routines. 1109 1110 \begin{itemize} 1111 \item[a)] 1112 \begin{sloppypar} 1113 To make a library of the instrumented LAPACK routines, first 1114 go to \texttt{LAPACK/TIMING/LIN/LINSRC} and type \texttt{make} followed 1115 by the data types desired, as in the examples of Section~\ref{toplevelmakefile}. 1116 The library of instrumented code is created in 1117 \texttt{LAPACK/TIMING/LIN/linsrc.a}. 1118 \end{sloppypar} 1119 1120 \item[b)] 1121 To make the linear equation timing programs, 1122 go to \texttt{LAPACK/TIMING/LIN} and type \texttt{make} followed by the data 1123 types desired, as in the examples in Section~\ref{toplevelmakefile}. 1124 The executable files are called \texttt{xlintims}, 1125 \texttt{xlintimc}, \texttt{xlintimd}, and \texttt{xlintimz} and are created 1126 in \texttt{LAPACK/TIMING}. 1127 1128 \item[c)] 1129 Go to \texttt{LAPACK/TIMING} and 1130 make any necessary modifications to the input files. 1131 You may need to set the minimum time a subroutine will 1132 be timed to a positive value, or to restrict the size of the tests 1133 if you are using a computer with performance in between that of a 1134 workstation and that of a supercomputer. 1135 The computational requirements can be cut in half by using only one 1136 value of LDA. 1137 If it is necessary to also reduce the matrix sizes or the values of 1138 the blocksize, corresponding changes should be made to the 1139 BLAS input files (see Section~\ref{timeblas}). 1140 1141 \item[d)] 1142 Run the programs for each data type you are using. 1143 For the REAL version, the commands for the small data sets are 1144 1145 \begin{list}{}{} 1146 \item{} \texttt{xlintims < stime\_small.in > stime\_small.out } 1147 \item{} \texttt{xlintims < sband\_small.in > sband\_small.out } 1148 \item{} \texttt{xlintims < stime2\_small.in > stime2\_small.out } 1149 \end{list} 1150 or the commands for the large data sets are 1151 \begin{list}{}{} 1152 \item{} \texttt{xlintims < stime\_large.in > stime\_large.out } 1153 \item{} \texttt{xlintims < sband\_large.in > sband\_large.out } 1154 \item{} \texttt{xlintims < stime2\_large.in > stime2\_large.out } 1155 \end{list} 1156 1157 \noindent 1158 Similar commands should be used for the other data types. 1159 \end{itemize} 1160 1161 \subsubsection{Timing the BLAS}\label{timeblas} 1162 1163 The linear equation timing program is also used to time the BLAS. 1164 Three input files are provided in each data type for timing the Level 1165 2 and 3 BLAS. 1166 These input files time the BLAS using the matrix shapes encountered 1167 in the LAPACK routines, and we will use the results to analyze the 1168 performance of the LAPACK routines. 1169 For the REAL version, the small data files are 1170 \texttt{sblasa\_small.in}, \texttt{sblasb\_small.in}, and \texttt{sblasc\_small.in} 1171 and the large data files are 1172 \texttt{sblasa\_large.in}, \texttt{sblasb\_large.in}, and \texttt{sblasc\_large.in}. 1173 There are three sets of inputs because there are three 1174 parameters in the Level 3 BLAS, M, N, and K, and 1175 in most applications one of these parameters is small (on the order 1176 of the blocksize) while the other two are large (on the order of the 1177 matrix size). 1178 In \texttt{sblasa\_small.in}, M and N are large but K is 1179 small, while in \texttt{sblasb\_small.in} the small parameter is M, and 1180 in \texttt{sblasc\_small.in} the small parameter is N. 1181 The Level 2 BLAS are timed only in the first data set, where K 1182 is also used as the bandwidth for the banded routines. 1183 1184 \begin{itemize} 1185 1186 \item[a)] 1187 Go to \texttt{LAPACK/TIMING} and 1188 make any necessary modifications to the input files. 1189 You may need to set the minimum time a subroutine will 1190 be timed to a positive value. 1191 If you modified the values of N or NB 1192 in Section~\ref{timelin}, set M, N, and K accordingly. 1193 The large parameters among M, N, and K 1194 should be the same as the matrix sizes used in timing the linear 1195 equation routines, 1196 and the small parameter should be the same as the 1197 blocksizes used in timing the linear equation routines. 1198 If necessary, the large data set can be simplified by using only one 1199 value of LDA. 1200 1201 \item[b)] 1202 Run the programs for each data type you are using. 1203 For the REAL version, the commands for the small data sets are 1204 1205 \begin{list}{}{} 1206 \item{} \texttt{xlintims < sblasa\_small.in > sblasa\_small.out } 1207 \item{} \texttt{xlintims < sblasb\_small.in > sblasb\_small.out } 1208 \item{} \texttt{xlintims < sblasc\_small.in > sblasc\_small.out } 1209 \end{list} 1210 or the commands for the large data sets are 1211 \begin{list}{}{} 1212 \item{} \texttt{xlintims < sblasa\_large.in > sblasa\_large.out } 1213 \item{} \texttt{xlintims < sblasb\_large.in > sblasb\_large.out } 1214 \item{} \texttt{xlintims < sblasc\_large.in > sblasc\_large.out } 1215 \end{list} 1216 1217 \noindent 1218 Similar commands should be used for the other data types. 1219 \end{itemize} 1220 1221 \subsubsection{Timing the Eigensystem Routines}\label{timeeig} 1222 1223 The eigensystem timing program is found in \texttt{LAPACK/TIMING/EIG} 1224 and the input files are in \texttt{LAPACK/TIMING}. 1225 Four input files are provided in each data type for timing the 1226 eigensystem routines, 1227 one for the generalized nonsymmetric eigenvalue problem, 1228 one for the nonsymmetric eigenvalue problem, 1229 one for the symmetric and generalized symmetric eigenvalue problem, 1230 and one for the singular value decomposition. 1231 For the REAL version, the small data sets are called \texttt{sgeptim\_small.in}, 1232 \texttt{sneptim\_small.in}, \texttt{sseptim\_small.in}, and \texttt{ssvdtim\_small.in}, respectively. 1233 and the large data sets are called \texttt{sgeptim\_large.in}, \texttt{sneptim\_large.in}, 1234 \texttt{sseptim\_large.in}, and \texttt{ssvdtim\_large.in}. 1235 Each of the four input files reads a different set of parameters, 1236 and the format of the input is indicated by a 3-character code 1237 on the first line. 1238 1239 The timing program for eigenvalue/singular value routines accumulates 1240 the operation count as the routines are executing using special 1241 instrumented versions of the LAPACK routines. The first step in 1242 compiling the timing program is therefore to make a library of the 1243 instrumented routines. 1244 1245 \begin{itemize} 1246 \item[a)] 1247 \begin{sloppypar} 1248 To make a library of the instrumented LAPACK routines, first 1249 go to \texttt{LAPACK/TIMING/EIG/EIGSRC} and type \texttt{make} followed 1250 by the data types desired, as in the examples of Section~\ref{toplevelmakefile}. 1251 The library of instrumented code is created in 1252 \texttt{LAPACK/TIMING/EIG/eigsrc.a}. 1253 \end{sloppypar} 1254 1255 \item[b)] 1256 To make the eigensystem timing programs, 1257 go to \texttt{LAPACK/TIMING/EIG} and 1258 type \texttt{make} followed by the data types desired, as in the examples 1259 of Section~\ref{toplevelmakefile}. The executable files are called 1260 \texttt{xeigtims}, \texttt{xeigtimc}, \texttt{xeigtimd}, and \texttt{xeigtimz} 1261 and are created in \texttt{LAPACK/TIMING}. 1262 1263 \item[c)] 1264 Go to \texttt{LAPACK/TIMING} and 1265 make any necessary modifications to the input files. 1266 You may need to set the minimum time a subroutine will 1267 be timed to a positive value, or to restrict the number of tests 1268 if you are using a computer with performance in between that of a 1269 workstation and that of a supercomputer. 1270 Instead of decreasing the matrix dimensions to reduce the time, 1271 it would be better to reduce the number of matrix types to be timed, 1272 since the performance varies more with the matrix size than with the 1273 type. For example, for the nonsymmetric eigenvalue routines, 1274 you could use only one matrix of type 4 instead of four matrices of 1275 types 1, 3, 4, and 6. 1276 Refer to LAPACK Working Note 41~\cite{WN41} for further details. 1277 % See Section~\ref{moretiming} for further details. 1278 1279 \item[d)] 1280 Run the programs for each data type you are using. 1281 For the REAL version, the commands for the small data sets are 1282 1283 \begin{list}{}{} 1284 \item{} \texttt{xeigtims < sgeptim\_small.in > sgeptim\_small.out } 1285 \item{} \texttt{xeigtims < sneptim\_small.in > sneptim\_small.out } 1286 \item{} \texttt{xeigtims < sseptim\_small.in > sseptim\_small.out } 1287 \item{} \texttt{xeigtims < ssvdtim\_small.in > ssvdtim\_small.out } 1288 \end{list} 1289 or the commands for the large data sets are 1290 \begin{list}{}{} 1291 \item{} \texttt{xeigtims < sgeptim\_large.in > sgeptim\_large.out } 1292 \item{} \texttt{xeigtims < sneptim\_large.in > sneptim\_large.out } 1293 \item{} \texttt{xeigtims < sseptim\_large.in > sseptim\_large.out } 1294 \item{} \texttt{xeigtims < ssvdtim\_large.in > ssvdtim\_large.out } 1295 \end{list} 1296 1297 \noindent 1298 Similar commands should be used for the other data types. 1299 \end{itemize} 1300 1301 \subsection{Send the Results to Tennessee}\label{sendresults} 1302 1303 Congratulations! You have now finished installing, testing, and 1304 timing LAPACK. If you encountered failures in any phase of the 1305 testing or timing process, please 1306 consult our \texttt{release\_notes} file on netlib. 1307 \begin{quote} 1308 \url{http://www.netlib.org/lapack/release\_notes} 1309 \end{quote} 1310 This file contains machine-dependent installation clues which hopefully will 1311 alleviate your difficulties or at least let you know that other users 1312 have had similar difficulties on that machine. If there is not an entry 1313 for your machine or the suggestions do not fix your problem, please feel 1314 free to contact the authors at 1315 \begin{list}{}{} 1316 \item \href{mailto:lapack@cs.utk.edu}{\texttt{lapack@cs.utk.edu}}. 1317 \end{list} 1318 Tell us the 1319 type of machine on which the tests were run, the version of the operating 1320 system, the compiler and compiler options that were used, 1321 and details of the BLAS library or libraries that you used. You should 1322 also include a copy of the output file in which the failure occurs. 1323 1324 We would like to keep our \texttt{release\_notes} file as up-to-date as possible. 1325 Therefore, if you do not see an entry for your machine, please contact us 1326 with your testing results. 1327 1328 Comments and suggestions are also welcome. 1329 1330 We encourage you to make the LAPACK library available to your 1331 users and provide us with feedback from their experiences. 1332 %This release of LAPACK is not guaranteed to be compatible 1333 %with any previous test release. 1334 1335 \subsection{Get support}\label{getsupport} 1336 First, take a look at the complete installation manual in the LAPACK Working Note 41~\cite{WN41}. 1337 if you still cannot solve your problem, you have 2 ways to go: 1338 \begin{itemize} 1339 \item 1340 either send a post in the LAPACK forum 1341 \begin{quote} 1342 \url{http://icl.cs.utk.edu/lapack-forum} 1343 \end{quote} 1344 \item 1345 or send an email to the LAPACK mailing list: 1346 \begin{list}{}{} 1347 \item \href{mailto:lapack@cs.utk.edu}{\texttt{lapack@cs.utk.edu}}. 1348 \end{list} 1349 \end{itemize} 1350 \section*{Acknowledgments} 1351 1352 Ed Anderson and Susan Blackford contributed to previous versions of this report. 1353 1354 \appendix 1355 1356 \chapter{Caveats}\label{appendixd} 1357 1358 In this appendix we list a few of the machine-specific difficulties we 1359 have 1360 encountered in our own experience with LAPACK. A more detailed list 1361 of machine-dependent problems, bugs, and compiler errors encountered 1362 in the LAPACK installation process is maintained 1363 on \emph{netlib}. 1364 \begin{quote} 1365 \url{http://www.netlib.org/lapack/release\_notes} 1366 \end{quote} 1367 1368 We assume the user has installed the machine-specific routines 1369 correctly and that the Level 1, 2 and 3 BLAS test programs have run 1370 successfully, so we do not list any warnings associated with those 1371 routines. 1372 1373 \section{\texttt{LAPACK/make.inc}} 1374 1375 All machine-specific 1376 parameters are specified in the file \texttt{LAPACK/make.inc}. 1377 1378 The first line of this \texttt{make.inc} file is: 1379 \begin{quote} 1380 SHELL = /bin/sh 1381 \end{quote} 1382 and will need to be modified to \texttt{SHELL = /sbin/sh} if you are 1383 installing LAPACK on an SGI architecture. 1384 1385 \section{ETIME} 1386 1387 On HPPA architectures, 1388 the compiler and linker flag \texttt{+U77} should be included to access 1389 the function \texttt{ETIME}. 1390 1391 \section{ILAENV and IEEE-754 compliance} 1392 1393 %By default, ILAENV (\texttt{LAPACK/SRC/ilaenv.f}) assumes an IEEE and IEEE-754 1394 %compliant architecture, and thus sets (\texttt{ILAENV=1}) for (\texttt{ISPEC=10}) 1395 %and (\texttt{ISPEC=11}) settings in ILAENV. 1396 % 1397 %If you are installing LAPACK on a non-IEEE machine, you MUST modify ILAENV, 1398 %as this test inside ILAENV will crash! 1399 1400 As some new routines in LAPACK rely on IEEE-754 compliance, 1401 two settings (\texttt{ISPEC=10} and \texttt{ISPEC=11}) have been added to ILAENV 1402 (\texttt{LAPACK/SRC/ilaenv.f}) to denote IEEE-754 compliance for NaN and 1403 infinity arithmetic, respectively. By default, ILAENV assumes an IEEE 1404 machine, and does a test for IEEE-754 compliance. \textbf{NOTE: If you 1405 are installing LAPACK on a non-IEEE machine, you MUST modify ILAENV, 1406 as this test inside ILAENV will crash!} 1407 1408 Thus, for non-IEEE machines, the user must hard-code the setting of 1409 (\texttt{ILAENV=0}) for (\texttt{ISPEC=10} and \texttt{ISPEC=11}) in the version 1410 of \texttt{LAPACK/SRC/ilaenv.f} to be put in 1411 his library. For further details, refer to section~\ref{testieee}. 1412 1413 Be aware 1414 that some IEEE compilers by default do not enforce IEEE-754 compliance, and 1415 a compiler flag must be explicitly set by the user. 1416 1417 On SGIs for example, you must set the \texttt{-OPT:IEEE\_NaN\_inf=ON} compiler 1418 flag to enable IEEE-754 compliance. 1419 1420 And lastly, the test inside ILAENV to detect IEEE-754 compliance, will 1421 result in IEEE exceptions for Divide by Zero'' and Invalid Operation''. 1422 Thus, if the user is installing on a machine that issues IEEE exception 1423 warning messages (like a Sun SPARCstation), the user can disregard these 1424 messages. To avoid these messages, the user can hard-code the values 1425 inside ILAENV as explained in section~\ref{testieee}. 1426 1427 \section{Lack of \texttt{/tmp} space} 1428 1429 If \texttt{/tmp} space is small (i.e., less than approximately 16 MB) on your 1430 architecture, you may run out of space 1431 when compiling. There are a few possible solutions to this problem. 1432 \begin{enumerate} 1433 \item You can ask your system administrator to increase the size of the 1434 \texttt{/tmp} partition. 1435 \item You can change the environment variable \texttt{TMPDIR} to point to 1436 your home directory for temporary space. E.g., 1437 \begin{quote} 1438 \texttt{setenv TMPDIR /home/userid/} 1439 \end{quote} 1440 where \texttt{/home/userid/} is the user's home directory. 1441 \item If your archive command has an \texttt{l} option, you can change the 1442 archive command to \texttt{ar crl} so that the 1443 archive command will only place temporary files in the current working 1444 directory rather than in the default temporary directory /tmp. 1445 \end{enumerate} 1446 1447 \section{BLAS} 1448 1449 If you suspect a BLAS-related problem and you are linking 1450 with an optimized version of the BLAS, we would strongly suggest 1451 as a first step that you link to the Fortran~77 version of 1452 the suspected BLAS routine and see if the error has disappeared. 1453 1454 We have included test programs for the Level 1 BLAS. 1455 Users should therefore beware of a common problem in machine-specific 1456 implementations of xNRM2, 1457 the function to compute the 2-norm of a vector. 1458 The Fortran version of xNRM2 avoids underflow or overflow 1459 by scaling intermediate results, but some library versions of xNRM2 1460 are not so careful about scaling. 1461 If xNRM2 is implemented without scaling intermediate results, some of 1462 the LAPACK test ratios may be unusually high, or 1463 a floating point exception may occur in the problems scaled near 1464 underflow or overflow. 1465 The solution to these problems is to link the Fortran version of 1466 xNRM2 with the test program. \emph{On some CRAY architectures, the Fortran77 1467 version of xNRM2 should be used.} 1468 1469 \section{Optimization} 1470 1471 If a large numbers of test failures occur for a specific matrix type 1472 or operation, it could be that there is an optimization problem with 1473 your compiler. Thus, the user could try reducing the level of 1474 optimization or eliminating optimization entirely for those routines 1475 to see if the failures disappear when you rerun the tests. 1476 1477 %LAPACK is written in Fortran 77. Prospective users with only a 1478 %Fortran 66 compiler will not be able to use this package. 1479 1480 \section{Compiling testing/timing drivers} 1481 1482 The testing and timing main programs (xCHKAA, xCHKEE, xTIMAA, and 1483 xTIMEE) 1484 allocate large amounts of local variables. Therefore, it is vitally 1485 important that the user know if his compiler by default allocates local 1486 variables statically or on the stack. It is not uncommon for those 1487 compilers which place local variables on the stack to cause a stack 1488 overflow at runtime in the testing or timing process. The user then 1489 has two options: increase your stack size, or force all local variables 1490 to be allocated statically. 1491 1492 On HPPA architectures, the 1493 compiler and linker flag \texttt{-K} should be used when compiling these testing 1494 and timing main programs to avoid such a stack overflow. I.e., set 1495 \texttt{FFLAGS\_DRV = -K} in the \texttt{LAPACK/make.inc} file. 1496 1497 For similar reasons, 1498 on SGI architectures, the compiler and linker flag \texttt{-static} should be 1499 used. I.e., set \texttt{FFLAGS\_DRV = -static} in the \texttt{LAPACK/make.inc} file. 1500 1501 \section{IEEE arithmetic} 1502 1503 Some of our test matrices are scaled near overflow or underflow, 1504 but on the Crays, problems with the arithmetic near overflow and 1505 underflow forced us to scale by only the square root of overflow 1506 and underflow. 1507 The LAPACK auxiliary routine SLABAD (or DLABAD) is called to 1508 take the square root of underflow and overflow in cases where it 1509 could cause difficulties. 1510 We assume we are on a Cray if$ \log_{10} (\mathrm{overflow})\$
1511 is greater than 2000
1512 and take the square root of underflow and overflow in this case.
1513 The test in SLABAD is as follows:
1514 \begin{verbatim}
1515       IF( LOG10( LARGE ).GT.2000. ) THEN
1516          SMALL = SQRT( SMALL )
1517          LARGE = SQRT( LARGE )
1518       END IF
1519 \end{verbatim}
1520 Users of other machines with similar restrictions on the effective
1521 range of usable numbers may have to modify this test so that the
1522 square roots are done on their machine as well.  \emph{Usually on
1523 HPPA architectures, a similar restriction in SLABAD should be enforced
1524 for all testing involving complex arithmetic.}
1525 SLABAD is located in \texttt{LAPACK/SRC}.
1526
1527 For machines which have a narrow exponent range or lack gradual
1528 underflow (DEC VAXes for example), it is not uncommon to experience
1529 failures in sec.out and/or dec.out with SLAQTR/DLAQTR or DTRSYL.
1530 The failures in SLAQTR/DLAQTR and DTRSYL
1531 occur with test problems which are very badly scaled when the norm of
1532 the solution is very close to the underflow
1533 threshold (or even underflows to zero).  We believe that these failures
1534 could probably be avoided by an even greater degree of care in scaling,
1535 but we did not want to delay the release of LAPACK any further.  These
1536 tests pass successfully on most other machines.  An example failure in
1537 dec.out on a MicroVAX II looks like the following:
1538
1539 \begin{verbatim}
1540 Tests of the Nonsymmetric eigenproblem condition estimation routines
1541 DLALN2, DLASY2, DLANV2, DLAEXC, DTRSYL, DTREXC, DTRSNA, DTRSEN, DLAQTR
1542
1543 Relative machine precision (EPS) =     0.277556D-16
1544 Safe minimum (SFMIN)             =     0.587747D-38
1545
1546 Routines pass computational tests if test ratio is less than   20.00
1547
1548 DEC routines passed the tests of the error exits ( 35 tests done)
1549 Error in DTRSYL: RMAX =   0.155D+07
1550 LMAX =     5323 NINFO=    1600 KNT=   27648
1551 Error in DLAQTR: RMAX =   0.344D+04
1552 LMAX =    15792 NINFO=   26720 KNT=   45000
1553 \end{verbatim}
1554
1555 \section{Timing programs}
1556
1557 In the eigensystem timing program, calls are made to the LINPACK
1558 and EISPACK equivalents of the LAPACK routines to allow a direct
1559 comparison of performance measures.
1560 In some cases we have increased the minimum number of
1561 iterations in the LINPACK and EISPACK routines to allow
1562 them to converge for our test problems, but
1563 even this may not be enough.
1564 One goal of the LAPACK project is to improve the convergence
1565 properties of these routines, so error messages in the output
1566 file indicating that a LINPACK or EISPACK routine did not
1567 converge should not be regarded with alarm.
1568
1569 In the eigensystem timing program, we have equivalenced some work
1570 arrays and then passed them to a subroutine, where both arrays are
1571 modified.  This is a violation of the Fortran~77 standard, which
1572 says if a subprogram reference causes a dummy argument in the
1573 referenced subprogram to become associated with another dummy
1574 argument in the referenced subprogram, neither dummy argument may
1575 become defined during execution of the subprogram.''
1576 \footnote{ ANSI X3.9-1978, sec. 15.9.3.6}
1577 If this causes any difficulties, the equivalence
1578 can be commented out as explained in the comments for the main
1579 eigensystem timing programs.
1580
1581 %\section*{MACHINE-SPECIFIC DIFFICULTIES}
1582 %Some IBM compilers do not recognize DBLE as a generic function as used
1583 %in LAPACK.  The software tools we use to convert from single precision
1584 %to double precision convert REAL(C) and AIMAG(C), where C is COMPLEX,
1585 %to DBLE(Z) and DIMAG(Z), where Z is COMPLEX*16, but
1586 %IBM compilers use DREAL(Z) and DIMAG(Z) to take the real and
1587 %imaginary parts of a double complex number.
1588 %IBM users can fix this problem by changing DBLE to DREAL when the
1589 %argument of DBLE is COMPLEX*16.
1590 %
1591 %IBM compilers do not permit the data type COMPLEX*16 in a FUNCTION
1592 %subprogram definition.  The data type on the first line of the
1593 %function subprogram must be changed from COMPLEX*16 to DOUBLE COMPLEX
1594 %for the following functions:
1595 %
1596 %\begin{tabbing}
1597 %\dent ZLATMOO \= from the test matrix generator library \kill
1598 %\dent ZBEG \> from the Level 2 BLAS test program  \\
1599 %\dent ZBEG \> from the Level 3 BLAS test program  \\
1600 %\dent ZLADIV \> from the LAPACK library \\
1601 %\dent ZLARND \> from the test matrix generator library \\
1602 %\dent ZLATM2 \> from the test matrix generator library \\
1603 %\dent ZLATM3 \> from the test matrix generator library
1604 %\end{tabbing}
1605 %The functions ZDOTC and ZDOTU from the Level 1 BLAS are already
1606 %declared DOUBLE COMPLEX.  If that doesn't work, try the declaration
1607 %COMPLEX FUNCTION*16.
1608
1609
1610 \newpage
1612
1613 \begin{thebibliography}{9}
1614
1615 \bibitem{LUG}
1616 E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra,
1617 J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney,
1618 S. Ostrouchov, and D. Sorensen,
1619 \textit{LAPACK Users' Guide}, Second Edition,
1621
1622 \bibitem{WN16}
1623 E. Anderson and J. Dongarra,
1624 \textit{LAPACK Working Note 16:
1625 Results from the Initial Release of LAPACK},
1626 University of Tennessee, CS-89-89, November 1989.
1627
1628 \bibitem{WN41}
1629 E. Anderson, J. Dongarra, and S. Ostrouchov,
1630 \textit{LAPACK Working Note 41:
1631 Installation Guide for LAPACK},
1632 University of Tennessee, CS-92-151, February 1992 (revised June 1999).
1633
1634 \bibitem{WN5}
1635 C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum,
1636 S. Hammarling, and D. Sorensen,
1637 \textit{LAPACK Working Note \#5:  Provisional Contents},
1638 Argonne National Laboratory, ANL-88-38, September 1988.
1639
1640 \bibitem{WN13}
1641 Z. Bai, J. Demmel, and A. McKenney,
1642 \textit{LAPACK Working Note \#13: On the Conditioning of the Nonsymmetric
1643 Eigenvalue Problem:  Theory and Software},
1644 University of Tennessee, CS-89-86, October 1989.
1645
1646 \bibitem{XBLAS}
1647 X. S. Li, J. W. Demmel, D. H. Bailey, G. Henry, Y. Hida, J. Iskandar,
1648 W. Kahan, S. Y. Kang, A. Kapur, M. C. Martin, B. J. Thompson, T. Tung,
1649 and D. J. Yoo, \textit{Design, implementation and testing of extended
1650   and mixed precision BLAS},
1651 \textit{ACM Trans. Math. Soft.}, 28, 2:152--205, June 2002.
1652
1653 \bibitem{BLAS3}
1654 J. Dongarra, J. Du Croz, I. Duff, and S. Hammarling,
1655 A Set of Level 3 Basic Linear Algebra Subprograms,''
1656 \textit{ACM Trans. Math. Soft.}, 16, 1:1-17, March 1990
1657 %Argonne National Laboratory, ANL-MCS-P88-1, August 1988.
1658
1659 \bibitem{BLAS3-test}
1660 J. Dongarra, J. Du Croz, I. Duff, and S. Hammarling,
1661 A Set of Level 3 Basic Linear Algebra Subprograms:
1662 Model Implementation and Test Programs,''
1663 \textit{ACM Trans. Math. Soft.}, 16, 1:18-28, March 1990
1664 %Argonne National Laboratory, ANL-MCS-TM-119, June 1988.
1665
1666 \bibitem{BLAS2}
1667 J. Dongarra, J. Du Croz, S. Hammarling, and R. Hanson,
1668 An Extended Set of Fortran Basic Linear Algebra Subprograms,''
1669 \textit{ACM Trans. Math. Soft.}, 14, 1:1-17, March 1988.
1670
1671 \bibitem{BLAS2-test}
1672 J. Dongarra, J. Du Croz, S. Hammarling, and R. Hanson,
1673 An Extended Set of Fortran Basic Linear Algebra Subprograms:
1674 Model Implementation and Test Programs,''
1675 \textit{ACM Trans. Math. Soft.}, 14, 1:18-32, March 1988.
1676
1677 \bibitem{BLAS1}
1678 C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh,
1679 Basic Linear Algebra Subprograms for Fortran Usage,''
1680 \textit{ACM Trans. Math. Soft.}, 5, 3:308-323, September 1979.
1681
1682 \end{thebibliography}
1683
1684 \end{document}