Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
ElmerCSC
GitHub Repository: ElmerCSC/elmerfem
Path: blob/devel/fhutiter/doc/hutidoc.tex
3206 views
1
%
2
% Documentation for the HUT-Iter library
3
%
4
5
\documentclass[11pt,a4paper,english,oneside]{report}
6
\usepackage[us]{datetime}
7
\usepackage[latin1]{inputenc}
8
\usepackage{float,graphicx,t1enc}
9
10
\title{HUTI - HUT Iter Library, User's Guide}
11
12
\author{Jouni Malinen\\
13
CSC - IT Center for Science Ltd.\\
14
P.O.BOX 405, FIN-02101 Espoo, Finland\\
15
Jouni.Malinen@csc.fi}
16
17
\date{\formatdate{01}{08}{1997}\\
18
Version 1.0
19
}
20
21
% ------------ Here begins the actual document
22
% Standard stuff
23
24
\begin{document}
25
26
% Miscellaneous settings
27
28
\setcounter{secnumdepth}{4}
29
\setcounter{tocdepth}{4}
30
31
\pagenumbering{roman}
32
\pagestyle{plain}
33
34
\maketitle
35
36
\tableofcontents
37
% \listoffigures
38
\listoftables
39
40
% ------------ The actual text begins here
41
42
% ------------------------------------------------------------------------
43
% ------------------------------------------------------------------------
44
45
\chapter{Introduction}
46
\label{ch:intro}
47
\pagenumbering{arabic}
48
\pagestyle{headings}
49
50
\section{General}
51
52
Many computational problems require the solution of linear systems of
53
equations $Ax = b$, where $A$ is the coefficient matrix, $b$ is
54
the {\em right-hand side} \/and $x$ is the solution.
55
56
There are several methods for solving linear systems of equations. These
57
can be divided into two classes, direct and iterative methods. In
58
general direct methods are preferred because of their predictable
59
behaviour and robustness. However, the need to solve very large linear
60
systems in a reasonable time and with limited resources has been one of
61
the key reasons for the development of iterative solvers.
62
We now know more of the behaviour and suitability of these methods
63
and are able to use them in different kind of applications.
64
65
HUTI is an effort to make an effective and well structured library
66
containing a collection of iterative methods. The methods implemented
67
in the library are:
68
69
\begin{itemize}
70
\item Conjugate Gradient (CG) \cite{Bar93}
71
\item Conjugate Gradient Squared (CGS) \cite{Bar93}
72
\item Bi-Conjugate Gradient Stabilized (Bi-CGSTAB) \cite{Bar93}
73
\item Bi-Conjugate Gradient Stabilized (2) (Bi-CGSTAB(2))
74
\item Quasi-Minimal Residual (QMR) \cite{Bar93,Fre91,Fre94,Buc96}
75
\item Transpose-Free Quasi-Minimal Residual (TFQMR) \cite{Fre93b}
76
\item Generalized Minimum Residual (GMRES) \cite{Bar93,Saa96}
77
\end{itemize}
78
79
This library supports both serial and parallel execution. Parallelisation
80
has been made in a distributed memory environment using message passing
81
as a communication method between processes. User has the same interface
82
into the library for both of the execution models. The selection of the
83
model can be controlled with special library routines or via environment
84
variables.
85
86
\section{Why HUTI?}
87
88
There are already several implementations of various iterative
89
methods both as libraries and as ``plain code''
90
\cite{Cun95,Bal95,Saa95,Fre96}.
91
HUTI differs from these implementations in the sense that it has been
92
specially tuned for the parallel architecture it is using and is
93
not meant to be a general purpose code.
94
95
Another reason for making this library is the
96
master's thesis of the author. HUTI has also been incorporated into
97
a software called ELMER which in turn is made for a
98
TEKES\footnote{Technology Development Center in Finland}
99
funded project VIRKE\footnote{VIRtauslaskentaohjelmiston KEhitt�minen}.
100
101
The name HUTI comes from Helsinki University of Technology (HUT) and
102
Iterative solvers.
103
104
% ------------------------------------------------------------------------
105
% ------------------------------------------------------------------------
106
107
\chapter{Iterative Methods}
108
\label{ch:methods}
109
110
This chapter should somehow present the characteristics of different
111
iterative methods and their usability in different problem areas.
112
113
\section{Overview of the Methods}
114
\section{Preconditioning}
115
\section{Stopping Criteria}
116
\section{Convergence}
117
\section{Parallelism}
118
119
% ------------------------------------------------------------------------
120
% ------------------------------------------------------------------------
121
122
\chapter{Using HUTI}
123
\label{ch:using}
124
125
\section{Naming Conventions}
126
127
All the HUTI routine names and variables start with the
128
129
\begin{minipage}{1in}
130
\begin{center}
131
\bigskip
132
{\ttfamily huti\_} \\
133
or \\
134
{\ttfamily HUTI\_}
135
\bigskip
136
\end{center}
137
\end{minipage}
138
139
\noindent prefix. In the routine names the precision is denoted by an
140
appropriate character: {\ttfamily s} for
141
{\em single precision},
142
{\ttfamily d} for {\em double precision}, {\ttfamily c} for {\em complex}
143
and {\ttfamily z} for {\em double complex}.
144
145
% ------------------------------------------------------------------------
146
147
\section{Driver Routines}
148
149
The key idea in HUTI is that all iterator routines have the same calling
150
conventions regardless of the selected method.
151
All matrix related operations are done externally from the iterator
152
library. This means that the solver doesn't need to know the exact matrix
153
structure. Matrix can be stored for example in well-known Compressed
154
Row Storage (CRS) or Compressed Comlumn Storage (CCS) formats.
155
This eases the optimization of memory usage in each particular case.
156
157
In parallel setting it is on users responsibility to define the storage
158
convention for the distribution of matrices and vectors.
159
Well-known distribution concepts are for example block-cyclic decomposition
160
and domain based decompositions. More information can be found from
161
\cite{Kum94,Saa96}. See also chapter \ref{ch:examples} for an example
162
of user supplied distribution of data.
163
164
Solver routines are called in the following way:
165
166
\noindent
167
\begin{tabbing}
168
{\ttfamily CALL HUTI\_$*$\_{\em SOLVER\_TYPE}} {\ttfamily (} \=
169
{\ttfamily X, RHS, IPAR, DPAR, WORK, MATVEC,} \\
170
\> {\ttfamily PCONDL, PCONDR, DOTPROD, NORM,} \\
171
\> {\ttfamily STOPC)} \\
172
\end{tabbing}
173
174
\noindent
175
\begin{tabular*}{\textwidth}{lll}
176
where & $*$ & is either {\ttfamily S, D, C} or {\ttfamily Z}
177
depending on the precision. \\
178
& {\ttfamily {\em SOLVER\_TYPE}} & is either {\ttfamily CG, CGS, BICGSTAB,
179
BICGSTAB\_2, QMR, TFQMR} or {\ttfamily GMRES} \\
180
& & depending on the method. \\
181
\end{tabular*}
182
183
Table \ref{table:solver-param} describes the parameters for solver
184
routines.
185
186
\begin{table}[H]
187
\begin{tabular*}{\textwidth}{lll}
188
\hline\hline
189
{\bfseries Argument} & {\bfseries Type} & {\bfseries Description} \\
190
\hline
191
X & vector of & Vector $x$, the current iterate \\
192
& type $*$ & \\
193
RHS & vector of & $b$, the Right Hand Side \\
194
& type $*$ & \\
195
IPAR & vector type & IPAR-structure, see section \ref{sec:ipars} \\
196
& integer & \\
197
DPAR & vector of type & DPAR-structure, see section \ref{sec:dpars} \\
198
& double prec. & \\
199
WORK & matrix of & User allocated working array, size varies\\
200
& type $*$ & depending on the method, see table \ref{table:ipar-input} \\
201
MATVEC & subroutine & User supplied external routine, \\
202
& & must perform matrix-vector product \\
203
PCONDL & subroutine & User supplied routine for left side \\
204
& & preconditioning \\
205
PCONDR & subroutine & User supplied routine for right side \\
206
& & preconditioning \\
207
DOTPROD & function & Used supplied routine to perform dot \\
208
& & product \\
209
NORM & function & User supplied routine returning norm \\
210
& & of a vector \\
211
STOPC & function & User supplied routine to perform stopping\\
212
& & criteria testing \\
213
\hline\hline
214
\end{tabular*}
215
\caption{Parameters for the solver routines}
216
\label{table:solver-param}
217
\end{table}
218
219
The external routine {\ttfamily MATVEC} is the only needed routine when
220
calling a solver. It should perform matrix-vector product.
221
Using zeros in place of the other external routine names will force the
222
library to use default routines applicapable to the selected execution model.
223
For example {\em double complex} Conjugate Gradient method could be called
224
from a FORTRAN program in the following way:
225
226
\medskip
227
\noindent
228
{\ttfamily CALL HUTI\_Z\_CG (X, RHS, IPAR, DPAR, WORK, MATVEC, 0, 0, 0, 0, 0)}
229
\medskip
230
231
\noindent
232
where {\ttfamily X, RHS, IPAR, DPAR} are user supplied vectors and
233
{\ttfamily WORK} is the user allocated work space (an array) for the iterator.
234
In this case the library would use BLAS-1 calls for {\ttfamily DOTPROD} and
235
{\ttfamily NORM} if executed in serial execution mode. There would be no
236
preconditioning applied. The {\ttfamily IPAR} and {\ttfamily DPAR} structures
237
must contain user supplied information about the dimensions of the vectors
238
and work array and certain control information for the iterators.
239
240
\section{External Routines}
241
242
This section describes external routines that can be given as arguments
243
for the solver routine. Only {\ttfamily MATVEC} routine is required,
244
others routines are optional.
245
246
These routines are called from the solver and type of arguments and order
247
is presented for each routine.
248
249
The matrix $A$ can be stored in any format because it is totally on
250
the user's responsibility to make it available for the external routines.
251
252
The {\ttfamily IPAR} structure is passed to some of the external routines
253
and is used to carry on certain control variables from the solver routine.
254
In the {\ttfamily IPAR} structure there is for example the assumed type
255
of the matrix in external operation. This applies for both the
256
matrix-vector operation
257
$Au = v$ and preconditioning operations $M_{1}^{-1}u = v$ and
258
$M_{2}^{-1}u = v$.
259
260
\subsection{Matrix-Vector operation}
261
262
The arguments for the external matrix-vector operation {\ttfamily MATVEC}
263
are given in Table \ref{table:matvec-param}. This routine should perform
264
matrix-vector product. In the {\ttfamily IPAR} structure the iterator
265
provides information about the matrix form, should it be transposed or not.
266
Only non-transposed forms are used in CG, CGS, Bi-CGSTAB, TFQMR and GMRES
267
methods.
268
Only QMR will need a transposed matrix-vector product, that is $A^{T}u = v$.
269
270
The calling convention for the {\ttfamily MATVEC} is:
271
272
\bigskip
273
\noindent
274
{\ttfamily SUBROUTINE MATVEC ( U, V, IPAR )}
275
\bigskip
276
277
\begin{table}[H]
278
\begin{tabular*}{\textwidth}{lll}
279
\hline\hline
280
{\bfseries Argument} & {\bfseries Type} & {\bfseries Description} \\
281
\hline
282
U & vector of & Vector u in $Au = v$ \\
283
& type $*$ & \\
284
V & vector of & Vector v in $Au = v$ \\
285
& type $*$ & \\
286
IPAR & vector of type & IPAR-structure, see section \ref{sec:ipars} \\
287
& integer & \\
288
\hline\hline
289
\end{tabular*}
290
\caption{Parameters for the external MATVEC subroutine}
291
\label{table:matvec-param}
292
\end{table}
293
294
\subsection{Preconditioning}
295
296
The routines {\ttfamily PCONDL} and {\ttfamily PCONDR} should solve
297
the $M_{1}u = v$ and $M_{2}u = v$ respectively if preconditioning
298
matrix is splitted in two parts. If only one preconditioning matrix $M$ is
299
available, the {\ttfamily PCONDL} routine should solve $Mu = v$ and
300
{\ttfamily PCONDR} should not be supplied for the solver (the argument must
301
be zero).
302
303
The arguments for the external preconditioning operations
304
{\ttfamily PCONDL} and {\ttfamily PCONDR} are given in
305
Table \ref{table:pcond-param}. Preconditioning routines should use the
306
information in {\ttfamily IPAR} structure to apply transposed or
307
non-transposed solve when needed. Only QMR method will need the
308
$M^{-T}u = v$ operation.
309
310
The calling convention for the {\ttfamily PCONDL} is
311
312
\bigskip
313
\noindent
314
{\ttfamily SUBROUTINE PCONDL ( U, V, IPAR )}
315
\bigskip
316
317
\noindent
318
and for the {\ttfamily PCONDR}
319
320
\bigskip
321
\noindent
322
{\ttfamily SUBROUTINE PCONDR ( U, V, IPAR )}
323
\bigskip
324
325
\begin{table}[H]
326
\begin{tabular*}{\textwidth}{lll}
327
\hline\hline
328
{\bfseries Argument} & {\bfseries Type} & {\bfseries Description} \\
329
\hline
330
U & vector of & Vector u in $Mu = v$ \\
331
& type $*$ & \\
332
V & vector of & Vector v in $Mu = v$ \\
333
& type $*$ & \\
334
IPAR & vector of type & IPAR-structure, see section \ref{sec:ipars} \\
335
& integer & \\
336
\hline\hline
337
\end{tabular*}
338
\caption{Parameters for the external PCONDL and PCONDR subroutines}
339
\label{table:pcond-param}
340
\end{table}
341
342
\subsection{Global Dot Product}
343
344
The external function {\ttfamily DOTPROD} is supposed to perform global
345
dot product for two given vectors. In the serial case this routine
346
is by default the corresponding BLAS-1 routine. In the parallel case
347
this is the place to do global product using for example
348
{\ttfamily MPI\_ALLREDUCE} function to sum up the local products
349
computed using for example BLAS-1 routine.
350
351
The calling convention for the function {\ttfamily DOTPROD} is
352
353
\bigskip
354
\noindent
355
{\ttfamily FUNCTION DOTPROD ( NDIM, X, INCX, Y, INCY )}
356
\bigskip
357
358
\begin{table}[H]
359
\begin{tabular*}{\textwidth}{lll}
360
\hline\hline
361
{\bfseries Argument} & {\bfseries Type} & {\bfseries Description} \\
362
\hline
363
NDIM & integer & Dimension of vectors X and Y\\
364
X & vector of & Vector x in $x \cdot y$ \\
365
& type $*$ & \\
366
INCX & integer & The increment for the elements of X \\
367
Y & vector of & Vector y in $x \cdot y$ \\
368
& type $*$ & \\
369
INCY & integer & The increment for the elements of Y \\
370
\hline\hline
371
\end{tabular*}
372
\caption{Parameters for the external DOTPROD function}
373
\label{table:dotprod-param}
374
\end{table}
375
376
The function {\ttfamily DOTPROD} must return a value of the same type as the
377
argument vectors.
378
379
\subsection{Global Vector Norm}
380
381
The external routine {\ttfamily NORM} is used to produce the global
382
vector norm, usually the vector 2-norm $\|x\|_{2}$. In the serial case
383
this routine is by default the corresponding BLAS-1 routine. In parallel
384
case this is very similar as {\ttfamily DOTPROD} function.
385
386
The calling convention for the function {\ttfamily NORM} is
387
388
\bigskip
389
\noindent
390
{\ttfamily FUNCTION NORM ( NDIM, X, INCX )}
391
\bigskip
392
393
\begin{table}[H]
394
\begin{tabular*}{\textwidth}{lll}
395
\hline\hline
396
{\bfseries Argument} & {\bfseries Type} & {\bfseries Description} \\
397
\hline
398
NDIM & integer & Dimension of vector X \\
399
X & vector of & Vector x in $\|x\|$ \\
400
& type $*$ & \\
401
INCX & integer & The increment for the elements of X \\
402
\hline\hline
403
\end{tabular*}
404
\caption{Parameters for the external NORM function}
405
\label{table:norm-param}
406
\end{table}
407
408
The function {\ttfamily NORM} must return a value that is either
409
real if X is single precision ({\em real or complex}) or double if X is
410
double precision ({\em double precision or double complex}).
411
412
\subsection{Stopping Criterion}
413
414
Stopping criterion can be selected from the built-in stopping criteria
415
or it can be supplied by the user. Built-in alternatives are
416
listed in table \ref{table:ipar-input}.
417
418
The calling convention for the user supplied function {\ttfamily STOPC} is
419
420
\bigskip
421
\noindent
422
{\ttfamily FUNCTION STOPC ( X, B, R, IPAR, DPAR )}
423
\bigskip
424
425
\begin{table}[H]
426
\begin{tabular*}{\textwidth}{lll}
427
\hline\hline
428
{\bfseries Argument} & {\bfseries Type} & {\bfseries Description} \\
429
\hline
430
X & vector of & Current iterate $x_{n}$ \\
431
& type $*$ & \\
432
B & vector of & The original right-hand side \\
433
& type $*$ & \\
434
R & vector of & Current residual vector $r_{n}$ \\
435
& type $*$ & \\
436
IPAR & vector of type & IPAR-structure, see section \ref{sec:ipars} \\
437
& integer & \\
438
DPAR & vector of type & DPAR-structure, see section \ref{sec:dpars} \\
439
& double precision & \\
440
\hline\hline
441
\end{tabular*}
442
\caption{Parameters for the external STOPC function}
443
\label{table:stopc-param}
444
\end{table}
445
446
The function {\ttfamily STOPC} must return a value of same type as
447
the {\ttfamily NORM} function for selected precision. See the previous
448
section.
449
450
The returned value should describe somehow how close the current iterate is
451
the user supplied tolerance. It will be tested against the tolerance and
452
printed if requested.
453
454
% ------------------------------------------------------------------------
455
456
\section{Iteration Parameters}
457
458
\subsection{IPAR -Structure}
459
\label{sec:ipars}
460
461
The {\ttfamily IPAR} structure is used to control the progress and behaviour
462
of the iterator routine and to get status back from it. {\ttfamily IPAR}
463
is also passed further to some of the user supplied routines.
464
465
Input parameters are described in table \ref{table:ipar-input} along with
466
their default values, output parameters are in table \ref{table:ipar-output}.
467
468
A more detailed description of the various parameters and output values
469
for each solver is listed on the corresponding reference pages.
470
471
\begin{table}[H]
472
473
\begin{tabular*}{\textwidth}{lll}
474
\hline\hline
475
{\bfseries Element} & {\bfseries Description} & {\bfseries Default} \\
476
\hline\hline
477
& {\em General parameters} & \\
478
\hline
479
1 & Length of the IPAR structure & 50 \\
480
2 & Length of the DPAR structure & 10 \\
481
3 & Leading dimension of the matrix (and vectors) & \\
482
4 & Number of vectors in the {\ttfamily work} array: & \\
483
& CG: 4 & \\
484
& CGS: 7 & \\
485
& Bi-CGSTAB: 8 & \\
486
& Bi-CGSTAB\_2: 8 & \\
487
& QMR: 14 & \\
488
& TFQMR: 10 & \\
489
& GMRES: 7 + number of restart vectors & \\
490
5 & Number of iterations between debug output & 0 \\
491
6 & Assumed matrix type in external operations & \\
492
& 0: Matrix must {\em not} be transposed & \\
493
& 1: Matrix must be transposed & \\
494
\hline
495
& {\em Iteration parameters} & \\
496
\hline
497
10 & Maximum number of iterations allowed & 5000 \\
498
12 & Stopping criterion used: & 0 \\
499
& ($\epsilon$ is the tolerance given by the user, see table \ref{table:dpar}) & \\
500
& 0: $\|r_{n}\| < \epsilon$ & \\
501
& 1: $\|r_{n}\| < \epsilon \|b\| $ & \\
502
& 2: $\|z_{n}\| < \epsilon$ & \\
503
& 3: $\|z_{n}\| < \epsilon \|b\| $ & \\
504
& 4: $M^{-1}\|z_{n}\| < \epsilon M^{-1}\|b\| $ & \\
505
& 5: $\|x_{n} - x_{n-1}\| < \epsilon $ & \\
506
& 6: {\em upper bound} $< \epsilon$ (only with TFQMR) & \\
507
& 10: Use the user supplied routine {\ttfamily STOPC} &\\
508
13 & Preconditioning technique used: & 0 \\
509
& 0: None & \\
510
& 1: Right preconditioning & \\
511
& 2: Left preconditioning & \\
512
& 3: Symmetric preconditioning & \\
513
14 & Initial $x_{0}$, starting vector: & 0 \\
514
& 0: Random $x_{0}$ & \\
515
& 1: User supplied $x_{0}$, vector in {\ttfamily XVEC} & \\
516
15 & Number of restart vectors in GMRES(m) & 1 \\
517
\hline
518
& {\em Parallel environment parameters} & \\
519
\hline
520
20 & Processor identification number for specific process & \\
521
21 & Number of processors & 1 \\
522
\hline\hline
523
\end{tabular*}
524
525
\caption{IPAR-structure, input parameters}
526
\label{table:ipar-input}
527
\end{table}
528
529
\begin{table}[H]
530
531
\begin{tabular*}{\textwidth}{ll}
532
\hline\hline
533
{\bfseries Element} & {\bfseries Description} \\
534
\hline\hline
535
& {\em General parameters} \\
536
\hline
537
30 & Status information: \\
538
& 0: No change \\
539
& 1: Iteration converged \\
540
& 2: Maximum number of iterations reached \\
541
& 10: QMR breakdown in $\rho$ or $\psi$ \\
542
& 11: QMR breakdown in $\delta$ \\
543
& 12: QMR breakdown in $\epsilon$ \\
544
& 13: QMR breakdown in $\beta$ \\
545
& 14: QMR breakdown in $\gamma$ \\
546
& 20: CG breakdown in $\rho$ \\
547
& 25: CGS breakdown in $\rho$ \\
548
& 30: TFQMR breakdown in $\rho$ \\
549
& 35: Bi-CGSTAB breakdown in $\rho$ \\
550
& 36: Bi-CGSTAB breakdown in $\|s\|$ \\
551
& 37: Bi-CGSTAB breakdown in $\omega$ \\
552
31 & Number of iterations run through \\
553
\hline\hline
554
\end{tabular*}
555
556
\caption{IPAR-structure, output parameters}
557
\label{table:ipar-output}
558
\end{table}
559
560
\subsection{DPAR -Structure}
561
\label{sec:dpars}
562
563
For parameters of type {\em double precision} there is a structure
564
called {\ttfamily DPAR}. Table \ref{table:dpar} describes the
565
elements of this structure.
566
567
\begin{table}[h]
568
\begin{tabular*}{\textwidth}{lll}
569
\hline\hline
570
{\bfseries Element} & {\bfseries Description} & {\bfseries Default} \\
571
\hline\hline
572
& {\em General parameters} & \\
573
\hline
574
1 & Tolerance used by stopping criterion & $10e^{-6}$ \\
575
\hline\hline
576
\end{tabular*}
577
\caption{DPAR-structure}
578
\label{table:dpar}
579
\end{table}
580
581
% ------------------------------------------------------------------------
582
583
\section{Header Files}
584
\subsection{{\ttfamily huti\_fdefs.h} and {\ttfamily huti\_defs.h}}
585
586
There are header files in preprocessor format for both Fortran90 and C
587
languages. These header files include definitions for all of the
588
variables described in
589
tables \ref{table:ipar-input}, \ref{table:ipar-output} and \ref{table:dpar}.
590
There are also definitions for possible flags of certain variables and
591
default values.
592
593
The user should use the named definitions by including header file
594
via {\ttfamily \#include ``huti\_defs.h''} for C defines and
595
{\ttfamily \#include ``huti\_fdefs.h''} for Fortran90 defines. In that
596
way the compatibility is guaranteed also with the later versions of
597
the library.
598
599
% ------------------------------------------------------------------------
600
% ------------------------------------------------------------------------
601
602
\chapter{Examples}
603
\label{ch:examples}
604
605
% ------------ End of the main text
606
607
% ------------ Bibliography
608
609
\begin{thebibliography}{1}
610
611
\bibitem{Gol89}
612
Gene H. Golub and Charles F. van Loan, {\em Matrix Computations},
613
second edition, The Johns Hopkins University Press, 1993.
614
615
\bibitem{Gei93}
616
Al Geist et al., {\em PVM 3 User's Guide and Reference Manual},
617
Oak Ridge National Laboratory, Oak Ridge Tennessee, May 1993.
618
619
\bibitem{Bar93}
620
Richard Barrett et al.,{\em Templates for the Solution of Linear
621
Systems: Building Blocks for Iterative Methods}, SIAM, 1993
622
623
\bibitem{Fre91}
624
Roland W. Freund and No\"el Nachtigal, {\em QMR: a Quasi-Minimal
625
Residual Method for Non-Hermitian Linear Systems}, Numer. Math.
626
60, 315-339, 1991
627
628
\bibitem{Fre93a}
629
Roland W. Freund, {\em An Implementation of the Look-Ahead
630
Lanczos Algorithm for Non-Hermitian Matrices},
631
SIAM J. Sci. Comput., Vol. 14, No. 1, pp. 137-158, January 1993
632
633
\bibitem{Fre93b}
634
Roland W. Freund, {\em A Transpose-Free Quasi-Minimal Residual
635
Algorithm for Non-Hermitian Linear Systems},
636
SIAM J. Sci. Comput., Vol. 14, No. 2, pp. 470-482, March 1993
637
638
\bibitem{Fre94}
639
Roland W. Freund and No\"el Nachtigal, {\em An Implementation of the
640
QMR Method Based on Coupled Two-Term Recurrences},
641
SIAM J. Sci. Comput., Vol. 15, No. 2, pp. 313-337, March 1994
642
643
\bibitem{Mpi94}
644
{\em MPI: A Message-Passing Interface Standard}, Message
645
Passing Interface Forum, April 1994
646
647
\bibitem{Gro94}
648
William Gropp, Ewing Lusk and Anthony Skjellum, {\em Using MPI,
649
Portable Parallel Programming with the Message-Passing Interface},
650
The MIT Press, 1994
651
652
\bibitem{Cun95}
653
Rudnei Dias da Cunha and Tim Hopkins, {\em PIM 2.0, The Parallel
654
Iterative Methods package for Systems of Linear Equations, User's
655
Guide}, ftp://unix.hensa.ac.uk/pub/misc/netlib/pim/ug20.ps.gz, 1995
656
657
\bibitem{Buc96}
658
H. Martin B\"{u}cker and Manfred Sauren, {\em A Parallel Version
659
of the Unsymmetric Lanczos Algorithm and its Application to QMR},
660
Forschungszentrum J\"{u}lich, March 1996
661
662
\bibitem{Saa96}
663
Yousef Saad, {\em Iterative Methods for Sparse Linear Systems},
664
PWS Publishing Company, 1996
665
666
\bibitem{Kor95}
667
Samuel Kortas and Philippe Angot, {\em A Practical and Portable
668
Model of Programming for Iterative Solvers on Distributed Memory
669
Machines}, Parallel Computing, Vol. 22, No. 4, June 1996
670
671
\bibitem{Jon95}
672
Mark T. Jones and Paul E. Plassman, {\em BlockSolve95 Users Manual:
673
Scalable Library Software for the Parallel Solution of Sparse
674
Linear Systems}, Argonne National Laboratory ANL-95/48,
675
December 1995
676
677
\bibitem{Bal95}
678
S. Balay, W. Gropp, L. C. McInnes and B. Smith, {\em PETSc 2.0
679
Users Manual}, Argonne National Laboratory ANL-95/11, 1995
680
681
\bibitem{Saa95}
682
Yousef Saad and Andrei V. Malevsky, {\em P-SPARSLIB: A Portable
683
Library of Distributed Memory Sparse Iterative Solvers},
684
University of Minnesota, Department of Computer Science, May 1995.
685
686
\bibitem{Kum94}
687
Vipin Kumar, Ananth Grama, Anshul Gupta and George Karypis,
688
{\em Introduction to Parallel Computing, Design and Analysis
689
of Algorithms}, The Benjamin/Cummings Publishing Company Inc.,
690
1994.
691
692
\bibitem{Fre96}
693
Roland W. Freund and No\"el Nachtigal, {\em QMRPACK: A Package
694
of QMR Algorithms}, ACM Transactions on Mathematical Software,
695
Vol 22, No. 1, pp. 46-77, March 1996
696
697
\end{thebibliography}
698
699
\label{page:last}
700
\end{document}
701
702