You are on page 1of 50

Method of Conjugate

Gradients
Dragica Vasileska
CG Method Conjugate Gradient (CG) Method
M. R. Hestenes & E. Stiefel, 1952
BiCG Method Biconjugate Gradient (BiCG) Method
C. Lanczos, 1952
D. A. H. Jacobs, 1981
C. F. Smith et al., 1990
R. Barret et al., 1994
CGS Method Conjugate Gradient Squared (CGS) Method (MATLAB Function)
P. Sonneveld, 1989
GMRES Method Generalized Minimal Residual (GMRES) Method
Y. Saad & M. H. Schultz, 1986
R. Barret et al., 1994
Y. Saad, 1996
QMR Method QuasiMinimalResidual (QMR) Method
R. Freund & N. Nachtigal, 1990
N. Nachtigal, 1991
R. Barret et al., 1994
Y. Saad, 1996
Conjugate Gradient Method
1. Introduction, Notation and Basic Terms
2. Eigenvalues and Eigenvectors
3. The Method of Steepest Descent
4. Convergence Analysis of the Method of Steepest Descent
5. The Method of Conjugate Directions
6. The Method of Conjugate Gradients
7. Convergence Analysis of Conjugate Gradient Method
8. Complexity of the Conjugate Gradient Method
9. Preconditioning Techniques
10. Conjugate Gradient Type Algorithms for Non-
Symmetric Matrices (CGS, Bi-CGSTAB Method)
1. Introduction, Notation and Basic Terms
The CG is one of the most popular iterative methods for solving
large systems of linear equations
Ax=b
which arise in many important settings, such as finite difference and
finite element methods for solving partial differential equa-tions,
circuit analysis etc.
It is suited for use with sparse matrices. If A is dense, the best
choice is to factor A and solve the equation by backsubstitution.
There is a fundamental underlying structure for almost all the
descent algorithms: (1) one starts with an initial point; (2) deter-
mines according to a fixed rule a direction of movement; (3) mo-ves
in that direction to a relative minimum of the objective function; (4)
at the new point, a new direction is determined and the process is
repeated. The difference between different algorithms depends upon
the rule by which successive directions of movement are selected.
1. Introduction, Notation and Basic Terms (Contd)
A matrix is a rectangular array of numbers, called elements.
The transpose of an mn matrix A is the nm matrix A
T
with
elements
a
ij
T
= a
ji
Two square nn matrices A and B are similar if there is a nonsin-
gular matrix S such that
B=S
-1
AS
Matrices having a single row are called row vectors; matrices
having a single column are called column vectors.
Row vector: a = [a
1
, a
2
, , a
n
]
Column vector: a = (a
1
, a
2
, , a
n
)
The inner product of two vectors is written as

=
=
n
i
i i
1
T
y x y x
1. Introduction, Notation and Basic Terms (Contd)
A matrix is called symmetric if a
ij
= a
ji
.
A matrix is positive definite if, for every nonzero vector x
x
T
Ax > 0
Basic matrix identities: (AB)
T
=B
T
A
T
and (AB)
-1
=B
-1
A
-1
A quadratic form is a scalar, quadratic function of a vector with the
form
The gradient of a quadratic form is
c
2
1
) f(
T T
+ = x b Ax x x
b Ax x A
x
x
x + =
(
(
(
(
(

c
c
c
c
=
2
1
2
1
) f(
x
) f(
x
) ( f
T
n
1
'
1. Introduction, Notation and Basic Terms (Contd)
Various quadratic forms for an arbitrary 22 matrix:
Positive-definite
matrix
Negative-definite
matrix
Singular
positive-indefinite
matrix
Indefinite
matrix
Specific example of a 22 symmetric positive definite matrix:
1. Introduction, Notation and Basic Terms (Contd)
(

= =
(

=
(

=
2
2
0 c
8
2
6 2
2 3
x b A , ,
Graph of a quadratic form f(x)
Contours of the
quadratic form
Gradient of
the quadratic form
2. Eigenvalues and Eigenvectors
For any nn matrix B, a scalar and a nonzero vector v that satisfy
the equation
Bv=v
are said to be the eigenvalue and the eigenvector of B.
If the matrix is symmetric, then the following properties hold:
(a) the eigenvalues of B are real
(b) eigenvectors associated with distinct eigenvalues are
orthogonal
The matrix B is positive definite (or positive semidefinite) if and
only if all eigenvalues of B are positive (or nonnegative).
Why should we care about the eigenvalues? Iterative methods often
depend on applying B to a vector over and over again:
(a) If ||<1, then B
i
v=
i
v vanishes as i approaches infinity
(b) If ||>1, then B
i
v=
i
v will grow to infinity.
2. Eigenvalues and Eigenvectors (Contd)
=-0.5
=2
Examples for ||<1 and || >1
The spectral radius of a matrix is: (B)= max|
i
|

1
=0.7

2
=-2
2. Eigenvalues and Eigenvectors (Contd)
Example: The Jacobi Method for the solution of Ax=b
- split the matrix A such that A=D+E
- Instead of solving the system Ax=b, one solves the modifi-
ed system x=Bx+z, where B=-D
-1
E and z=D
-1
b
- The iterative method is: x
(i+1)
= Bx
(i)
+z, or
it terms of the error vector: e
(i+1)
= Be
(i)
- For our particular example, we have that the eigenvalues
and the eigenvectors of the matrix A are:

1
=7, v
1
=[1, 2]
T

2
=2, v
2
=[-2, 1]
T
- The eigenvalues and eigenvectors of the matrix B are:

1
=-\2/3=-0.47, v
1
=[\2, 1]
T

2
= \2/3=0.47, v
2
=[-\2, 1]
T
Graphical representation of the convergence of the Jacobi method
2. Eigenvalues and Eigenvectors (Contd)
Eigenvectors of B
together with their
eigenvalues
Convergence of the
Jacobi method which
starts at [-2,-2]
T
and
converges to [2, -2]
T
The error vector e
(0)
The error vector e
(1)
The error vector e
(2)
3. The Method of Steepest Descent
In the method of steepest descent, one starts with an arbitrary point
x
(0)
and takes a series of steps x
(1)
, x
(2)
, until we are satis-fied that
we are close enough to the solution.
When taking the step, one chooses the direction in which f
decreases most quickly, i.e.
Definitions:
error vector: e
(i)
=x
(i)
-x
residual: r
(i)
=b-Ax
(i)
From Ax=b, it follows that
r
(i)
=-Ae
(i)
=-f(x
(i)
)
(i) (i)
) ( f Ax b x = '
The residual is actually the direction of steepest descent
3. The Method of Steepest Descent (Contd)
Start with some vector x
(0)
. The next vector falls along the solid line
The magnitude of the step is determined with a line search proce-
dure that minimizes f along the line:
) 0 ( ) 0 ( ) 1 (
r x x o + =
( )
( ) ( )
( )
) 0 (
) 0 (
) 0 (
T
) 0 ( ) 0 (
) 0 (
T
) 0 ( ) 0 (
) 0 (
T
) 1 (
) 0 (
T
) 1 (
) 1 (
T
) 1 ( ) 1 (
) 0 (
) 0 (
) 0 (
0
0
0
0
0
d
d
) ( f ) f(
d
d
Ar r
r r
r Ar r r
r r x A b
r Ax b
r r
x
x x
T
T
T
= o = o
= o +
=
=
=
o
=
o
'
Geometrical representation
(e)
(f)
3. The Method of Steepest Descent (Contd)
3. The Method of Steepest Descent (Contd)
The algorithm
To avoid one matrix-vector multiplication, one uses
(i) (i) (i) ) 1 (i (i) (i) (i) ) 1 (i
(i)
(i)
(i)
(i) (i)
(i)
(i)
r e e r x x
Ar r
r r
Ax b r
T
T
o + = => o + =
= o
=
+ +
(i) (i) (i) ) 1 (i
Ar r r o =
+
Two matrix-vector
multiplications are
required.
The disadvantage of using this recurrence is that the residual sequence is
determined without any feedback from the value of x
(i)
, so that round-off
errors may cause x
(i)
to converge to some point near x.
4. Convergence Analysis of the Method of Steepest Descent
The error vector e
(i)
is equal to the
eigenvector v
j
of A
The error vector e
(i)
is a linear combi-
nation of the eigenvectors of A, and all
eigenvalues are the same
0 ,
1
(i) (i) (i) ) 1 (i
j
(i)
(i)
(i)
j j j (i) (i)
(i)
(i)
= o + =

= = o
= = =
+
r e e
Ar r
r r
v Av Ae r
T
T
0 ,
1
(i) (i) (i) ) 1 (i
(i)
(i)
(i)
n
1 j
j j
n
1 j
j j j (i) (i)
n
1 j
j j (i)
(i)
(i)
= o + =

= = o

= =

=
+
= =
=
r e e
Ar r
r r
v v Ae r
v e
T
T
3. The Method of Steepest Descent
In the method of steepest descent, one starts with an arbitrary point
x
(0)
and takes a series of steps x
(1)
, x
(2)
, until we are satis-fied that
we are close enough to the solution.
When taking the step, one chooses the direction in which f
decreases most quickly, i.e.
Definitions:
error vector: e
(i)
=x
(i)
-x
residual: r
(i)
=b-Ax
(i)
From Ax=b, it follows that
r
(i)
=-Ae
(i)
=-f(x
(i)
)
(i) (i)
) ( f Ax b x = '
The residual is actually the direction of steepest descent
3. The Method of Steepest Descent (Contd)
Start with some vector x
(0)
. The next vector falls along the solid line
The magnitude of the step is determined with a line search proce-
dure that minimizes f along the line:
) 0 ( ) 0 ( ) 1 (
r x x o + =
( )
( ) ( )
( )
) 0 (
) 0 (
) 0 (
T
) 0 ( ) 0 (
) 0 (
T
) 0 ( ) 0 (
) 0 (
T
) 1 (
) 0 (
T
) 1 (
) 1 (
T
) 1 ( ) 1 (
) 0 (
) 0 (
) 0 (
0
0
0
0
0
d
d
) ( f ) f(
d
d
Ar r
r r
r Ar r r
r r x A b
r Ax b
r r
x
x x
T
T
T
= o = o
= o +
=
=
=
o
=
o
'
Geometrical representation
(e)
(f)
3. The Method of Steepest Descent (Contd)
3. The Method of Steepest Descent (Contd)
Summary of the algorithm
To avoid one matrix-vector
multiplication, one uses instead
(i) (i) (i) ) 1 (i
Ar r r o =
+
(i) (i) (i) ) 1 (i
(i)
(i)
(i)
(i) (i)
(i)
(i)
r x x
Ar r
r r
Ax b r
T
T
o + =
= o
=
+
Efficient implementation of the
method of steepest descent
1 i i
endif
-
else
-
50 le by is divisib i If
and i i While
0 i
T
T
0
2
max
T
+ :
: o
o :
:
o + :
o
: o
:
o c > o <
: o
:
:

r r

q r r

Ax b r

r x x
q r

Ar q

r r
Ax b r
4. Convergence Analysis of the Method of Steepest Descent
The error vector e
(i)
is equal to the
eigenvector v
j
of A
The error vector e
(i)
is a linear combi-
nation of the eigenvectors of A, and all
eigenvalues are the same
0 ,
1
(i) (i) (i) ) 1 (i
j
(i)
(i)
(i)
j j j (i) (i)
(i)
(i)
= o + =

= = o
= = =
+
r e e
Ar r
r r
v Av Ae r
T
T
0 ,
1
(i) (i) (i) ) 1 (i
(i)
(i)
(i)
n
1 j
j j
n
1 j
j j j (i) (i)
n
1 j
j j (i)
(i)
(i)
= o + =

= = o

= =

=
+
= =
=
r e e
Ar r
r r
v v Ae r
v e
T
T
4. Convergence Analysis of the Method of Steepest Descent
(Contd)
General convergence of the method can be proven by calculating
the energy norm
For n=2, one has that
( )
( )( )


= e e = =
+ + +
j
j
2
j j
3
j
2
j
2
j
2
j
2
j
2 2
2
A
) i ( ) 1 i (
T
) 1 (i
2
A
) 1 i (
1 , e Ae e e
1 2 min max
2 3 2
2 2 2
2
/ , /
1
1
) )( (
) (
1
= = k
+ k
k
s
+ k + k
+ k
= e

Conditioning number
4. Convergence Analysis of the Method of Steepest Descent
(Contd)
The influence of the spectral
condition number on the convergence
Worst-case scenario: =k
5. The Method of Conjugate Directions
Basic idea:
1. Pick a set of orthogonal
search directions d
(0)
, d
(1)
, ,
d
(n-1)
2. Take exactly one step in
each search direction to line up
with x
Mathematical formulation:
1. For each step we choose a
point
x
(i+1)
=x
(i)
+o
(i)
d
(i)
2. To find o
(i)
, we use the fact
that e
(i+1)
is orthogonal to d
(i)
( )
) i (
T
) i (
) i (
T
) i (
) i (
) i (
T
) i ( ) i ( ) i (
T
) i (
) i ( ) i ( ) i (
T
) i (
) 1 i (
T
) i (
0
0
0
d d
e d
d d e d
d e d
e d
= o
= o +
= o +
=
+
5. The Method of Conjugate Directions (Contd)
To solve the problem of not knowing e
(i)
, one makes the search
directions to be A-orthogonal rather then orthogonal to each other,
i.e.:
0 A
) j (
T
) i (
= d d
5. The Method of Conjugate Directions (Contd)
The new requirement is now that e
(i+1)
is A-orthogonal to d
(i)
( )
(i)
(i)
) i (
) i ( ) i ( ) i (
T
) i (
) 1 i (
T
) i (
(i)
T
) 1 (i
) 1 (i
T
) 1 (i ) 1 (i
(i)
(i)
0
0
0
0
d
d
) ( ' f ) f(
d
d
Ad d
r d

d e A d
Ae d
d r
x
x x
T
T
= o
= o +
=
=
=
o
=
o
+
+
+
+ +
If the search vectors were the residuals, this
formula would be identical to the method of
steepest descent.
5. The Method of Conjugate Directions (Contd)
Proof that this process computes x in n-steps
- express the error terms as a linear combination of the search
directions
- use the fact that the search directions are A-orthogonal
) j (
1 n
0 j
) j ( ) 0 (
d e

=
o =
5. The Method of Conjugate Directions (Contd)
Calculation of the A-ortogonal search directions by a conjugate
Gram-Schmidth process
1. Take a set of linearly independent vectors u
0
, u
1
, , u
n-1
2. Assume that d
(0)
=u
0
3. For i>0, take an u
i
and subtracts all the components from it
that are not A-orthogonal to the previous search directions
) j (
T
(j)
) j (
T
(i)
ij ) j (
1 i
0 j
ij ) i ( ) i (
,
Ad d
Ad u
d u d = |

| + =

=
5. The Method of Conjugate Directions (Contd)
The method of Conjugate Directions using the axial unit vectors as
basis vectors
5. The Method of Conjugate Directions (Contd)
1. The error vector is A-orthogonal to all previous search directions
2. The residual vector is orthogonal to all previous search directions
3. The residual vector is also orthogonal to all previous basis vectors.
j i for 0
j
T
i j
T
i j
T
i
< = = = r u r d Ae d
Important observations:
6. The Method of Conjugate Gradients
The method of Conjugate Gradients is simply the method of
conjugate directions where the search directions are constructed by
conjugation of the residuals, i.e. u
i
=r
(i)
This allows us to simplify the calculation of the new search
direction because
The new search direction is determined as a linear combination of
the previous search direction and the new residual

+ >
+ = =
o
= |

1 j i 0
1 j i
1
) 1 i (
T
) 1 i (
) i (
T
) i (
) 1 i (
T
) 1 i (
) i (
T
) i (
) 1 i (
ij
r r
r r
Ad d
r r
(i) i ) 1 i ( ) 1 (i
d r d | + =
+ +
6. The Method of Conjugate Gradients (Contd)
Efficient implementation
1 i i
endif
-
else
-
50 y ivisible b If i is d
and i i While
0 i
old
new
T
new
new old
T
new
0
2
new max
new 0
T
new
+ :
| + :
o
o
: |
: o
o : o
o :
:
o + :
o
: o
:
o c > o <
o : o
: o
:
:
:

d r d

r r


q r r

Ax b r

d x x
q d

Ad q

r r
r d
Ax b r
The Landscape of Ax=b
Solvers


Pivoting
LU

GMRES,
BiCGSTAB,


Cholesky

Conjugate
gradient


Direct
A = LU
Iterative
y = Ay
Non-
symmetric
Symmetric
positive
definite
More Robust Less Storage (if sparse)
More Robust
More General
Conjugate gradient iteration
One matrix-vector multiplication per iteration
Two vector dot products per iteration
Four n-vectors of working storage
x
0
= 0, r
0
= b, d
0
= r
0
for k = 1, 2, 3, . . .

k
= (r
T
k-1
r
k-1
) / (d
T
k-1
Ad
k-1
) step length
x
k
= x
k-1
+
k
d
k-1
approx solution
r
k
= r
k-1

k
Ad
k-1
residual

k
= (r
T
k
r
k
) / (r
T
k-1
r
k-1
) improvement
d
k
= r
k
+
k
d
k-1
search direction
7. Convergence Analysis of the CG Method
If the algorithm is performed in exact arithmetic, the exact solution
is obtained in at most n-steps.
When finite precision arithmetic is used, rounding errors lead to
gradual loss of orthogonality among the residuals, and the finite
termination property of the method is lost.
If the matrix A has only m distinct eigenvalues, then the CG will
converge in at most m iterations
An error bound on the CG method can be obtained in terms of the
A-norm, and after k-iterations:
k
A
0
A
k
1 ) (
1 ) (
2
(

+ k
k
s
A
A
x x x x
Dominant operations of Steepest Descent or Conjugate Gradient
method are the matrix vector products. Thus, both methods have
spatial complexity O(m).
Suppose that after I-iterations, one requires that ||e
(i)
||
A
sc||e
(0)
||
A
(a) Steepest Descent:
(b) Conjugate Gradient:
The time complexity of these two methods is:
(a) Steepest Descent:
(b) Conjugate Gradient:
Stopping criterion: ||r
(i)
||sc||r
(0)
||
8. Complexity of the CG Method
( ) | |
c
k
1
2
1
ln s i
( ) | |
c
k
2
2
1
ln s i
) ( k m O
) ( k m O
9. Preconditioning Techniques
References:
J.A. Meijerink and H.A. van der Vorst, An iterative solution method for
linear systems of which the coefficient matrix is a symmetric M-matrix,
Mathematics of Computation, Vol. 31, No. 137, pp. 148-162 (1977).
J.A. Meijerink and H.A. van der Vorst, Guidelines for the usage of
incomplete decompositions in solving sets of linear equations as they
occur in practical problems, Journal of Computational Physics, Vol. 44,
pp. 134-155 (1981).
D.S. Kershaw, The incomplete Cholesky-Conjugate gradient method for
the iterative solution of systems of linear equations, Journal of
Computational Physics, Vol. 26, pp. 43-65 (1978).
H.A. van der Vorst, High performance preconditioning, SIAM Journal
of Scientific Statistical Computations, Vol. 10, No. 6, pp. 1174-1185
(1989).
9. Preconditioning Techniques (Contd)
Preconditioning is a technique for improving the condition number
of a matrix. Suppose that Mis a symmetric, positive-definite matrix
that approximates A, but is easier to invert. Then, rather than
solving the original system
Ax=b
one solves the modified system
M
-1
Ax= M
-1
b
The effectiveness of the preconditioner depends upon the condition
number of M
-1
A.
Intuitively, the preconditioning is an attempt to stretch the quadratic
form to appear more spherical, so that the eigenvalues become
closer to each other.
Derivation:
x E x b E x AE E EE M
T T T
= = =

~
,
~
,
1 1
9. Preconditioning Techniques (Contd)
Transformed Preconditioned CG
(PCG) Algorithm
1
~
~
~
~ ~
~ ~ ~ ~ ~
~
~ ~
~
~
~
~
,
~ ~
~
~
,
~ ~
, 0
1 1
1
0
2
max
0
1 1
+ :
+ :
:
:
:
: :
+ :
:
:
> <
: :
: : :



i i
- or -
and i i While
i
old
new
T
new
new old
T
T
new
T
new
new
T
new
T

d r d

r r

q r r x AE E b E r
d x x
q d

d AE E q

r r
r d x AE E b E r
|
o
o
|
o
o o
o
o
o
o
o c o
o o o
Untransformed PCG Algorithm
d E d r E r
T
= =

~
,
~
1
1
,
, , 0
1
1
0
2
max
0
1
1
+ :
+ :
:
:
:
: :
+ :
:
:
> <
: :
: : :

i i
- or -
and i i While
i
old
new
- T
new
new old
T
new
new
new
- T
new

d r M d

r M r

q r r Ax b r
d x x
q d

Ad q

r M r
r M d Ax b r
|
o
o
|
o
o o
o
o
o
o
o c o
o o o
9. Preconditioning Techniques (Contd)
There are sevaral types of preconditioners that can be used:
(a) Preconditioners based on splitting of the matrix A, i.e.
A = M - N
(b) Complete or incomplete factorization of A, e.g.
A = LL
T
+ E
(c) Approximation of M=A
-1
(d) Reordering of the equations and/or unknowns, e.g.
Domain Decomposition
9. Preconditioning Techniques (Contd)
(a) Preconditioning based on splittings of A
Diagonal Preconditioning:
M=D=diag{a
11
, a
22
, a
nn
}
Tridiagonal Preconditioning
Block Diagonal Preconditioning
(

= =
(

=
(

=
2
2
0 c
8
2
6 2
2 3
x b A , ,
9. Preconditioning Techniques (Contd)
(b) Preconditioning based on factorization of A
Cholesky Factorization (for symmetric and positive definite matrix)
as a direct method:
A = LL
T
x = (L
-T
)(L
-1
b)
where:
Modified Cholesky factorization: A = LDL
T
n i i j
L
L L A
L
L A L
ii
i
k
ik jk ji
ji
i
k
ik ii ii
, , 2 , 1 ,
1
1
1
1
+ + =

=
9. Preconditioning Techniques (Contd)
(b) Preconditioning based on factorization of A
Incomplete Cholesky Factorization:
A = LL
T
+ E or A = LDL
T
+ E , where L
ij
=0 if (i,j)eP
Simplest choice is: P={(i,j)| a
ij
=0; i,j,=1,2,,n} => ICCG(0)
Modified ICCG(0):
(
(
(
(

= A
a
i
b
i
c
i
(
(
(
(

=
T
L

i i
b
~
i
c
~
(
(
(
(

= D
i
d
~
n i c c b b d c d b a d a
i i i i m i m i i i i i i
, , 2 , 1 ,
~
,
~
,
~
~
~ ~ ~
~
2
1
2
1
1
= = = = =


m i
m i
i
i
i i
T
d
c
d
b
a d diag diag

= =
+ + =
2
1
2
1
1
), ( ) (
) ( ) (
M A
L D D D L M
9. Preconditioning Techniques (Contd)
(c) Preconditioning based on approximation of A
If (J)<1, then the inverse of I-J is
Write matrix A as:
A=D+L+U=D(I+D
-1
(L+U)) => J= -D
-1
(L+U))
The inverse of A can be expressed as:
For k=1, one has that:
For k=m, one usually uses:

+ + + + = =

=

0
3 2 1
) (
k
k
J J J I J J I
( )
1 1
0
1 1 1 1
) ( ) 1 (
) ( )) ( (

=

+

=
= =
D U L D
D J I J I D A
k
k
k
1 1 1 1
) (

+ = D U L D D M
( )
1 1
0
1
) ( ) 1 (

=

= D U L D M
k
m
k
k
k m

9. Preconditioning Techniques (Contd)
(d) Domain Decomposition Preconditioning
Domain I
Domain II
(
(
(

=
(
(
(
(

=
S B B
0 M 0
0 0 M
L L
S 0 0
0 M 0
0 0 M
L M
T T
T -
-
where
2 1
2
1
2
1
1
1
1
,
(
(
(

=
(
(
(

(
(
(

=
f
d
d
z
x
x
Q B B
B A 0
B 0 A
b Ax
2
1
2
1
2 2
1 1
2 1
,
T T
2
1
1
1 *
*
2 2
1 1
2 2 1 1
2 1
, B M B B M B S S
S B B
B M 0
B 0 M
M
- T - T
T T
+ + =
(
(
(

=
10. Conjugate Gradient Type Algorithms for
Nonsymmetric Matrices
The Bi-Conjugate Gradient
(BCG) Algorithm was propo-
sed by Lanczos in 1954.
It solves not only the original
system Ax=b but also the dual
linear system A
T
x
*
=b
*
.
Each step of this algorithm
requires a matrix-by-vector
product with both A and A
T
.
The search direction p
j
*
does
not contribute to the solution
directly.
1
, ,
, 0
* * *
*
*
*
0
2
max
0
+ :
+ :
+ :
:
:
:
:
:
+ :
:
:
:
> <
: : = :
= : :
i i
-
-
and i i While
, i
old
new
T
new
new old
*T
new
T *
new
new
T
new

p r p
p r p

r r

q r r
q r r
p x x
q p

p A q
Ap q

r r r p r, p
0 r* r Ax b r
* *
* * *
T
|
|
o
o
|
o
o o
o
o
o
o
o
o c o
o o o
10. Conjugate Gradient Type Algorithms for
Nonsymmetric Matrices (Contd)
The Conjugate Gradient
Squared (CGS) Algorithm
was developed by Sonneveld in
1984.
Main idea of the algorithm:
1
(
) (
) (
,
arbitrary is , 0
*
0
*
0
0
2
max
0
*
0 0
*
0 0 0
0
+ :
+ + :
+ :
:
:
:
+ :
+ + :
:
:
> <
: :
= :
: :
i i
)
-
and i i While
, i
old
new
T
new
new old
T
new
new
new
T
new

p q u p
q r u

r r

q u A r r
q u x x
Ap u q
Ap r


r r
r u , r p
r Ax b r
| |
|
o
o
|
o
o o
o
o
o
o
o
o c o
o o o
* *
* *
r A p
r A r
r A p
r A r
0
0
) (
) (
) (
) (
0
0
T
j
T
j
j j
j j
j
j
t
|
t
|
=
=
=
=
10. Conjugate Gradient Type Algorithms for
Nonsymmetric Matrices (Contd)
The Bi-Conjugate Gradient
Stabilized (Bi-CGSTAB)
Algorithm was developed by
van der Vorst in 1992.
Rather than producing iterates
whose residual vectors are of
the form
it produces iterates with
residual vectors of the form
( )
1
(
,
arbitrary is , 0
*
0
*
0
0
2
max
0
*
0
*
0 0 0
0
+ :
+ :
:
:
:
:
+ + :
:
:
:
> <
: :
:
: :
i i
)
-
and i i While
, i
old
new
T
new
new old
T
T
T
new
new
new
T
new

Ap p r p

r r

As s r
s p x x
As As
As s

Ap r s
Ap r


r r
r p
r Ax b r
e |
e
o
o
o
|
o
o o
e
e o
e
o
o
o
o c o
o o o
0
2
) ( r A r
j
j
| =
0
0
) ( ) (
) ( ) (
r A A p
r A A r
j j j
j j j
t
|
=
=
Complexity of linear solvers
O(n
1.17
) O(n
1.25
) CG, modified IC:
O(n
1.75
) -> O(n
1.31
) O(n
1.20
) -> O(n
1+
) CG, support trees:
O(n
1.33
) O(n
1.5
) CG, no precond:
O(n
2
) O(n
2
)
CG, exact
arithmetic:
O(n) O(n) Multigrid:
O(n
2
) O(n
1.5
) Sparse Cholesky:
3D 2D
n
1/2
n
1/3
Time to solve
model problem
(Poissons
equation) on
regular mesh

You might also like