Chapter 3

Chapter 3
Linear Systems of Equations

The solution of linear systems of equations is an extremely important pro-
cess in scientic computing. Linear systems of equations directly serve as
mathematical models in many situations, while solution of linear systems of
equations is an important intermediary computation in the analysis of other
models, such as nonlinear systems of dierential equations.
Example 3.1
Find x
1
, x
2
, and x
3
such that
x
1
+ 2x
2
+ 3x
3
= 1,
4x
1
+ 5x
2
+ 6x
3
= 0,
7x
1
+ 8x
2
+ 10x
3
= 1.
This chapter deals with the analysis and approximate solution of such sys-
tems of equations with oating point arithmetic. We will study two direct
methods, Gaussian elimination (the LU decomposition) and the QR decom-
position), as well as iterative methods, such as the GaussSeidel method, for
solving such systems. (Computations in direct methods nish with a nite
number of operations, while iterative methods involve a limiting process, as
xed point iteration does.) We will also study the singular value decomposi-
tion, a powerful technique for obtaining information about linear systems of
equations, the mathematical models that give rise to such linear systems, and
the eects of errors in the data on the solution of such systems.
The process of dealing with linear systems of equations comprises the sub-
ject numerical linear algebra. Before studying the actual solution of linear
systems, we introduce (or review) underlying commonly used notation and
facts.
69
70 Applied Numerical Methods
3.1 Matrices, Vectors, and Basic Properties
The coecients of x
1
, x
2
, and x
3
in Example 3.1 can be written as an array
of numbers
A =
_
_
1 2 3
4 5 6
7 8 10
_
_
,
which we call a matrix. The horizontal lines of numbers are the rows, while
the vertical lines are the columns. In the example, we say A is a 3 by 3
matrix, meaning that it has 3 rows and 3 columns. (If a matrix B had two
rows and 5 columns, for example, we would say B is a 2 by 5 matrix.)
In numerical linear algebra, the variables x
1
, x
2
, and x
3
as in Example 3.1
are typically represented in a matrix
x =
_
_
x
1
x
2
x
3
_
_
with 3 rows and 1 column, as is the set of right members of the equations:
b =
_
_
1
0
1
_
_
;
x and b are called column vectors.
Often, the system of linear equations will have real coecients, but the
coecients will sometimes be complex. If the system has n variables in the
column vector x, and the variables are assumed to be real, we say that x R
n
.
If B is an m by n matrix whose entries are real numbers, we say B R
mn
.
If vector x has n complex coecients, we say x C
n
, and if an m by n matrix
B has complex coecients, we say B C
mn
.
Systems such as in Example 3.1 can be written using matrices and vectors,
with the concept of matrix multiplication. We use upper case letters to denote
matrices, lower case letters without subscripts to denote vectors (which we
consider to be column vectors, we denote the element in the i-th row, j-th
column of a matrix A by a
ij
, and we sometimes denote the entire matrix A
by (a
ij
). The numbers that comprise the elements of matrices and vectors are
called scalars.
DEFINITION 3.1 If A = (a
ij
), then A
T
= (a
ji
) and A
H
= (a
ji
) denote
the transpose and conjugate transpose of A, respectively.
Example 3.2
On most computers in use today, the basic quantity in matlab is a matrix
whose entries are double precision oating point numbers according to the
Linear Systems of Equations 71
IEEE 754 standard. Matrices are marked with square brackets [ and ],
with commas or spaces separating the entries in a row and semicolons or the
end of a line separating the rows. The transpose of a matrix is obtained by
typing a single quotation mark (or apostrophe) after the matrix. Consider
the following matlab dialog.
>> A = [1 2 3;4 5 6;7 8 10]
A =
1 2 3
4 5 6
7 8 10
>> A
ans =
1 4 7
2 5 8
3 6 10
>>
DEFINITION 3.2 If A is an m n matrix and B is an n p matrix,
then C = AB where
c
ij
=
n
k=1
a
ik
b
kj
for i = 1, , m, j = 1, , p. Thus, C is an mp matrix.
Example 3.3
Continuing the matlab dialog from Example 3.2, we have
>> B = [-1 0 1
2 3 4
3 2 1]
B =
-1 0 1
2 3 4
3 2 1
>> C = A*B
C =
12 12 12
24 27 30
39 44 49
>>
(If the reader or student is not already comfortable with matrix multiplica-
tion, we suggest conrming the above calculation by doing it with paper and
pencil.)
With matrix multiplication, we can write the linear system in Example 3.1
at the beginning of this chapter as
Ax = b, A =
_
_
1 2 3
4 5 6
7 8 10
_
_
, x =
_
_
x
1
x
2
x
3
_
_
, b =
_
_
1
0
1
_
_
.
Matrix multiplication can be easily described in terms of the dot product:
DEFINITION 3.3 Suppose we have two real vectors
v =
_
_
_
_
_
v
1
v
2
.
.
.
v
n
_
_
_
_
_
and w =
_
_
_
_
_
w
1
w
2
.
.
.
w
n
_
_
_
_
_
Then the dot product v w, also written (v, w), of v and w is the matrix
product
v
T
w =
n
i=1
v
i
w
i
.
If the vectors have complex components, the dot product is dened to be
v
H
w =
n
i=1
v
i
w
i
,
where v
i
is the complex conjugate of v
i
.
Dot products can also be dened more generally and abstractly, and are
useful throughout pure and applied mathematics. However, our interest here
is the fact that many computations in scientic computing can be written in
terms of dot products, and most modern computers have special circuitry and
software to do dot products eciently.
Example 3.4
matlab represents the transpose of a vector V as V. Also, if A is an n by
n matrix in matlab, the i-th row of A is accessed as A(i,:), while the j-th
column of A is accessed as A(:,j). Continuing Example 3.3, we have the
following matlab dialog, illustrating writing the product matrix C in terms
of dot products.
>> C = [ A(1,:)*B(:,1), A(1,:)*B(:,2), A(1,:)*B(:,3)
A(2,:)*B(:,1), A(2,:)*B(:,2), A(2,:)*B(:,3)
A(3,:)*B(:,1), A(3,:)*B(:,2), A(3,:)*B(:,3)]
C =
12 12 12
24 27 30
39 44 49
>>
Matrix inverses are also useful in describing linear systems of equations:
DEFINITION 3.4 Suppose A is an n by n matrix. (That is, suppose
a is square.) Then, A
1
is the inverse of A if A
1
A = AA
1
= I, where I
is the n by n identity matrix, consisting of 1s on the diagonal and 0s in all
o-diagonal elements. If A has an inverse, then A is said to be nonsingular
or invertible.
Example 3.5
Continuing the matlab dialog from the previous examples, we have
>> Ainv = inv(A)
Ainv =
-0.66667 -1.33333 1.00000
-0.66667 3.66667 -2.00000
1.00000 -2.00000 1.00000
>> Ainv*A
ans =
1.00000 0.00000 -0.00000
0.00000 1.00000 -0.00000
-0.00000 0.00000 1.00000
>> A*Ainv
ans =
1.0000e+00 4.4409e-16 -4.4409e-16
1.1102e-16 1.0000e+00 -1.1102e-15
3.3307e-16 2.2204e-15 1.0000e+00
>> eye(3)
ans =
1 0 0
0 1 0
0 0 1
>> eye(3)-A*inv(A)
ans =
1.1102e-16 -4.4409e-16 4.4409e-16
-1.1102e-16 -8.8818e-16 1.1102e-15
-3.3307e-16 -2.2204e-15 2.3315e-15
>> eye(3)*B-B
ans =
0 0 0
0 0 0
0 0 0
>>
Above, observe that I B = B. Also observe that the computed value of
I AA
1
is not exactly the matrix consisting entirely of zeros, but is a
matrix whose entries are small multiples of the machine epsilon for IEEE
double precision oating point arithmetic.
Some matrices do not have inverses.
DEFINITION 3.5 A matrix that does not have an inverse is called a
singular matrix. A matrix that does have an inverse is said to be non-singular.
Singular matrices are analogous to the number zero when we are dealing
with a single equation in a single unknown. In particular, if we have the
system of equations Ax = b, it follows that x = A
1
b (since A
1
(Ax) =
(A
1
A)x = Ix = x = A
1
b), just as if ax = b, then x = (1/a)b.
Example 3.6
The matrix
_
_
1 2 3
4 5 6
7 8 9
_
_
is singular. However, if we use matlab to try to nd an inverse, we obtain:
>> A = [1 2 3;4 5 6;7 8 9]
A =
1 2 3
4 5 6
7 8 9
>> inv(A)
ans =
1.0e+016 *
-0.4504 0.9007 -0.4504
0.9007 -1.8014 0.9007
-0.4504 0.9007 -0.4504
>>
Observe that the matrix matlab gives for the inverse has large elements (on
the order of the reciprocal of the machine epsilon
m
1.11 10
16
times
the elements of A). This is due to roundo error. This can be viewed as
analogous to trying to form 1/a when a = 0, but, due to roundo error (such
as some cancelation error) a is a small number, on the order of the machine
epsilon
m
. We then have 1/a is on the order of 1/
m
.
The following two denitions and theorem clarify which matrices are sin-
gular and clarify the relationship between singular matrices and solution of
linear systems of equations involving those matrices.
DEFINITION 3.6 Let
v
(i)
m
i=1
be m vectors. Then
v
(i)
m
i=1
is said
to be linearly independent provided
m
i=1
i
v
(i)
= 0, then
i
= 0 for i =
1, 2, , m.
Example 3.7
Let
a
1
=
_
_
1
2
3
_
_
, a
2
=
_
_
4
5
6
_
_
, and a
3
=
_
_
7
8
9
_
_
be the rows of the matrix A from Example 3.6 (expressed as column vectors).
Then
a
1
2a
2
+a
3
= 0,
so a
1
, a
2
, and a
3
are linearly dependent. In particular, the third row of A is
two times the second row minus the rst row.
DEFINITION 3.7 The rank of a matrix A, rank(A), is the maximum
number of linearly independent rows it possesses. It can be shown that this is
the same as the maximum number of linearly independent columns. If A is an
m by n matrix and rank(A) = min{m, n}, then A is said to be of full rank.
For example, if m < n and the rows of A are linearly independent, then A is
of full rank.
The following theorem deals with rank, nonsingularity, and solutions to
systems of equations.
THEOREM 3.1
Let A be an n n matrix (A L(C
n
)). Then the following are equivalent:
1. A is nonsingular.
2. det(A) = 0, where det(A) is the determinant
1
of the matrix A.
3. The linear system Ax = 0 has only the solution x = 0.
4. For any b C
n
, the linear system Ax = b has a unique solution.
5. The columns (and rows) of A are linearly independent. (That is, A is
of full rank, i.e. rank(A) = n.)
When the matrices for a system of equations have special properties, we
can often use these properties to take short cuts in the computation to solve
corresponding systems of equations, or to know that roundo error will not
accumulate when solving such systems. Symmetry and positive deniteness
are important properties for these purposes.
DEFINITION 3.8 If A
T
= A, then A is said to be symmetric. If
A
H
= A, then A is said to be Hermitian.
Example 3.8
If A =
1 2 i
2 +i 3
, then A
H
=
1 2 i
2 +i 3
, so A is Hermitian.
DEFINITION 3.9 If A is an n by n matrix with real entries, if A
T
= A
and x
T
Ax > 0 for any x R
n
except x = 0, then A is said to be symmetric
positive denite. If A is an n by n matrix with complex entries, if A
H
= A
and x
H
Ax > 0 for x C
n
, x = 0, then A is said to be Hermitian positive
denite. Similarly, if x
T
Ax 0 (for a real matrix A) or x
H
Ax 0 (for
a complex matrix A) for every x = 0, we say that A is symmetric positive
semi-denite or Hermitian positive semi-denite, respectively.
1
We will not give a formal denition of determinant here, but we will use their properties.
Determinants are generally dened well in a good linear algebra course. We explain a good
way of computing determinants in Section 3.2.3 on page 86. When computing a determinant
of small matrices symbolically, expansion by minors is often used.
Example 3.9
If A =
4 1
1 3
, then A
T
= A, so A is symmetric.
Also, x
T
Ax = 4x
2
1
+ 2x
1
x
2
+ 3x
2
2
= 3x
2
1
+ (x
1
+ x
2
)
2
+ 2x
2
2
> 0 for x = 0.
Thus, A is symmetric positive denite.
Prior to studying actual methods for analyzing systems of linear equations,
we introduce the following concepts.
DEFINITION 3.10 If v = (v
1
, . . . , v
n
)
T
is a vector and is a number,
we dene scalar multiplication w = v by w
i
= v
i
, that is, we multiply each
component of v by . We say that we have scaled v by . We can similarly
scale a matrix.
Example 3.10
Observe the following matlab dialog.
>> v = [1;-1;2]
v =
1
-1
2
>> lambda = 3
lambda =
3
>> lambda*v
ans =
3
-3
6
>>
DEFINITION 3.11 If A is an nn matrix, a scalar and a nonzero x
are an eigenvalue and eigenvector of A if Ax = x.
DEFINITION 3.12 (A) = max
1in
|
i
|, where {
i
}
n
i=1
is the set of eigen-
values of A, is called the spectral radius of A.
Example 3.11
The matlab function eig computes eigenvalues and eigenvectors. Consider
the following matlab dialog.
>> A = [1,2,3
4 5 6
7 8 10]
A =
1 2 3
4 5 6
7 8 10
>> [V,Lambda] = eig(A)
V =
-0.2235 -0.8658 0.2783
-0.5039 0.0857 -0.8318
-0.8343 0.4929 0.4802
Lambda =
16.7075 0 0
0 -0.9057 0
0 0 0.1982
>> A*V(:,1) - Lambda(1,1)*V(:,1)
ans =
1.0e-014 *
0.0444
-0.1776
0.1776
>> A*V(:,2) - Lambda(2,2)*V(:,2)
ans =
1.0e-014 *
0.0777
0.1985
0.0944
>> A*V(:,3) - Lambda(3,3)*V(:,3)
ans =
1.0e-014 *
0.0666
-0.0444
0.2109
>>
Note that the eigenvectors of the matrix A are stored in the columns of V, while
corresponding eigenvalues are stored in the diagonal entries of the diagonal
matrix Lambda. In this case, the spectral radius is (A) 16.7075.
Although we wont study computation of eigenvalues and eigenvectors until
Chapter 5, we refer to the concept in this chapter.
With these facts and concepts, we can now study the actual solution of
systems of equations on computers.
3.2 Gaussian Elimination
We can think of Gaussian elimination as a process of repeatedly adding
a multiple of one equation to another equation to transform the system of
equations into one that is easy to solve. We rst focus on these elementary
row operations.
DEFINITION 3.13 Consider a linear system of equations Ax = b, where
A is n n, and b, x R
n
.Elementary row operations on a system of linear
equations are of the following three types:
1. interchanging two equations,
2. multiplying an equation by a nonzero number,
3. adding to one equation a scalar multiple of another equation.
THEOREM 3.2
If system Bx = d is obtained from system Ax = b by a nite sequence of
elementary operations, then the two systems have the same solutions.
(A proof of Theorem 3.2 can be found in elementary texts on linear alge-
bra and can be done, for example, with Theorem 3.1 and using elementary
properties of determinants.)
The idea underlying Gaussian elimination is simple:
1. Subtract multiples of the rst equation from the second through the
n-th equations to eliminate x
1
from the second through n-th equations.
2. Then, subtract multiples of the new second equation from the third
through n-th equations to eliminate x
2
from these. After this step, the
third through n-th equations contain neither x
1
nor x
2
.
3. Continue this process until the resulting n-th equation contains only x
n
,
the resulting n 1-st equation contains only x
n
and x
n1
, etc.
4. Solve the resulting n-th equation for x
n
.
5. Plug the value for x
n
into the resulting (n 1)-st equation, and solve
that equation for x
n1
.
6. Continue this back-substitution process until we have solved for x
1
in
the rst equation.
Example 3.12
We will apply this process to the system in Example 3.1. In illustrating
the process, we can write the original system and transformed systems as an
augmented matrix, with a numbers position in the matrix telling us to which
variable (or right-hand-side) and which equation it belongs. The original
system is thus written as
_
_
1 2 3 1
4 5 6 0
7 8 10 1
_
_
.
We will use to denote that two systems of equations are equivalent, and we
will indicate below this symbol which multiples are subtracted: For example
R
3
R
3
2R
2
would mean that we replace the third row (i.e. the third
equation) by the third equation minus two times the second equation. The
Gaussian elimination process then proceeds as follows.
_
_
1 2 3 1
4 5 6 0
7 8 10 1
_
_
R2R24R1
R3R37R1
_
_
1 2 3 1
0 3 6 4
0 6 11 8
_
_
R3R32R2
_
_
1 2 3 1
0 3 6 4
0 0 1 0
_
_
.
The transformed third equation now reads x
3
= 0, while the transformed
second equation reads 3x
2
6x
3
= 4. Plugging x
3
= 0 into the trans-
formed second equation thus gives
x
2
= (4 + 6 x
3
)/(3) = (4)/(3) =
4
3
.
Similarly plugging x
3
= 0 and x
2
= 4/3 into the transformed rst equation
gives
x
1
= (1 2x
2
3x
3
) = (1 2(4/3)) = 5/3.
The solution vector is thus
_
_
x
1
x
2
x
3
_
_
=
_
_
5/3
4/3
0
_
_
.
We check by computing the residual :
Ax b =
_
_
1 2 3
4 5 6
7 8 10
_
_
_
_
5/3
4/3
0
_
_
_
_
1
0
1
_
_
=
_
_
1
0
1
_
_
_
_
1
0
1
_
_
=
_
_
0
0
0
_
_
.
Example 3.13
If we had used oating point arithmetic in Example 3.12, 5/3 and 4/3
would not have been exactly representable, and the residual would not have
been exactly zero. In fact, a variant
2
of Gaussian elimination with back-
substitution is programmed in matlab and accessible with the backslash (\)
operator:
>> A = [1 2 3
4 5 6
7 8 10]
A =
1 2 3
4 5 6
7 8 10
>> b = [-1;0;1]
b =
-1
0
1
>> x = A\b
x =
1.6667
-1.3333
-0.0000
>> A*x-b
ans =
1.0e-015 *
0.2220
0.8882
0.8882
>>
3.2.1 The Gaussian Elimination Algorithm
Following the pattern in the examples we have presented, we can write down
the process in general. The system will be written
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
,
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= b
2
,
.
.
.
a
n1
x
1
+a
n2
x
2
+ +a
nn
x
n
= b
n
.
2
using partial pivoting, which we will see later
Now, the transformed matrix
_
_
1 2 3
0 3 6
0 0 1
_
_
from Example 3.12 (page 80) is termed an upper triangular matrix, since it has
zeros in all entries below the diagonal. The goal of Gaussian elimination is to
reduce A to an upper triangular matrix through a sequence of elementary row
operations as in Denition 3.13. We will call the transformed matrix before
working on the r-th column A
(r)
, with associated right-hand-side vector b
(r)
,
and we begin with A
(1)
= A = (a
(1)
ij
) and b = b
(1)
= (b
(1)
1
, b
(1)
2
, , b
(1)
n
)
T
,
with A
(1)
x = b
(1)
. The process can then be described as follows.
Step 1: Assume that a
(1)
11
= 0. (Otherwise, the nonsingularity of A guaran-
tees that the rows of A can be interchanged in such a way that the new
a
(1)
11
is nonzero.) Let
m
i1
=
a
(1)
i1
a
(1)
11
, 2 i n.
Now multiply the rst equation of A
(1)
x = b
(1)
by m
i1
and subtract the
result from the i-th equation. Repeat this for each i, 2 i n. As a
result, we obtain A
(2)
x = b
(2)
, where
A
(2)
=
_
_
_
_
_
_
_
a
(1)
11
a
(1)
12
. . . a
(1)
1n
0 a
(2)
22
. . . a
(2)
2n
.
.
.
.
.
.
.
.
.
0 a
(2)
n2
. . . a
(2)
nn
_
_
_
_
_
_
_
and b
(2)
=
_
_
_
_
_
_
_
b
(1)
1
b
(2)
2
.
.
.
b
(2)
n
_
_
_
_
_
_
_
.
Step 2: We consider the (n 1) (n 1) submatrix

A
(2)
of A
(2)
dened
by

A
(2)
= a
(2)
ij
, 2 i, j n. We eliminate the rst column of

A
(2)
in a manner identical to the procedure for A
(1)
. The result is system
A
(3)
x = b
(3)
where A
(3)
has the form
A
(3)
=
_
_
_
_
_
_
_
_
_
a
(1)
11
a
(1)
12
a
(1)
1n
0 a
(2)
22
a
(2)
2n
.
.
. 0 a
(3)
33
a
(3)
3n
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
(3)
n3
a
(3)
nn
_
_
_
_
_
_
_
_
_
.
Steps 3 to n 1: The process continues as above, where at the k-th stage
we have A
(k)
x = b
(k)
, 1 k n 1, where
A
(k)
=
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
a
(1)
11
a
(1)
1n
0 a
(2)
22
a
2
2n
.
.
. 0 a
33

.
.
.
.
.
.
.
.
.
.
.
.
0 a
(k)
kk
a
(k)
kn
.
.
.
.
.
.
0 0 a
(k)
nk
a
(k)
nn
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
and b
(k)
=
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
b
(1)
1
b
(2)
2
.
.
.
b
(k1)
k1
b
(k)
k
.
.
.
b
(k)
n
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
.
(3.1)
For every i, k + 1 i n, the k-th equation is multiplied by
m
ik
= a
(k)
ik
/a
(k)
kk
and subtracted from the i-th equation. (We assume, if necessary, a row
is interchanged so that a
(k)
kk
= 0.) After step k = n 1, the resulting
system is A
(n)
x = b
(n)
where A
(n)
is upper triangular.
On a computer, this algorithm can be programmed as:
ALGORITHM 3.1
(Gaussian elimination, forward phase)
INPUT: the n by n matrix A and the n-vector b R
n
.
OUTPUT: A
(n)
and b
(n)
.
FOR k = 1, 2, , n 1
FOR i = k + 1, , n
(a) m
ik
a
ik
/a
kk
.
(b) FOR j = k, k + 1, , n
a
ij
a
ij
m
ik
a
kj
.
END FOR
(c) b
i
b
i
m
ik
b
k
.
END FOR
END FOR
END ALGORITHM 3.1.
Note: In Step (b) of Algorithm 3.1, we need only do the loop for j = k + 1
to n, since we know that the resulting a
k+1,k
will equal 0.
Back solving can be programmed as:
ALGORITHM 3.2
(Gaussian elimination, back solving phase)
INPUT: A
(n)
and b
(n)
from Algorithm 3.1.
OUTPUT: x R
n
as a solution to Ax = b.
1. x
n
b
n
/a
nn
.
2. FOR k = n 1, n 2, , 1
x
k
(b
k
j=k+1
a
kj
x
j
)/a
kk
.
END FOR
END ALGORITHM 3.2.
Note: To solve Ax = b using Gaussian elimination requires
1
3
n
3
+ O(n
2
)
multiplications and divisions. (See Exercise 4 on page 142.)
3.2.2 The LU decomposition
We now explain how the Gaussian elimination process we have just pre-
sented can be viewed as nding a lower triangular matrix L (i.e. a matrix
with zeros above the diagonal) and an upper triangular matrix U such that
A = LU. Assume rst that no row interchanges are performed in Gaussian
elimination. Let
M
(1)
=
_
_
_
_
_
_
_
1 0 . . . 0
m
21
m
31
.
.
. I
n1
m
n1
_
_
_
_
_
_
_
,
where I
n1
is the (n1)(n1) identity matrix, and where m
i1
, 2 i n are
dened in Gaussian elimination. Then, A
(2)
= M
(1)
A
(1)
and b
(2)
= M
(1)
b
(1)
.
At the r-th stage of the Gaussian elimination process,
M
(r)
= r-th row
_
_
_
_
_
_
_
I
r1
0 0 . . . 0
0 1 0 . . . 0
0 m
r+1,r
.
.
.
.
.
. I
nr
0 m
n,r
_
_
_
_
_
_
_
. (3.2)
Also,
(M
(r)
)
1
=
_
_
_
_
_
_
_
I
r1
0 0 . . . 0
0 1 0 . . . 0
0 m
r+1,r
.
.
.
.
.
. I
nr
0 m
n,r
_
_
_
_
_
_
_
, (3.3)
where m
ir
, r + 1 i n are given in the Gaussian elimination process, and
A
(r+1)
= M
(r)
A
(r)
and b
(r+1)
= M
(r)
b
(r)
. (Note: We are assuming here that
a
(r)
rr
= 0 and no row interchanges are required.) Collecting the above results,
we obtain A
(n)
x = b
(n)
, where
A
(n)
= M
(n1)
M
(n2)
M
(1)
A
(1)
and b
(n)
= M
(n1)
M
(n2)
M
(1)
b
(1)
.
Recalling that A
(n)
is upper triangular and setting A
(n)
= U, we have
A = (M
(n1)
M
(n2)
M
(1)
)
1
U. (3.4)
Example 3.14
Following the Gaussian elimination process from Example 3.12, we have
M
(1)
=
_
_
1 0 0
4 1 0
7 0 1
_
_
, M
(2)
=
_
_
1 0 0
0 1 0
0 2 1
_
_
,
(M
(1)
)
1
=
_
_
1 0 0
4 1 0
7 0 1
_
_
, (M
(2)
)
1
=
_
_
1 0 0
0 1 0
0 2 1
_
_
,
and A = LU, with
L = (M
(1)
)
1
(M
(2)
)
1
=
_
_
1 0 0
4 1 0
7 2 1
_
_
, U =
_
_
1 2 3
0 3 6
0 0 1
_
_
.
Applying the Gaussian elimination process to b can be viewed as solving
Ly = b for y then solving Ux = y for x. Solving Ly = b involves forming
M
1)
b, then forming M
(2)
(M
(1)
b), while solving Ux = y is simply the back-
substitution process.
Note: The product of two lower triangular matrices is lower triangular and
the inverse of a nonsingular lower triangular matrix is lower triangular. Thus,
L = (M
(1)
)
1
(M
(2)
)
1
(M
(n1)
)
1
is lower triangular. Hence, A = LU,
i.e., A is expressed as a product of lower and upper triangular matrices. The
result is called the LU decomposition (also known as the LU factorization,
triangular factorization, or triangular decomposition of A. The nal matrices
L and U are given by:
L =
_
_
_
_
_
_
_
_
1 0 0
m
21
1 0 0
m
31
m
32
1 0
.
.
.
.
.
.
.
.
.
.
.
.
m
n1
m
n2
m
n,n1
1
_
_
_
_
_
_
_
_
and U =
_
_
_
_
_
_
_
_
a
(1)
11
a
(1)
12
a
(1)
1n
0 a
(2)
22
a
(2)
2n
.
.
.
.
.
.
.
.
.
0 0 a
(n)
nn
_
_
_
_
_
_
_
_
.
(3.5)
This decomposition can be so formed when no row interchanges are required.
Thus, the original problem Ax = b is transformed into LUx = b.
Note: Since computing the LU decomposition of A is done by Gaussian
elimination, it requires O(n
3
) operations. However, if L and U are already
available, computing y with Ly = b then computing x with Ux = y requires
only O(n
2
) operations.
Note: In some software, the multiplying factors, that is, the nonzero o-
diagonal elements of L, are stored in the locations of corresponding entries
of A that are made equal to zero, thus obviating the need for extra storage.
Eectively, such software returns the elements of L and U in the same array
that was used to store A.
3.2.3 Determinants and Inverses
Usually, the solution x of a system of equations Ax = b is desired, and the
determinant det(A) is not of interest, even though one method of computing
x, Cramers rule, involves rst computing determinants. (In fact, computing
x with Gaussian elimination with back substitution is more ecient than
using Cramers rule, and is denitely more practical for large n.) However,
occasionally the determinant of A is desired for other reasons. An ecient
way of computing the determinant of a matrix is with Gaussian elimination.
If A = LU, then
det(A) = det(L) det(U) =
n
j=1
a
(j)
jj
.
(Using expansion by minors to compute the determinant requires O(n!) mul-
tiplications.)
Similarly, even though we could in principle compute A
1
, then compute
x = A
1
b, computing A
1
is less ecient than applying Gaussian elimination
with back-substitution. However, if we need A
1
for some other reason, we
can compute it relatively eciently by solving n systems Ax
(j)
= e
(j)
where
e
(j)
i
=
ij
, where
ij
is the Kronecker delta function dened by
ij
=
1 if i = j,
0 if i = j.
If A = LU, we perform n pairs of forward and backward solves, to obtain
A
1
= (x
1
, x
2
, , x
n
).
Example 3.15
In Example 3.14, for
A =
_
_
1 2 3
4 5 6
7 8 10
_
_
,
we used Gaussian elimination to obtain A = LU, with
L =
_
_
1 0 0
4 1 0
7 2 1
_
_
and U =
_
_
1 2 3
0 3 6
0 0 1
_
_
.
Thus,
det(A) = u
11
u
22
u
33
= (1)(3)(1) = 3.
We now compute A
1
: Using L and U to solve
_
_
1 2 3
4 5 6
7 8 10
_
_
x
(1)
=
_
_
1
0
0
_
_
gives x
(1)
=
_
_
2/3
2/3
1
_
_
,
solving
_
_
1 2 3
4 5 6
7 8 10
_
_
x
(2)
=
_
_
0
1
0
_
_
gives x
(2)
=
_
_
4/3
11/3
2
_
_
,
and solving
_
_
1 2 3
4 5 6
7 8 10
_
_
x
(3)
=
_
_
0
0
1
_
_
gives x
(3)
=
_
_
1
2
1
_
_
.
Thus,
A
1
=
_
_
2/3 4/3 1
2/3 11/3 2
1 2 1
_
_
, AA
1
= I =
_
_
1 0 0
0 1 0
0 0 1
_
_
.
3.2.4 Pivoting in Gaussian Elimination
In our explanation and examples of Gaussian elimination so far, we have
assumed that no row interchanges are required. In particular, we must have
a
kk
= 0 in each step of Algorithm 3.1. Otherwise, we may need to do a row
interchange, that is, we may need to rearrange the order of the transformed
equations. We have two questions:
1. When can Gaussian elimination be performed without row interchanges?
2. If row interchanges are employed, can Gaussian elimination always be
employed?
THEOREM 3.3
(Existence of an LU factorization) Assume that nn matrix A is nonsingular.
Then A = LU if and only if all the leading principal submatrices of A are
nonsingular.
3
Moreover, the LU decomposition is unique, if we require that
the diagonal elements of L are all equal to 1.
REMARK 3.1 Two important types of matrices that have nonsingu-
lar leading principal submatrices are symmetric positive denite and strictly
diagonally dominant, i.e.,
|a
ii
| >
n
j=1
j=i
|a
ij
|, for i = 1, 2, , n.
We now consider our second question, If row interchanges are employed, can
Gaussian elimination be performed for any nonsingular A? Switching the
rows of a matrix A can be done by multiplying A on the left by a permutation
matrix:
DEFINITION 3.14 A permutation matrix P is a matrix whose columns
consist of the n dierent vectors e
j
, 1 j n, in any order.
3
The leading principal submatrices of A have the form
a
11
. . . a
1n
.
.
.
.
.
.
.
.
.
a
k1
. . . a
kk
for k = 1, 2, , n.
Example 3.16
P = (e
1
, e
3
, e
4
, e
2
) =
_
_
_
_
1 0 0 0
0 0 0 1
0 1 0 0
0 0 1 0
_
_
_
_
is a permutation matrix such that the rst row of PA is the rst row of A, the
second row of PA is the fourth row of A, the third row of PA is the second
row of A, and the fourth row of PA is the third row of A. Note that the
permutation of the columns of the identity matrix in P corresponds to the
permutation of the rows of A. For example,
_
_
0 1 0
0 0 1
1 0 0
_
_
_
_
1 2 3
4 5 6
7 8 10
_
_
=
_
_
4 5 6
7 8 10
1 2 3
_
_
.
Thus, by proper choice of P, any two or more rows can be interchanged.
Note: det P = 1, since P is obtained from I by row interchanges.
Now, Gaussian elimination with row interchanges can be performed by the
following matrix operations:
4
A
(n)
= M
(n1)
P
(n1)
M
(n2)
P
(n2)
M
(2)
P
(2)
M
(1)
P
(1)
A.
b
(n)
= M
(n1)
P
(n1)
M
(2)
P
(2)
M
(1)
P
(1)
b.
It follows that U =

LA
(1)
, where

L is no longer lower triangular. However, if
we perform all the row interchanges rst, at once, then
M
(n1)
M
(1)
PAx = M
(n1)
M
(n2)
M
(1)
Pb,
or
LPAx =

LPb,
so
LPA = U.
Thus,
PA =

L
1
U = LU.
We can state these facts as follows.
4
When implementing Gaussian elimination, we usually dont actually multiply full n by
n matrices together, since this is not ecient. However, viewing the process as matrix
multiplications has advantages when we analyze it.
THEOREM 3.4
If A is a nonsingular n n matrix, then there is a permutation matrix P
such that PA = LU, where L is lower triangular and U is upper triangular.
(Note: det(PA) = det(A) = det(L) det(U).)
We now examine the actual operations we do to complete the Gaussian
elimination process with row interchanges (known as pivoting).
Example 3.17
Consider the system
0.0001x
1
+x
2
= 1
x
1
+x
2
= 2.
The exact solution of this system is x
1
1.00010 and x
2
0.99990. Let
us solve the system using Gaussian elimination without row interchanges.
We will assume calculations are performed using three-digit rounding decimal
arithmetic. We obtain
m
21

a
(1)
21
a
(1)
11
0.1 10
5
,
a
(2)
22
a
(1)
22
m
21
a
(1)
12
0.1 10
1
0.1 10
5
0.100 10
5
.
Also, b
(2)
(0.1 10
1
, 0.1 10
5
)
T
, so the computed (approximate) upper
triangular system is
0.1 10
3
x
1
+ 0.1 10
1
x
2
= 0.1 10
1
,
0.1 10
5
x
2
= 0.1 10
5
,
whose solutions are x
2
= 1 and x
1
= 0. If instead, we rst interchange the
equations so that a
(1)
11
= 1, we nd that x
1
= x
2
= 1, correct to the accuracy
used.
Example 3.17 illustrates that small values of a
(r)
rr
in the r-th stage lead to
large values of the m
ir
s and may result in a loss of accuracy. Therefore, we
want the pivots a
(r)
rr
to be large.
Two common pivoting strategies are:
Partial pivoting: In partial pivoting, the a
(r)
ir
for r i n, in the r-
th column of A
(r)
is searched to nd the element of largest absolute
value, and row interchanges are made to place that element in the pivot
position.
Full pivoting: In full pivoting, the pivot element is selected as the element
a
(r)
ij
, r i, j n of maximum absolute value among all elements of
the (n r) (n r) submatrix of A
(r)
. This strategy requires row and
column interchanges.
In theory, full pivoting is required in general to assure that the process does
not result in excessive roundo error. However, partial pivoting is adequate
in most cases. For some classes of matrices, no pivoting strategy is required
for a stable elimination procedure. For example, no pivoting is required for a
real symmetric positive denite matrix or for a strictly diagonally dominant
matrix [41].
We now present a formal algorithm for Gaussian elimination with partial
pivoting. In reading this algorithm, recall that
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= b
2
.
.
.
a
n1
x
1
+a
n2
x
2
+ +a
nn
x
n
= b
n
.
ALGORITHM 3.3
(Solution of a linear system of equations with Gaussian elimination with par-
tial pivoting and back-substitution)
INPUT: The n by n matrix A and right-hand-side vector b.
OUTPUT: An approximate solution
5
x to Ax = b.
FOR k = 1, 2, , n 1
1. Find such that |a
k
| = max
kjn
|a
jk
| (k n).
2. Interchange row k with row
_
_
_
c
j
a
kj
a
kj
a
j
a
j
c
j
_
_
_
for j = 1, 2, . . . , n, and
_
_
_
d b
k
b
k
b
d
_
_
_
.
3. FOR i = k + 1, , n
(a) m
ik
a
ik
/a
kk
.
(b) FOR j = k, k + 1, , n
a
ij
a
ij
m
ik
a
kj
.
END FOR
5
approximate because of roundo error
(c) b
i
b
i
m
ik
b
k
.
END FOR
4. Back-substitution:
(a) x
n
b
n
/a
nn
and
(b) x
k

b
k
j=k+1
a
kj
x
j
a
kk
, for k = n 1, n 2, , 1.
END FOR
END ALGORITHM 3.3.
REMARK 3.2 In Algorithm 3.3, the computations are arranged seri-
ally, that is, they are arranged so each individual addition and multiplica-
tion is done separately. However, it is ecient on modern machines, that
have pipelined operations and usually also have more than one processor,
to think of the operations as being done on vectors. Furthermore, we dont
necessarily need to change entire rows, but just keep track of a set of indices
indicating which rows are interchanged; for large systems, this saves a signif-
icant number of storage and retrieval operations. For views of the Gaussian
elimination process in terms of vector operations, see [16]. For an example of
software that takes account of the way machines are built, see [5].
REMARK 3.3 If U is the upper triangular matrix resulting from Gaus-
sian elimination with partial pivoting, we have
det(A) = (1)
K
det(U) = (1)
K
a
(1)
11
a
(2)
22
a
(n)
nn
,
where K is the number of row interchanges made.
3.2.5 Systems with a Special Structure
We now consider some special but commonly encountered kinds of matrices.
3.2.5.1 Symmetric, Positive Denite Matrices
We rst characterize positive denite matrices.
THEOREM 3.5
Let A be a real symmetric n n matrix. Then A is positive denite if and
only if there exists an invertible lower triangular matrix L such that A = LL
T
.
Furthermore, we can choose the diagonal elements of L,
ii
, 1 i n, to be
positive numbers.
The decomposition with positive
ii
is called the Cholesky factorization of
A. It can be shown that this decomposition is unique. L can be computed
using a variant of Gaussian elimination. Set
11
=
a
11
and
j1
= a
j1
/
a
11
for 2 j n. (Note that x
T
Ax > 0, and the choice x = e
j
implies that
a
jj
> 0.) Then, for i = 1, 2, 3, n, set
ii
=
a
ii
i1
k=1
(
ik
)
2
1
2
ji
=
1
ii
a
ji
i1
k=1
ik
jk
for i + 1 j n.
If A is real symmetric and L can be computed in this way, then A is positive
denite. (This is an ecient way to show positive deniteness.) To solve
Ax = b where A is real symmetric positive denite, L can be formed in this
way, and the pair Ly = b and L
T
x = y can be solved for x, analogously to
the way we use the LU decomposition to solve a system.
Note: The multiplication and division count for Cholesky decomposition is
n
3
/6 +O(n
2
).
Thus, for large n, about 1/2 the multiplications and divisions are required
compared to standard Gaussian elimination.
Example 3.18
Consider solving approximately
x
(t) = sin(t), x(0) = x(1) = 0.

One technique of approximately solving this equation is to replace x
in the
dierential equation by
x
(t)
x(t +h) 2x(t) +x(t h)
h
2
. (3.6)
If we subdivide the interval [0, 1] into four subintervals, then the end points of
these subintervals are t
0
= 0, t
1
= 1/4, t
2
= 1/2, t
3
= 3/4, and t
4
= 1. If we
require the approximate dierential equation with x
replaced using (3.6) to

be exact at t
1
, t
2
, and t
3
and take h = 1/4 to be the length of a subinterval,
we obtain:
at t
1
=
1
4
:
x
2
2x
1
+x
0
1
16
= sin(/4),
at t
2
=
1
2
:
x
3
2x
2
+x
1
1
16
= sin(/2),
at t
3
=
3
4
:
x
4
2x
3
+x
2
1
16
= sin(3/4),
with t
k
= k/4, k = 0, 1, 2, 3, 4. If we plug in x
0
= 0, x
4
= 0, we multiply
both sides of each of these three equations by h
2
= 1/16, and we write the
equations in matrix form, we obtain
_
_
2 1 0
1 2 1
0 1 2
_
_
_
_
x
1
x
2
x
3
_
_
=
1
16
_
_
sin(/4)
sin(/2)
sin(3/4)
_
_
.
The matrix for this system is symmetric. There is a matlab function chol
that performs a Cholesky factorization. We use it as follows:
>> A = [2 -1 0
-1 2 -1
0 -1 2]
A =
2 -1 0
-1 2 -1
0 -1 2
>> b = (1/16)*[sin(pi/4); sin(pi/2); sin(3*pi/4)]
b =
0.0442
0.0625
0.0442
>> L = chol(A)
L =
1.4142 0 0
-0.7071 1.2247 0
0 -0.8165 1.1547
>> L*L-A
ans =
1.0e-015 *
0.4441 0 0
0 -0.4441 0
0 0 0
>> y = L\b
y =
0.0312
0.0691
0.0871
>> x = L\y
x =
0.0754
0.1067
0.0754
>> A\b
ans =
0.0754
0.1067
0.0754
>>
3.2.5.2 Tridiagonal Matrices
A tridiagonal matrix is a matrix of the form
A =
_
_
_
_
_
_
_
_
_
a
1
c
1
0 0
b
2
a
2
c
2
0 0
0 b
3
a
3
c
3
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 b
n1
a
n1
c
n1
0 0 b
n
a
n
_
_
_
_
_
_
_
_
_
.
For example, the matrix from Example 3.18 is tridiagonal. In many cases im-
portant in applications, A can be decomposed into a product of two bidiagonal
matrices, that is,
A = LU =
_
_
_
_
_
_
1
0 0
b
2

2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 b
n

n
_
_
_
_
_
_
_
_
_
_
_
_
1
1
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
n1
0 0 1
_
_
_
_
_
_
. (3.7)
In such cases, multiplying the matrices on the right of (3.7) together and
equating the resulting matrix entries with corresponding entries of A gives
the following variant of Gaussian elimination:
1
= a
1
,
1
= c
1
/
1
,
i
= a
i
b
i
i1
i
= c
i
/
i
for i = 2, , n 1,
n
= a
n
b
n
n1
.
(3.8)
Thus, if
i
= 0, 1 i n, we can compute the decomposition (3.7). Fur-
thermore, we can compute the solution to Ax = f = (f
1
, f
2
, , f
n
)
T
by
successively solving Ly = f and Ux = y, i.e.,
y
1
= f
1
/
1
,
y
i
= (f
i
b
i
y
i1
)/
i
for i = 2, 3, , n,
x
n
= y
n
,
x
j
= (y
j

j
x
j+1
) for j = n 1, n 2, , 1.
(3.9)
Sucient conditions to guarantee the decomposition (3.7) are as follows.
THEOREM 3.6
Suppose the elements a
i
, b
i
, and c
i
of A satisfy |a
1
| > |c
1
| > 0, |a
i
| |b
i
|+|c
i
|,
and b
i
c
i
= 0 for 2 i n 1, and suppose |a
n
| > |b
n
| > 0. Then A is
invertible and the
i
s are nonzero. (Consequently, the factorization (3.7) is
possible.)
Note: It can be veried that solution of a linear system having tridiago-
nal coecient matrix using (3.8) and (3.9) requires (5n 4) multiplications
and divisions and 3(n 1) additions and subtractions. (Recall that we need
n
3
/3+O(n
2
) multiplications and divisions for Gaussian elimination.) Storage
requirements are also drastically reduced to 3n locations, versus n
2
for a full
matrix.
Example 3.19
The matrix from Example 3.18 is tridiagonal, and satises the conditions
in Theorem 3.6. This holds true if we form the linear system of equations
in the same was as in Example 3.18, regardless of how small we make h,
and how large the resulting system is. Thus, we may solve such systems
with the forward substitution and back substitution algorithms represented
by (3.8) and (3.9). If we want less truncation error in the approximation to
the dierential equation, we need to solve a larger system (with h smaller). It
is more practical to do so with (3.8) and (3.9) than with the general Gaussian
elimination algorithm, since the amount of work the computer has to do is
proportional to n, rather than n
3
.
3.2.5.3 Block Tridiagonal Matrices
We now consider briey block tridiagonal matrices, that is, matrices of the
form
A =
_
_
_
_
_
_
_
_
_
_
A
1
C
1
0 0 0
B
2
A
2
C
2
0 0
0 B
3
A
3
C
3
0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 B
n1
A
n1
C
n1
0 0 0 0 B
n
A
n
_
_
_
_
_
_
_
_
_
_
,
where A
i
, B
i
, and C
i
are mm matrices. Analogous to the tridiagonal case,
we construct a factorization of the form
A =
_
_
_
_
_
_
A
1
0 . . . 0
B
2

A
2
. . . 0
0
.
.
.
.
.
.
.
.
.
0 . . . B
n

A
n
_
_
_
_
_
_
_
_
_
_
_
_
I E
1
. . . 0
0 I
.
.
. 0
0
.
.
.
.
.
. E
n1
0 . . . 0 I
_
_
_
_
_
_
.
Provided the

A
i
, 1 i n, are nonsingular, we can compute:
A
1
= A
1
,
E
1
=

A
1
1
C
1
,
A
i
= A
i
B
i
E
i1
for 2 i n,
E
i
=

A
1
i
C
i
, for 2 i n 1.
For eciency, the

A
1
i
are generally not computed, but instead, the columns
of E
i
are computed by factoring

A
i
and solving a pair of triangular systems.
That is,

A
i
E
i
= C
i
with

A
i
= L
i
U
i
becomes L
i
U
i
E
i
= C
i
.
Note: The number of operations for computing a block factorization of a
block tridiagonal system is proportional to nm
3
. This is signicantly less
than the number of operations, proportional to n
3
, for completing the general
Gaussian elimination algorithm, for m small relative to n. In such cases,
tremendous savings are achieved by taking advantage of the zero elements.
Now consider
Ax = b, x =
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
, b =
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
,
where x
i
, b
i
R
m
. Then, with the factorization A = LU, Ax = b can be
solved as follows: Ly = b, Ux = y, with
A
1
y
1
= b
1
,
A
i
y
i
= (b
i
B
i
y
i1
) for i = 2, , n,
x
n
= y
n
,
x
j
= y
j
E
j
x
j+1
for j = n 1, , 1.
Block tridiagonal systems arise in various applications, such as in equi-
librium models for diusion processes in two and three variables, a simple
prototype of which is the equation
2
u
x
2
+

2
y
2
= f(x, y),
when we approximate the partial derivatives in a manner similar to how we
approximated u
in Example 3.18. In that case, not only is the overall system

block tridiagonal, but, depending on how we order the equations and variables,
the individual matrices A
i
, B
i
, and C
i
are tridiagonal, or contain mostly zeros.
Taking advantage of these facts is absolutely necessary, to be able to achieve
the desired accuracy in the approximation to the solutions of certain models.
3.2.5.4 Banded Matrices
A generalization of a tridiagonal matrix arising in many applications is a
banded matrix. Such matrices have non-zero elements only on the diagonal
and p entries above and below the diagonal. For example, p = 1 for a tridi-
agonal matrix. The number p is called the semi-bandwidth of the matrix.
Example 3.20
_
_
_
_
_
_
_
_
_
3 1 1 0 0
1 3 1 1.1 0
0.9 1 3 1 1.1
0 1.1 1 3 1
0 0 0.9 1 3
_
_
_
_
_
_
_
_
_
is a banded matrix with semi-bandwidth equal to 2.
Provided Gaussian elimination without pivoting is applicable, banded ma-
trices may be stored and solved analogously to tridiagonal matrices. In partic-
ular, we may store the matrix in 2p +1 vectors, and we may use an algorithm
similar to (3.8) and (3.9), based on the general Gaussian elimination algorithm
(Algorithm 3.1 on page 83), but with the loop on i having an upper bound
equal to min k +p, n, rather than n, and with the a
i,j
replaced by appropriate
references to the n by 2p +1 matrix in which the non-zero entries are stored.
It is advantageous to handle a matrix as a banded matrix when its dimension
n is large relative to p.
3.2.5.5 General Sparse Matrices
Numerous applications, such as models of communications and transporta-
tion networks, give rise to matrices most of whose elements are zero, but do
not have an easily usable structure such as a block or banded structure. Ma-
trices most of whose elements are zero are called sparse matrices. Matrices
that are not sparse are called dense or full . Special, more sophisticated vari-
ants of Gaussian elimination, as well as iterative methods, which we treat
later in Section 3.5, may be used for sparse matrices.
Several dierent schemes are used to store sparse matrices. One such scheme
is to store two integer vectors r and c and one oating point vector v, such
that the number of entries in r, c, and v is the total number of non-zero
elements in the matrix; r
i
gives the row index of the i-th non-zero element, c
i
gives the corresponding column index, and v
i
gives the value.
Example 3.21
_
_
_
_
_
_
_
_
_
0 0 1 0 0
3 0 0 0 1
2 1 0 1.1 0
0 0 0 5 1
7 8 0 0 0
_
_
_
_
_
_
_
_
_
may be stored with the vectors
r =
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
2
3
5
3
5
1
3
4
2
4
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
, c =
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
1
1
1
2
2
3
4
4
5
5
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
, and v =
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
3
2
7
1
8
1
1.1
5
1
1
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
.
Note that there are 25 entries in this matrix, but only 10 nonzero entries.
There is a question concerning whether or not a particular matrix should
be considered to be sparse, rather than treated as dense. In particular, if the
matrix has some elements that are zero, but many are not, it may be more
ecient to treat the matrix as dense. This is because there is extra overhead
in the algorithms used to solve the systems with matrices that are stored as
sparse, and the elimination process can cause ll-in, the introduction of non-
zeros in the transformed matrix into elements that were zero in the original
matrix. Whether a matrix should be considered to be sparse or not depends
on the application, the type of computer used to solve the system, etc. Sparse
systems that have a banded or block structure are more eciently treated
with special algorithms for banded or block systems than with algorithms for
general sparse matrices.
There is extensive support for sparse matrices in matlab. This is detailed in
matlabs help system. One method of describing a sparse matrix in matlab
is as we have done in Example 3.21.
3.3 Roundo Error and Conditioning
On page 17, we dened the condition number of a function in terms of the
ratio of the relative error in the function value to the relative error in its argu-
ment. Also, in Example 1.17 on page 16, we saw that one way of computing
a quantity can lead to a large relative error in the result, while another way
leads to an accurate result; that is, one algorithm can be numerically unstable
while another is stable.
Similar concepts hold for solutions to systems of linear equations. For ex-
ample, Example 3.17 on page 90 illustrated that Gaussian elimination without
partial pivoting can be numerically unstable for a system of equations where
Gaussian elimination with partial pivoting is stable. We also have a concept of
condition number of a matrix, which relates the relative change of components
of x to changes in the elements of the matrix A and right-hand-side vector
b for the system. To understand the most commonly used type of condition
number of a matrix, we introduce norms.
3.3.1 Norms
We use norms to describe errors in vectors and convergence of sequences of
vectors.
DEFINITION 3.15 A function that assigns a non-negative real number
v to a vector v is called a norm, provided it has the following properties.
1. u 0.
2. u = 0 if and only if u = 0.
3. u = ||u for R (or C if v is a complex vector).
4. u +v u +v (triangle inequality).
Consider V = C
n
, the vector space of n-tuples of complex numbers. Note
that x C
n
has the form x = (x
1
, x
2
, , x
n
)
T
. Also,
x +y = (x
1
+y
1
, x
2
+y
2
, , x
n
+y
n
)
T
and
x = (x
1
, x
2
, , x
n
)
T
.
Important norms on C
n
are:
(a) x
= max
1in
|x
i
|: the
or max norm (for z = a + ib, |z| =
a
2
+b
2
=
zz.)
(b) x
1
=
n
i=1
|x
i
|: the
1
norm
(c) x
2
=
i=1
|x
i
|
2
1/2
: the
2
norm (Euclidean norm)
(d) Scaled versions of the above norms, where we dene v
a
= a
T
v,
where a = (a
1
, a
2
, , a
n
)
T
with a
i
> 0 for 1 i n.
A useful property relating the Euclidean norm and the dot product is:
THEOREM 3.7
(the CauchySchwarz inequality)
|v w| = |v
T
w| x
2
y
2
.
We now introduce a concept and notation for describing errors in compu-
tations involving vectors.
DEFINITION 3.16 The distance from u to v is dened as u v.
The following concept and associated theorem are worth keeping in mind,
since they hint that, in many cases, it is not so important from the point of
view of size of the error, which norm we choose to describe the error in a
vector.
DEFINITION 3.17 Two norms
and
are called equivalent if

there exist positive constants c
1
and c
2
and such that
c
1
x
c
2
x
.
Hence, also,
1
c
2
x

1
c
1
x
.
THEOREM 3.8
Any two norms on C
n
are equivalent.
The following are the constants associated with the 1-, 2-, and -norms:
(a) x
x
2

n|x
,
(b)
1
n
x
1
x
2
x
1
,
(c)
1
n
x
1
x
x
1
.
(3.10)
The above relations are sharp in the sense that vectors can be found for which
the inequalities are actually inequalities. Thus, in a sense, the 1, 2, and
norms of vectors become less equivalent, the larger the vector space.
Example 3.22
The matlab function norm computes norms of vectors. Consider the follow-
ing dialog.
>> x = [1;1;1;1;1]
x =
1
1
1
1
1
>> norm(x,1)
ans =
5
>> norm(x,2)
ans =
2.2361
>> norm(x,inf)
ans =
1
>> n=1000;
>> for i=1:n;x(i)=1;end;
>> norm(x,1)
ans =
1000
>> norm(x,2)
ans =
31.6228
>> norm(x,inf)
ans =
1
>> >>
This illustrates that, for a vector all of whose entries are equal to 1, the second
inequality in (3.10)(a) is an equation, the rst inequality in (b) is an equation,
and the rst inequality in (c) is an equation.
To discuss the condition number of the matrix, we use the concept of the
norm of a matrix. In the following, A and B are arbitrary square matrices
and is a complex number.
DEFINITION 3.18 A matrix norm is a real-valued function of A, de-
noted by satisfying:
1. A 0.
2. A = 0 if and only if A = 0.
3. A = || A.
4. A+B A +B.
5. AB A B.
REMARK 3.4 In contrast to vector norms, we have an additional fth
property, referred to as a submultiplicative property, dealing with the norm
of the product of two matrices.
Example 3.23
The quantity
A
E
=
_
_
n
i,j=1
|a
ij
|
2
_
_
1
2
is called the Frobenius norm. Since the Frobenius norm is the Euclidean
norm of the matrix when the matrix is viewed to be a single vector formed
by concatenating its columns (or rows), the Frobenius norm is a norm. It is
also possible to prove that the Frobenius norm is a matrix norm.
To relate norms of matrices to errors in the solution of linear systems, we
relate vector norms to matrix norms:
DEFINITION 3.19 A matrix norm A and a vector norm x are called
compatible if for all vectors x and matrices A we have Ax A x.
REMARK 3.5 A consequence of the CauchySchwarz inequality is that
Ax
2
A
E
x
2
, i.e., the Euclidean norm
E
for matrices is compatible
with the
2
-norm
2
for vectors.
In fact, every vector norm has associated with it a sharply dened compat-
ible matrix norm:
DEFINITION 3.20 Given a vector norm , we dene a natural or
induced matrix norm associated with it as
A = sup
x=0
Ax
x
. (3.11)
It is straightforward to show that an induced matrix norm satises the ve
properties required of a matrix norm. Also, from the denition of induced
norm, an induced matrix norm is compatible with the given vector norm,
that is,
A x Ax for all x C
n
. (3.12)
REMARK 3.6 Denition 3.20 is equivalent to
A = sup
y=1
Ay,
since
A = sup
x=0
Ax
x
= sup
x=0
A
x
x
= sup
y=1
Ay
(letting y = x/x).
We now present explicit expressions for A
, A
1
, and A
2
.
THEOREM 3.9
(Formulas for common induced matrix norms)
(a) A
= max
1in
n
j=1
|a
ij
| = {maximum absolute row sum}.
(b) A
1
= max
1jn
n
i=1
|a
ij
| = {maximum absolute column sum}.
(c) A
2
=
(A
H
A), where (M) is the spectral radius of the matrix M,
that is, the maximum absolute value of an eigenvalue of M.
(We will study eigenvalues and eigenvectors in Chapter 5. This spectral
radius plays a fundamental role in a more advanced study of matrix
norms. In particular (A) A for any square matrix A and any
matrix norm, and, for any square matrix A and any > 0, there is a
matrix norm such that A (A) +. )
Note that A
2
is not equal to the Frobenius norm.
Example 3.24
The norm function in matlab gives the induced matrix norm when its ar-
gument is a matrix. With the matrix A as in Example 3.13 (on page 81),
consider the following matlab dialog (edited for brevity):
>> A
A =
1 2 3
4 5 6
7 8 10
>> x
ans = 1 1 1
> norm(A,1)
ans = 19
>> norm(x,1)
ans = 3
>> norm(A*x,1)
ans = 46
>> norm(A,1)*norm(x,1)
ans = 57
>> norm(A,2)
ans = 17.4125
>> norm(x,2)
ans = 1.7321
>> norm(A*x,2)
ans = 29.7658
>> norm(A,2)*norm(x,2)
ans = 30.1593
>> norm(A,inf)
ans = 25
>> norm(x,inf)
ans = 1
>> norm(A*x,inf)
ans = 25
>> norm(A,inf)*norm(x,inf)
ans = 25
>>
We are now prepared to discuss condition numbers of matrices.
3.3.2 Condition Numbers
We begin with the following:
DEFINITION 3.21 If the solution x of Ax = b changes drastically when
A or b is perturbed slightly, then the system Ax = b is called ill-conditioned.
Because rounding errors are unavoidable with oating point arithmetic,
much accuracy can be lost during Gaussian elimination for ill-conditioned
systems. In fact, the nal solution may be considerably dierent than the
exact solution.
Example 3.25
An ill-conditioned system is
Ax =
1 0.99
0.99 0.98
x
1
x
2
1.99
1.97
, whose exact solution is x =
1
1
.
However,
Ax =
1.989903
1.970106
has solution x =
3
1.0203
.
Thus, a change of
b =
0.000097
0.000106
produces a change x =
2.0000
2.0203
.
We rst study the phenomenon of ill-conditioning, then study roundo error
in Gaussian elimination. We begin with
THEOREM 3.10
Let
be an induced matrix norm. Let x be the solution of Ax = b with

A an n n invertible complex matrix. Let x +x be the solution of
(A+A)(x +x) = b +b. (3.13)
Assume that
A
A
1
< 1. (3.14)
Then
x
(A)(1 A
A
1
)
1
+
A
, (3.15)
where
(A) = A
A
1
is dened to be the condition number of the matrix A with respect to norm
. There exist perturbations x and b for which (3.15) holds with equality.
That is, inequality (3.15) is sharp.
(We supply a proof of Theorem 3.10 in [1].)
The condition number
(A) 1 for any induced matrix norm and any

matrix A, since
1 = I
= A
1
A
A
1
(A).
Example 3.26
Consider the system of equations from Example 3.25, and the following mat-
lab dialog.
>> A = [1 0.99
0.99 0.98]
A =
1.0000 0.9900
0.9900 0.9800
>> norm(A,1)*norm(inv(A),1)
ans =
3.9601e+004
>>>> b = [1.99;1.97]
b =
1.9900
1.9700
>> x = A\b
x =
1.0000
1.0000
>> btilde = [1.989903;1.980106]
btilde =
1.9899
1.9801
>> xtilde = A\btilde
xtilde =
102.0000
-101.0203
>> norm(x-xtilde,1)/norm(x,1)
ans =
101.5102
>> sol_error = norm(x-xtilde,1)/norm(x,1)
sol_error =
101.5102
>> data_error = norm(b-btilde,1)/norm(b,1)
data_error =
0.0026
>> data_error * cond(A,1)
ans =
102.0326
>> cond(A,1)
ans =
3.9601e+004
>> cond(A,2)
ans =
3.9206e+004
>> cond(A,inf)
ans =
3.9601e+004
>>
This illustrates the denition of the condition number, as well as the fact that
the relative error in the norm of the solution can be estimated by the relative
error in the norms of the matrix and the right-hand-side vector multiplied
by the condition number of the matrix. Also, in this two-dimensional case,
the condition numbers in the 1-, 2-, and -norms do not dier by much.
The actual error in the solutions A\x and A\tilde x are small relative to the
displayed digits, in this case.
If A = 0, we have
x
(A)
b
,
and if b = 0, then
x
(A)
1 A
1
.
Note: In solving systems using Gaussian elimination with partial pivoting,
we can use the condition number as a rule of thumb in estimating the number
of digits correct in the solution. For example, if double precision arithmetic is
used, errors in storing the matrix into internal binary format and in each step
of the Gaussian elimination process are on the order of 10
16
. If the condition
number is 10
4
, then we might expect 16 4 = 12 digits to be correct in the
solution. In many cases, this is close. (For more foolproof bounds on the
error, interval arithmetic techniques can sometimes be used.)
Note: For a unitary matrix U, i.e., U
H
U = I, we have
2
(U) = 1. Such a
matrix is called perfectly conditioned, since
(A) 1 for any and A.

A classic example of an ill-conditioned matrix is the Hilbert matrix of order
n:
H
n
=
_
_
_
_
_
_
_
_
1
1
2
1
3

1
n
1
2
1
3
1
4

1
n+1
.
.
.
1
n
1
n+1
1
n+2

1
2n1
_
_
_
_
_
_
_
_
.
Hilbert matrices and matrices that are approximately Hilbert matrices occur
in approximation of data and functions. Condition numbers for some Hilbert
TABLE 3.1: Condition numbers of some Hilbert matrices
n 3 5 6 8 16 32 64
2(Hn) 5 10
2
5 10
5
15 10
6
15 10
9
2.0 10
22
4.8 10
46
3.5 10
95
matrices appear in Table 3.1. The reader may verify entries in this table,
using the following matlab dialog as an example.
>> hilb(3)
ans =
1.0000 0.5000 0.3333
0.5000 0.3333 0.2500
0.3333 0.2500 0.2000
>> cond(hilb(3))
ans =
524.0568
>> cond(hilb(3),2)
ans =
524.0568
REMARK 3.7 Consider Ax = b. Ill-conditioning combined with round-
ing errors can have a disastrous eect in Gaussian elimination. Sometimes,
the conditioning can be improved ( decreased) by scaling the equations. A
common scaling strategy is to row equilibrate the matrix A by choosing a
diagonal matrix D, such that premultiplying A by D causes max
1jn
|a
ij
| = 1
for i = 1, 2, , n. Thus, DAx = Db becomes the scaled system with maxi-
mum elements in each row of DA equal to unity. (This procedure is generally
recommended before Gaussian elimination with partial pivoting is employed
[19]. However, there is no guarantee that equilibration with partial pivoting
will not suer greatly from eects of roundo error.)
Example 3.27
The condition number does not give the entire story in Gaussian elimination.
In particular, if we multiply an entire equation by a non-zero number, this
changes the condition number of the matrix, but does not have an eect on
Gaussian elimination. Consider the following matlab dialog.
>> A = [1 1
-1 1]
A =
1 1
-1 1
>> cond(A)
ans =
1.0000
>> A(1,:) = 1e16*A(1,:)
A =
1.0e+016 *
1.0000 1.0000
-0.0000 0.0000
>> cond(A)
ans =
1.0000e+016
>>
However, the strange scaling in the rst row of the matrix will not cause
serious roundo error when Gaussian elimination proceeds with oating point
arithmetic, if the right-hand-sides are scaled accordingly.
3.3.3 Roundo Error in Gaussian Elimination
Consider the solution of Ax = b. On a computer, elements of A and b
are represented by oating point numbers. Solving this linear system on a
computer only produces an approximate solution x.
There are two kinds of rounding error analysis. In backward error analysis,
one shows that the computed solution x is the exact solution of a perturbed
system of the form (A + F) x = b. (See, for example, [30] or [42].) Then we
have
Ax A x = F x,
that is,
x x = A
1
F x,
from which we obtain
x x
A
1
(A)
F
. (3.16)
Thus, assuming that we have estimates for
(A) and F
, we can use
(3.16) to estimate the error x x
.
In forward error analysis, one keeps track of roundo error at each step of
the elimination procedure. Then, x x is estimated in some norm in terms
of, for example, A, (A), and =
p
2
1t
[37, 38].
The analyses are lengthy and are not given here. The results, however, are
useful to understand. Basically, it is shown that
F
c
n
g, (3.17)
where
c
n
is a constant that depends on size of the n n matrix A,
g is a growth factor, g =
max
i,j,k
|a
(k)
ij
|
max
i,j
|a
ij
|
, and
is the unit roundo error, =
p
2
1t
.
Note: Using backward error analysis, c
n
= 1.01n
3
+ 5(n + 1)
2
, and using
forward error analysis, c
n
=
1
6
(n
3
+ 15n
2
+ 2n 12).
Note: The growth factor g depends on the pivoting strategy: g 2
n1
for
partial pivoting,
6
while g n
1/2
(2 3
1/2
4
1/3
n
1/n1
)
1/2
for full pivoting.
(Wilkinson conjectured that this can be improved to g n.) For example,
for n = 100, g 2
99
10
30
for partial pivoting and g 3300 for full pivoting.
Note: Thus, by (3.16) and (3.17), the relative error x x
/ x
depends
directly on
(A), , n
3
, and the pivoting strategy.
REMARK 3.8 The factor of 2
n1
discouraged numerical analysts in
the 1950s from using Gaussian elimination, and spurred study of iterative
methods for solving linear systems. However, it was found that, for most
matrices, the growth factor is much less, and Gaussian elimination with partial
pivoting is usually practical.
3.3.4 Interval Bounds
In many instances, it is practical to obtain rigorous bounds on the solution
x to a linear system Ax = b. The algorithm is a modication of the gen-
eral Gaussian elimination algorithm (Algorithm 3.1) and back substitution
(Algorithm 3.2), as follows.
ALGORITHM 3.4
(Interval bounds for the solution to a linear system)
INPUT: The n by n matrix A and n-vecttor b R
n
.
OUTPUT: an interval vector x such that the exact solution to Ax = b must
be within the bounds x.
1. Use Algorithm 3.1 and Algorithm 3.2 (that is, Gaussian elimination with
back substitution, or any other technique) and oating point arithmetic
to compute an approximation Y to A
1
.
2. Use interval arithmetic, with directed rounding, to compute interval en-
closures to Y A and Y b. That is,
6
It cannot be improved, since g = 2
n1
for certain matrices.
(a)

A Y A (computed with interval arithmetic),
(b)

b Y b (computed with interval arithmetic).
3. FOR k = 1, 2, , n 1 (forward phase using interval arithmetic)
FOR i = k + 1, , n
(a) m
ik
a
ik
/ a
kk
.
(b) a
ik
[0, 0].
(c) FOR j = k + 1, , n
a
ij
a
ij
m
ik
a
kj
.
END FOR
(d)

b
i

b
i
m
ik
b
k
.
END FOR
END FOR
4. x
n

b
n
/ a
nn
.
5. FOR k = n 1, n 2, , 1 (back substitution)
x
k
(
b
k
n
j=k+1
a
kj
x
j
)/ a
kk
.
END FOR
END ALGORITHM 3.4.
Note: We can explicitly set a
ik
to zero without loss of mathematical rigor,
even though, using interval arithmetic, a
ik
m
ik
a
kk
may not be exactly [0, 0].
In fact, this operation does not even need to be done, since we need not ref-
erence a
ik
in the back substitution process.
Note: Obtaining the rigorous bounds x in Algorithm 3.2 is more costly
than computing an approximate solution with oating point arithmetic using
Gaussian elimination with back substitution, because an approximate inverse
Y must explicitly be computed to precondition the system. However, both
computations take O(n
3
) operations for general systems.
The following theorem claries why we may use Algorithm 3.4 to obtain
mathematically rigorous bounds.
THEOREM 3.11
Dene the solution set to

Ax =

b to be
(
A,
b) =
x |

Ax =

b for some

A

A and

b
.
If Ax
= b, then x
A,
b). Furthermore, if x is the output to Algo-

rithm 3.4, then (
A,
b) x.
For facts enabling a proof of Theorem 3.11, see [29] or other references on
interval analysis.
Example 3.28
_
_
_
3.3330 15920. 10.333
2.2220 16.710 9.612
1.5611 5.1791 1.6852
_
_
_
_
_
_
x
1
x
2
x
3
_
_
_ =
_
_
_
15913.
28.544
8.4254
_
_
_
For this problem,
(A) 16000 and the exact solution is x = [1, 1, 1]

T
. We
will use the matlab(providing IEEE double precision oating point arith-
metic) to compute Y , and we will use the intlab interval arithmetic toolbox
for matlab (based on IEEE double precision). Rounded to 14 decimal digits,
7
we obtain
Y
_
_
0.00012055643706 0.14988499865822 0.85417095741675
0.00006278655296 0.00012125786211 0.00030664438576
0.00008128244868 0.13847464088044 0.19692507695527
_
_
.
Using outward rounding in both the computation and the decimal display, we
obtain
[1.00000000000000, 1.00000000000000] [0.00000000000012, 0.00000000000011]

[0.00000000000000, 0.00000000000001] [1.00000000000000, 1.00000000000001]
[0.00000000000000, 0.00000000000001] [0.00000000000013, 0.00000000000014]
[0.00000000000001, 0.00000000000000]
[0.00000000000001, 0.00000000000000]
[0.99999999999999, 1.00000000000001]
,
and
[0.99999999999988, 0.99999999999989]
[1.00000000000000, 1.00000000000001]
[1.00000000000013, 1.00000000000014]
.
Completing the remainder of Algorithm 3.4 then gives
x
x
_
_
[0.99999999999999, 1.00000000000001]
[0.99999999999999, 1.00000000000001]
[0.99999999999999, 1.00000000000001]
_
_
.
The actual matlab dialog is as follows:
>> format long
>> intvalinit(DisplayInfsup)
===> Default display of intervals by infimum/supremum (e.g. [ 3.14 , 3.15 ])
>> x = interval_Gaussian_elimination(A,b)
7
as matlab displays it
x =
1.000000000000000
1.000000000000000
1.000000000000001
>> IA = [intval(3.3330) intval(15920.) intval(-10.333)
intval(2.2220) intval(16.710) intval(9.612)
intval(1.5611) intval(5.1791) intval(1.6852)]
intval IA =
1.0e+004 *
Columns 1 through 2
[ 0.00033330000000, 0.00033330000001] [ 1.59200000000000, 1.59200000000000]
[ 0.00022219999999, 0.00022220000000] [ 0.00167100000000, 0.00167100000001]
[ 0.00015610999999, 0.00015611000000] [ 0.00051791000000, 0.00051791000001]
Column 3
[ -0.00103330000001, -0.00103330000000]
[ 0.00096120000000, 0.00096120000001]
[ 0.00016851999999, 0.00016852000001]
>> Ib = [intval(15913.);intval(28.544);intval(8.4254)]
intval Ib =
1.0e+004 *
[ 1.59130000000000, 1.59130000000000]
[ 0.00285440000000, 0.00285440000001]
[ 0.00084253999999, 0.00084254000000]
>> YA = Y*IA
intval YA =
Columns 1 through 2
[ 0.99999999999999, 1.00000000000001] [ -0.00000000000100, -0.00000000000099]
[ -0.00000000000001, 0.00000000000001] [ 1.00000000000000, 1.00000000000001]
[ -0.00000000000001, 0.00000000000001] [ 0.00000000000013, 0.00000000000014]
Column 3
[ 0.00000000000000, 0.00000000000001]
[ -0.00000000000001, -0.00000000000000]
[ 0.99999999999999, 1.00000000000001]
>> Yb = Y*Ib
intval Yb =
[ 0.99999999999900, 0.99999999999901]
[ 1.00000000000000, 1.00000000000001]
[ 1.00000000000013, 1.00000000000014]
>> x = interval_Gaussian_elimination(A,b)
x =
1.000000000000000
1.000000000000000
1.000000000000001
Here, we need to use the intlab function intval to convert the decimal
strings representing the matrix and right-hand side vector elements to small
intervals containing the actual decimal values. This is because, even though
the original system did not have interval entries, the elements cannot all be
represented exactly as binary oating point numbers, so we must enclose
the exact values in oating point intervals to be certain that the bounds we
compute contain the actual solution. This is not necessary in computing the
oating point preconditioning matrix Y , since Y need not be an exact inverse.
The function interval Gaussian elimination, not a part of intlab, is as
follows:
function [x] = interval_Gaussian_elimination(A, b)
% [x] = interval_Gaussian_elimination(A, b)
% returns the result of Algorithm 3.5 in the book.
% The matrix A and vector b should be intervals,
% although they may be point intervals (i.e. of width zero).
n = length(b);
Y = inv(mid(A));
Atilde = Y*A;
btilde = Y*b;
error_occurred = 0;
for k=1:n
for i=k+1:n
m_ik = Atilde(i,k)/Atilde(k,k);
for j=k+1:n
Atilde(i,j) = Atilde(i,j) - m_ik*Atilde(k,j);
end
btilde(i) = btilde(i) -m_ik*btilde(k);
end
end
x(n) = btilde(n)/Atilde(n,n);
for k=n-1:-1:1
x(k) = btilde(k);
for j=k+1:n
x(k) = x(k) - Atilde(k,j)*x(j);
end
x(k) = x(k)/Atilde(k,k);
end
x = x;
Note: There are various ways of using interval arithmetic to obtain rigorous
bounds on the solution set to linear systems of equations. Some of these are
related mathematically to the interval Newton method introduced in 2.4 on
page 56, while others are related to the iterative techniques we discuss later in
this section. The eectiveness and practicality of a particular such technique
depend on the condition of the system, and whether the entries in the matrix
A and right hand side vector b are points to start, or whether there are larger
uncertainties in them (that is, whether or not these coecients are wide or
narrow intervals). A good theoretical reference is [29] and some additional
practical detail is given in our monograph [20].
We now consider another method for computing the solution of a linear
system Ax = b. This method is particularly appropriate for various statistical
computations, such as least squares ts, when there are more equations than
unknowns.
3.4 Orthogonal Decomposition (QR Decomposition)
This method for computing the solution of Ax = b is based on orthogonal
decomposition, also known as the QR decomposition or QR factorization. In
addition to solving linear systems, the QR factorization is also useful in least
squares problems and eigenvalue computations.
We will use the following concept heavily in this section, as well as when
we study the singular value decomposition.
DEFINITION 3.22 Two vectors u and v are called orthogonal provided
the dot product u v = 0. A set of vectors v
(i)
is said to be orthonormal,
provided v
(i)
v
(j)
=
ij
, where
ij
is the Kronecker delta function
ij
=
1 if i = j,
0 if i = j.
A matrix Q whose columns are orthonormal vectors is called an orthogonal
matrix.
In QR-decompositions, we compute an orthogonal matrix Q and an upper
triangular matrix
8
R such that A = QR. Advantages of the QR decomposition
include the fact that systems involving an upper triangular matrix R can be
solved by back substitution, the fact that Q is perfectly conditioned (with
condition number in the 2-norm equal to 1), and the fact that the solution to
Qy = b is Q
T
y.
There are several ways of computing QR-decompositions. These are de-
tailed, for example, in our graduate-level text [1]. Here, we focus on the
properties of the decomposition and its use.
Note: The QR decomposition is not unique. Hence, dierent software may
come up with dierent QR decompositions for the same matrix.
3.4.1 Properties of Orthogonal Matrices
The following two properties, easily provable, make the QR decomposition
a numerically stable way of dealing with systems of equations.
THEOREM 3.12
Suppose Q is an orthogonal matrix. Then Q has the following properties.
8
also known as a right triangular matrix. This is the reason for the notation R.
1. Q
T
Q = I, that is, Q
T
= Q
1
. Thus, solving the system Qy = b can be
done with a matrix multiplication.
9
2. Q
2
= Q
T
2
= 1.
3. Hence,
2
(Q) = 1, where
2
(Q) is the condition number of Q in the 2-
norm. That is, Q is perfectly conditioned with respect to the 2-norm (and
working with systems of equations involving Q will not lead to excessive
roundo error accumulation).
4. Qx
2
= x
2
for every x R
n
. Hence QA
2
= A
2
for every n by
n matrix A.
3.4.2 Least Squares and the QR Decomposition
Overdetermined linear systems (with more equations than unknowns) oc-
cur frequently in data tting, in mathematical modeling and statistics. For
example, we may have data of the form {(t
i
, y
i
)}
m
i=1
, and we wish to model
the dependence of y on t by a linear combination of n basis functions {
j
}
n
j=1
,
that is,
y f(t) =
n
i=1
x
i
i
(t), (3.18)
where m > n. Setting f(t
i
) = y
i
, 1 i m, gives the overdetermined linear
system
_
_
_
_
_
_
_
_
1
(t
1
)
2
(t
1
)
n
(t
1
)
1
(t
2
)
2
(t
2
)
n
(t
2
)
.
.
.
1
(t
m
)
2
(t
m
)
n
(t
m
)
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
_
_
_
=
_
_
_
_
_
_
_
_
y
1
y
2
.
.
.
y
m
_
_
_
_
_
_
_
_
, (3.19)
that is,
Ax = b, where A L(R
n
, R
m
), a
ij
=
j
(t
i
), and b
i
= y
i
. (3.20)
Perhaps the most common way of tting data is with least squares, in which
we nd x
such that
1
2
Ax
b
2
2
= min
xR
n
(x), where (x) =
1
2
Ax b
2
2
. (3.21)
(Note that x
minimizes the 2-norm of the residual vector r(x) = Ax b,

since the function g(u) = u
2
is increasing.)
9
With the usual way of multiplying matrices, this is n
2
multiplications, more than with
back-substitution, but still O(n
2
). Furthermore, it can be done with n dot products,
something that is ecient on many machines.
The naive way of nding x
is to set the gradient (x) = 0 and simplify.

Doing so gives the normal equations:
A
T
Ax = A
T
b. (3.22)
(See Exercise 11 on page 143.) However, the normal equations tend to be
very ill-conditioned. For example, if m = n,
2
(A
T
A) =
2
(A)
2
. Fortunately,
the least squares solution x
may be computed with a QR decomposition. In

particular,
Ax b
2
= QRx b
2
= Q
T
(QRx b)
2
= Rx Q
T
b
2
.
(Above, we used Ux
2
= x
2
when U is orthogonal.) However,
Rx Q
T
b
2
2
=
n
i=1
_
_
_
_
_
i
j=1
r
ij
x
j
_
_
_
(Q
T
b)
i
_
_
2
+
n
i=m+1
(Q
T
b)
2
i
. (3.23)
Observe now:
1. All m terms in the sum in (3.23) are nonnegative.
2. The rst n terms can be made exactly zero.
3. The last mn terms are constant.
Therefore,
min
xR
n
Ax b
2
2
=
n
i=m+1
(Q
T
b)
2
i
,
and the minimizer x
can be computed by backsolving the square triangular

system consisting of the rst n rows of Rx = Q
T
b.
We summarize these computations in the following algorithm.
ALGORITHM 3.5
(Least squares ts with a QR decomposition)
INPUT: the m by n A, m n, and b R
m
.
OUTPUT: the least squares t x R
n
such that Ax b
2
is minimized, as
well as the square of the residual norm Ax b
2
2
.
1. Compute Q and R such that Q is an m by m orthogonal matrix, R is an
m by n upper triangular (or right triangular) matrix, and A = QR.
2. form y = Q
T
b.
3. Solve the upper triangular system n by n system R
1:n,1:n
x = y
1:n
using
Algorithm 3.2 (the back-substitution algorithm). Here, R
1:n,1:n
corre-
sponds to A
(n)
and y
1:n
corresponds to b
(n)
.
4. Set the residual norm Ax b
2
to
m
i=n+1
y
2
i
= y
n+1:m
2
.
END ALGORITHM 3.5.
Example 3.29
Consider tting the data
t y
0 1
1 4
2 5
3 8
in the least squares sense with a polynomial of the form
p
2
(x) = x
0
0
(x) +x
1
1
(x) +x
2
2
(x),
where
0
(x) 1,
1
(x) x, and
2
(x) x
2
. The overdetermined system
(3.19) becomes
_
_
_
_
1 0 0
1 1 1
1 2 4
1 3 9
_
_
_
_
_
_
x
0
x
1
x
2
_
_
=
_
_
_
_
1
4
5
8
_
_
_
_
.
We use matlab to perform a QR decomposition and nd the least squares
solution:
>> format short
>> clear x
>> A = [1 0 0
1 1 1
1 2 4
1 3 9]
A =
1 0 0
1 1 1
1 2 4
1 3 9
>> b = [1;4;5;8]
b =
1
4
5
8
>> [Q,R] = qr(A)
Q =
-0.5000 0.6708 0.5000 0.2236
-0.5000 0.2236 -0.5000 -0.6708
-0.5000 -0.2236 -0.5000 0.6708
-0.5000 -0.6708 0.5000 -0.2236
R =
-2.0000 -3.0000 -7.0000
0 -2.2361 -6.7082
0 0 2.0000
0 0 0
>> Qtb = Q*b;
>> x(3) = Qtb(3)/R(3,3);
>> x(2)=(Qtb(2) - R(2,3)*x(3))/R(2,3);
>> x(1) = (Qtb(1) - R(1,2)*x(2) - R(1,3)*x(3))/R(1,1)
x =
3.4000 0.7333 0.0000
>> x=x;
>> resid = A*x - b
resid =
2.4000
0.1333
-0.1333
-2.4000
>> tt = linspace(0,3);
>> yy = x(1) + x(2)*tt + x(3)*tt.^2;
>> axis([-0.1,3.1,0.9,8.1])
>> hold
Current plot held
>> plot(A(:,2),b,LineStyle,none,Marker,*,MarkerEdgeColor,red,Markersize,15)
>> plot(tt,yy)
>> y = Q*b
y =
-9.0000
-4.9193
0.0000
-0.8944
>> x = R(1:n,1:n)\y(1:n)
x =
1.2000
2.2000
0.0000
>> resid_norm = norm(y(n+1:m),2)
resid_norm =
0.8944
>> norm(A*x-b,2)
ans =
0.8944
>>
>>
This dialog results in the following plot, illustrating the data points as stars
and the quadratic t (which in this case happens to be linear) as a blue curve.
0 0.5 1 1.5 2 2.5 3
1
2
3
4
5
6
7
8
Note that the t does not approximate the rst and fourth data points well.
(The portion of the dialog following the plot commands illustrates alternative
views of the computation of x and the residual norm.)
Although working with the QR decomposition is a stable process, care
should be taken when computing Q and R. We discuss actually computing Q
and R in [1].
We now turn to iterative techniques for linear systems of equations.
3.5 Iterative Methods for Solving Linear Systems
Here, we study iterative solution of linear systems
Ax = b, i.e.
n
k=1
a
jk
x
k
= b
j
, j = 1, 2, . . . , n. (3.24)
Example 3.30
Consider Example 3.18 (on page 93), where we replaced a second derivative
in a dierential equation by a dierence approximation, to obtain the system
_
_
2 1 0
1 2 1
0 1 2
_
_
_
_
x
1
x
2
x
3
_
_
=
1
16
_
_
sin(/4)
sin(/2)
sin(3/4)
_
_
.
In other words, the equations are
2x
1
x
2
=
1
16
sin(
4
),
x
1
+ 2x
2
x
3
=
1
16
sin(
2
),
x
2
+ 2x
3
=
1
16
sin(
3
4
).
Solving the rst equation for x
1
, the second equation for x
2
, and the third
equation for x
3
, we obtain
x
1
=
1
2
1
16
sin(
4
) + x
2
,
x
2
=
1
2
1
16
sin(
2
) + x
1
+ x
3
,
x
3
=
1
2
1
16
sin(
3
4
) + x
2
,
which can be written in matrix form as
_
_
_
x
1
x
2
x
3
_
_
_ =
_
_
_
0
1
2
0
1
2
0
1
2
0
1
2
0
_
_
_
_
_
_
x
1
x
2
x
3
_
_
_+
1
32
_
_
_
sin(
4
)
sin(
2
)
sin(
3
4
)
_
_
_,
that is,
x = Gx +c, (3.25)
with
x =
_
_
_
x
1
x
2
x
3
_
_
_, G =
_
_
_
0
1
2
0
1
2
0
1
2
0
1
2
0
_
_
_, and c =
1
32
_
_
_
sin(
4
)
sin(
2
)
sin(
3
4
)
_
_
_.
Equation (3.25) can form the basis of an iterative method:
x
(k+1)
= Gx
(k)
+c. (3.26)
Starting with x
(0)
= (0, 0, 0)
T
, we obtain the following in matlab:
>> x = [0,0,0]
x =
0
0
0
>> G = [0 1/2 0
1/2 0 1/2
0 1/2 0]
G =
0 0.5000 0
0.5000 0 0.5000
0 0.5000 0
>> c = (1/32)*[sin(pi/4); sin(pi/2); sin(3*pi/4)]
c =
0.0221
0.0313
0.0221
>> x = G*x + c
x =
0.0221
0.0313
0.0221
>> x = G*x + c
x =
0.0377
0.0533
0.0377
>> x = G*x + c
x =
0.0488
0.0690
0.0488
>> x = G*x + c
x =
0.0566
0.0800
0.0566
>> x = G*x + c
x =
0.0621
0.0878
0.0621
>> x = G*x + c
x =
0.0660
0.0934
0.0660
>> x = G*x + c
x =
0.0688
0.0973
0.0688
>> x = G*x + c
x =
0.0707
0.1000
0.0707
>> x = G*x + c
x =
0.0721
0.1020
0.0721
>>
Comparing with the solution in Example 3.18, we see that the components
of x tend to the components of the solution to Ax = b as we iterate (3.26).
This is an example of an iterative method (namely, the Jacobi method) for
solving the system of equations Ax = b.
Good references for iterative solution of linear systems are [23, 30, 39, 44].
Why may we wish to solve (3.24) iteratively? Suppose that n = 10, 000 or
more, which is not unreasonable for many problems. Then A has 10
8
elements,
making it dicult to store or solve (3.24) directly using, for example, Gaussian
elimination.
To discuss iterative techniques involving vectors and matrices, we use:
DEFINITION 3.23 A sequence of vectors {x
k
}
k=1
is said to converge
to a vector x C
n
if and only if x
k
x 0 as k for some norm .
Denition 3.23 implies that a sequence of vectors {x
k
} R
n
(or C
n
)
converges to x if and only if x
k
i
x
i
as k for all i.
Note: Iterates dened by (3.26) can be viewed as xed point iterates that
under certain conditions converge to the xed point.
DEFINITION 3.24 The iterative method dened by (3.26) is called con-
vergent if, for all initial values x
(0)
, we have x
(k)
A
1
b as k .
We now take a closer look at the Jacobi method, as well as the related
GaussSeidel method and SOR method.
3.5.1 The Jacobi Method
We can think of the Jacobi method illustrated in the above example in
matrix form as follows. Let L be the lower triangular part of the matrix A,
U the upper triangular part, and D the diagonal part.
Example 3.31
In Example 3.30,
L =
_
_
0 0 0
1 0 0
0 1 0
_
_
, U =
_
_
0 1 0
0 0 1
0 0 0
_
_
, and D =
_
_
2 0 0
0 2 0
0 0 2
_
_
.
Then the Jacobi method may be written in matrix form as
G = D
1
(L +U) J. (3.27)
J is called the iteration matrix for the Jacobi method. The iterative method
becomes:
x
(k+1)
= D
1
(L +U)x
(k)
+D
1
b, k = 0, 1, 2, . . . (3.28)
Generally, one uses the following equations to solve for x
(k+1)
:
x
(0)
i
is given,
x
(k+1)
i
=
1
a
ii
_
_
_
b
i
i1
j=1
a
ij
x
(k)
j

n
j=i+1
a
ij
x
(k)
j
_
_
_
,
(3.29)
for k 0 and 1 i n (where a sum is absent if its lower limit on j is larger
than its upper limit). Equations (3.29) are easily programmed.
3.5.2 The GaussSeidel Method
We now discuss the GaussSeidel method, or successive relaxation method.
If in the Jacobi method, we use the new values of x
j
as they become available,
then
x
(0)
i
is given,
x
(k+1)
i
=
1
a
ii
_
_
_
b
i
i1
j=1
a
ij
x
(k+1)
j

n
j=i+1
a
ij
x
(k)
j
_
_
_
,
(3.30)
for k 0 and 1 i n. (We continue to assume that a
ii
= 0 for i =
1, 2, . . . , n.) The iterative method (3.30) is called the GaussSeidel method,
and can be written in matrix form with
G = (L +D)
1
U G,
so
x
(k+1)
= (L +D)
1
Ux
(k)
+ (L +D)
1
b for k 0. (3.31)
Note: The GaussSeidel method only requires storage of
(x
(k+1)
1
, x
(k+1)
2
, . . . , x
(k+1)
i1
, x
(k)
i
, x
(k)
i+1
, . . . , x
(k)
n
)
T
to compute x
(k+1)
i
. The Jacobi method requires storage of x
(k)
as well as
x
(k+1)
. Also, the GaussSeidel method generally converges faster. This gives
an advantage to the GaussSeidel method. However, on some machines, sep-
arate rows of the iteration equation may be processed simultaneously in par-
allel, while the GaussSeidel method requires the coordinates be processed
sequentially (with the equations in some specied order).
Example 3.32
2 1
1 3
x
1
x
2
3
2
, that is,
2x
1
+ x
2
= 3
x
1
+ 3x
2
= 2.
(The exact solution is x
1
= x
2
= 1.) The Jacobi and GaussSeidel methods
have the forms
Jacobi:
_
_
x
(k+1)
1
=
3
2

1
2
x
(k)
2
x
(k+1)
2
=
2
3
+
1
3
x
(k)
1
_
_
,
GaussSeidel:
_
_
x
(k+1)
1
=
3
2

1
2
x
(k)
2
x
(k+1)
2
=
2
3
+
1
3
x
(k+1)
1
_
_
.
The results in Table 3.2 are obtained with x
(0)
= (0, 0)
T
. Observe that the
GaussSeidel method converges roughly twice as fast as the Jacobi method.
This behavior is provable.
3.5.3 Successive Overrelaxation
We now describe Successive OverRelaxation (SOR). In the SOR method,
one computes x
(k+1)
i
to be a weighted mean of x
(k)
i
and the GaussSeidel
iterate for that element. Specically, for = 0 a real parameter, the SOR
method is given by
x
(0)
i
is given
x
(k+1)
i
= (1 )x
(k)
i
+

a
ii
_
_
_
b
i
i1
j=1
a
ij
x
(k+1)
j

n
j=i+1
a
ij
x
(k)
j
_
_
_
,
(3.32)
TABLE 3.2: Iterates of the Jacobi and
GaussSeidel methods, for Example 3.32
k x
(k)
1
Jacobi x
(k)
2
Jacobi x
(k)
1
GS x
(k)
2
GS
0 0 0 0 0
1 1.5 0.667 1.5 1.167
2 1.167 1.167 0.917 0.972
3 0.917 1.056 1.014 1.005
4 0.972 0.972 0.998 0.999
5 1.014 0.991 1.000 1.000
6 1.005 1.005
7 0.998 1.002
8 0.999 0.999
9 1.000 1.000
for 1 i n and for k 0. The parameter is called a relaxation factor. If
< 1, we call an underrelaxation factor and if > 1, we call an overre-
laxation factor. Note that if = 1, the GaussSeidel method is obtained.
Note: For certain classes of matrices and certain between 1 and 2, the SOR
method converges faster than the GaussSeidel method.
We can write (3.32) in the matrix form:
L +
1
x
(k+1)
=
U + (1
1
)D
x
(k)
+b (3.33)
for k = 0, 1, 2, . . . , with x
(0)
given. Thus,
G = (L +D)
1
[(1 )D U] S
,
and
x
(k+1)
= S
x
(k)
+
L +
1
1
b. (3.34)
The matrix S
is called the SOR matrix. Note that = 1 gives G, the

GaussSeidel matrix.
A classic reference on iterative methods, and the SOR method in particular,
is [44].
3.5.4 Convergence of Iterative Methods
The general iteration equation (3.26) (on page 122) gives
x
(k+1)
= Gx
(k)
+c and x
(k)
= Gx
(k1)
+c.
Subtracting these equations and using properties of vector addition and matrix-
vector multiplication gives
x
(k+1)
x
(k)
= G(x
(k)
x
(k1)
). (3.35)
Furthermore, similar rearrangements give
(I G)(x
(k)
x) = x
(k)
x
(k+1)
(3.36)
because
x = Gx +c and x
(k+1)
= Gx
(k)
+c.
Combining (3.35) and (3.36) gives
x
(k)
x = (I G)
1
G(x
(k)
x
(k1)
) = (I B)
1
B
2
(x
(k1)
x
(k2)
) ,
and taking norms gives
x
(k)
x (I G)
1
G x
(k)
x
(k1)
= x
(k)
x (I G)
1
G G(x
(k1)
x
(k2)
)
(I G)
1
G G x
(k1)
x
(k2)
.
.
.
.
.
.
(I G)
1
G
k
x
(1)
x
(0)
.
It is not hard to show that, for any induced matrix norm,
(I G)
1

1
1 G
.
Therefore,
x
(k)
x
G
k
1 G
x
(1)
x
(0)
. (3.37)
The practical importance of this error estimate is that we can expect linear
convergence of our iterative method when G < 1.
Example 3.33
We revisit Example 3.30, with the following matlab dialog:
>> x = [0,0,0]
x =
0
0
0
>> G = [0 1/2 0
1/2 0 1/2
0 1/2 0]
G =
0 0.5000 0
0.5000 0 0.5000
0 0.5000 0
>> c = (1/32)*[sin(pi/4); sin(pi/2); sin(3*pi/4)]
c =
0.0221
0.0313
0.0221
>> exact_solution = (eye(3)-G)\c
exact_solution =
0.0754
0.1067
0.0754
>> normG = norm(G)
normG =
0.7071
>> for i=1:5;
old_norm = norm(x-exact_solution);
x = G*x + c;
new_norm = norm(x-exact_solution);
ratio = new_norm/old_norm
end
ratio =
0.7071
ratio =
0.7071
ratio =
0.7071
ratio =
0.7071
ratio =
0.7071
>> x
x =
0.0660
0.0934
0.0660
>>
We thus see linear convergence with the Jacobi method, with convergence
factor G 0.7071, just as we discussed in Section 1.1.3 (page 7) and our
study of the xed point method for solving a single nonlinear equation (Sec-
tion 2.2, starting on page 47).
Example 3.34
We examine the norm of the iteration matrix for the GaussSeidel method
for Example 3.30:
>> L = [0 0 0
-1 0 0
0 -1 0]
L =
0 0 0
-1 0 0
0 -1 0
>> U = [0 -1 0
0 0 -1
0 0 0]
U =
0 -1 0
0 0 -1
0 0 0
>> D = [2 0 0
0 2 0
0 0 2]
D =
2 0 0
0 2 0
0 0 2
>> GS = -inv(L+D)*U
GS =
0 0.5000 0
0 0.2500 0.5000
0 0.1250 0.2500
>> norm(GS)
ans =
0.6905
We see that this norm is less than the norm of the iteration matrix for
the Jacobi method, so we may expect the GaussSeidel method to converge
somewhat faster.
The error estimates hold if is any norm. Furthermore, it is possible to
prove the following.
THEOREM 3.13
Suppose
(G) < 1,
where (G) is the spectral radius of G, that is,
(G) = max{|| : is an eigenvalue of G}.
Then the iterative method
x
(k+1)
= Gx
(k)
+c
converges.
In particular, the Jacobi method and GaussSeidel method for matrices
of the form in Example 3.30 all converge, although G becomes nearer to 1
(and hence, the convergence is slower), the ner we subdivide the interval [0, 1]
(and hence the larger n becomes). There is some theory relating the spectral
radius of various iteration matrices, and matrices arising from discretizations
such as in Example 3.30 have been analyzed extensively.
One criterion that is easy to check is diagonal dominance, as dened in
Remark 3.1 on page 88:
THEOREM 3.14
Suppose
|a
ii
|
n
j=1
j=i
|a
ij
|, for i = 1, 2, , n,
and suppose that the inequality is strict for at least one i. Then the Jacobi
method and GaussSeidel method for Ax = b converge.
We present a more detailed analysis in [1].
3.5.5 The Interval GaussSeidel Method
The interval GaussSeidel method is an alternative method
10
for using
oating point arithmetic to obtain mathematically rigorous lower and upper
bounds to the solution to a system of linear equations. The interval Gauss
Seidel method has several advantages, especially when there are uncertainties
in the right-hand-side vector b that are represented in the form of relatively
wide intervals [b
i
, b
i
], and when there are also uncertainties [a
ij
, a
ij
] in the co-
ecients of the matrix A. That is, we assume that the matrix is A IR
nn
,
b IR
n
, and we wish to nd an interval vector (or box) x that bounds
(A, b) = {x | Ax = b for some A A and some b b} , (3.38)
where IR
nn
denotes the set of all n by n matrices whose entries are in-
tervals, IR
n
denotes the set of all n-vectors whose entries are intervals, and
A A means that each element of the point matrix A is contained in the
corresponding element of the interval matrix A (and similarly for b b).
The interval GaussSeidel method is similar to the point GaussSeidel
method as dened in (3.30) on page 124, except that, for general systems,
we almost always precondition. In particular, let

A = Y A and

b = Y b,
where Y is a preconditioning matrix. We then have the preconditioned system
Y Ax = Y b, i.e.

Ax =

b. (3.39)
We have
THEOREM 3.15
(The solution set for the preconditioned system contains the solution set for
the original system.) (A, b) (Y A, Y b) = (
A,
b).
This theorem is a fairly straightforward consequence of the subdistributivity
(Equation (1.4) on page 26) of interval arithmetic. For a proof of this and
other facts concerning interval linear systems, see, for example, [29].
Analogously to the noninterval version of GaussSeidel iteration (3.30), the
interval GaussSeidel method is given as
x
(k+1)
i

1
a
ii
_
_
_
b
i
i1
j=1
a
ij
x
(k+1)
j

n
j=i+1
a
ij
x
(k)
j
_
_
_
(3.40)
for i = 1, 2, . . . , n, where a sum is interpreted to be absent if its lower index
is greater than its upper index, and with x
(0)
i
given for 1 = 1, 2, . . . , n.
10
to the interval version of Gaussian elimination of Section 3.3.4 on page 111
REMARK 3.9 As with the interval version of Gaussian elimination (Al-
gorithm 3.4 on page 111), a common preconditioner Y for the interval Gauss
Seidel method is the inverse midpoint matrix Y = (m(A))
1
, where m(A)
is the matrix whose elements are midpoints of corresponding elements of the
interval matrix A. However, when the elements of A have particularly large
widths, specially designed preconditioners
11
may be more appropriate.
REMARK 3.10 Point iterative methods, are often preconditioned. How-
ever, computing an inverse of a point matrix A leads to Y A I, where I is
the identity matrix, so the system will already have been solved (except for,
possibly, iterative renement). Moreover, such point iterative methods are
usually employed for very large systems of equations, with matrices with 0
for many elements. Although the elements that are 0 need not be stored,
the inverse generally does not have 0s in any of its elements [13], so it may
be impractical to even store the inverse, let alone compute it.
12
Thus, spe-
cial approximations are used for these preconditioners.
13
Preconditioners for
the point GaussSeidel method, conjugate gradient method (explained in our
graduate text [1]), etc. are often viewed as operators that increase the sepa-
ration between the largest eigenvalue of A and the remaining eigenvalues of
A, rather than computing an approximate inverse.
The following theorem tells us that the interval GaussSeidel method can
be used to prove existence and uniqueness of a solution of a system of linear
equations.
THEOREM 3.16
Suppose (3.40) is used, starting with initial interval vector x
(0)
, and obtaining
interval vector x
(k)
after a number of iterations. Then, if x
(k)
x
(0)
, for each
A A and each b b, there is an x x
(k)
such that Ax = b.
The proof of Theorem 3.16 can be found in many places, such as in [20] or
[29].
Example 3.35
Consider Ax = b, where
A =
[0.99, 1.01] [1.99, 2.01]

[2.99, 3.01] [3.99, 4.01]
, b =
[1.01, 0.99]
[0.99, 1.01]
, x
(0)
=
[10, 10]
[10, 10]
.
11
See [20, Chapter 3].
12
Of course, the inverse could be computed one row at a time, but this may still be im-
practical for large systems.
13
Much work has appeared in the research literature on such preconditioners
Then,
14
m(A) =
1 2
3 4
, Y m(A)
1
=
2.0 1.0
1.5 0.5
A = Y A
[0.97, 1.03] [0.03, 0.03]

[0.02, 0.02] [0.98, 1.02]
,

b = Y b
[2.97, 3.03]
[2.02, 1.98]
.
We then have
x
(1)
1

1
[0.97, 1.03]
[2.97, 3.03] [0.03, 0.03][10, 10]
[2.5922, 3.4330],
x
(1)
2

1
[0.98, 1.02]
[2.02, 1.98] [0.02, 0.02][2.5922, 3.4330]
[2.1313, 1.8738].
If we continue this process, we eventually obtain
x
(4)
= ([2.8215, 3.1895], [2.1264, 1.8786])
T
,
which, to four signicant gures, is the same as x
(3)
. Thus, we have found
mathematically rigorous bounds on the set of all solutions to Ax = b such
that A A and b b.
In Example 3.35, uncertainties of 0.01 are present in each element of the
matrix and right-hand-side vector. Although the bounds produced with the
preconditioned interval GaussSeidel method are not guaranteed to be the
tightest possible with these uncertainties, they will be closer to the tightest
possible when the uncertainties are smaller.
Convergence of the interval GaussSeidel method is related closely to con-
vergence of the point GaussSeidel method, through the concept of diagonal
dominance. We give a hint of this convergence theory here.
DEFINITION 3.25 If a = [a, a] is an interval, then the magnitude of a
is dened to be
mag(a) = max{|a|, |a|}.
Similarly, the mignitude of a is dened to be
mig(a) = min
aa
|a|.
14
These computations were done with the aid of intlab, a matlab toolbox available free
of charge for non-commercial use.
Given the matrix

A, form the matrix H = (h
ij
) such that
h
ij
=
mag(a
ij
) if i = j,
mig(a
ij
) if i = j.
Then, basically, the interval GaussSeidel method will be convergent if H is
diagonally dominant.
For a careful review of convergence theory for the interval GaussSeidel
method and other interval methods for linear systems, see [29]. Also, see [32].
3.6 The Singular Value Decomposition
The singular value decomposition, which we will abbreviate SVD, is not
always the most ecient way of analyzing a linear system, but is extremely
exible, and is sometimes used in signal processing (smoothing), sensitivity
analysis, statistical analysis, etc., especially if a large amount of information
about the numerical properties of the system is desired. The major libraries
for programmers (e.g. Lapack) and software systems (e.g. matlab, Mathe-
matica) have facilities for computing the SVD. The SVD is often used in the
same context as a QR factorization, but the component matrices in an SVD
are computed with an iterative technique related to techniques for computing
eigenvalues and eigenvectors (in Chapter 5 of this book).
The following theorem denes the SVD.
THEOREM 3.17
Let A be an m by n real matrix, but otherwise arbitrary. Then there are
orthogonal matrices U and V and a an m by n matrix = [
ij
] such that
ij
= 0 for i = j,
i,i
=
i
0 for 1 i p = min{m, n}, and
1

2

p
, such that
A = UV
T
.
For a proof and further explanation, see G. W. Stewart, Introduction to
Matrix Computations [35] or G. H. Golub
15
and C. F. van Loan, Matrix
Computations [16].
Note: The SVD for a particular matrix is not necessarily unique.
Note: The SVD is dened similarly for complex matrices (that is, matrices
whose elements are complex numbers).
15
Gene Golub, a famous numerical analyst, a professor of Computer Science and, for many
years, department chairman, at Stanford University, invented the ecient algorithm used
today for computing the singular value decomposition.
REMARK 3.11 A simple algorithm to nd a singular-value decompo-
sition is: (1) nd the nonzero eigenvalues of A
T
A, i.e.,
i
, i = 1, 2, . . . , r,
(2) nd the orthogonal eigenvectors of A
T
A and arrange them in n n ma-
trix V , (3) form the m n matrix with diagonal entries
i
=
i
, (4)
let u
i
=
1
i
Av
i
, i = 1, 2, . . . r and compute u
i
, i = r + 1, r + 2, . . . , m using
Gram-Schmidt orthogonalization. However, a well-known ecient method
for computing the SVD is the Golub-Reinsch algorithm [36] which employs
Householder bidiagonalization and a variant of the QR method.
Example 3.36
Let A =
_
_
1 2
3 4
5 6
_
_
. Then
U
_
_
0.2298 0.8835 0.4082
0.5247 0.2408 0.8165
0.8196 0.4019 0.4082
_
_
,
_
_
9.5255 0
0 0.5143
0 0
_
_
, and
V
0.6196 0.7849
0.7849 0.6196
is a singular value decomposition of A. This approximate singular value de-

composition was obtained with the following matlab dialog.
>> A = [1 2;3 4;5 6]
A =
1 2
3 4
5 6
>> [U,Sigma,V] = svd(A)
U =
-0.2298 0.8835 0.4082
-0.5247 0.2408 -0.8165
-0.8196 -0.4019 0.4082
Sigma =
9.5255 0
0 0.5143
0 0
V =
-0.6196 -0.7849
-0.7849 0.6196
>> U*Sigma*V
ans =
1.0000 2.0000
3.0000 4.0000
5.0000 6.0000
>>
Note: If A = UV
T
represents a singular value decomposition of A, then,
for

A = A
T
,

A = V
T
U
T
represents a singular value decomposition for

A.
DEFINITION 3.26 The vectors V (:, i), 1 i p are called the right
singular vectors of A, while the corresponding U(:, i) are called the left singular
vectors of A corresponding to the singular values
i
.
The singular values are like eigenvalues, and the singular vectors are like
eigenvectors. In fact, we have
THEOREM 3.18
Let the n by n matrix A be symmetric and positive denite. Let {
i
}
n
i=1
be
the eigenvalues of A, ordered so that
1

2

n
, and let v
i
be the
eigenvector corresponding to
i
. Furthermore, choose the v
i
so {v
i
}
n
i=1
is an
orthonormal set, and form V = [v
1
, , v
n
] and = diag(
1
, ,
n
). Then
A = V V
T
represents a singular value decomposition of A.
This theorem follows directly from the denition of the SVD. We also have
THEOREM 3.19
Let the n by n matrix A be invertible, and let A = UV
T
represent a singular
value decomposition of A. Then the 2-norm condition number of A is
2
(A) =
1
/
n
.
Thus, the condition number of a matrix is obtainable directly from the
SVD, but the SVD gives us more useful information about the sensitivity of
solutions than just that single number, as well see shortly.
The singular value decomposition is related directly to the MoorePenrose
pseudo-inverse. In fact, the pseudo-inverse can be dened directly in terms
of the singular value decomposition.
DEFINITION 3.27 Let A be an m by n matrix, let A = UV
T
represent
a singular value decomposition of A, and assume r p is such that
1

2

r
> 0, and
r+1
=
r+2
= =
p
= 0. Then the MoorePenrose
pseudo-inverse of A is dened to be
A
+
= V
+
U
T
,
where
+
=
+
ij
is an m by n matrix such that
+
ij
= 0 if i = j or i > r, and
+
ii
= 1/
i
if 1 i r.
Part of the power of the singular value decomposition comes from the fol-
lowing.
ace-2pt
THEOREM 3.20
Suppose A is an m by n matrix and we wish to nd approximate solutions to
Ax = b, where b R
m
. Then,
If Ax = b is inconsistent, then x = A
+
b represents the least squares
solution of minimum 2-norm.
If A is consistent (but possibly underdetermined) then x = A
+
b repre-
sents the solution of minimum 2-norm.
In general, x = A
+
b represents the least squares solution to Ax = b of
minimum 2-norm.
The proof of Theorem 3.20 is left as an exercise (on page 144).
REMARK 3.12 If m < n, one would expect the system to be underde-
termined but full rank. In that case, A
+
b gives the solution x such that x
2
is minimum; however, if A were also inconsistent, then there would be many
least squares solutions, and A
+
b would be the least squares solution of min-
imum norm. Similarly, if m > n, one would expect there to be a single least
squares solution; however, if the rank of A is r < p = n, then there would
be many such least squares solutions, and A
+
b would be the least squares
solution of minimum norm.
Example 3.37
Consider Ax = b, where A =
_
_
1 2 3
4 5 6
7 8 9
_
_
and b =
_
_
1
0
1
_
_
. Then
U
_
_
0.2148 0.8872 0.4082
0.5206 0.2496 0.8165
0.8263 0.3879 0.4082
_
_
,
_
_
16.8481 0 0
0 1.0684 0
0 0 0.0000
_
_
,
V
_
_
0.4797 0.7767 0.4082
0.5724 0.0757 0.8165
0.6651 0.6253 0.4082
_
_
, and
+
_
_
0.0594 0 0
0 0.9360 0
0 0 0
_
_
.
Since
3
= 0, we note that the system is not of full rank, so it could be either
inconsistent or underdetermined. We compute x [0.9444, 0.1111, 0.7222]
T
,
and we obtain
16
Axb
2
2.510
15
. Thus, Ax = b, although apparently
underdetermined, is apparently consistent, and x represents that solution of
Ax = b which has minimum 2-norm.
As with other methods for computing solutions, we usually do not form the
pseudo-inverse A
+
to compute A
+
x, but we use the following.
ALGORITHM 3.6
(Computing A
+
b)
INPUT:
(a) the m by n matrix A L(R
n
, R
m
),
(b) the right-hand-side vector b R
m
,
(c) a tolerance such that a singular value
i
is considered to be equal to
0 if
i
/
1
< .
OUTPUT: an approximation x to A
+
b.
1. Compute the SVD of A, that is, compute approximations to U L(R
m
),
L(R
n
, R
m
), and V L(R
n
) such that A = UV
T
.
2. p min{m, n}.
3. r p.
4. FOR i = 1 to p.
IF
i
/
1
> THEN
+
i
1/
i
.
ELSE
i. r i 1.
ii. EXIT FOR
END IF
END FOR
5. Compute w = (w
1
, , w
r
)
T
R
r
, w U(:, 1 : r)
T
b, where U(:, 1 :
r) R
nr
is the matrix whose columns are the rst r columns of U.
6. FOR i = 1 to r: w
i

+
i
w
i
.
16
The computations in this example were done using matlab, and were thus done in IEEE
double precision. The digits displayed here are the results from that computation, rounded
to four signicant decimal digits with matlabs intrinsic display routines.
7. x
r
i=1
w
i
V (:, i).
END ALGORITHM 3.6.
REMARK 3.13 Ill-conditioning (i.e., sensitivity to roundo error) in the
computations in Algorithm 3.6 occurs when small singular values
i
are used.
For example, suppose
i
/
1
10
6
, and there is an error U(:, i) in the vector
b, that is, b =

bU(:, i) (that is, we perturb b by in the direction of U(:, i)).
Then, instead of A
+
b,
A
+
(b +U(:, i)) = A
+
b +A
+
U(:, i) = A
+
b +
1
i
V (:, i). (3.41)
Thus, the norm of the error U(:, i) is magnied by 1/
i
. Now, if, in addition,
b happened to be in the direction of U(:, 1), that is, b =
1
U(:, 1), then
A
+
b
2
=
1
(1/
1
)V (:, 1)
2
= (1/
1
)b
2
. Thus, the relative error, in this
case, would be magnied by
1
/
i
.
In view of Remark 3.13, we are led to consider modifying the problem
slightly to reduce the sensitivity to roundo error. For example, suppose that
we are data tting, with m data points (t
i
, y
i
) (as in Section 3.4 on page 117),
and A is the matrix as in Equation (3.19), where m n. Then we assume
there is some error in the right-hand-side vector b. However, since {U(:, i)}
forms an orthonormal basis for R
m
,
b =
m
i=1
i
U(:, i) for some coecients {
i
}
m
i=1
.
Therefore, U
T
b = (
1
, . . . ,
m
)
T
, and we see that x will be more sensitive to
changes in components of b in the direction of the
i
with larger indices. If we
know that typical errors in the data are on the order of , then, intuitively, it
makes sense not to use components of b in which the magnication of errors
will be larger than that. That is, it makes sense in such cases to choose =
in Algorithm 3.6.
Use of = 0 in Algorithm 3.6 can be viewed as replacing the smallest
singular values of the matrix A by 0. In the case that A L(R
n
) is square and
only
n
is replaced by zero, this amounts to replacing an ill-conditioned matrix
A by a matrix that is exactly singular. One (of many possible) theorems
dealing with this replacement process is
THEOREM 3.21
Suppose A is an n by n matrix, and suppose we replace
n
= 0 in the singular
value decomposition of A by 0, then form

A = U
V
T
, where A = UV
T
rep-
resents the singular value decomposition of A, and

= diag(
1
, ,
n1
, 0).
Then
A

A
2
= min
BL(R
n
)
rank(B)<n
AB
2
,
Suppose now that

A has been obtained from A by replacing the smallest
singular values of A by 0, so the nonzero singular values of

A are
1

2

r
> 0, and dene x =

A
+
b. Then, perturbations of size b in b
result in perturbations of size at most (
1
/
r
)b in x. This prompts us to
dene a generalization of condition number as follows.
DEFINITION 3.28 Let A, be an m by n matrix with m and n arbitrary,
and assume the nonzero singular values of A are
1

2

r
> 0.
Then the generalized condition number of A is
1
/
r
.
Example 3.38
Consider A =
_
_
1 2 3
4 5 6
7 8 10
_
_
, whose singular value decomposition is approxi-
mately
U
_
_
0.2093 0.9644 0.1617
0.5038 0.0353 0.8631
0.8380 0.2621 0.4785
_
_
,

_
_
17.4125 0 0
0 0.8752 0
0 0 0.1969
_
_
, and
V
_
_
0.4647 0.8333 0.2995
0.5538 0.0095 0.8326
0.6910 0.5528 0.4659
_
_
.
Suppose we want to solve the system Ax = b, where b = [1, 1, 1]
T
, but that,
due to noise in the data, we do not wish to deal with any system of equations
with condition number equal to 25 or greater. How can we describe the set of
solutions, based on the best information we can obtain from the noisy data?
We rst observe that
2
(A) =
1
/
n
88.4483. However,
1
/
2

19.8963 < 25. We may thus form a new matrix

A = U
V
T
, where

is obtained from by replacing

3
by 0. This is equivalent to projecting
A onto the set of singular matrices according to Theorem 3.21. We then
use Algorithm 3.6 (applied to

A) to determine x as x =

A
+
b. We obtain
x (0.6205, 0.0245, 0.4428)
T
. Thus, to within the accuracy of 1/25 = 4%,
we can only determine that the solution lies along the line
_
_
0.6205
0.0245
0.4428
_
_
+y
3
V
:,3
, y
3
R.
This technique is a common type of analysis in data tting. The parameter
y
3
(or multiple parameters, in case of higher-order rank deciency) needs to
be chosen through other information available with the application.
3.7 Applications
Consider the following dierence equation model [3], which describes the
dynamics of a population divided into three stages.
_
_
J(t + 1) = (1
1
)s
1
J(t) +bB(t)
N(t + 1) =
1
s
1
J(t) + (1
2
)s
2
N(t)
B(t + 1) =
2
s
2
N(t) +s
3
B(t)
(3.42)
The variables J(t), N(t) and B(t) represents the number of juveniles, non-
breeders, and breeders, respectively, at time t. The parameter b > 0 is the
birth rate, while
1
,
2
(0, 1) represent the fraction (in one time unit) of
juveniles that become non-breeders and non-breeders that become breeders,
respectively. Parameters s
1
, s
2
, s
3
(0, 1) are the survivor rates of juveniles,
non-breeders and breeders, respectively.
To analyze the model numerically, we let b = 0.6,
1
= 0.8,
2
= 0.7, s
1
=
0.7, s
2
= 0.8, s
3
= 0.9. Also notice the model can be written as
_
_
J(t + 1)
N(t + 1)
B(t + 1)
_
_
=
_
_
0.14 0 0.6
0.56 0.24 0
0 0.56 0.9
_
_
_
_
J(t)
N(t)
B(t)
_
_
or a matrix form
X(t + 1) = AX(t),
where X(t) = (J(t), N(t), B(t))
T
, and A =
_
_
0.14 0 0.6
0.56 0.24 0
0 0.56 0.9
_
_
.
Suppose we know all the eigenvectors v
i
, i = 1, 2, 3 and their associated
eigenvalues
i
, i = 1, 2, 3 of the matrix A. By the knowledge of linear alge-
bra, any initial vector X(0) can be expressed as a linear combination of the
eigenvectors
X(0) = c
1
v
1
+c
2
v
2
+c
3
v
3
,
then
X(1) = AX(0) = A(c
1
v
1
+c
2
v
2
+c
3
v
3
)
= c
1
Av
1
+c
2
Av
2
+c
3
Av
3
= c
1
1
v
1
+c
2
2
v
2
+c
3
3
v
3
.
Applying the same techniques, we get
X(2) = AX(1) = A(c
1
1
v
1
+c
2
2
v
2
+c
3
3
v
3
)
= c
1
2
1
v
1
+c
2
2
2
v
2
+c
3
2
3
v
3
.
Continuing the above will lead to the general solution of the population dy-
namical model (3.42)
X(t) =
3
i=1
c
i
t
i
v
i
.
Now, to compute the eigenvalues and eigenvectors of A, we could simply type
the following in matlab command window
>> A=[0.14 0 0.6; 0.56 0.24 0; 0 0.56 0.9]
A =
0.1400 0 0.6000
0.5600 0.2400 0
0 0.5600 0.9000
>> [v,lambda]=eig(A)
v =
-0.1989 + 0.5421i -0.1989 - 0.5421i 0.4959
0.6989 0.6989 0.3160
-0.3728 - 0.1977i -0.3728 + 0.1977i 0.8089
lambda =
0.0806 + 0.4344i 0 0
0 0.0806 - 0.4344i 0
0 0 1.1188
From the result, we see that the spectral radius of A is
3
= 1.1188 and its
corresponding eigenvector is v
3
= (0.4959, 0.3160, 0.8089)
T
. Hence, X(t) =
A
t
X(0) c
3
(1.1188)
t
v
3
. This shows the population size will increase geo-
metrically as time increases.
3.8 Exercises
1. Let
A =
5 2
4 7
.
Find A
1
, A
, A
2
, and (A). Verify that (A) A
1
, (A)
A
and (A) A
2
.
2. Show that back solving for Gaussian elimination (that is, show that
completion of Algorithm 3.2) requires (n
2
+ n)/2 multiplications and
divisions and (n
2
n)/2 additions and subtractions.
3. Consider Example 3.14 (on page 85).
(a) Fill in the details of the computations. In particular, by multiplying
the matrices together, show that M
1
1
and M
1
2
are as stated, that
A = LU, and that L = M
1
1
M
1
2
.
(b) Solve Ax = b as mentioned in the example, by rst solving Ly = b,
then solving Ux = y. (You may use matlab, but print the entire
dialog.)
4. Show that performing the forward phase of Gaussian elimination for
Ax = b (that is, completing Algorithm 3.1) requires
1
3
n
3
+ O(n
2
) mul-
tiplications and divisions.
5. Show that the inverse of a nonsingular lower triangular matrix is lower
triangular.
6. Explain why A = LU, where L and U are as in Equation (3.5) on
page 86.
7. Verify the details in Example 3.15 by actually computing the solutions
to the three linear systems, and by multiplying A and A
1
. (If you use
matlab, print the details.)
8. Program the tridiagonal version of Gaussian elimination represented by
equations (3.8) and (3.9) on page 95. Use your program to approxi-
mately solve
u
= 1, u(0) = u(1) = 0
using the technique from Example 3.18 (on page 93), with h = 1/4, 1/8,
1/64, and 1/4096. Compare with the exact solution u(x) =
1
2
x(1 x).
9. Store the matrices from Problem 8 in matlabs sparse matrix format,
and solve the systems from Problem 8 in matlab, using the sparse
matrix format. Compare with the results you obtained from your tridi-
agonal system solver.
10. Let A =
_
_
1 2 3
4 5 6
7 8 10
_
_
and b =
_
_
1
0
1
_
_
.
(a) Compute
(A) approximately.
(b) Use oating point arithmetic with = 10 and t = 3 (3-digit deci-
mal arithmetic), rounding-to-nearest, and Algorithms 3.1 and 3.2
to nd an approximation to the solution x to Ax = b.
(c) Execute Algorithm 3.4 by hand, using t = 3, = 10, and out-
wardly rounded interval arithmetic (and rounding-to-nearest for
computing Y ).
(d) Find the exact solution to Ax = b by hand.
(e) Compare the results you have obtained.
11. Derive the normal equations (3.22) from (3.21).
12. Let
A =
_
_
2 1 1
4 4 1
6 5 8
_
_
.
(a) Find the LU factorization of A, such that L is lower triangular and
U is unit upper triangular.
(b) Perform back solving then forward solving to nd a solution x for
the system of equations Ax = b = [4 7 15]
T
.
13. Find the Cholesky factorization of
A =
_
_
1 1 2
1 5 4
2 4 29
_
_
.
Also explain why A is positive denite.
14. Let A =
0.1 0.1
1.0 1.5
. Determine such that
(A), the condition

number in the induced -norm, is minimized.
15. Let A be n n lower triangular matrix with elements
a
ij
=
_
_
_
1 if i = j,
1 if i = j + 1,
0 otherwise.
Determine the condition number of A using the matrix norm
.
16. Consider the matrix system Au = b given by
_
_
_
_
_
_
_
_
_
_
1
2
0 0 0
1
4
1
2
0 0
1
8
1
4
1
2
0
1
16
1
8
1
4
1
2
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
u
1
u
2
u
3
u
4
_
_
_
_
_
_
_
_
_
_
=
_
_
_
_
_
_
_
_
_
_
1
0
0
1
_
_
_
_
_
_
_
_
_
_
.
(a) Determine A
1
by hand.
(b) Determine the innity-norm condition number of the matrix A.
(c) Let u be the solution when the right-hand side vector b is perturbed
to

b = (1.01 0 0 0.99)
T
. Estimate u u
, without computing
u.
17. Complete the computations, to check that x
(4)
is as given in Exam-
ple 3.35 on page 131. (You may use intlab. Also see the code
gauss seidel step.m available from
http://www.siam.org/books/ot110.)
18. Repeat Example 3.28, but with the interval GaussSeidel method, in-
stead of interval Gaussian elimination, starting with x
(0
i
= [10, 10],
1 i 3. Compare the results.
19. Let A be the n n tridiagonal matrix with,
a
ij
=
_
_
_
4 if i = j,
1 if i = j + 1 or i = j 1,
0 otherwise.
Prove that the Gauss-Seidel and Jacobi methods converge for this ma-
trix.
20. Consider the linear system,
3 2
2 4
x
1
x
2
7
10
.
Using the starting vector x
(0)
= (0, 0)
T
, carry out two iterations of the
GaussSeidel method to solve the system.
21. Prove Theorem 3.20 on page 136. (Hint: You may need to consider
various cases. In any case, youll probably want to use the properties of
orthogonal matrices, as in the proof of Theorem 3.19.)
22. Given U, , and V as given in Example 3.37 (on page 136), compute
A
+
b by using Algorithm 3.6. How does the x that you obtain compare
with the x reported in Example 3.37?
23. Find the singular value decomposition of the matrix A =
_
_
1 2
1 1
1 3
_
_
.

Chapter 3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 3

Uploaded by

Copyright:

Available Formats

Chapter 3

Linear Systems of Equations

(t) = sin(t), x(0) = x(1) = 0.

replaced using (3.6) to

in Example 3.18. In that case, not only is the overall system

or max norm (for z = a + ib, |z| =

are called equivalent if

, whose exact solution is x =

be an induced matrix norm. Let x be the solution of Ax = b with

is dened to be the condition number of the matrix A with respect to norm

(A) 1 for any induced matrix norm and any

(A) 1 for any and A.

b). Furthermore, if x is the output to Algo-

(A) 16000 and the exact solution is x = [1, 1, 1]

[1.00000000000000, 1.00000000000000] [0.00000000000012, 0.00000000000011]

minimizes the 2-norm of the residual vector r(x) = Ax b,

is to set the gradient (x) = 0 and simplify.

may be computed with a QR decomposition. In

can be computed by backsolving the square triangular

is called the SOR matrix. Note that = 1 gives G, the

[0.99, 1.01] [1.99, 2.01]

[0.97, 1.03] [0.03, 0.03]

[2.97, 3.03] [0.03, 0.03][10, 10]

[2.02, 1.98] [0.02, 0.02][2.5922, 3.4330]

is a singular value decomposition of A. This approximate singular value de-

is an m by n matrix such that

is obtained from by replacing

. Determine such that

(A), the condition

You might also like