Professional Documents
Culture Documents
5. Gaussian Elimination
To solve , we reduce it to an equivalent system , in which U is upper triangular. This system can be easily solved
by a process of backward substitution.
where and n is the order of the system. We reduce the system to the triangular
form by adding multiples of one equation to another equation, eliminating some unknown from the second equation.
Additional row operations are used in the modifications given later. We define the algorithm in the following:
These are used in eliminating the term form equation 2 through n. Define
Also, the first rows of A and b are left undisturbed, and the first column of , below the diagonal, is set to zero. The system
looks like
We continue to eliminate unknowns, going onto columns 2, 3, etc., and this is expressed generally as follows.
Step Let . Assume that has been constructed with eliminated at successive
(1)
Use these to remove the unknown's from equations k+1 through n. Define
(2)
The earlier rows 1 through k are left undisturbed, and zeros are introduced into column k below the diagonal element. By continuing
in this manner, after n-1 steps, we obtain i.e.
Let and . The system Ux=g is upper triangular and easy to solve by back substitution; i.e
and
Theorem: If L and U are the lower and upper triangular matrices as defined above, then
Proof: The proof is an algebraic manipulation, making use of (1) and (2) as given above. We write
For
For
Since L and U are triangular, their determinants are the product of their diagonal elements. The desired result follows easily, since
det(L)=1.
Let i be the smallest row index, , for which the maximum is attained. If then switch rows k and i in A and b, and
proceed with step k of the elimination process. All the multipliers will now satisfy
This helps in preventing the growth of elements in of greatly varying size, and thus lessens the possibility for large loss of
significance errors.
Definition: (Complete Pivoting).
Define
Switch rows of A and b and columns of A to bring to the pivot position an element giving the maximum . However, complete
pivoting is more expensive and thus partial pivoting is more often used
Example: Consider solving the system with and without pivoting:
The solution is
....................(3)
2. Solution with pivoting: To indicate the interchange of rows i and j, we will use the notation
The solution is
...............(4)
The error in (3) is from seven to sixteen times larger than it is for (4), depending upon the component of the solution being
considered. The results in (4) have one more significant digit than those in (3). This illustrates the positive effect that the use of
pivoting can have on the error for Gaussian elimination.
Scaling: It has been observed that if the elements of the coefficient matrix A vary greatly in size, then it is likely that large loss of
significance errors will be introduced and the propagation of rounding errors will be worse. To avoid this problem, we usually scale
the matrix A so that the elements vary less. This is usually done by multiplying the rows and columns by suitable constants. If we let
B denote the result of row and column scaling in A, then
where and are the diagonal matrices, with entries the scaling constants. To solve , observe that
This procedure is much the same as Gauss elimination including the possible use of pivoting and scaling. It differs in eliminating the
unknown in equation above the diagonal as well as below it. In step k of the elimination, choose the pivot element as before. Then
define
Eliminate the unknown in equation both above and below equation k. Define
The procedure will convert the augmented matrix to , so that the solution is .
Let A be a symmetric and positive definite matrix of order n. The matrix is positive definite if for
all . For such a matrix A, there is a very convenient factorization and can be carried out without any need for pivoting
or scaling. This is called Choleski factorization and that is we can find a lower triangular real matrix L such that
Construction of the matrix L: Let with for . Begin the construction of L by multiplying the first row of
Because A is positive definite, and . Next, multiply the second row of L times the first two columns of to
get
(5)
The square root in (5) of choleski's method can be avoided by using a slight modification of the factorization. That is to find a
diagonal matrix D and a lower triangular matrix , with unity on the diagonal, such that
(6)
Example: Consider the Hilbert matrix of order 3,
Tridiagonal systems
The matrix is tridiagonal, if for
Let
Consider
A is non-singular)
To solve , let and . Then
Error Analysis:
Consider solving the system , where A is a non-singular matrix of order n. Denote by and the true and computed
solutions, respectively, of . One possible measure of the error in the computed solution would be the magnitude of the
residual vector
If
then
Remark:
But this is not appropriate of error analysis as can be quite small even though is a very erroneous solution. This can be
justified in the following manner: if has some large elements, then r may be very small even if is substantially different from
the true solution. For
or
Therefore, if some elements of are large, a small component of r can still mean a large difference between and , or
conversely, may be far from but r can nevertheless still be small. In other words, an accurate solution (i.e a small difference
between and ) will always produce small residuals but small residuals do not guarantee an accurate solution.
If the system is such that contains some very large elements, then we say the matrix and, therefore, the system of
equations is ill-conditioned. The following simple example will illustrate the dangers inherent in solving ill-conditioned system.
Consider the system
(7)
(8)
which has the solution x=10, y=-2. Here a change of .00002 in and .00001 in has caused a gross change in the solution. The
inverse of the matrix of coefficients in (7) has elements whose order of magnitude is , which indicates the ill conditioning of A.
A more reasonable measure of the error is given by
It is this error we shall try to estimate here. Any bound on E will depend on the magnitude of the round off errors incurred, the order
of the matrix A, and the size of . One approach to finding such a bound would be to consider the worst possible case of round
off at each stage of the method and to derive a bound based on the accumulation of these errors. Since the round off at one stage is
quite complicated function of the round off at previous stages, such bounds are difficult to calculate. Instead our approach here will
be to estimate the perturbed system of equations whose true solution is the calculated solution . That is, the computed solution
is the true solution of some system which we write as
We can not hope to find and precisely; our object is to find bounds on their elements.
We have
(9)
In order to find a bound E, we consider the following two norms of the matrix A.
i) Euclidean Norm of A is defined as
where the notation denotes an eigen value of . Both of these norms are defined for any m x n matrix. For vectors we
where tr denotes the trace. Since the trace of a matrix is the sum of its eigen values where n is the order of
is symmetric, we have
We need the following theorem to obtain the error bound.
Theorem: Let A be a square matrix and x any vector. Then
Since the eigen value of are nonpositive and therefore the numerator is negative semi-definite. Equality holds when x is
and
where
. We have
.Thus
(10)
where we have assumed that . One can see that bound (10) depends upon bounds on and in
addition to a bound on .
A priori Error Estimate
In solving the system
(11)
by any procedure, round off errors will in general be introduced. But if the problem is well posed these errors can be kept with in
reasonable bounds. By a problem to be well posed, we mean that 'small' changes in the data lead to 'small' changes in the solution.
Suppose that the matrix A and b in (11) are perturbed by the quantities and . Then if the perturbation in the solution x of (11)
is , we have
(12)
Now an estimate of the relative change in the solution can be given in terms of the relative changes in A and b by means of the
following.
Theorem: Let A be non-singular and the perturbation be so small that
(13)
(14)
Proof: Since by (13), it follows that by corollary 2 and 3, the matrix is non singular
and further that
(15)
is not too large. The condition (13) is equivalent to . Thus, it is clear that when the condition number is not too
large, the system (11) is well conditioned. Note that we can not expect to be small compared to unity since
(16)
(17)
We have the following result.
Theorem: If , then A and C are non-singular, and
Now
and thus both det(C) and det(A) are non zero. This proves that both A and C are non singular. Now or
and
Also
and
Corollary:
If is of interest to note that with C an approximate inverse of A, we can find the perturbation so that C is the exact inverse of
. That is,
or
and
Finally, we observe that the computed inverse can also be used to estimate the error in solving a linear system. This is given in the
following.
Theorem: Let A, C and R be as given in the previous theorem. Let be an approximate solution to Ax=b and define .
Then
Proof.
or
and
Iterative methods
We restrict to linear iteration which has the form
(18)
(19)
or
(20)
(21)
Equation(18) is then just a generalization of (21). We require that the true solution x of (19) be a fixed point of (18). We must then
have
for all
. Since we have
or
(22)
or
(23)
This is called the condition of consistency for and . In view of (22), we can write (18) in the form
(24)
, (using (23) )
where
(25)
Therefore, a necessary and sufficient condition for the convergence of the sequence to x for arbitrary is that
for all
We also have
Therefore, is also a sufficient condition for the convergence of the iteration (18). For stationary iterative process still
another sufficient condition for convergence is
(26)
The eigen values of and the powers of the eigen values of B. Therefore, the condition (26) is equivalent to requiring
that all eigen values of B lie with in the unit circle. In fact, this is a necessary and sufficient condition for the convergence of
stationary iteration.
Where D is a diagonal matrix with and L and U are, respectively, lower and upper triangular matrices with zeros on the
diagonal. Then the system
can be written as
(27)
(28)
where is an initial approximation. This is known as Jacobi iteration or the method of simultaneous displacements.
If , we can write (27) in the component form as
(29)
.
.
If we let
where
We now describe the algorithm for the jacobi iteration (method of simultaneous displacement).
Algorithm: Let the system be expressed in the form
where B is an n x n matrix as given above and c is a given column vector. To find an approximate solution.
1. Choose an arbitrary initial approximation vector . If no better choice is available, choose
for
or
and write
or
where
and
where is an initial approximation. The algorithm for Gauss-seidel method is given below:
Algorithm:
(30)
where B is n x n matrix with elements . The Jacobi and Gauss-seidel methods then consist of choosing an arbitrary initial vector
(31)
(32)
where is the error in the initial approximation . We must now show that regardless of what the initial error is,
Theorem: The iteration defined by (31) will converge for any choice of the initial vector if and only if the eigenvalues of the
matrix B are all less than one in magnitude.
Remark: T he rate of convergence depends upon the magnitude of the largest eigen value of B. The smaller this eigen value, the
faster the rate of convergence.
This theorem is not, however, very helpful in practice because we will generally not know the eigen values of B.
We shall describe sufficient condition for convergence in the following Theorem, which is more practicable.
Theorem: In the jacobi iteration, let the elements of B satisfy the column sum inequalities
(33)
Then the Jacobi iteration method will converge for any choice of the initial approximation .
Proof: The error equation (32) can be explicitly written as
Taking absolute values and applying the triangle inequality, we have
Since this is a finite sum of positive terms, it can vanish only if each term in the sum vanishes separately. We have thus shown that
as
We can also prove the theorem if the elements of B satisfy the row sum inequalities
(34)
If the condition (34) is satisfied, convergence of Jacobi iteration will take place for any choice of the initial approximation . The
conditions (33) and (34) expressed in terms of the elements of A become
(35)
(36)
Matrices satisfying condition (36) are said to be strictly diagonally dominant. The conditions given in the theorem are sufficient for
convergence but not necessary. These conditions are also sufficient in the convergence for Gauss-seidel method.
Exercises