Matrix Algebra

APPENDIX A Matrix Algebra As far as possible we follow the convention of using boldface, lowercase type for vectors and boldface, uppercase type for matrices. The sequence of topics in this appendix attempts to mirror the order in which the topics appear in the main text to facilitate cross reference between them. Al VECTORS ‘A vector is an ordered sequence of elements arranged in a row or column. In this bbook the elements are generally real numbers or symbols representing real numbers. ‘Asan example, fi i a=] L is a 3-clement column vector, and b = [-2 O]is a 2-element row vector. The or der of a vector is the number of elements in the vector. Changing the sequence of elements in a vector produces a different vector. Thus, permuting the elements ina ‘would yield six different vectors. In general we will interpret the vector symbol as a columa vector. Column vectors can be transformed into row vectors and vice versa by the operation of transposition. We will denote the operation by a prime, although some authors use a T superscript. Thus, ‘| pOadatue etd -2 00) Clearly, repetition of the operation will restore the original vector, so that a’)! = a. a=(5 1 3} and 455456 econowernic MerHons A.1.1. Multiplication by a Scalar “Multiplication of a vector by a scalar simply means that each element in the vector is multiplied by the scalar. For example, 26 = [-4 0) A.L2 Addition and Subtraction In this operation corresponding elements ofthe vectors are added or subtracted. This can only be done for vectors that (i) are all column vectors or all tow vectors and (ii) are all of the same order. Clearly, one cannot add a row vector to a column vector, nor add a 3-clement column veetor to a 6 element column vector. To illustrate, HB-E = EEB-EI Fora’ = fay ap *** dy) and b' = [by by > by] ay + by) [a ene late| le ms -_ A.13 Linear Combinations Combining the two operations of scalar multiplication and vector addition expresses. fone vector as a linear combination of other vectors. For instance, let ‘ b= Ajay + drag + + Ace = > As (A2) ft In general defines ab vector as a linear combination of the a; vectors with scalar weights A;. A.1.4 Some Geometry Vectors may be given a geometric as well as an algebraic interpretation. Consider a 2-element vectora’ = [2.1]. This may be pictured as a directed line segment, as shown in Fig. A.1. The arrow denoting the segment starts at the origin and endsaprexpoc a: Matrix Algebra 457 01 2 3 4 taelement FIGURE Al at the point with coordinates (2, 1). The vector @ may also be indicated by the point at which the arrow terminates. If we have another 2-element vector 6’ = [1 _3),the geometry of vector addition is as follows, Start with a and then place the b vector at the terminal point of the a vector. This takes us to the point P in Fig, A.1. This point defines a vector ¢ as the sum of the vectors @ and b, and it is obviously also reached by starting with the b vector and placing the a vector at its terminal point. ‘The process is referred to as completing the parallelogram, or as the parallelogram law for the addition of vectors. The coordinates of P are (3, 4), and coevs-feBlE so there is an exact correspondence between the geometric and algebraic treatments. Now consider scalar multiplication of a vector. For example, »-af)-( ‘gives a vector in exactly the same direction as a, but twice as long. Similarly, 6 «-[] gives a vector three times as long as a, but going in the opposite direction. The three vectors are shown in Fig. A.2. All three terminal points lie on a single line through the origin, that line being uniquely defined by the vector a. In general Aa’ = [ar Aay o> Ady (a3) It is clear from the parallelogram rule that any 2-element vector can be expressed asa unique linear combination of the a and b vectors in the preceding aumerical o-[sogagh=s458 ecoNometnic METHODS 2nd element Tetelement FIGURE A2 A.L5. Vector Multiplication ‘The sealar, dot, or inner product of two vectors is defined as by by p a'b= (a a agl| | = aby + arb +++ + andy = > aibj = Bla E rr bn, aad ‘The operation is only defined for vectors ofthe same order. Corresponding elements are multiplied together and summed to give the product, which is a scalar. A special cease of Eq. (A.A) is the product of a vector by itself, which gives In the 2-clement case this quantity is (a3 + a3), which, by Pythagoras’ Theorem, is the squared length ofthe vectora. The length of vector is denoted by lal. Extending through three and higher dimensions gives, in general, the length of a vector as lal = Vara (As) where the positive square root is always taken, The outer product of two n-element column vectors is ab’, which is ann X n ‘matrix, each element being the product of an element from a and an element from b. A.L6 Equality of Vectors If two vectors of the same order are equal, they are equal element by element. The difference of the two vectors then gives the zero vector, in which every element is zero.amvenpoca: Matrix Algebra 459 AQ MATRICES ‘A matrix is a rectangular array of elements, The order of a matrix is given by the. numberof ows and the number of columns Tn stting te ore, te amber SOs ren first, aid the number of columns second. Thus, a mai iSalways give mH appears ay ao ay an my At Amd °° Clearly a column vector is a special case of a matrix, namely, a matrix of order, say, ‘m> 1, and a row vector is a matrix of order 1 n. The m Xn matrix may be regarded as an ordered collection of m-element column vectors or as an ordered collection of n- element row vectors. Multiplication of a matrix by a scalar means that each element in the matrix is multiplied by the scalar. The addition of two matrices of the same order, as with vectors, is achieved by adding corresponding elements. ‘The transpose of A is denoted by A’. The first row of A becomes the first column of the transpose, the second row of A becomes the second column of the transpose, and so on. The definition might equally well have been stated in terms of the first column of A becoming the first row of A’, and so on. As an example had «BY 20 4, thatis, ay = ay for ih ‘where ai; i$ the element in A atthe intersection ofthe ith row and th column. This property can only hold for square matrices, (m — n), since otherwise A and A" are not even of the same order. An example of a symmetric matrix is 1-14 -1 03 432 From the definition of a transpose it follows that repetition of the operation re~ turns the original matrix. It also follows directly that A symmetric matrix satisfies Ssrmmetc mates sa 4 A (a6) ‘2.1 Matrix Multiplication Matrix multiplication is achieved by repeated applications of vector multiplication. If is of order m x n and B is of order n x p, then a matrix C = AB of order m x p460 Econoweniie MEMODS can be found. The typical element ¢;; of C is the inner product of the ith row of A and the jth column of B, that is, ey = Sub = 1.2 7 P an ‘These inner products only exist if A has the same number of columns as B has rows. ‘Thus the order in which matrices enter the product is of vital importance. When 1p % mthe inner products of rows of B and columns of A do not exist, and BA is not ‘defined. The following example illustrates a case where both product matrices exist (p= m): les) aol 3 af) 1. 1(1) + 20) +3) Nenana 21) + 0(0) + 4(1) 246) + 001) + 4(1), aff ul 6 16) 1 6) 2 3) 2 3a] 11. fs +6(2) 1(2)+6(0) 103) + 6(4) JO(1) + 12) 0(2)+ 100) 013) + 14) 10) +1@) 12)+1@) 16)+1@)J 132027) . =e B27) A.2.2 The Transpose of a Product, Let the product AB be denoted by po C= AB= by be oe ‘here ay indicates the jth row of A, and by the ith column of B. Thus » denotes the jith element in C. Transposition of € means that the jith element in C becomes the {jth element in C’, Denoting this element by ci, gives cy abi fm Naam oy = en = biavrenooca: Matrix Algebra 461 Referring to the definition of vector multiplication in Eq. (A.4), we see that ajby = qj. Thus fj = Bia = inner product ofthe ith row of Band the jth column of A’ and so, from the definition of matrix multiplicato C= By = Bal (As) ‘The transpose of a product is the product of the transposes in reverse order. This rule ‘extends directly to any number of conformable matrices. Thus, (ABO = C’B'A' ayy ‘The associative law of addition holds for matrices; that is, (A+B) +C=A+B+O) (A.10) ‘This result is obvious since matrix addition merely involves adding corresponding elements, and it does not matter in what order the additions are performed. We state, without proof, the associative law of multiplication, which is (AB)C = ABO) (al A.2.3 Some Important Square Matrices ‘The unit or identity matrix of order n X nis 100 o10 000 1 ‘with ones down the principal diagonal and zeros everywhere else. This matrix plays a role similar to that of unity in scalar algebra. Premultiplying an n-vector, y. by J leaves the vector unchanged, that is, Iy = y. Transposing this last result gives y'I = y', that is, postmultiplying a row vector by [leaves the row vector unchanged. For a matrix A of order m Xn it follows that ImA = Aly Pre- or postinultiplication by I leaves the matrix unchanged. There is usually no need to indicate the order of the identity matrix explicitly as it will be obvious from the context, The identity matrix may be entered or suppressed at will im matrix multiplication. For instance, where M = I~ P. “ A diagonal matrix is like the identity matrix in that all off-diagonal terms are zero, but now the terms on the principal diagonal are scalar elements, of which at least one is nonzero, The diagonal matrix may be written462. ecovomernic METHODS a 0 0 Onn i. a 0 0 0 ns by om bg A special case of a diagonal matrix occurs when all the A’s are equal. This is termed. a scalar matrix and may be written, AO o OA Oyen 00 A A scalar multiplier may be placed before or after the matrix it multiplies, ‘Another important square matrix is an idempotent matrix. IfA is idempotent, then cee that is, multiplying A by itself, however many times, simply reproduces the original ‘matrix. An example of a symmetric idempotent matrix is fit A=gi-2 4 -2 1-201 ‘as may be verified by multiplication A very useful transformation matt where i is a column vector of n ones. The product i’ is a matrix of order n 1, “WWHIGH every element is one. Given a column vector of 1 observations on a vari- able ¥, ba 1 if] 1. WEA n-F and so Ay id lye‘avwenonxa: Matrix Algebra 463 Toe ti ns tags raw aot des m. If the data series has zero ‘mean, it is unaffected by the Wrinsformation. Finally we note that Ai = 0. ‘Another important matrix, though not necessarily square, is the mull matrix 0 ‘whose every element is zero. Obvious relations are A+0=A and AO=0 Similarly we may encounter nll : oper of which isthe sum of the elements on the principal diagoné | way = Saw the matrix A is of order m X n and B is of order X m then AB and BA are both square matrices, and WAB) = WBA) Repeated application gives < tWABC) = (CAB) = (BCA) a provided the products exist as square matrices. A.24 Partitioned Matrices ‘A matrix may be partitioned into a set of submatrices by indicating subgroups of rows and/or columns. For example, a-| ost ele al ae 3.2 5. e: _{4 0 2] om _ anf 23) al] Ay =[-3 2 0} An =5 (A114) ‘The dashed lines indicate the partitioning. yielding the four submatrices defined in Eq, (A.14). The previous rules for the addition and multiplication of matrices apply directly to partitioned matrices provided the submatrices are all of appropriate dimensions. For instance, if A and B are both written in partitioned form as, a-{4n 42] and [Rn 2 (Aa Azz, Bu Ba « _[4n+Bn An + Be thea 7 (es +By An+ Bx provided A and B are of the same overall order (dimension) and each pair Ay, By is of the same order. As an example of the multiplication of partitioned matrices,464 ecovowernic MevHo0s An An AB=\An An dn Ax. AuBu+AvBx AuBu + AnBu = |AnBu+AnBy AnBo +AnBn AnBu +AnBo AnBu +AnBn For the multiplication to be possible and for these equations to hold, the number of columns in A must equal the number of rows in B, and the same partitioning must be applied to the columns of A as to the rows of B. Ba Baa, bs A2S Matrix Differentiation As seen in Chapter 3, OLS requires the determination of a vector b to minimize the residual sum of squares, ee = yy — WIX'y + bXXD ‘The first term on the right-hand side does not involve b, whereas the second term is linear in b since X'y is a k-element column vector of known numbers, and the third term is a symmetric quadratic form in 6, To differentiate a linear function, write it as $b) = a'b = ashy + azhz +++ + aby = Bla where the a’s ae given constants. We may partially differentiate f(b) with respect to each of the b;. The resultant partial derivatives are arranged as a column vector, a aa’b) _ atb'a) _ | ab ab (As) a ‘These erivatives might equally well have been arranged as a ow vector. The impor tant requirement is consistency of treatment so that veetors and matrices of derivatives are of appropriate order for further manipulation. For the linear term in ee it follows directly that aQbXy) which is a k-element vector. a The general quadratic form in b may be written f(b) = b'Ab, where the matrix A of known constants may be taken as symmetric. As a simple illustration consider by by bs i} 1b + aa2b3 + asab3 + 2ayabiby + 2aysbibs + 2azsbrbs ay a1 ayy $b) = [by ba vif 22 G33 a3 433 ayaresoixa: Matrix Algebra 465 ‘The vector of partial derivatives is then 2Aayiby + aiaby + arzbs) 2aiyb1 + arab + a3b3) | = 2Ab 2arsb1 + arsba + assbs) | af) ob This result obviously holds for a symmetric quadratic form of any onder; that i, a(0'Ab) ab for symmetric A. Applying this result t the OLS case gives aO'X'XD) as which is a k-element column vector. 2Ab (A16) (XX) A.266 Solution of Equations ‘The OLS coefficient vector, b, is the solution of (X'X)6 = X'y. We need to estab- lish the conditions under which a unique solution vector exists. Consider the set of equations Ab=e (Al?) where A is a square, but not necessarily symmetric, matrix of order & X k, and b and ¢ are k-clement column vectors. The elements of A and ¢ are known and B isto be determined. The simplest case occurs when k = 2. The equations may then be waritten by}ay| + b2|a2) = Je where the a (i = 1,2) are the 2-element column vectors in A. If the type of situation illustrated in Fig. A.1 exists, it follows directly that there is @ unique linear combination of the a; that gives the ¢ vector. However, if the situation pictured in Fig A obtains, where one column vector is simply a scalar multiple ofthe other. say, @ = Aaj, then any linear combination of the two vectors can only produce another ‘multiple of ay. Should the c vector also lie on the ray through ay. Eq. (A.17) will have an infinity of solutions, whereas if ¢ lies elsewhere there will be no solution. ‘The difference between (i) a unique solution and (i?) no solution or an infinity of solutions is that in the first case the column vectors are linearly independent, and in the second case they are linearly dependent. Ifthe only solution to Aiay + Axa> = 0 is Ar = Az = 0, the vectors are said to be linearly independent; otherwise they are linearly dependent,466 ecovomernic merHons ‘The extension of this definition to higher dimensions is as follows. If the only solution to Auay + Avaa + °° + Aude = 0 is Ay = Az = ++" = Ag = 0, the k-clement a; vectors are linearly independent. Any K-element vector can then be expressed as a unique linear combination of these vec~ tors, and so Eq. (A.17) has a unique solution vector, b. This set of linearly independent vectors serves asa basis for the k-> Saraazp day (A33) ohne ‘Alternatively, the expansion in terms of the th row of A is ul or, in terms of the jth column of A, 1 Cin + aC + °° + inCin Lm (A34) Wj = aC + 020 #2 Haj Cay GF =LQm (38) ‘The cofactors are now the signed minors of matrices of order n — 1, and the inverse matrix is Cu Cx Cu A\Ca Ca Ca Al 7 Ate (36) Cin Con Can,‘arrenoixa: Matrix Algebra 471 A.2.9 Some Properties of Determinants (i Ifa matrix B is formed from A by adding a multiple of one row (or column) to another row (or column), the determinant is unchanged. Suppose the ith row of B is defined as the sum of the th row of A and a multiple of the jth row of A; that is, by = ant day k= 12m Expanding in terms of the ith row, Bl = Sau + AadGe = Sane = Al F 7 where the cofnctors ofthe ith row are obviously the same for each matrix. The result then follows since expansions in terms of alien cofactors vanish (ii) If the rows (columns) of A are linearly dependent, |A| = 0, and if they are linearly independent, |A| * 0. 1 row i, ay, can be expressed as a linear combination of certain other rows, the rows of A are linearly dependent. Subtract- ing that linear combination from row / is simply a repeated application of Property (i). and so will leave the determinant unchanged. However, it produces a matrix with a row of zeros. Since each term in the determinantal expansion contains one element from any specific row, the determinant is zero. If the rows (columns) of A are linearly independent, there is no way to produce a zero row (column) and so |A| ¥ 0. Thus, nonsingular matrices have nonzero determinants and singular matrices have zero determinants, (iti) The determinant of a triangular matrix is equal to the product of the di- ‘agonal elements. A lower triangular matrix has zeros everywhere above the diagonal, asin ‘an 0 0 0 a a0 0 A=|an a2 an 0 nt dad nd °° an [Expanding in terms of the elements in the fist row gives the determinant as the product of ay; and the determinant of the matrix of order n ~ 1 obtained by deleting the first row and column of A. Proceeding inthis fashion gives Al = anaz2a33°** dan An upper triangular matrix has zeros everywhere below the diagonal and the same result obviously holds. Two special cases of this property follow directly: a ‘The determinant of a diagonal matrix is the product of the diagonal ele~ ments. The determinant of the unit (identity) matrix is one. (iv) Multiplying any row (column) of a matrix by a constant multiplies the determinant by the same constant. Multiplying a matrix of order n by a constant4AT2_ economernic METHODS ‘multiplies the determinant by hat constant raised tothe nth power. This result follows from the determinantal expansion where each term is the product of nnelements, one and only one from each row and column of the matrix. (0) The determinant of the product of wo square matrices is the product ofthe determinants. 4B) = [Al 1B) A.useful corollary is aoe _ A.2.10 Properties of Inverse Matrices ‘We now state, mostly without proof, some of the main properties of inverse matrices. (O The inverse of the inverse reproduces the original matrix. ay From the definition of an inverse, (A~!)(A~!y ives the result. (ii) The inverse of the transpose equals the transpose of the inverse. ayl=a'y Transposing AA-! = I gives (A~')'A' Yields the result. (iii) The inverse of an upper (lower) triangular matrix is also an upper (lower) triangular matrix. We illustrate this result for a lower triangular 3X 3 matrb : [au 0 0 A= len an 0 a a ay 1. Premultiplying by A I. Postmultiplication by (A’)-! By inspection one can see that three cofactors are zero, namely, 0 oO — lan of lee a] =m fn O20 Thus, Go Gs Gs. (iv) The inverse of a partitioned maurix may also be expressed in partitioned form. If [An Ai aft An where Ay, and Az» are square nonsingular matrices, then a B -ByA past at = [te Aidt AGAnBuy Ag! + Ag AnBudnAgt| AS?areexoox a: Matrix Algebra 473, where By, = (Ay — AyAz}Ao1)! or, alternatively, ns T M+ AGIA DBaAaAg! ~Aj!A DB: At = [Ail HAGA DBxAnAgt ~Aj!An Ba ae ~BaAndji Bx ey where Boz = (Azz ~ Ap\Aji'A12)"!. The correctness of these results may be checked by multiplying out. These formulas are frequently used, The first form, Eq. (A.37), is the simpler for expressions involving the first row of the inverse. Conversely, the second form, Eq. (A.38), is more convenient for expressions involving the second row. A very important special case of these results occurs when a data matrix is par- tioned as X = [X1 Xo]. Then xx = GG age] Substitution in the preceding formulae gives By = [XiXi — XXX) NG, | | = OMX! (A39) with Mz = 1 X(X3X2)'Xy (A40) A similar substitution, or simply interchanging the 1, 2 subscripts, gives Bry = (XjMiX2)"" (aad with My; =1-X\(XiX) 'X} (A42) The M; are symmetric idempotent matrices. Premultplication of any vector by M, gives the residuals from the regression of that vector on X}. Thus MX; gives the matrix of residuals when each of the variables in X, is regressed on X2, and so forth, The OLS equations for y on X in partitioned form are fos] _ [Xi Xpxay" fay loo} YX) XV) Ly “aking the frst row, we have by (Otek)! aaa aoc") [F2] aay X{MXi"X {May Similarly, by = OMX XM (assy ‘These results provide an alternative look at OLS regression, Regressing y and X; on LX. yields a vector of residuals, Mzy. and a matrix of residuals. M-X;. Regressing the former on the latter gives the by coefficient vector in Eq. (A.43). There is a similar interpretation for the by vector in Eq. (A.44). a A.2.11 More on Rank and the Solution of Equations Consider the homogeneous equations Ab=0 (Aas)474 ecoNoMerRic METHODS The elements of A are known constants, and b is an unknown solution vector. Clearly if is square and nonsingular, the only solution is the zero vector, b = A~'0 ‘A nonzero solution requires A to be singular. As an illustration, consider ayy + ay2b2 = 0 G31 + azgb2 = 0 ‘These equations give by For a nonzero solution we must have an that is, the determinant of A must be zero. A is then a singular matrix with rank of ‘one, One row (column) is a multiple of the other row (column). The solution vector isa ray through the origin. ‘Now consider a rectangular system, ayyby + ayaby + aisbs = 0 Gb + azab2 + abs = 0 ‘The rank of A is at most two. If it is two, then A will have at least two linearly independent columns. Suppose that the first «wo columns are linearly independent The equations then solve for b; and by in terms of bs, say, b, = Aibs, by = Aabs ‘The solution vector may be written oy) fay b= | bo] =| Ao bs bs} Lt ‘The scalar by is arbitrary. Thus all solution vectors lie on a ray through the origin. Ifthe rank of A is one. then one row must be a multiple ofthe other. The equations then solve for, say, by as a linear function of b2 and bs, which are arbitrary. Writing this as b) = Azb2 + Azbs, the solution vector is, abit All solution vectors thus lie in a two-dimensional subspace of B°. ‘The set of solutions to Eq, (A.45) constitutes a vector space called the nullspace of A, The dimension of this nullspace (the number of linearly independent vectors spanning the subspace) is called the nullity. All three examples satisfy the equation, Number of columns in A = rank of A + nullity (A46) ‘This equation holds generally. Let A be of order m Xn with rank r. Thus there is at least one set of r linearly independent rows and at least one set ofr linearly independent columns. If necessary, rows and columns may be interchanged so that the first ‘rrows and the frst r columns are linearly independent. Partition A by the first rrowsarpenpoca: Matrix Algebra 475 ‘and columns as in [An A ts where 1; is a square nonsingular matrix of order r, and Ay. is of order r X (n— r).. Dropping the last m — r rows from Eq. (A.45) leaves tan Anlfp!] =o (aay ‘where by contains r elements and be the remaining m — r elements. This isa set ofr Tinearly independent equations in n = r unknowns. Solving for By gives By = ~Aj'Ay2b> (A48) ‘The By subvector is arbitrary of “fee” inthe sense that the n ~r elements can be specified at will. or any such specification the subvector Bis determined by Eq. (448). The general solution vector to Eq. (AAT) is thus fo) _ [-Aq'Aw b= f= [eae (a4) Butany solution to Eq, (A.47)is also a solution to Eq, (A.45) since the rows discarded from Eq. (A.45) to reach Eq. (A.47) can all be expressed as linear combinations of the r independent rows in Eq, (A.47). Hence any solution that holds forthe included rows also holds for the discarded rows. Thus Eq. (A.49) defines the general solution vector for Eq, (A.45). If is a solution, then clearly Ab is also a solution for arbitrary A. fb; and by are distinct solutions then Ajby +b; is also a solution. Thus the so- Tutions to Eq. (A.45) constitute a vector space the nullspace ofA. The dimension of the nullspace is determined from Eq. (A.49). The n — r columns of the matrix in Eq. (A.49) are linearly independent since the columns of the submatrix Iy-, are necessarily independent. Thus the dimension ofthe nullspace isn ~ r, which proves the relation in Eq. (A.46). One important application of this result occurs in the discussion of the identification of simultaneous equation models, namely, that if the rank of ‘Ais one less than the number of columns, the solution space is simply a ray through the origin. Result (A.46) als yields simple proofs of some important theorems onthe ranks of various matrices. In Chapter 3 we saw that a crucial matrix in the determination of the OLS vector is X'X where X is the n x k data matrix. It is assumed that n > k. Suppose p(X) = r. The nullspace of X then has dimension k ~ r. If m denowes any vector in this nullspace, Xm =0 : Premultiplying by X’ gives X'Xm =0 ‘Thus, m also lies in the nullspace of X'X. Next lets be any vector im the palispace of X°X so that X'Xs = 0 Premultiplying by s', we find s'X'Xs476. economeraic METHODS ‘Thus Xs is a vector with zero length and so must be the null vector; that is, Xs=0 “Thus, s lies in the nullspace of X. Consequently, X and X’X have the same nullspace and hence the same nullity, (k~r). They also have the same number of columns (k), ‘and so by Eq, (A.46) have the same rank r = k ~ (k ~ r). Thus AX) = (XX) (A50) When X has linearly independent columns, its rank is k. Then (X'X) has rank kand so is nonsingular with inverse (X'X)~", guaranteeing the uniqueness of the OLS vector. ‘Transposing a matrix does not change its rank; that is, a(X) = p(X"). Applying Eq. (A.50) gives PUXX’) = p(X’) ‘The general result is then AX) = p(X'X) = p(XX") (asl) Notice that XX" isa square matrix of order n (> &), so that even if X has full column, rank, XX" is still singular Another important theorem on rank may be stated as follows. If A is a matrix of order m > n with rank r, and P and Q are square nonsingular matrices of order m and n, respectively, then IPA) = p(AQ) = p(PAQ) = piA) (A52) that is, premultiplication and/or postmultiplication of A by nonsingular matrices yields a matrix with the same rank as A. This result may be established by the same ‘methods as used for Eq. (A.51). Finally we state without proof a theorem for the general case of the multiplication of one rectangular matrix by another conformable rectangular matrix. Let A be m x n and B be 1X s. Then PAB) = min[p(A), p(B) (A53) that is, the rank of the product is less than or equal to the smaller of the ranks of the ‘constituent matrices. Again, a similar method of proof applies as in the previous two theorems. A.2.12 Eigenvalues and Eigenvectors Eigenvalues and eigenvectors occur in the solution of a special set of equations. Consider the set of first-order difference equations that appears in the discussion of ‘VARS in Chapter 9, namely, w= At, (As4) where x; is a kX 1 vector of observations on a set of x variables at time f, and A is a kX k matrix of known numbers. By analogy with the treatment of the univariate ccase in Chapter 7 we postulate a solution vector for the multivariate case as n= Ne (A355)Arrexonc a: Matrix Algebra 477 Where A is an unknown scalar and ¢ is an unknown k X 1 vector. If Eg. (A.55) is to be a solution for Eq. (A.54), substitution in Eq. (A.54) should give equality of the two sides. Making the substitution and dividing through by A'~" gives de = Ae or, (A-Ale = 0 (A586) ‘Thee vector thus lies in the nullspace of the matrix A — AJ- If this matrix is nonsingular, the ony solution to Eq. (A.56) is the trivial x = 0. A nontrivial solution requires the matrix to be singular of, in other words, to have a zero determinant, which gives waar (ast, ‘This condition gives the characteristic equation of the matrix A. Its a polynomial of degree k in the unknown A, which can be solved forthe k roots. These A’s are the eigenvalues of A. They are also known as latent roots or characteristic roots. Each A; may be substituted back in Eq. (A.56) and the corresponding ¢ vector obtained. ‘The ¢ vectors are known as the eigenvectors of A. They are also known as latent vectors or characteristic vectors. Assembling all k solutions produces the matrix. equation. 1 A 0 0 soe "100 Az 0 span al ; PoE i 0 0 Ak. which is written more compactly as AC=CA (A358) where Cis the square matrix of eigenvectors and A is the diagonal matrix of eigenvalues. If we assume for the moment that C is nonsingular, it follows that CuMC=A (A59) and the matrix of eigenvectors serves to diagonalize the A matrix. EXAMPLE, Asa simple illustration consider 13 ay] os ol ‘The characteristic equation (A.57) is then | 13-A 01 | te oda 3 ANOS — A) + 008 = 8 -170406 = A= 1294-05) 0478 ecoNoMETaic METions ‘The eigenvalues are Ay (A56) gives 2 and Az = 0.5. Substituting the first eigenvalue in Eq. “for ayn) aman = [08 calle] = [el ee ee Eee Rene acetone gree ert eines Oe ee Ta retps - ef} we oils oy and tis easy to check that 4s -Npa on ya 0 -1 iflos oajli a)“ lo os which illustrates the diagonalization of A. The reader should check that any other abi- trary normalization of the eigenvectors leaves the eigenvalues unchanged. cae A.2.13 Properties of Eigenvalues and Eigenvectors In the properties to follow, A is a k X k matrix of real elements and rank k, A is @ diagonal matrix ofk eigenvalues, nt necessarily all distinct, and Cis a ks (s = h) ‘malrix, whose columns are the eigenvectors of A. Some properties apply generally to any real square matrix. Others depend on whether the matrix is symmetric or not. For such results we use (a) to refer tothe nonsymmetri case and (b) to refer to the symmetric case. Some results are stated without proof. For others an outline of a prot is provided. 1(a). The eigenvalues of a nonsymmetric matrix may be real or complex. 1(b). The eigenvalues of a symmetric matrix are all real. Asanillustration. the matrix A, shown below, has characteristic equation A?-+1 = 0, giving A= i, where i = y~1, and B has eigenvalues * V5: 1-2] 1-2) (a) a-[3 3] 2a). If all k eigenvalues are distinct, C will have k linearly independent columns and so, as just shown, C'AC=A or A =CAC! (A.60) ‘The method of proof may be sketched forthe k = 2case. Assume the contrary result, that is, that the two eigenvectors are linearly dependent, so that one may write A bye, + byer = 0 for some scalars b and bz, of which at least one is nonzero. Premultiply this linear combination by A to obtain byAey + byAer = (hyAr)er + (b2A2Ve:arpexo0x a: Matrix Algebra 479 ‘Multiplying the linear combination by A1 gives (biAver + Braver = 0 ‘Subtracting from the previous equation, we find Qa Adbzen = 0 ‘The eigenvalues are different by assumption, and ¢9, being an eigenvector, is not the null vector. Thus, 6 = 0. Similarly, it may be shown that by = 0, and so acontradic- tion is forced. Thus distinct eigenvalues generate linearly independent eigenvectors. 2(6). The proof in 2(a) did not involve the symmetry of A or the lack of it. Thus the diagonalization in Eq, (A.60) applies equally well to symmetric matrices. However, when A is symmetric, the eigenvectors are not just linearly independent; they are also pairwise orthogonal. Consider the first two eigenvectors as in Aci = Ayer and Aez = Axex Premultiplying the first equation by c’ and the second by ¢} gives ede, = Ache; and eer = Arejer ‘Transposing the second equation gives esAe, Thus, Axcie1, provided A is symmetric. Areser = A Se Since the eigenvalues are distinct, ef¢, = 0. This result holds for any pair of eigen- ‘vectors and so they are pairwise orthogonal when A is symmetric. It is also cus- tomary in this case to normalize the eigenvectors to have unit length, fei| = 1, for i = 1,2,...,k Let Q denote the matrix whose columns are these normalized or- ‘thogonal eigenvectors. Then @Q-1 (A61) From the definition and uniqueness of the inverse matrix it follows that a=" (A.62) ‘The matrix Q is called an orthogonal matrix, that is, a matrix such that its inverse is simply its transpose. It follows directly from (A.62) that gQ'=1 (63) that is, although @ was constructed as a matrix with orthogonal columns, its row ‘vectors are also orthogonal. An orthogonal matrix is thus defined by 0=00'=1 sy For symmetric matrices the diagonalization may be written GAQ=A or A=QA0" (A65) 3(a). When the eigenvalues are not all distinct there are usually fewer than k linearly independent eigenvectors480 _econowernic METHODS Asan example, consider eee acts 4] ‘The eigenvalues are A; = Ay = 2, that is, a single root with multiplicity two. Sub- bs “Ike]=[) This yields justa single eigenvector, ¢)’ = [2 1]. The diagonalization in Eq. (A.60) is then impossible. However, itis possible to get close to diagonalization in the form of a Jordan matrix. In this example the Tordan matrix is a 0 2, Its seen to be upper triangular with the (repeated) eigenvalue displayed on the principal diagonal and the number 1 above the principal diagonal. There exists a nonsingular matrix P such that P'AP=J oor A= PJP! (A.66) ‘To find the P matrix inthis example, rewrite Eq. (A.66) as AP = PJ; that is, J Alp: p|= |r: i je 00 pi + Aa Thus, Ap) = A Ap, = Pi + Ap ‘The first equation shows that py is the eigenvector ¢1, which has already been obtained. Substituting A = 2 and py’ = [2 1] gives po’ = [41]. The P matrix is then rfid] 11 where each column has been normalized by setting the second element at 1. Some arithmetic shows that these matrices satisfy Eg, (66) In the general case where A has s (= k) independent eigenvectors, the Jordan matrix is block diagonal J Je I Bach block relates to a single eigenvalue and the associated eigenvector. If an eigenvalue has multiplicity m, the corresponding block has that eigenvalue repeated m times on the principal diagonal and a series of 1s on the diagonal above the principal diagonal, All other elements are zero. Ifthe eigenvalue is distinct, the block reduces toa scalar showing the eigenvalue, For example, if k = 4 and there are just twoamrenpoca: Matrix Algebra 481 eigenvalues, one with multiplicity three, the Jordan matrix is yo 0 0 : sf Jeo 8 8 Sh. 0 0 4 0 0 0 0 Ay. When one or more roots are repeated, a nonsingular matrix P can always be found to satisfy Eq. (A.66). The equations in (A.66) are perfectly general and not just ap- plicable to the k = 2.case. If A is a diagonal matrix of order k containing all the eigenvalues, including repeats, one may easily see that tA) = tW) and [A = WJ) (an 3(b). When A is symmetric, the same result, Eq. (A.65), holds for repeated eigenvalues as for distinct eigenvalues ‘The reason is that a root with multiplicity m has m orthogonal vectors associated with it! As an illustration consider the matrix 1 0 oO} A-|o1 9} 00 2 “The characteristic equation is (1 ~ 4)°(2— 4) = ). with eigenvalues Ay = Aa and Ay = 2. For As, (A ~ Ade = 0 gives 1 0 OVfer 0 -1 Ofjes|=0 0 0 oes ‘The first two elements in es are thus zero, and the third element is arbitrary. So the eigenvector is any nonzero multiple of e = {0 0 1]. The multiple eigenvalue gives 000 000 on le=0 ‘The third element in ¢ must be zero, but the other two elements are arbitrary. Denot- ing the arbitrary scalars by b, and b,, we may write by 1] fo by 0] + b2}1 0 ol Lo. ‘The eigenvalue with multiplicity 2 thus yields two orthogonal eigenvectors, ey and 2. Itis also seen that all three eigenvectors are mutually orthogonal. The diagonalization in Eq, (A.65) thus holds for all real symmetric matrices, whether or not the eigenvalues are distinct. = by "Fora proof see G. Hadley Linea Algebra, Addison-Wesley, 1961, 243-28.482. econoMerRic MeTHOns 4, The sum of all the eigenvalues is equal to the trace of A. From Eq, (A.59) we may write tr(A) = w(C“'AC) = WAC = uA) (468) ‘The same method of proof works directly for symmetric matrices in Eq (A.65), and also for Eq, (A.66) since we saw in Eq. (A.67) that (J) = tr(A), 5. The product of the eigenvalues is equal to the determinant of A. From property (v) of determinants, [A] = |€“1AC| = [Co'IAlicl = 14] (A.69) ‘The same method of poof works forthe other two cases in Eg. (A.65) and (A.66), noting that (Al = [as shown in Eg, (8.67) 6. The rank of A is equal to the number of nonzero eigenvalues. | was established in Eq. (A.52) that premultiplication and/or postmultiplication of «matrix by nonsingular matrices leaves the rank of the matrix unchanged. Thus, in the first two diagonalizations, Eqs. (A.59) and (A.65), PIA) = pA) (a7) The rank of A is the order of the largest nonvanishing determinant that can be formed from its diagonal elements. This is simply equal to the number of nonvanishing cigenvalues. It also follows that the rank of J is equal to the rank of A, and so the result holds for all three cases. 7. The eigenvalues of It = I-A are the complements of the eigenvalues of A, but the eigenvectors of the two matrices are the same. ‘An eigenvalue and associated eigenvector for A are given by Ac = de ‘Subtracting each side frome gives cna 1- Ae (am) e-Ac Aye which establishes the result that 8. The eigenvalues of A? are the squares of the eigenvalues of A, but the eigenvectors of both matrices are the same. Premultiplying Ae = Ac by A gives Ate = Ade = Ne (A72) which establishes the result, 9. The eigenvalues of A~? are the reciprocals of the eigenvalues of A, but the eigenvectors of both matrices are the same.ampenpoca: Matrix Algebra 483, From Ac de, we write c=Mle sig are=(t a ee eee eee From Bg, (A.72) = We A’e = Ac = de A= De = 0 and since any eigenvector ¢ is not the null vector, A=0 or Ast IL. The rank of an idempotent matrix is equal to its trace. PIA) = play ‘number of nonzero eigenvalues A) = tA) ‘The first step comes from Eq. (A.70), the seconid from property 6, the third from property 10, and the last from Eq. (A.68). 4.2.14 Quadratic Forms and Positive Definite Matrices ‘A simple example of a quadratic form was given in the treatment of vector differentiation earlier in this appendix. In this section A denotes a real symmetric matrix of order kX k. A quadratic form is defined as, "Ab where q is a scalar and b is a nonnull k X 1 vector. The quadratic form and matrix are said to be positive definite if q is strictly postive for any nonzero b. The form and matrix are positive semidefinite if q is nonnegative. There is an intimate link between the nature of the quadratic form and the eigenvalues of A. Ge 1. A necessary and sufficient condition for the real symmetric matrix A to be positive definite is that all the eigenvalues of A be positive. To prove the necessary condition, assume b’Ab > 0. For any eigenvalue and corresponding eigenvector, ‘Ac = Ae. Premultiplying by c' gives cde = dele = since the eigenvectors can be given unit length, Positive definiteness thus implies positive eigenvalues. To prove sufficiency, assume all eigenvalues to be positive.484. nconoMerRIc METHODS From Eq, (A.65), A=CAC where C is an orthogonal matrix of eigenvectors. For any nonnull vector b, biAb = b'CACS =d'Ad = Dad where d = C’b. Because C is nonsingular, the d vector is nonnull. Thus 6'Ab > Qh which proves the rest 2. A is symmetric and postive definite, «nonsingular matrix P canbe found such that A = PP’, When all eigenvalues are positive, A may be factored into A= Aleale va where AM = aa Then A= CAC’ = CANA which gives the result with P = CA'?, 3. IFA is positive definite and B iss X k with p(B) = k, then B'AB is positive definite. For any nonnull vector d a'(BABM (Bay'A(Bd) “The vector Ba isa linear combination of the columns of B and cannot be null since the columns of B are linearly independent. Setting A = 1 gives B'B asa positive definite matrix. In least-squares analysis the data matrix X is conventionally of order n k with rank k. Thus X°X, the matrix of sums of squares and cross products, is positive definite. Dividing by 7 gives the sample variance-covariance matrix, ‘which is thus positive definite. This result also holds for population or theoretical variance-covariance matrices provided there is no linear dependence between the variables.

Matrix Algebra

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Matrix Algebra

Uploaded by

Copyright:

Available Formats

You might also like