You are on page 1of 8

No Layman Left Behind

Subscribe to feed Home About Authors

CATEGO RIES
Linear Algebra: What matrices actually are
Computer Science
July 10, 2011 in Algebra, Linear Algebra, Mathem atics
Theory
Algorithms Most high school students in the United States learn about matrices and matrix
multiplication, but they often are not taught why matrix multiplication w orks
General
the w ay it does. Adding matrices is easy: you just add the corresponding
Authors entries. How ever, matrix multiplication does not w ork this w ay, and for
Mathematics someone w ho doesn’t understand the theory behind matrices, this w ay of
Algebra multiplying matrices may seem extremely contrived and strange. To truly
understand matrices, w e view them as representations of part of a bigger
Linear Algebra
picture. Matrices represent functions betw een spaces, called vector spaces,
Analysis and not just any functions either, but linear functions. This is in fact w hy linear
Combinatorics algebra focuses on matrices. The tw o fundamental facts about matrices is that
Probability every matrix represents some linear function, and every linear function is
represented by a matrix. Therefore, there is in fact a one-to-one
correspondence betw een matrices and linear functions. W e’ll show that
multiplying matrices corresponds to composing the functions that they
represent. Along the w ay, w e’ll examine w hat matrices are good for and w hy
linear algebra sprang up in the first place.
Most likely, if you’ve taken algebra in high school, you’ve seen something like
the follow ing:

Your high school algebra teacher probably told you this thing w as a “matrix.”
You then learned how to do things w ith matrices. For example, you can add
tw o matrices, and the operation is fairly intuitive:

You can also subtract matrices, w hich w orks similarly. You can multiply a matrix
by a number:

Then, w hen you w ere taught how to multiply matrices, everything seemed
w rong:

That is, to find the entry in the -th row , -th column of the product, you look at
the -th row of the first matrix, the -th column of the second matrix, you
multiply together their corresponding numbers, and then you add up the
results to get the entry in that position. In the above example, the 1st row ,
2nd column entry is a because the 1st row of the first matrix is , the
2nd column of the second matrix is , and w e have .
Moreover, this implies that matrix multiplication isn’t even commutative! If w e
sw itch the order of multiplication above, w e get

How come matrix multiplication doesn’t w ork like addition and subtraction? And
if multiplication w orks this w ay, how the heck does division w ork? The goal of
this post is to answ er these questions.

To understand w hy matrix multiplication w orks this w ay, it’s necessary to


understand w hat matrices actually are. But before w e get to that, let’s briefly

converted by W eb2PDFConvert.com
take a look at w hy w e care about matrices in the first place. The most basic
application of matrices is solving systems of linear equations. A linear equation
is one in w hich all the variables appear by themselves w ith no pow ers; they
don’t get multiplied w ith each other or themselves, and no funny functions
either. An example of a system of linear equations is

The solution to this system is . Such equations seem simple, but


they easily arise in life. For example, let’s say I have tw o friends Alice and Bob
w ho w ent shopping for candy. Alice bought 2 chocolate bars and 1 bag of
skittles and spent $3, w hereas Bob bought 4 chocolate bars and 3 bags of
skittles and spent $7. If w e w ant to figure out how much chocolate bars and
skittles cost, w e can let be the price of a chocolate bar and be the price of
a bag of skittles and the variables w ould satisfy the above system of linear
equations. Therefore w e can deduce that a chocolate bar costs $1 and so
does a bag of skittles. This system w as particularly easy to solve because one
can guess and check the solution, but in general, w ith variables and
equations instead of 2, it’s much harder. That’s w here matrices come in! Note
that, by matrix multiplication, the above system of linear equations can be re-
w ritten as

If only w e could find a matrix , w hich is the inverse of the matrix , so

that if w e multiplied both sides of the equation (on the left) by w e’d get

The applications of matrices reach far beyond this simple problem, but for now
w e’ll use this as our motivation. Let’s get back to understanding w hat matrices
are. To understand matrices, w e have to know w hat vectors are. A vector
space is a set w ith a specific structure, and a vector is simply an element of
the vector space. For now , for technical simplicity, w e’ll stick w ith vector spaces
over the real numbers, also know n as real vector spaces. A real vector space
is basically w hat you think of w hen you think of space. The number line is a 1-
dimensional real vector space, the x-y plane is a 2-dimensional real vector
space, 3-dimensional space is a 3-dimensional real vector space, and so on. If
you learned about vectors in school, then you are probably familiar w ith
thinking about them as arrow s w hich you can add together, multiply by a real
number, and so on, but multiplying vectors together w orks differently. Does
this sound familiar? It should. That’s how matrices w ork, and it’s no
coincidence.

The most important fact about vector spaces is that they alw ays have a basis.
A basis of a vector space is a set of vectors such that any vector in the space
can be w ritten as a linear combination of those basis vectors. If are
your basis vectors, then is a linear combination if are
real numbers. A concrete example is the follow ing: a basis for the x-y plane is
the vectors . Any vector is of the form w hich can be w ritten
as

so w e indeed have a basis! This is not the only possible basis. In fact, the
vectors in our basis don’t even have to be perpendicular! For example, the
vectors form a basis since w e can w rite

Now , a linear transformation is simply a function betw een tw o vector spaces


that happens to be linear. Being linear is an extremely nice property. A
function is linear if the follow ing tw o properties hold:

For example, the function defined on the real line is not linear,
since w hereas
. Now , w e connect together all the ideas w e’ve talked
about so far: matrices, basis, and linear transformations. The connection is
that matrices are representations of linear transformations, and you can
figure out how to w rite the matrix dow n by seeing how it acts on a basis. To

converted by W eb2PDFConvert.com
understand the first statement, w e need to see w hy the second is true. The
idea is that any vector is a linear combination of basis vectors, so you only
need to know how the linear transformation affects each basis vector. This is
because, since the function is linear, if w e have an arbitrary vector w hich can
be w ritten as a linear combination , then

Notice that the value of is completely determined by the values


, and so that’s all the information w e need to completely
define the linear transformation. W here does the matrix come in? W ell, once
w e choose a basis for both the domain and the target of the linear
transformation, the columns of the matrix w ill represent the images of the
basis vectors under the function. For example, suppose w e have a linear
transformation w hich maps to , meaning it takes in 3-dimensional
vectors and spits out 2-dimensional vectors. Right now is just some abstract
function for w hich w e have no w ay of w riting dow n on paper. Let’s pick a basis
for both our domain (3-space) and our target (2-space, or the plane). A nice
choice w ould be for the former
and for the latter. All w e need to know is how
affects , and the basis for the target is for w riting dow n the values
concretely. The matrix for our function w ill be a 2-by-3
matrix, w here the 3 columns are indexed by and the 2 row s are
indexed by . All w e need to w rite dow n are the values
. For concreteness, let’s say

Then the corresponding matrix w ill be

The reason w hy this w orks is that matrix multiplication w as designed so that if


you multiply a matrix by the vector w ith all zeroes except a 1 in the -th entry,
then the result is just the -th column of the matrix. You can check this for
yourself. So w e know that the matrix w orks correctly w hen applied to
(multiplied to) basis vectors. But also matrices satisfy the same properties as
linear transformations, namely and ,
w here are vectors and is a real number. Therefore w orks for all
vectors, so it’s the correct representation of . Note that if w e had chosen
different vectors for the basis vectors, the matrix w ould look different.
Therefore, matrices are not natural in the sense that they depend on w hat
bases w e choose.
Now , finally to answ er the question posed at the beginning. W hy does matrix
multiplication w ork the w ay it does? Let’s take a look at the tw o matrices w e
had in the beginning: and . W e know that these

correspond to linear functions on the plane, let’s call them and ,


respectively. Multiplying matrices corresponds to composing their functions.
Therefore, doing is the same as doing for any vector . To
determine w hat the matrix should look like, w e can see how it affects the
basis vectors . W e have

so the first column of should be , and

so the second column of should be . Indeed, this agrees w ith the


answ er w e got in the beginning by matrix multiplication! Although this is not at
all a rigorous proof, since it’s just an example, it captures the idea of the
reason matrix multiplication is the w ay it is.
Now that w e understand how and w hy matrix multiplication w orks the w ay it
does, how does matrix division w ork? You are probably familiar w ith functional
inverses. The inverse of a function is a function such that
for all . Since multiplication of matrices corresponds
to composition of functions, it only makes sense that the multiplicative inverse
of a matrix is the compositional inverse of the corresponding function. That’s
w hy not all matrices have multiplicative inverses. Some functions don’t have

converted by W eb2PDFConvert.com
compositional inverses! For example, the linear function mapping to
defined by has no inverse, since many vectors get mapped to
the same value (w hat w ould be? ? ?). This corresponds to
the fact that the 1×2 matrix has no multiplicative inverse. So dividing by
a matrix is just multiplication by , if it exists. There are algorithms for
computing inverses of matrices, but w e’ll save that for another post.

S H A R E
T H I S :

 Twitter  Facebook 413


Loading...

R E L A T E D

Combinatorics: Power of generating functions

Algorithms: What are they?

Calculus: Asymptotic growth

30 comments
Com m ents feed for this article
May 5, 2012 at 8:44 am W onderful post, thank you. This is almost
David Miles exactly w hat I w as looking for. Now , I
have to try to translate aspects of this
for high school students. I w onder if the complexity of this is part of the reason
that matrices have been removed from the IB DPs new mathematics
curriculum.
Reply
Novem ber 28, 2012 at 1:51 am great post! this helps me alot for
gary understanding my upper division courses
of linear algebra!
Reply
March 21, 2013 at 4:26 pm You have row s and columns confused.
Charles Peezy Row s are horizontal, columns are
vertical.
Reply
March 21, 2013 at 4:58 pm Yes, row s are horizontal, columns are
Alan Guo vertical. W here in the article do I
make a mistake?
Reply

converted by W eb2PDFConvert.com
May 6, 2013 at 11:59 am Amazing article! Thanks!
Metro Man
Reply
July 5, 2013 at 11:55 am I am a little confused, how does the
Addae f(g(w 1)) = f(w 1+w 2)
and for the second column w hy do you
do f(g(w 2)) = f(2w 1)
Reply
July 6, 2013 at 7:20 am Recall that g is defined to be the
Alan Guo function represented by the matrix B,
w hose first column is (1 1) and
second column is (2 0) in the basis w 1 and w 2. The first column tells us
w hat g(w 1) is and the second column tells us w hat g(w 2) is. In particular,
it tells us g(w 1) = 1*w 1 + 1*w 2 and g(w 2) = 2*w 1 + 0*w 2.
Reply
July 6, 2013 at 5:50 pm Thanks, I sort of saw that. But my
Addae real question is w ere do you plug
in w 1 and w 2? Am I missing
something basic? Sorry for the inconvenience. Thanks for answ ering,
and so quickly as w ell.

July 6, 2013 at 8:20 pm Hm, I’m not sure I completely


Alan Guo understand your question. But I’ll
say some stuff, and if you’re still
confused, let me know .
W e know , w ith respect to the basis w 1 and w 2, w e have g(w 1) = w 1 +
w 2 and g(w 2) = 2*w 2. In other w ords, w henever w e see g(w 1), w e
can replace that w ith w 1 + w 2, since they’re equal, and similarly w e
can replace g(w 2) w ith 2*w 2. Therefore, f(g(w 1)) = f(w 1 + w 2), since
w e just substitute w 1 + w 2 for g(w 1) inside f, and similarly f(g(w 2)) =
f(2*w 2).

July 6, 2013 at 10:46 pm Alright I know that g(w 1) = w 1 + w 2


Addae So w ith a regular function g(x) = 5x+3,
w hen you w rite g(w 1) you get g(w 1) =
5w 1+3
W here do you actually plug in w 1 into g if g(x) = w 1 + w 2
Lets say w 1 w as (2, 0) instead of (1,0) how w ould that change g(w 1)?
Everything else makes sense just getting lost in the details is all.
Reply
July 7, 2013 at 6:12 am Ah, I think I understand your
Alan Guo question now . w 1 and w 2 are not
variables, they are actual specific
vectors in the plane that I’ve chosen.
So w hen I say g(w 1) = w 1 + w 2 and g(w 2) = 2w 1, w hat I mean is, I’ve
chosen some basis w 1, w 2 for the domain (and range). Every x can be
w ritten as a*w 1 + b*w 2, and so by linearity of g, w e have
g(x) = g(a*w 1 + b*w 2) = a*g(w 1) + b*g(w 2) = a*(w 1+w 2) + b*(2*w 1) =
(a+2b)*w 1 + a*w 2.

If w e choose to represent w 1 = (1,0) and w 2 = (0,1), w hat that’s saying is


g(a,b) = (a+2b,a)
w hich can also be read off the matrix B.

Now , suppose w e choose a different basis v1 = (2,0) = 2*w 1 and v2 =


(0,1) = w 2. Then, w ith respect to this new basis v1,v2, w e have
g(v1) = g(2*w 1) = 2*g(w 1) = 2*w 1 + 2*w 2 = v1 + 2*v2

g(v2) = g(w 2) = 2*w 1 = 2*v1


and so the new matrix B’ for g w ith respect to this new basis w ould have
first column (1 2) and second column (2 0).
Reply

July 10, 2013 at 6:47 pm Took me a w hile to process…


Addae Okay so I understand g(a,b) = (a+2b,a)
w 1 is a bias vector and w 2 is another
basis vector
Then w hen you put those basis vectors into the g(a,b) equation you get your
solutions and they are clearly seen from the B Matrix. So the method of g(x)

converted by W eb2PDFConvert.com
and x being a variable is cleared up. By this method I get it so much more,
thank you.
w hen you go into matrix notation i get lost
w 1 = (1,0)
w 2 = (0,1)
g represents
[12]
[10]

So w hen you say g(w 1) are you are calling upon the vector (1,1) or the first
column that is made by the linear combination of 1w 1+1w 2? Is w 1 similar to
the notation of v1 that you used earlier?
If I understand, in algebraic terms that means that g(x) can be expressed as a
linear combination of basis vectors w 1, w 2 w here they are (1,0) and (0,1)
(w here x is a vector)
g(x) = 1w 1+2w 2
so then w hat happens w hen you put in
w 1?

g(w 1) = ?
every x can be w ritten as a*w 1+b*w 2

Took my a w hile to form my questions, this seems very abstract thank you for
helping.
Reply
July 10, 2013 at 8:32 pm Matrix notation only has meaning
Alan Guo w hen you specify a basis. For
example, w hen I w rite a matrix A as

[a b]
[c d]

w hat that really means is I’ve fixed a basis v1,v2 for the domain V and a
basis w 1,w 2 for the codomain W , and the matrix A represents the linear
function f defined by

f(v1) = a*w 1 + c*w 2


f(v2) = b*w 1 + d*w 2

This uniquely specifies how f behaves on the entire domain V, since every
vector v in V can be w ritten uniquely as x*v1 + y*v2 for some scalars x,y.
So you can think of v as a variable, w hich is really parametrized by the tw o
variables x,y. Then, by linearity,

f(v) = f(x*v1 + y*v2)


= x*f(v1) + y*f(v2)
= (ax + by)*w 1 + (cx + dy)*w 2
w hich is the same as w hen you multiple the column vector (x, y) by the
matrix A:

[a b] [x] = [ax + by]


[c d] [y] [cx + dy]

Note that the column vector (x, y) on the left hand side is w ritten in the
(v1,v2) basis, so it represents the vector x*v1 + y*v2, w hereas the column
vector (ax + by, cx + dy) on the right hand side is w ritten in the (w 1, w 2)
basis, so it represents the vector (ax + by)*w 1 + (cx + dy)*w 2.
In my examples, I conveniently chose the same basis w 1,w 2 for both the
domain and the codomain.

So anyw ay, to answ er your specific question, w hen I say g(w 1), w hat I
mean is, w 1 is a vector w hich, in the basis w 1,w 2, is w ritten as 1*w 1 +
0*w 2, denoted by the column vector (1,0), and g(w 1) means applying g to
the vector (1,0), so multiply the matrix B by (1,0) w hich w ill give you (1,1),
so g(w 1) = 1*w 1 + 1*w 2.
Reply
August 29, 2015 at 11:51 am I had the same question as
Alex Addae (I think). The w ay I w ould
put it: It *seems* w eird that
g(w 1) = w 1 + w 2, because ‘normally’ w hen you define a function g(x),
the RHS involves only the variable x, e.g. g(x) = 2*x. How ever, for
something like g(x) = 2*x + 5*y, one might react as, “W ait, w here does
y come from? How do you get any sort of y from x?” (Is that w hat you
mean, Addae?)
How ever, if I understand you correctly Alan, I think g(w 1) has a bit
different meaning. It’s more like, w hen I apply the function g to the
basis vector w 1, w hat new vector do I get from any linear combination

converted by W eb2PDFConvert.com
of the basis vectors of the vector space…. NOT necessarily from just
w 1. Does that clear it up?

O ctober 14, 2013 at 9:46 pm Hi,


Jeremy Hansbrough
If you have a linear transformation that’s
one to one and onto, then the basis vectors span the space and send every
vector in the domain to a unique vector in the codomain. The codomain is the
same as the range…

Meaning that the ker(T) = {0}, and that the Im(T) = V, w here V is the domain…

So if a set of vectors doesn’t span its domain, then the kernel spans a
dimension that is sent to 0 by definition. How does this relate to matrix
multiplication?

So if you have a matrix w here the the vectors making it up are linearly
dependent, such as:

[ 1 -1 -1]
[-1 2 3]
[-2 1 0]

All three vectors only span a tw o space, because one can be expressed in
terms of the others. Is there a w ay to argue that a linear transformation isn’t
one to one simply because of the geometry of spanning? How does this relate
to matrix multiplication?
Reply
O ctober 16, 2013 at 8:15 pm Yes, the kernel of the matrix is
Alan Guo intimately related to the geometry of
the vectors making up the matrix. In
particular, any nonzero linear combination of the columns of the matrix
w hich yields zero (a.k.a. a linearly dependence relation) is a member of the
kernel of the matrix. For instance, in your example matrix, if a, b, c are the
column vectors of your matrix, then w e see that a + 2b – c = 0, so the
column vector (1,2,-1) is in your kernel. In fact, multiplying the column
vector (x,y,z) by the matrix exactly gives you the vector x*a + y*b + z*c,
so the kernel is nontrivial if and only if the columns are linearly dependent.
Reply

O ctober 16, 2013 at 2:30 pm I finally get it entirely. Thanks A lot. I


Addae took a course over the summer to help
me out. I started doing so w ork w ith
matrices and some w ork w ith quaternions. Then I decided I w ould give this a
go again and it is surprisingly simple now . W hat you w ere doing w as
expressing a column of g as a linear combination of the basis vectors. After
seeing w hat that does to the basis vectors, you put that answ er through f and
see how it affects it relative to the basis vectors. Because they all share the
same basis vectors this approach w orks. W hat w ere to happen if the basis
vectors are not same for the both matrices. I’m guessing you w ould use the
basis vector of the first matrix g, and see how f transforms it.

Just a quick question, I w as w ondering if you have done much in quaternion


algebra and if I could message you sometime about it. If so could you e-mail
me! Don’t w ant to flood you comment section anymore than I have. Thanks a
lot for clearing things up, and spending the time to explain the concept to me
back then
Reply
O ctober 16, 2013 at 8:17 pm It’s great to hear that these things
Alan Guo are clear now ! No, I haven’t done
any w ork in quaternion algebras.
Reply

Septem ber 12, 2014 at 6:52 am Really good and straightforw ard article.
Juxhino
Thank you!
Reply
August 29, 2015 at 9:49 am Hi,
Gideon
Great post. One question: aren’t there
multiple matrix representations for a given linear function? Doesn’t this mean
that it’s a one to many relationship, not one to one?

Thanks again for w riting this!


Reply

converted by W eb2PDFConvert.com
August 29, 2015 at 12:42 pm I don’t see either simultaneous
menomnon equations or Gaussian elimination
mentioned?
Reply
Septem ber 1, 2015 at 8:41 am […]
_ HPJ's Personal
Website
[…]
Reply
Septem ber 1, 2015 at 6:22 pm […]
- code123 […]
Reply
Septem ber 7, 2015 at 4:02 am
Les liens de la semaine – Édition
#148 | French Coding […] Q u’est-ce qu’une m atrice? […]
Reply
Septem ber 13, 2015 at 12:29 amThank you for the excellent explanation.
Peter Varga Another w ay to prove this point is
geometric algebra to draw a few arrow s
and the apt student w ould see how the functions and vectors in a space
correspond.

This article gave me an inspiration to solve my problem I w as stuck w ith. Thank


you again.
Reply
O ctober 4, 2015 at 5:49 am […] blandit consectetur posuere. Aenean efficitur,
¡Hola mundo! | Juan Carlos ipsum ut m attis tincidunt, sem tellus m alesuada
González augue, in sollicitudin lectus augue ac nibh. Nunc […]
Reply
O ctober 12, 2015 at 9:01 am W ow , this helped so much. I’ve w hy
rohantmp w ould anyone teach matrices w ithout
explaining this. Now here else in any
~”Intro to Matrices” sort of thing have I found anything nearly like this.

Thank you so much!


Reply
March 5, 2016 at 4:16 am Very intuitive and helpful.
Arkadeep Mukhopadhyay I cordially invite you to visit the blog
Antarctica Daily
Reply
March 13, 2016 at 7:44 am […]
-FreeBay.CC […]
Reply
May 24, 2016 at 6:55 pm thanks
hehe
Reply

LEAVE A REPLY

Enter your comment here...

« Intro Post: Alan Combinatorics: Pow er of generating functions »

Search

Subscribe to feed. Create a free w ebsite or blog at W ordPress.com.


Ben Eastaugh and Chris Sternal-Johnson.

Follow

F O L L O W
L A Y M A N
B E H I N D ”
converted by W eb2PDFConvert.com

You might also like