You are on page 1of 13

MatLab Tutorial

Draft

Anthony S. Maida

October 8, 2001; revised September 20, 2004; March 23, 2006

Contents
1 Introduction 2
1.1 Is MatLab appropriate for your problem? . . . . . . . . . . . . . . . . . . 2
1.2 Interpreted language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Command-line shell . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Loading scripts from files . . . . . . . . . . . . . . . . . . . . . . 3

2 Vectorizing code 3
2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Statement terminators and output suppression . . . . . . . . . . . . 4
2.1.2 Some matrix operators . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.3 Flexible matrix access . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.4 Loading data via a script . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Vector operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Examining the workspace . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Applying functions to matrix elements . . . . . . . . . . . . . . . . . . . . 6
2.3.1 Vectorizing a feedforward network for one epoch . . . . . . . . . . 7
2.4 Defining functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Plotting and visualization 9


3.1 Simple plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.1 Application to neural networks . . . . . . . . . . . . . . . . . . . . 9
3.2 3D Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Surface plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Other plotting commands . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Appendix 11

1
A Appendix: Matrix notation 12
A.1 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
A.2 Definition of transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1 Introduction
M ATLAB stands for matrix laboratory and the name is a trademark of The MathWorks
Incorporated. The purpose of this document is to give you enough background so that you
effectively use the M ATLAB built-in help system to solve your programming problems.

1.1 Is MatLab appropriate for your problem?


M ATLAB is effective in small applications where the appropriate data representations are
vectors and matrices. Vectors and matrices are the appropriate data structure for nearly all
non-trivial problems in scientific computing. M ATLAB is especially useful for interactively
plotting functions and visualizing scientific data. M ATLAB is commercial software.1

1.2 Interpreted language


M ATLAB is an interpreted language, hence, its programs are called scripts. To achieve
efficiency, M ATLAB allows the user to vectorize his/her code by using matrices and their
associated operators. The interpretation overhead for a vectorized instruction is then small
compared to the amount of computation that the instruction does.

1.2.1 Command-line shell


The user has a choice of using M ATLAB interactively via a command-line shell (read-eval-
print loop) or by loading prewritten scripts from files. The language is designed (e.g.,
variables need not be declared) so that commands evaluated in the read-eval-print loop will
also work when loaded from a file. This facilitates rapid prototyping and debugging because
you can test your command before putting it into a file. A simple command-line interaction
session is shown below.

>> 2 + 2
ans = 4

The command-line prompt is >>. The system reads the expression 2 + 2, evaluates it,
stores it in a default variable ans, and prints the value of this variable. The variable ans
always has the result of the most recent computation that has not already been assigned a
value. For instance, if we continue the session as shown below, the value of ans doesn’t
change.
1
If you cannot afford MatLab, then GNU O CTAVE (www.octave.org) is an alternative for interactive
matrix computations and GNUPLOT (www.gnuplot.org) is an alternative for plotting and visualizing
data.

2
>> x = 2 + 3
x = 5
>> ans
ans = 4

At this point the workspace has two variables x and ans. The development environment
has workspace editor which allows you to examine the variables in the workspace and how
much space they consume. You can clear the workspace to its original state by typing the
command clear.

1.2.2 Loading scripts from files

You can use the M ATLAB built-in editor to write scripts, save them to files, and then have
them loaded and evaluated as if you typed the commands directly into the command line.
Also, while in the command line shell, you can type the file name containing a M ATLAB
script to achieve the same effect. Text files containing M ATLAB code are called m-files and
have the file extension “.m”.
As you develop a script, you will follow a cycle of editing and loading. When you
reload your script, the workspace from the previous cycle will still be in effect unless you
clear it. You probably want to clear it to avoid subtle reinitialization errors. The best way
to do this is to put the clear command as the first line of the script, so that the script runs
in a virgin workspace.

2 Vectorizing code
Interpreted M ATLAB scripts achieve efficiency in speed of execution by using large instruc-
tions to reduce decoding overhead and by using space to decrease time. For a given problem
size, M ATLAB scipts use a lot of memory. Thus M ATLAB trades memory to increase speed.

2.1 Matrices
The primary data structure in M ATLAB is a matrix. The appendix in this tutorial has more
information about matrices. You can create and initialize a matrix by typing the matrix
values in an assignment statement. The example below creates a 3 × 2 matrix of double-
precision floating-point values and then sets the variable w to reference the matrix.2 Notice
that the variable w was not previously declared. The rows of the matrix are terminated with
semicolons. All the commands below create the same matrix.

>> w = [1 2 3; 4 5 6]
>> w = [1 2 3; 4 5 6;]
>> w = [1, 2, 3; 4, 5, 6]
>> w = [1, 2, 3; 4, 5, 6;]

2
Numbers in M ATLAB are always double-precision floating-point.

3
M ATLAB will respond by echoing the value of w. That is, it will print w and the matrix
of values. In this example, terminating a line with a semicolon character will suppress the
output.3

2.1.1 Statement terminators and output suppression


A newline terminates a statement unless you are typing in a matrix. Three consecutive
periods is the command continuation operator, which signals that you are typing a command
which extends across more than one line. For matrices, the open-square-bracket character
signals that you are entering a matrix. The process of entering a matrix can extend over
more than one line and is terminated with the close-square-bracket character. If you are
entering assignment statements into a file, then normally you would terminate them with a
semicolon (to suppress output) unless you want the value of some variable to be printed as
the file is evaluated.4

2.1.2 Some matrix operators


The mathematical notation for the transpose of a matrix w is wT . In M ATLAB, the apostophe
symbol is used instead. For instance, w’ will yield the transpose of w. The dimensions of
w and w’ are compatible so the matrices can be mutiplied, as shown below.
>> w = [1 2 3; 4 5 6];
>> w*w’
ans = 14 32
32 77

Of course, you can also add two compatible matrices using the “+” operator. There are
other more subtle operators. Suppose that you want to square each element in a matrix.
There is an operator, “.*”, called array multiply. to allow you to multiply corresponding
elements of two m × n matrices to yield a new m × n matrix. In the example below, we
use the operator to square the elements of w.
>> w = [1 2 3; 4 5 6];
>> w.*w
ans = 1 4 9
16 25 36

2.1.3 Flexible matrix access


You can access an element of the matrix w in the previous example by using an expression
of the form w(i,j), where i indicates the row and j indicates the column. M ATLAB is
amazingly flexible in allowing you to access information in matrices. For the matrix w, you
3
The function of the semicolon operator is context dependent. In a matrix, it terminates a row. At the end of
a line, it suppresses output. The statement terminator is the newline character. The line continuation operator
is three consecutive periods . . ..
4
The semicolon is a statement terminator in the programming languages C, C++, and JAVA but is an output
suppression operator in M ATLAB.

4
can treat it as a six-element vector by referencing it using the expression “w(:)”. Type this
into the command shell to see what happens.
You also have full access to rows and columns in the matrix. For instance, the expres-
sion “w(2,:)” gives you the second row of the matrix and the expression “w(:,2)”
gives you the second column. Type these into the command shell to see what happens. You
can delete the second column of matrix “w(:,2)” by typing
>> w(:,2) = [];
w = 1 3
4 6

In most of the examples that follow, I will leave out the command-line prompt.

2.1.4 Loading data via a script


It is often very useful to generate data in a traditional language such as C++ or JAVA and
then visualize it using M ATLAB. There is a very easy way to do this by writing the output
data in the form of a M ATLAB script. Afterwards, simply executing the script will load the
data into M ATLAB. The example below illustrates this.
static void printData(int cyc) {
System.out.println("data(:, :, " + cyc + ") = [");
for (int r = 0; r < data.length; r++) {
for (int c = 0; c < (data[r].length-1); c++)
System.out.print(data[r][c]+" ");
System.out.println(data[r][data[r].length-1]+";");}
System.out.println("];");
In the above JAVA method, suppose the variable data is a 6 × 12 array of zeros and
ones, representing say the current state of the of a cellular automaton, such as Conway’s
game of life. Thus the array gets a set of new values on each cycle of the game. Let’s
assume the method printData is executed on each cycle in order to print the current
state of the game. Possible output for the first cycle is shown below.
data(:, :, 1) = [
1 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 1 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 1 0 0 0;
0 0 0 0 0 0 1 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0;
];

data(:, :, 2) = [
...

Notice that JAVA program embedded the output within a M ATLAB script where the M AT-
LAB variable data is a three-dimensional array. The third dimension holds the cycle num-
ber and the first two dimensions hold the state of the two-dimensional array of cellular au-
tomata for that cycle. Suppose several cycles of this output are sent to a file named data.m.
Then from within M ATLAB this data can be loaded simply by typing the one-line command
data, which is an instruction to evaluate the script named data.m. Once loaded into
M ATLAB, any number of tools can be used to manipulate, examine, and visualize the data.

5
2.2 Vector operations
Let us interpret the matrix w as the weight matrix for the first layer of a two-unit neural
network with three input-features to each unit. The matrix has two rows and each row codes
the three weight values for one unit. Then if we have an input pattern vector p~ = [1, 0, 0]
(represented as a column vector5 ), we can compute the net-input for this layer with one
matrix multiplication as shown below in the last line.

p = [1; 0; 0];
w = [1 2 3; 4 5 6];
n = w*p;

Each component ni of ~n represents the net input for one neuron in the input layer. Each ni
is equivalent to the sum below.
N
X
ni = wj,i pj
j=1

If we were to compute these sums using for-loops, the interpreter would have to decode
and execute six different assignment statements. In this example, the operation of matrix
multiplication vectorizes a double-nested for-loop, so that only one statement is needed to
calculate it.

2.2.1 Examining the workspace

M ATLAB has excellent run-time debugging facilities. In an interpreted language, you get
run-time errors that are inconceivable in a compiled language, so excellent run-time debug-
ging is a necessity in an intepreted language.
The currrent workspace has three variables which reference matrices. In the M ATLAB
development environment, you have direct access to inspect and edit the objects in this
workspace. Depending on your platform, the workspace inspector is probably under the
window menu. When you access this inspector, you will see a list of variable names and
the amount of space their associated objects consume. If you click on one of these names,
you will be able to inspect the array contents using a spread-sheet-like interface. You can
even change the values of entries in a matrix.
If you are debugging a neural network program, you can watch the weights evolve using
this editor.

2.3 Applying functions to matrix elements


To complete the feedforward network computation, let us apply the sigmoidal activation
function to each element of the net input array in the previous example.
5
Vectors are a different kind of object than matrices. When we use vectors and matrices at the same time,
we need to find a way to treat them uniformly. The convention is to treat an n-component vector as an n × 1
matrix. So, a vector with three components is treated as a 3 × 1 matrix. Notice that these matrices have n rows
and one column. For this reason, they are called column vectors.

6
p = [1; 0; 0]
w = [1 2 3; 4 5 6]
n = w *p
a = 1 ./ (1+exp(-4*n))

This example is the same as the previous except that one more line was added. Let us
explain what that line does. Let us work from the innermost expressions, starting with n.
First, we premultiplied the matrix n with the scalar −4. Next we applied the exponential
function, exp, to the matrix. This has the effect of applying the function to each element
of the matrix. Notice, that whereas matrices are signaled by square brackets, function
application is signaled parentheses. The next step was to add one to the matrix of results.
Notice adding the scalar 1 to a matrix, has the effect of adding one to each element of the
matrix. How does this happen? M ATLAB converts the scalar 1 to a matrix of ones whose
dimensions match the argument on the other side of the operator (in this case +). After
this, M ATLAB applies the matrix operation of addition. The symbol “./” stands for array
divide and is the division equivalent of “.*”. The 1 in the numerator again gets converted
to a matrix of ones whose dimensions match those on the other side of the operator. Then
the matrix of reciprocals is computed.

2.3.1 Vectorizing a feedforward network for one epoch


The earlier example vectorized the presentation of one pattern to the network. If the num-
ber of training patterns is small, then we can vectorize the presentation of all the training
patterns to the network. For this example, assume that we are training a network to learn a
two-input boolean concept such as AND. We are looking at the first layer of the network.
The first layer consists of two units, each with two inputs.
inputs = [0 0; 0 1; 1 0; 1 1]’; % transposed
desiredOuts = [ 0 1 1 0];
onesVec = [1 1 1 1];
wts = [ .1, -.2; .1, -.1];
biasWts = [ .1; .1];
netInputs = (wts * inputs) + (biasWts * onesVec);
outputs = 1 ./ (1 + exp(-4*netInputs));

Notice that wts is a 2 × 2 matrix and that inputs is a 2 × 4 matrix. Multiplying wts
with inputs yields a 2 × 4 matrix. The matrix biasWts is 2 × 1 and onesVec is 1 × 4.
Multiplying these together yields a 2 × 4 matrix. The matrix netInput is therefore 2 × 4
and should be interpreted as follows. Each column of the matrix codes the output values
of the two units for one input pattern. Since there are four input patterns, there are four
columns.

2.4 Defining functions


M ATLAB uses the call-by-value parameter passing style. Functions can have side-effects
if the variables they use are declared global both in the function body and external to the
function and also have the same name.

7
When you define a function in M ATLAB, it should work with vectorized code. The
purpose of this section is to show how to define functions that work with vectorized code.
Let’s start with a simple example. We shall write a function to compute the logistic
sigmoid function, as defined below.

1
logsig(x) =
1 + e−x

M ATLAB does not have a built-in logistic sigmoid function. Here is how to implement it.

function f = logsig(x)
f = 1 ./ (1 + exp(-x));

Normally, this function is placed in a file named logsig.m. Notice that the function does
not have the return statement characteristic of C, C++, or JAVA. The function returns
when it reaches the end of its body. The return value is the value of the variable f, which
was declared at the start of the function. It is also customary not to indent the body of the
function. The function is also desiged to work either with scalars or with arrays. That is,
you should be able to issue the function invocation logSig(1.5) to apply the function
1.5, or the invocation logSig([1 2]) to apply the function to each element of the matrix
[1 2].
The next example implements a piecewise linear function and is a bit more tricky to
vectorize. M ATLAB does not have a built-in symmetric hard-limit function hardlims,
which is defined below. (
−1 x < 0
hardlims(x) =
+1 x ≥ 0

MatLab does have the built-in function sign which returns −1, 0, or 1. An example of its
use is given below.

>> sign([-2 0 2])


ans = -1 0 1

This function is similar to the hardlims with one difference. This function returns 0 when
its argument is zero and the hardlims function returns 1 which its argument is zero.
Here is how to define the hardlims function. It needs to be put on its own file called
hardlims.m. In this example, I have included a comment line between the function
declaration and the function body. The comment line begins with the % symbol.

function f = hardlims(x)
% 1 if x >= 0, -1 otherwise
f = 2 * (x >= 0) - 1;

For this function to work on array arguments, it is necessary to cause the system to create
an array of zeros whose dimensions are the same as x. The expression (x >= 0) in the
first line of the function body does this before the relational operator is applied.

8
3 Plotting and visualization
The language has convenient and powerful visualization facilities. You can generate data
within M ATLAB, or from an external program as was illustrated in Section 2.1.4.

3.1 Simple plotting


The plot command plots two-dimensional graphs. First, let’s create a vector y with 101
elements and then plot it.

>> y = 0:.1:10;
>> plot(y)

The first line creates a vector whose values range from 0 through 10 in increments of 0.1.
More accurately, y is a 1 × 101 matrix. The second line plots this vector as a function of an
implicit x ranging from 0 through 100 in increments of 1. That is, the y-values are plotted
as a function of their array indices.
The plot command operates on vectors and plots a y against an x. This was implicit
in the previous example and is made explicit below.

>> y = 0:.1:10;
>> [rows, cols] = size(y);
>> x = rows:cols;
>> plot(x,y)

In the above, size returns the dimensions of the matrix y, and the componentwise assign-
ment statement gives rows the value 1 and cols the value 101. The next line creates a
vector x with a default increment of 1. Compare this with the creation of y with an explicit
increment of 0.1. Finally, the plot command explicitly plots y as a function of x.

3.1.1 Application to neural networks


It is very easy to plot the sum-of-squared error (SSE) as a function of training epoch, as
shown in the example below. In this example, we train the network for 1000 epochs unless
the SSE drops below 0.02. In this case, we break from the for-loop.6

for epoch=1:1000
. . .

SSE(epoch) = sum(error .* error);


if((SSE(epoch)<.02)) break; end
end
plot(SSE);

6
This example illustrates the syntax of for statements and if statements. Notice that both statements
terminate with an end. Also, the for statement allows a break. Finally, the scope of the iteration variable
epoch continues beyond the end of the for statement.

9
The variable error is assumed to hold a vector of error values for the n training patterns
in one epoch of training. If we square those error values and add them up, then we have
the SSE for that particular training epoch. We save these values in the dynamically growing
array7 SSE. In M ATLAB, when storing a value in an array, if you use an array index that is
larger than the number of elements in the array, the array grows so that it is large enough to
handle the index. In JAVA, you would get an array index out-of-bounds exception. In C or
C++, your program behavior would be undefined. Once it is computed, plotting the SSE is
so easy it is mind boggling. Of course, we should put labels on the graph axes and give it a
title, as shown below.
plot(SSE);
xlabel(’Epoch’);
ylabel(’SSE’);
title(’SS error for backprop’);

If you want to print the value of a variable in a title, then use the more complex variant
below.
title([’SS error for ’, num2str(epoch), ’ epochs of bp’]);

In this variant, title accepts a vector of strings. Notice, that the value of the variable
epoch is converted to a string.

3.2 3D Plots
The command plot3 allows you to plot data in three dimensions. The script below loads
the data illustrated in Section 2.1.4 and then plots it in three dimensions using the plot3
command.
data;
hold on
for cyc=1:50,
[x,y] = find(data(:,:,cyc));
z=zeros(length(y),1)+cyc;
plot3(z,x,y,’.’)
axis([1 50 1 6 1 12])
end;

In the above, we assume that 50 frames or cycles of data have been generated. The 3D plot
is generated in a loop of 50 iterations where each iteration plots one frame of data on the
graph using the command plot3. The hold on command says to superimpose the data
from successive plots, rather than to erase the graph for each new plot. For a given cycle
of the life simulation, the command find obtains the x and y coordinates of the non-zero
elements of the two-dimensional matrix data(:, :, cyc). The list of x coordinates
goes into the x vector, and similarly for the y coordinates. Since x and y are vectors of the
same length, we also need a z vector of the same length to give to the plot3 command. To
do this, we create a vector of zeroes whose length matches y. The we add a scalar cyc to
7
If you are in a debugging cycle and you reload this file, then you should include a clear statement at the
beginning of the file. You need to erase the old SSE array from the system.

10
this vector. In M ATLAB, this adds the scaler to each element of the vector, yielding a vector
whose length is the same as y and whose components are all equal to cyc. Next, we issue
the plot3 command. We plot the z dimension on the x axis of plot3 because we want
the progress of time to be depicted on the x access of the plot. Finally, we use the axis
command to say that we want dimensions of the x, y, and z axes to vary from 1 . . . 50,
1 . . . 6, and 1 . . . 12, respectively. These values match the dimensions of the plotted data.

3.3 Surface plots


You can use a surface plot to plot the values in a two-dimensional matrix. Let’s apply this to
a neural network that has one ouput unit but has been trained on four input patterns. If we
look at that unit for one epoch of training, it will have four values corresponding to each of
the input patterns. If we store these values in a matrix across all epochs of training, then we
can plot the ouput values as a function of pattern and training epoch. The example below
illustrates how to do this. The matrix history is incrementally updated to hold the output
values of the units for the current training epoch.
for epoch=1:1000
. . .

history(:,epoch)=outputsL2(:);
SSE(epoch) = sum(error .* error);
if((SSE(epoch)<.02)) break; end
end
plot(SSE);
figure;
surf(history);
view([45,45]);

The command surf(history) creates a surface plot of the history matrix. This
surface plot is very useful because it shows you how the output units change their response
to the input patterns as a function of training. It vividly displays the network’s change in
behavior as a result of the learning process. The command figure tells M ATLAB to plot
the results in a new figure and do not overwrite the results in the SSE figure. The command
view([45,45]) sets the viewing angle. You can play with these parameters to get a
good viewing angle. Of course, you will want to annote the plot as illustrated below.
surf(history);
view([45,45]);
xlabel(’Epoch’);
ylabel(’Pattern’);
zlabel(’Activation’);
title(’Activation to patterns as a function of training’);

3.4 Other plotting commands


In the earlier examples, M ATLAB chose bounds for the x and y axes automatically. You can
specify this explicityly with the AXIS command.
You can put several graphs within the same figure using the subplot command.
You can plot error bars using the errorbar command in conjunction with hold.

11
A Appendix: Matrix notation
A 2 by 3 matrix has two rows and three columns. An m × n matrix has m rows and n
columns. With these conventions, the elements of an m × n matrix, A, would be written as
shown below.  
a1,1 a1,2 ··· a1,n
 a2,1 a2,2 ··· a2,n 
A ≡  ..
 
.. .. .. 
 . . . . 
am,1 am,2 · · · am,n
If m equals n, then we have a square matrix. A square matrix has the same number of rows
and columns. Further, a square matrix has a diagonal. This is the set of matrix locations
ai,i . A very important square matrix is the identity matrix, I, which has ones along the
diagonals and zeros everywhere else, as shown below.
 
1 0 ··· 0
 0 1 ··· 0 
I≡
 
.. .. . . . 
 . . . .. 
0 0 ··· 1

A.1 Matrix multiplication


This convention of describing matrix dimensions eases the problem of keeping track of
whether two matrices may be multiplied together to produce a new matrix and what the
resulting matrix dimensions will be. A matrix A can be multiplied with a matrix B if the
number of columns in A is equal to the number of rows in B. Suppose that A is an m × n
matrix and B is an n×o matrix. Then A can be multipled with B, written AB. The resulting
product matrix will have dimensions m × o. Matrix B cannot be multiplied with A unless
o is equal to m. Although scalar multiplication is commutative, matrix multiplication is not
in general commutative. Matrix multiplication is associative and distributive.
We can easily define how to compute the elements of the product matrix AB. Element
ij of matrix AB is computed by taking the inner product of row i in matrix A and column
j in matrix B. The inner product is not defined if row i does not have the same number of
elements as column j. This is why the number of columns of A must equal the number of
rows of B. The identity matrix has special properties with respect to matrix multiplication.
Multiplying a square matrix, A, with the identity matrix or multiplying the identity matrix
with A both yield the original matrix A. That is,

AI = IA = A (1)

A.2 Definition of transpose


The notation AT refers to a matrix generated from matrix A by reversing the order of the
subscripts of each of the elements. If A is an m × n matrix, then AT (read “A transpose”)

12
is an n × m matrix. When deriving results, there are a number of useful facts about the
transpose of a matrix.

1. The transpose of the identity matrix is itself.

I = IT

2. The transpose of the transpose of a matrix is the original matrix.

A = (AT )T

3. The transpose of the product of two matrices is the transpose of the second matrix
multiplied with the transpose of the first matrix.

(AB)T = B T AT

13

You might also like