You are on page 1of 31

Tutorial for Monte Carlo Simulation with Excel

Copyright by Moshe Cohen, 2008


bani@jce.ac.il
Table of Contents

Part I.

Part II.

Part III.
Part IV.

1.
2.
3.
1.
2.
3.
4.
5.
1.
2.
1.
2.
3.
4.

Introduction
Macros of Excel
Setting of Excel 2003 and earlier versions
Setting of Excel 2007
Loading macros to Visual Basic Editor
Deciding probability weights
Estimation of the distribution form a sample
Simple considerations for distribution choice
More detailed information from the histogram
Probability density functions of distributions
Selecting a distribution without a sample
Simple static simulation
Running the simulation
Appreciating the results
More examples
Portfolio analysis

Project planning
Travel time estimation
Reliability of systems

Index

Page
2
3
4
5
7
9
10
12
13
16
16
21
21
23
25
24
25
26
29
31

Introduction
Many decisions must be made involving unpredictable figures1 such as a share price six
months from now. Other examples are sales forecasts, time duration of performing a
completely new task, future selling prices, inflation and exchange rates.
A general strategy that may indicate what decision is most advisable is to assume several
possible realizations of the figures called scenarios -, calculate a numerical outcome
from each scenario and weigh the outcomes by the probability of the respective
realizations. Nowadays this procedure is computerized and it is possible to check
hundreds or thousands of scenarios and thereby increase the credibility of such an
analysis. The procedure is called static or Monte-Carlo simulation (or just simulation in
the rest of this tutorial). Excel can carry out static simulations with the aid of suitable
macros.
This tutorial expects some familiarity with Excel but none for statistics or probability. It
illustrates static simulations and introduces the minimal theory underlying probabilities of
realizations. The tools for these are the macros available in
http://my.jce.ac.il/bani/StaticSimulation/Sources.
The macros are free in the sense of GNU Lesser General Public License (LGPL) that
means approximately that they can be freely used for any purpose, including commercial
one; but please read LGPL for the exact meaning. The macros can make essentially the
same job as the commercial programs ExpertFit and Crystal Ball that sell for
hundreds of dollars.
At first it is explained how to attach probability weights (in Part II) aided by macro
ChooseDistribution and only then how to use the Simulate macro that performs the
simulation run (in Part III). If you have not done yet, please, install the macros and set up
Excel so that you can rerun all the examples given in this tutorial (explained in Part I).
Part IV shows an assortment of simple example problems.
An introductory example used in Parts II and III is the problem of an American call
option. It is a decision whether it is worthwhile to pay a given sum now (called
premium) for the right to buy a share on an agreed future date (called expiration date) at a
given fixed price (called strike price) agreed today. Obviously, this depends on the price
of the share at the expiration date. The outcome is supposed to aid the decision maker.
The outcome is based on the difference between the expense and the gross profit while
the expense is the sum of the premium and the strike price and the gross profit is the
shares market value at the expiration date. It is worthwhile to pay now the premium only
if profit is more probable than loss.

More exactly, stochastic variables

PART I. Macros of Excel


There are 22 files at http://my.jce.ac.il/bani/StaticSimulation and its subdirectory (besides
this file) listed also in Table 1.
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

Source file
BoundsFrm.frm
chooseFrm.frm
FitFrm.frm
mainForm.frm
htpFrm.frm
progressForm.frm
ChooseDistribution.bas
FitModule.bas
Simulate.bas
Utility.bas
BoundsFrm.frx
chooseFrm.frx
FitFrm.frx
mainForm.frx
htpFrm.frx
progressForm.frx
Empty.xls
Samples.xls
GNULesserGeneralPblicLicense.txt
GNUlicense.txt
DrawDistributionSetup.exe
MonteCralo.zip

File types

Source files

Files containing information for the above *.frm files.


Each one must be in the same directory as the
corresponding *.frm files
Empty Excel file containing the above macros
Excel file with all the sample problems of this tutorial
The licenses that cover the Visual Basic macros and
forms
Program drawing probability density function
Zip file containing all the above

Table 1. Files for Monte Carlo simulation


You may wish to download just the zip file (#22) and open its contents on your computer.
File #21 is not absolutely necessary.
If you are not interested in the source files themselves, you may download only the Excel
files. Please, save and keep intact the original Empty.xls as is and work only on its copies.
Empty.xls contains only the macros.
Macros are suspected to be malicious by Excel and therefore many Excel installations are
routinely instructed to ignore them. Therefore, first of all one has to assure that Excel
programs are ready to run macros. For the 2007 version one also has to assure that the
macros are saved in the workbooks. Some default settings in Excel must be altered for
running, saving and examining macros. How to do them is described in the next sections.

1. Settings of Excel 2003 and earlier versions


The security level is set as follows. Go to security option by clicking on the sub menu
Security that can be reached by Tools > Macro (see Fig 1.). The appropriate setting
is shown in Fig. 2. If the Macros button is clicked (see Fig 2.) a list of available
macros are displayed. The examination of the macros source code is possible via Tools >
Macro > Visual Basic Editor (also in Fig 2., about the Editor see further in Section 3.).
Check if menu Tools contains the item Data An1alysis (Like can be seen in Fig. 1).
If not, you have to install the Data An1alysis add-in by selecting Tools > Add-Ins
that pops up a list (Fig. 3) where ToolPak is to be selected.

Figure 1. Macro menu in Excel 2003

Figure 2. Excel 2003 security options

2. Settings of Excel 2007


Clicking the Office button
(in the left upper corner) opens a drop down menu
where the Excel Options button (located at the bottom of the menu) is to be clicked.
Here it is needed to change at least two or possibly up to four settings concerning:
the format of saved workbook under Save,
the security of the macros under Trust Center,
the Visual Basic Editor (if you want to observe/alter the macro) under Popular
and
making histograms (if you wish so) under Add-Ins

Figure 3. Choice of Add-In for Statistical Data Analysis


The change is necessary because the default of Excel 2007 is not to save macros, not to
run them, not to let users use Visual Basic and not to include any add-ins. Please note that
if settings are changed they become effective only after closing and reopening Excel.
The format that saves macros is shown in Fig 4.

Figure 4. Setting that saves macros


The security should be set as shown in Fig. 5 in the Trust Center window. The Trust
Center window pops up when the button Trust Center Settings is pressed (located at
the right bottom of Trust Center option)
Macros can be run by clicking the Macros button in the View tab of the ribbon (Fig.
6). It opens a list of all available macros.

The examination of the macro source code is possible in Excel 2007 only if the
Developer tab is placed on the ribbon. This is also not a default setting and it is
necessary to request. This is done in the Popular part of Excel options. The setting is
shown in Fig 7. After this operation the Developer tab may be open where clicking the
Visual Basic button pops up the Visual Basic Editor (see Fig.8; about the Editor see
under section 3).

Figure 5. Setting for running macros

Figure 6. The Macros button

Figure 7. Requesting the Developer tab

Check if the ribbon of Data contains the Data Analysis button in the Analysis
group as in Fig. 9. If not, you have to request it by entering Excel Options (Fig. 6)
selecting Add-Ins and clicking the Go button in the bottom. A list of add-ins is
popped up where Analysis ToolPak is to be selected (Fig. 3).

Figure 8. The Developer tab

Figure 9. The Data ribbon having the Data Analysis button


3. Loading macros to Visual Basic Editor
You may download the source files and load them yourself to any Excel file that is
supposed to run simulations. For adding the files to an Excel workbook, you have to open
the Visual Basic Editor. How to open was described above for each version.
You can import all the source files through the File sub menu of the editor as shown in
Fig. 11: click Import File that opens a window for selecting the source file. The
importing is for the active workbook, so if the Excel program has no workbooks,
importing is disabled.
Even if you do not have any desire to look at the source, it may be useful for checking it
about any malicious content such as writing on your disk without your permission2. For
checking purposes you can read the source files with a simple text editor such as
Notepad.exe where an Excel macro cannot do any damage.

This advice is a good way to ovoid malicious code (virus, Trojan horse, spy ware etc) in a macro. Of
course, this is possible only when the source is available.

Figure 10. The File menu of VBA editor

PART II. Deciding probability weights


It is shown in probability theory that weighing the outcomes with the probability of the
realizations is equivalent with choice of realizations with high probability more often
than those with low probability. More exactly, realizations in simulation are chosen at
random and proportionately to their probability. By doing so the estimate of the expected
outcome may be calculated as a simple average of the scenarios outcomes.
Excel provides functions that ensures both random choice (function RAND()) and
proportionality to probability. The pertinent Excel functions are listed in Table 2; how to
choose among them, what are the meanings and their appropriate values are the topic of
this part.
Statistical
Distribution
Beta
Exponential
Gamma
Log-normal
Normal

Excel Function
=BETAINV(RAND(), alpha, beta, lower bound, upper bound)
=-LN(RAND())*mean
=GAMMAINV(RAND(),shape parameter, scale parameter)
=LOGINV(RAND(),mean of ln(X), standard deviation of ln(X))
=NORMINV(RAND(),mean, standard deviation)

Table 2. Excel functions suitable for getting random realizations


The Excel functions relate to what is termed in probability theory as statistical
distributions3 (or just distributions) listed at the left column. Macro4 ChooseDistribution
helps to decide what function with what parameters to choose for a given problem. The
way to start a macro is:
in Excel 2003 and earlier versions: Tools > Macro > Macros (see Fig 1.)
in Excel 2007: click Macros button at the View tab (see Fig 6.).
This lists on a pop-up window all the available5 macros of the workbook. Select
ChooseDistribution and click the Run button6. It pops up the window shown in Fig 11.

Figure 11. Opening window of macro ChooseDistribution


3

For more on distributions consult http://en.wikipedia.org/wiki/Statistical_distributions


Macro #7 of Table 1.
5
You cannot see macro ChooseDistribution if it has not been installed; see how to install it in Part I.
6
You cannot run the macro if the security/trust check is to high. See the topic in Part I
4

Recall from the Introduction that for the American call option problem you want to make
an educated guess about the share price at the expiration date. You may either estimate
the price based on historical data (if available); in statistical parlance it called as
estimation based on a sample. The sample is the historical data and its individual items
are referred as observations. Alternatively, you can guess the share price without
previous sample. These are the two options presented in Fig. 11. We shall discuss the first
possibility in section 1 while the second on section 5. In the intermediate sections
important techniques and concepts are introduced.
1. Estimation of the distribution form a sample
Suppose that the expiration date is six months from now and it is possible to find
historical data about prices of 20 shares that behave similarly to the one under
consideration. So we can find the price increase or decrease (in percents) from their
prices six month ago to the prices today. We assume, for the sake of illustration, that
similar change will occur to our share in the future. The increases are displayed in Table
3, while a negative number means that the price went down.
8.03
1.11

-2.58
7.66

1.59
7.14

9.17
14.82

1.61
8.95

7.30
10.58

15.66
7.11

0.59
5.62

9.00
10.86

17.98
2.62

Table 3. Increases in prices of 20 shares in percents


Pressing button OK (while fit a sample is selected) pops up another window (shown
in Fig. 14) that has a warning about the appropriate position of the sample. One cannot go
on until the sample is arranged in a single column starting with cell A1 on a sheet that
contains only the sample as shown in Fig 12.
After the sample is arranged properly on the active worksheet (one where the user can
readily write if so wishes), fitting the sample is simply performed by clicking the Fit
button (see Fig. 14). It rearranges and fills the active sheet (see Fig. 13) and displays the
result as shown in Fig. 14.
It selects the (shifted) exponential distribution as the only distribution (from the five
distributions mentioned in range B13:F13) that fits the sample.
Note how the active sheet looks after fitting: the sample is pushed down 13 rows, its
average and standard deviation are given in range B2:B3. The original sample starting
in cell A14 is sorted. This is necessary for calculating the so-called Kolmogov-Smirnov
(K-S) statistic7 that is the maximal absolute difference between the sample distribution
and the assumed theoretical distribution with parameters in rows 6-12. This is shown for
the five distributions in range B13:F13. Distributions with reasonable fit must be below
the 95% critical value8. The smaller this statistic the closer is the sample to a proposed
7

For theory see http://en.wikipedia.org/wiki/Kolmogorov_Smirnov.


May be obtained from http://www.york.ac.uk/depts/maths/tables/kolmogorovsmirnov.ps, but bare is mind
that in our case we must use the two-sided test.
8

10

distribution. The preferred distribution in the case under consideration is the exponential
that is shifted by the offset in row 6. Thus, to get a realization of the share price six
months from now is given by
=-4.165 -LN(RAND())*15.436.

Figure 12. Organizing the sample on an Excel sheet

Figure 13. The worksheet after fitting.

11

Figure 14. The fitting window


Similarly, in the beta distributed case the appropriate realization is given as
Rows 14 and downwards are the values of the assumed theoretical distribution for the
arguments in column A of the same rows. Column of the log-normal distribution is empty
because it is defined only for samples having strictly positive observations.
2. Simple considerations for distribution choice.
The fact that the best fit was found as exponential may upset many financial
mathematicians whose first reaction would be before consulting your sample that lognormal distribution9 should be the best choice. Such reaction highlights the principle that
the distribution carries implicit assumptions about the sample that may not be deduced
from the sample, but known to the user. There are simple questions10 that any user may
consider before choosing a distribution. They may direct the choice to a theoretically
reasonable distribution. They are:
Is there any value below which no realization is possible? For example, time
durations cannot be negative, thus the value below which no realization is
possible in this case is 0. Such a value is called lower bound.
Is there any value above which no realization is possible? If the value is some
percent from a given total then the value above which no realization is possible is
100. Such a value is called upper bound.
Is there a value that is more probable than others? Such a value is called mode.
9

The log-normal distribution is, however, valid not for the percent change but for the ratio between the
present price and the price six months ago. This was raised for methodological reasons to open the
discussion for theoretical considerations.
10
The financial mathematicians choice of log-normal distribution is a result of much more sophisticated
considerations beyond the scope of this tutorial

12

Are the realizations deviations from the mode above the mode and below the
mode equiprobable? If so, the distribution is termed symmetric. Deviations from
a nominal weight or size due to slight technical variations in production
processes11 are often assumed to be symmetric. The nominal values serve as the
modes.

Non-existing of a lower bound means that any negative value is possible, non-existing of
upper bound means that values may be large without any limitation, although large
deviations above the mode are extremely rare.
The answers for three questions above for the five distributions are shown in Table 4.
The alpha and beta in the table are the parameters of the distribution of Table 2.
Distribution
Beta
Exponential
Gamma
Log-normal
Normal

Is Lower Bound?
Yes
Yes
Yes
Yes
No

Is Upper Bound?
Yes
No
No
No
No

Symmetric?
Only if alpha=beta
No
No
No
Yes

Table 4. Some basic features of distributions.


As a rule, exponential and log-normal distributions need not be candidates as gamma
distribution can replace them. Exponential distribution is a special case of gamma
distribution and gamma distribution may be indistinguishably close to log-normal
distribution12. Assuming existence of upper and lower bound for our case, the beta
distribution looks more plausible than the others.
3. More detailed information from the histogram.
The histogram is a graphical summary of the data that Excel may provide. It shows how
many cases fall into given, disjoint ranges, called bins. For appreciating it, the best is to
show a histogram for our case with accompanying clarifications. The procedure of
getting it is as follows.
1. Define the boundaries of the bins. This may be done as seen in Fig. 15 in column
C. They are six increasing numbers, while the first is less than the minimum of
the data (see Fig. 12) and the last one is larger than the maximum. The number of
boundaries is the number of bins plus 1. The recommended number of bins is
given in Fig. 16.
2. Request the window listing Excels statistical procedures from menu13 Tools >
Data Analysis (in Excel 2003 or earlier) or by clicking the Data Analysis

11

This holds only if the process is under control. Indeed, large part of statistical quality control is based on
analyses of possible asymmetries.
12
See the comparison in subsection 4 further
13
In some installation the data analysis does not appear under Tools menu. See Part I. how to cause it
appear.

13

button on the ribbon of the Data tab (in Excel 2007). This opens a list of the
available statistical methods. Histogram should be selected.
3. The selection opens a window shown in Fig. 15 where the location of the data
(A1:A20 supposing the situation from Fig. 12) is entered. The output range should
be an empty cell that below and right to it there are empty cells so that the output
can be written there. The output is shown in Fig 17 in the range E1:F8. In column
E has the upper boundaries of the bins and in column F the number of cases
falling in the bin. Thus, there are five numbers in the bin whose boundaries are -3
and 2.

Figure 15. Generating histogram

Figure 16. Proposed number of bins

14

Figure 17. Drawing the histogram


4. Select the range F3:F7 (the number cases in the bins). Call now the chart wizard
of Excel by clicking the button
(in Excel 2003 or earlier) or through the
Column chart button in the Insert tab (in Excel 2007). In Excel 2003 it opens
is to be selected. This
a window as shown in Fig. 17, in Excel 2007
supplies the histogram that is displayed (for the data from A1:A20) in Fig. 18.

Figure 18. Histogram of the data from Fig. 12


The histogram yields visually a much richer description of data than the rudimental
information used in Table 4.
The mathematical abstraction14 of the histogram is the so-called probability density
function, abbreviated as pdf. A reasonable distribution may be one that resembles the
histogram. For this reason, it is necessary to shortly review the pdfs of distributions.

14

If the number of bins tends to infinity and at the same time the rage of bins tends to 0 then the histogram
tends to the pdf.

15

4. Probability density functions of distributions


The program DrawDistribution.exe aided by InvF.dll (they must be in the same folder)
shows the pdf for various distributions. A few remarks are in order:
the lower bounds of all the distributions having lower bounds is set to 0
the upper bound of the beta distribution is set to 1
the order of the parameters are not always the same as in Excel (as displayed in
Table 2)
Figures 19 through 30 are the pdfs of several distributions to show the influence of the
parameters on pdf. They were produced by program DrawDistribution. The horizontal
axis shows the range of values containing most realizations.
The general shape of the normal distribution (Figs 19 through 21) is the same for all
parameters. The standard deviation of pdf shown in Figs 19 and 20 are equal to 1 while
the values of the means are different. Note that the corresponding ranges are the same:
6.576-1.424 = 2.576-(-2.576). The effect of changing the value of the mean is that each
point is translated by the difference between the two means (4 in the case of Figs 19 and
20).

Figure 19. Normal pdf with mean=0,


standard deviation=1

Figure 20. Normal pdf with mean=4,


standard deviation=1

Figs 19 and 21 are different only in standard deviation. Both are centered at 0 (the mode
is 0), but the corresponding range in Fig 21 is twice that of Fig 19. Hence, the standard
deviation can stretch the horizontal axis. Such stretching parameter that does not
influence the shape or location is called scale parameter. The parameter that changes the
shape is called the shape parameter. Normal distributions have the same overall shape.
Another distribution where the shape is unchanged is the exponential distribution (offset
0) shown in Fig. 22. Its mode is always 0 and the mean is its scale parameter.
The gamma distribution is another one that has a scale parameter. The shape depends on
the shape parameter see Figs 23 and 24. The larger the shape parameter, the farther is
the mode from the lower bound (0) and the more symmetric the pdf is.

16

Figure 21. Normal pdf with mean=0,


standard deviation=2

Figure 22. Exponential pdf with mean=1

Figure 23. Gamma pdf with scale


parameter=1 and shape parameter=3

Figure 24. Gamma pdf with scale


parameter=1 and shape parameter=6

The pdf of the log-normal distribution is shown in Figs 25 and 26. The shape is a
complicate function of the parameters. Log-normal distributions are similar to gamma
distributions.

Figure 25. Log-normal pdf with ln-mean


parameter=0 and ln-standard deviation
parameter=1

Figure 26. Log-normal pdf with ln-mean


parameter=0 and ln-standard deviation
parameter=0.5

As indicated in Table 4, the beta pdf is symmetric if alpha=beta parameters. This is


corroborated by Figs 27 and 28. It also may be observed that the larger the parameters,
the more concentrated is the pdf around the mode. If alpha is lower than beta parameter
then the mode tends to the lower bound, if alpha is larger than beta parameter then the

17

mode tends to the upper bound as demonstrated by Figs 29 and 30. The tendency is
stronger when the imbalance between the two parameters is larger

Figure 27. Beta pdf with alpha parameter=3 Figure 28. Beta pdf with alpha parameter=5
and beta parameter=3
and beta parameter=5

Figure 29. Beta pdf with alpha parameter=2 Figure 30. Beta pdf with alpha parameter=5
and beta parameter=5
and beta parameter=2
Finally note that the beta pdf may be very similar to normal pdf (compare Figs 21 and
28). The normal distribution may be closely approximated by a beta distribution where
alpha=beat=4.3 and the bounds are mean3(standard deviation); sign is for the lower
and + for the upper bound. The beta approximation is preferable if normal distribution
may lead to negative values with probability that is not negligible.
Beta pdf may be also similar to gamma pdf (compare Figs 23 and 29). Moreover, beta
distribution can be also applied to the case where any value is equally probable between
bounds:
=BETAINV(RAND(),1,1,lower bound, upper bound)

5. Selecting a distribution without a sample


It was demonstrated in the previous section that beta pdf is the most versatile and can
imitate other distributions. Therefore, the second option selected in pop-up window of
Fig. 11 suggests the beta distribution and pops up a window where three characteristic
values of the beta distribution can be set (Fig 31): the two bounds and the mode as the
most typical value.

18

Figure 31. Input window when no sample is available


Consulting the histogram (Figs 17 and 18), one can think of the bounds as -3 and 22.
There are actually two modes (with frequencies 5 and 10), but possibly the bigger one
(between 7 and 10) is dominant. The estimate of the bigger mode is the average of the
bins boundaries: (7+12)/2 = 9.5. To compensate for the smaller mode, for input to
window of Fig. 31, the lower boundary (7) is used rather than 9.5. The result is shown in
Fig 32. The calculation adds a new sheet, calculates15 the pdf in column D and draws the
pdf..
The pdf may be made more concentrated or less concentrated around by selecting
narrower or wider and clicking the modify button. It shows visually what happens the
pdf. Moreover, different proposed arguments may be written in the cumulative
distribution. By doing so, selecting the with different parameters option and clicking
the Modify option a new pdf also recalculates and redraws the pdf.

15

This is an approximation to the pdf or more exactly to pdf multiplied by a constant that has no influence
on the shape of pdf and is satisfactory for the purpose.

19

Figure 32. Output resulted from the input displayed in Fig 31

20

Part III Simple static simulations


In the problem of the American call (see Introduction) assume that:
the percentage rise of the market price of the share at the expiration date from its
present market price is beta distributed with the parameters shown in Fig 31 and
discussed before that figure,
to profit from purchasing the option materializes only if the market price rises
more than 7% and out of 1000 scenarios, if there are more scenarios with profit
than with loss then purchasing the option is worthwhile.
1. Running the simulation
Before trying out all the scenarios one scenario must be built on an Excel sheet used as a
template for the other ones. This may be done for our problem as shown in Fig 33.

Figure 33. Template scenario for the American call problem


The values of the cells B2 and B3 were calculated by formulas. The formula in cell B2 is
BETAINV(RAND(),2.6,3.4,-3,22), the same as suggested in Fig 32 for the random
variable. (If you try out this problem cell B2 may contain another value16). It is a price
that is randomly selected from all the possible prices with this distribution, thus it is an
unpredictable figure, as was referred to in the Introduction. Formula in cell B3 is
=IF(B1<B2,1,0), that yields 1 if the price increase in B2 greater than 7% and 0
otherwise17. Cell B3 is the numerical outcome of simulation that directs the decision. The
outcome must be numeric for macro Simulate, the answer True or False are not
accepted.
Trying 1000 scenarios will be performed with the aid of macro18 Simulate. The way to
start a macro is:
in Excel 2003 and earlier versions: Tools > Macro > Macros (see Fig 1.)
in Excel 2007: click Macros button at the View tab (see Fig 6.).
This lists on a pop-up window all the available19 macros of the workbook. Select
Simulate and click the Run button. It pops up the window shown in Fig 34.
The active cell (the cell with wide black frame) must be on the outcome cell B3 in the
example. There are four options for output. If Summary is selected the number of
16

The reason is that the choice is random. The truth is that it only looks so, what is called pseudo-random.
This issue is however beyond the scope of this tutorial.
17
Such functions are widely used in modern probability theory and called indicator functions.
18
Macro #9 of Table 1.
19
You cannot see macro Simulate if it has not been installed; see how to install it in Part I.

21

scenarios is performed and only the average and standard deviation of the outcomes are
computed. Besides this information there are possibilities to get the following if the
options are checked:
the outcomes of all the scenarios
histogram of the outcomes
fitting a distribution to the outcomes
.

Figure 34. Simulation input form


Having clicking the Start button 1000 (or whatever is written for number of scenarios in
the simulation input form) scenarios are chosen at random and a new sheet is inserted
where all the required output is written. The sheet is shown in Fig 35.

Figure 35 Results of simulation


The result is 0.545 (so in 545 scenarios out of 1000 showed profit and 455 scenarios
loss), so the profitable scenarios outweigh those with losses. (Again, if you try out this
problem range B2:B3 may contain slightly different values due to randomness.)

22

2. Appreciating the results


If scenarios are chosen at random, it may be possible that by doing so we got mostly
exceptionally good ones or bad ones. The probability that this happens may be
sufficiently small especially when many scenarios are tried. But how many is many
that is satisfactory? Are 1000 sufficient or too many? What is a reasonable number?
These are the question that we deal here with.
The key is the standard deviation estimate (0.49822) and the number of scenarios tried
(1000). For illustrative purposes the same problem is rerun for different number of
scenarios see Table 5.
Number
of
scenarios
9
25
100
900
1600
4900
Infinity

Average

Standard
deviation

t
statistic

0.56
0.36
0.47
0.55
0.57
0.55
0.55

0.53
0.49
0.50
0.50
0.50
0.50
0.50

2.31
2.06
1.98
1.96
1.96
1.96
1.96

Lower 95%
confidence
limit
0.15
0.16
0.37
0.52
0.55
0.54
0.55

Upper 95%
confidence
limit
0.96
0.56
0.57
0.59
0.59
0.57
0.55

Table 5. Simulation of the American call option for different scenarios


The last row of the table is not obtained by simulation but rather by theoretical
calculation that is possible in this simple case. For the other rows Number of scenarios,
Average and Standard deviation were obtained by actual simulations. The last three
columns are computed from the first three. To appreciate them the concepts of t statistic
and confidence limit must be clarified.
Confidence limits in statistics mean the bounds of the range that includes the true value of
the outcome in a given probability. Thus for 25 scenarios there is 95% chance that the
true value20 is between 0.16 and 0.56. Note that true value is known to be 0.55, the value
of average in the last row. The true value is indeed between the confidence limits for the
other cases as well. This should happen in 95% of the cases on the average, thus, in 19
out of 20 cases. The limits are easily computed by the formula (1):
Average t(Standard deviation) /SQRT(Number of scenarios). (1)

where t is the t statistic from the table, sign is used for the lower and sign + for the
upper confidence limit. For example the lower limit for 25 scenarios is computed as:
=0.36 2.060.49 /SQRT(25)

20

Its official name in statistical theory is the expected value.

23

The t statistic is given by the Excel function TINV(0.05,( Number of scenarios)-1).


The width between the confidence limits reduces when the number of scenarios
increases. The reduction of the width is however not uniform. For the present problem the
width around the true value depicted in Fig 36. For a very few observations any average
outcome is possible, for sample size larger than 100 the width reduction becomes very
slow. The width becomes 0 only at infinity, but of course, it is not reasonable to demand
infinite number of scenarios. Some realistic width must be tolerated.

Figure 36. Confidence limits vs. sample size


Now we are in a position to be able to answer the question what is the number of
reasonable replications (=scenarios or sample size) for a problem. It is one for which the
confidence limits (as given by (1)) are narrow enough. The following procedure is
therefore recommended for the appropriate sample size, N:
1) Assume that the required maximal width between 95% confidence limit is .
2) For N1 simulation runs get the average and the standard deviation. N1 may be in
the range 50-200 or any number that seems intuitively reasonable.
3) Calculate the confidence limits given by (1). If the upper limit is not higher than
the lower limit + then no further runs are necessary, N1 is a sufficient sample
size and procedure may stop. Else perform also the following.
4) Calculate the required number of simulations21: N = 16((Standard deviation)/ )2
5) Run N - N1 more simulations and combine the results with the runs made in 2)

21

The formula is a result of approximating the value of the t statistic by 2 in (1), equating the difference
between the upper and lower confidence limits as given by (1) to and solving the equation for the
number of scenarios, N.

24

Part IV More examples


We shall illustrate the possible uses of simulation in four sample problems. They are very
small to keep them simple and readily understandable, however, the size of problems are
limited only by Excel itself, the computer and the users imagination. Also the derivation
of the outcome that is called in the sequel the model, may look complicated because the
derivation uses simple arguments, without reference to special techniques22. However,
with using such techniques models may be constructed routinely, or even by a computer
code, if so desired. The interested reader may turn to the author (bani@jce.ac.il).
1. Portfolio analysis
Assume a portfolio consists of 3 assets: Real Property, State Bonds and Shares. Assume
that the expected return (as %) and its variability of each asset are as given in Table 6
Assets
Real Property
State Bonds
Shares

Expected return %
6
1
8

Standard deviation of return %


1.5
0.1
3

Table 6. The data of the assets in the portfolio


If we have 100 units in Real Property, 50 units in state bonds and 50 units in shares, what
is the expected return and its standard deviation?
We assume that the returns are normally distributed23. Then a sheet can be constructed as
shown in Fig. 37. Column C (Expected return per share) has the random returns (see
formulas in Table 7 and a random result in Fig 37) and column D has their contributions
to unit portfolio. Their sum (in cell D6) is the return of a unit portfolio, the outcome of
the model. It is the subject of simulation (see Fig 37.)

Assets
Real Property

Expected return per


share
=NORMINV(RAND(),6,1.5)

unit portfolio
=C3*B3/200

State Bonds

=NORMINV(RAND(),1,0.1)

=C4*B4/200

Shares

=NORMINV(RAND(),8,3)

=C5*B5/200

Sum

=SUM(D3:D5)

Table 7. The formulas for the portfolio problem

22

Such techniques are mostly in the domain of Operations Research.


The assumption is also for independence. Actually, returns of assets are interdependent: in bear market
returns of most assets go down, in bull market they go usually up. If they were independent the going up or
down should happen independently. If the covariance matrix (statistical measure of correlations) can be
estimated we still can perform simulation but it is beyond the scope of this tutorial. Interested readers may
turn to the author (bani@jce.ac.il)

23

25

Figure 37. The model of portfolio and its simulation setting


Fig 37. also displays the setting of simulation. As a result, the true average is between
5.2% and 5.4% but any value approximately between 2.5% and 8% is possible.
2. Project planning24
Consider the planning of a project about some device from conception to a ready
prototype. The assumption is that the whole project may be dissected to well-defined
tasks called as activities for which time durations can be estimated. The problem is to
estimate the time duration of the whole project, thus, how to translate uncertainty about
activity durations to uncertainty of project duration.
Some parts of several activities can be undertaken simultaneously that can shorten the
time of the whole project. The estimate of project duration is calculated assuming
maximal possible parallel execution of activities. There are some activities, however, that
cannot start without finishing others. For example it is impossible to put the roof before
some walls are built. The activity of building the walls is called a predecessor of the
activity of building the roof. The foundations are predecessor of roof but not an
immediate predecessor. Predecessors lengthen project duration, so they are relevant for
estimation of project duration.
Assume that the activities, their immediate predecessors and their time estimate are given
in Table 8. A further concept is the milestone that is the earliest point of time (counted
from the beginning of the project) that a specific activity may start or the earliest point of
time that the whole project is finished. Note that
24

The problem dealt in this Section is called stochastic PERT in the project management literature.

26

the milestone of activities with no predecessors is 0 (by the


counted and the fact of maximum parallelism is desired),
milestones of other activities is the time when each of its
finished
the milestone of the whole project is the outcome of the model.
In this project there are four milestones listed in Fig 38. Note that
Functional design and Documentation is the same as both may start
specification is done.
Activity

Immediate predecessors

Basic specification
Market research
Functional design
Documentation
Detailed design and prototype

none
none
Basic specification
Basic specification
Market research, Functional design

way that time is


predecessors are
the milestone of
as soon as Basic

Minimum
time
2
1
2
4
5

Maximum
time
4
3
3
5
8

Table 8. Data of the activities comprising the project

Figure 38. The model of project duration and its simulation setting
Fig 38. also shows the simulation settings. The earliest time of milestone #1 is 0 and the
others are based on formulas presented in Table 9. For example, the earliest time of

27

milestone #4 (end of project) is when both Documentation (its starting time is in cell B3)
and Detailed design and prototype (its starting time is in cell B4) are done. Note that in
this case other activities finish is implicit in cells B3 and B4. As only bounds are
available in Table 8, any value between the bounds it taken with equal probability that
leads to functions
(see end of Section 4 of Part II).
Milestone #

Earliest time

2
3
4

Table 9. The formulas for the project duration problem


As a result, the true average is between 11.9 and 12 but any value approximately between
9.5 and 14.5 is possible.
3.Travel time estimation25
Consider the prediction of travel time with varying traffic conditions. Assume that for
each section of all the possible routes from origin to destination it is possible to tell the
distribution of time needed to traverse the section. The problem is illustrated for the
map of Fig 39. as a graph. The origin is node A, the destination is node D, node C is an
intersection and the arrows are the sections. The minimal and maximal traverse times
through the sections are listed in Table 10.
Section
A->B
A->C
B->C
C->D

Figure 39. Map of the travel time problem

minimum time
2
3
2
3

maximum time
4
9
4
7

Table 10. Data for the travel time problem

Define the immediate predecessor nodes as the set of nodes from which there is a section
to a given node. A has no predecessors, nodes B and C each has a single predecessor
while node C has two predecessor nodes: A and B.
The Excel sheet (Fig 40) calculates the shortest time from node A to a given node. The
shortest time for node A is clearly 0. For the other nodes the shortest time is the minimum
over the set of all immediate predecessor nodes the sum of the shortest routes to a
predecessor and the traverse time to the node under consideration. The actual formula for
this problem is given in Table 11.
As a result, the true average is between 10.1 and 10.3 but any value approximately
between 6.5 and 13.5 is possible.
25

The problem dealt in this Section is called stochastic shortest route in the network algorithms literature.

28

Figure 40. The model of travel time and its simulation setting
Town

Earliest time

Table 11. The formulas for the project duration problem


4.Reliability of systems
Consider a system consisting of two batteries and a camera. The system can work with a
single battery, but cannot if either the camera or both batteries are broken down. The
lifetimes of the three components are random but the maximum and the most typical
lifetime are given for each in Table 12. The lower bound of lifetimes is 0. The problem is
to find the lifetime of the whole system.
Clearly, the systems lifetime is the minimum between the cameras and the maximal
lifetime of one of the two batteries.

29

Component
Camera
Battery A
Battery B

Maxmum Most typical


Lifetime in hours
1300
400
900
280
900
280

Table 12. The lifetimes of the systems components


Reasonable distributions of the lifetimes are found by the macro ChooseDistribution
without a sample26 and rounding the parameters to a single decimal place27. The sheet in
Fig. 41. has the model, the formulas in column B are in Table 41.

Figure 41. The model of reliability and its simulation setting


System

Life time

!" #$!!%&'%(
$) %&$
"% (*(!%)

Table 41. The formulas for the reliability problem


As a result, the true average is between 325 and 343 but any value approximately
between 80 and 640 is possible.
26

See Section 5 of Part II.


In lifetime analysis it is common to use not beta distribution offered by ChooseDistribution, rather
Weibull distribution that has a theoretical justification, beyond the scope of this tutorial.
27

30

Index

activity
American call
beta distribution
bin
bound: lower
bound: upper
call
ChooseDistribution macro
confidence limits
covariance matrix
Data analysis package in Excel
distribution
distribution: symmetric
estimation
exponential distribution
gamma distribution
histogram
immediate predecessor
Kolmogorov-Smirnov test
K-S test
log-normal distribution
lower bound
macro
macro: ChooseDistribution
macro: Simulate
milestone
mode
model
Monte-Carlo
Monte-Carlo simulation
normal distribution
numerical outcome
observations
one sided test
option

page
26
2
9
13
12
12
2
2
23
25
4
9
13
9
9
9
13
26
10
10
9
12
2
2
2
26
12
25
21
21
9
2
10
10
2

31

outcome
pdf
PERT
predecessor
probability density function
sample
scale parameter
scenario
security
shape parameter
shotest route
Simulate macro
stachastic
static simulation
statistical distribution
statistical distribution: beta
statistical distribution: exponential
statistical distribution: gamma
statistical distribution: log-normal
statistical distribution: normal
statistical distribution: symmetric
statistical distribution: t Student
statistical distribution: Weibull
statistical independence
statistical test: Kolmogorov-Smirnov
statistical test: K-S
statistical test: one sided
statistical test: two sided
symmetric distribution
t distribution
trust
two sided test
upper bound
Visual Basic Editor
Weibull distribution

page
2
15
26
26
15
10
16
2
4
16
28
2
2
2
9
9
9
9
9
9
13
24
30
25
10
10
10
10
13
24
5
10
12
4
30

You might also like