You are on page 1of 14

1

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
A B C D E F G H I J K L M
Illustration of using Excel to find maximum likelihood estimates
This will illustrate how to use Excel to find the maximum likelihood estimation in a poisson model with equal sized sample.
This is a simple poisson model where we are interested in estimating the value of lambda
that maximizes the likelihood. The data consists of 5 counts as noted below.
We will use the Poisson(x,mu.cumulative) function of Excel to return the probability of
each individual value. The cumulative variable is set to FALSE (or 0) to return the
individual probabilities or set to TRUE (or 1) to return the cumulative probability, i.e. from 0x.
Steps in constructing this spreadsheet:
(a) Create a section with the raw data (see light green below)
In this case, up to 5 values are accepted. If the sample size is <5, enter nothing
(b) Create cell that will hold the value of lambda (see blue below)
(c) CAUTION: create the probability of each of the green values given the
value for lambda in blue. This is the most crucial step. Be sure
that each cell references the data and lambda value correctily.
I used the Poisson built in function of Excel.
(d) Find the log(likelihood) for each data value. This is simply
the natural log of the values in (c)
(e) Find the total log-likelihood - use the sum function of excel
This is the value to be maximized (see red cell below).
You can maximize this value by manually changing the blue cell.
Alternatively, use the Tools -> Solver from the drop down menu
to maximize this cell by changing the blue cell.
(f) Create individual contributions to observed information matrix
This is the -second derivatives of the probability function.
You will likely have to code this by hand. CAUTION, be sure
to refer to the correct lambda and data cells.
(g) Find the total observed information matrix
=IF(ISNUMBER(D49),PO
ISSON(D$49,$C$52,0), )
Computes the probability if
the data cell is not blank.
38
39
40
41
A B C D E F G H I J K L M
(h) Invert the observed information matrix
(i) Find the sqrt(variance covariance matrix) from previous step.
=IF(ISNUMBER(D49),PO
ISSON(D$49,$C$52,0), )
Computes the probability if
the data cell is not blank.
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
A B C D E F G H I J K L M
A B C D E F G G I J
Raw data
4 10 15 Totals
Indiv contribution Lambda
to likelihood 9.667 0.023 0.124 0.029 0.000 0.000
Indiv contribution
to log-likelihood -3.77 -2.08 -3.54 0 0 -9.390
Indiv contribution 0.043 0.107 0.161 0 0 0.310
to obs information
matrix
Variance of estimate 3.22222
(inverse of obs information)
SE(of estimator) = sqrt(variance) 1.80
=SUM(D55:H55)
Find the total log(L)
=D$43/$C$46^2
=SUM(D57:H57)
Find the total information
=1/J57
Inverse of information.
=SQRT(J61)
Finally the se.
=IF(D52>0,LN(D52),0)
Computes the log() of the
probability
Illustration of using Excel to find maximum likelihood estimates with different sized samples
This will illustrate how to use Excel to find the maximum likelihood estimation in a poisson model with DIFFERENT sized samples.
This is a simple poisson model where we are interested in estimating the value of lambda
that maximizes the likelihood. The data consists of 5 counts and 5 areas where the counts were obtained as noted below.
We will use the Poisson(x,mu.cumulative) function of Excel to return the probability of
each individual value. The cumulative variable is set to FALSE (or 0) to return the
individual probabilities or set to TRUE (or 1) to return the cumulative probability, i.e. from 0x.
Steps in constructing this spreadsheet:
(a) Create a section with the raw data (see light green below)
(b) Create cell that will hold the value of lambda (see blue below)
(c) CAUTION: create the probability of each of the green values given the
value for lambda in blue. This is the most crucial step. Be sure
that each cell references the data and lambda value correctily.
I used the Poisson built in function of Excel. Note the use of the area adjustment
(d) Find the log(likelihood) for each data value. This is simply
the natural log of the values in (c)
(e) Find the total log-likelihood - use the sum function of excel
This is the value to be maximized (see red cell below).
You can maximize this value by manually changing the blue cell.
Alternatively, use the Tools -> Solver from the drop down menu
to maximize this cell by changing the blue cell.
(f) Create individual contributions to observed information matrix
This is the -second derivatives of the probability function.
You will likely have to code this by hand. CAUTION, be sure
to refer to the correct lambda and data cells.
(g) Find the total observed information matrix
(h) Invert the observed information matrix
(i) Find the sqrt(variance covariance matrix) from previous step.
Raw data
Counts 4 10 15 Totals
Area 1 1 1
Indiv contribution Lambda Totals
to likelihood 9.667 0.023 0.124 0.029 0.000 0.000
Indiv contribution
to log-likelihood -3.77 -2.08 -3.54 0 0 -9.390
Indiv contribution 0.043 0.107 0.161 0 0 0.310
to obs information
matrix
Variance of estimate 3.22222
(inverse of obs information)
SE(of estimator) = sqrt(variance) 1.80
Illustration of using Excel to find maximum likelihood estimates with different sized samples
This will illustrate how to use Excel to find the maximum likelihood estimation in a poisson model with DIFFERENT sized samples.
that maximizes the likelihood. The data consists of 5 counts and 5 areas where the counts were obtained as noted below.
What is probability of getting pregnant in a month?
Example of Maximum Likelihood Estimation with censored data.
For many couples, it is a joyful moment when they decide to try to and
have children. However, not every couple becomes immediately pregnant
on the first attempt to conceive a child and it may take many months
before the woman becomes pregnant.
Fertility scientists are interested in estimating the probability of becoming
pregnant in a given month.
A sample of couples are enrolled in a study and each
couple records the number of months prior to becoming pregnant. Here is
the raw data:
2; 6; 5; 0; 0; 4; 0; 3; 10+
where the value of 2 indicates that the couple became pregnant on the
3rd month( i.e. there were two months where the pregnancy did not occur
PRIOR to becoming pregnant on the 3rd month). The value 10+ indicates
that it took longer than 10 months to get pregnant but the exact time is
unknown.
A common probability distribution to model this type of data is the geometric
distribution with parameter p representing the probability of becoming
pregnant in any month.
Let Y be the number of months prior to becoming pregnant. Then
P(Y = y | p) = (1-p)^y x p
i.e. there are y failures to get pregnant followed by a success.
For censored data, the probability that it takes more than y+ months to
become pregnant is:
P(Y > y+ | p) = (1-p)^(y+)
Data
2 6 5 0 0 4 0 3
p
Indiv contribution 0.21 0.131 0.051 0.065 0.211 0.211 0.082 0.211 0.104
to likelihood
Indiv contribution -2.03 -2.98 -2.74 -1.56 -1.56 -2.50 -1.56 -2.27
to log-likelihood
Indiv contribution 25.77 32.19 30.58 22.56 22.56 28.98 22.56 27.38
to obs information
matrix
Variance of estimate
(inverse of obs information)
SE(of estimator) = sqrt(variance)
You may be curious to know that the 'accepted' value for the probability of becoming pregnant
in a month when attempting to become pregnant is about 25%. This
value was obtained using methods very similar to what was shown above.
For information on the current "state of the art" for these types of studies, see:
Scheike, T.H. and Keiding, N. (2006)
Design and analysis of time-to-pregnancy
Statistical Methods in Medical Research, 15, 127-140.
http://dx.doi.org/10.1191/0962280206sm435oa
10+
0.094
Total
-2.36 -19.56
16.04 228.63
0.004
0.066
2 3 2 3 1
Lambda 1.59362426
Probability 0.324 0.172 0.324 0.172 0.406
-1.128 -1.760 -1.128 -1.760 -0.900
9.3535E-12
Zero-truncated poisson distribution
Example of an MLE without a closed from solution.
The Poisson distribution is a popular distribution used to model smallish counts. In some cases,
only positive counts can be observed, i.e.\ you can't observe a 0 value. This is known as a zero-truncated Poisson distribution.
The following data were observed for the number of occupants in cars on a freeway.
Log-likelihood
MLE is a solution to
For example, the number of occupants in a car on a freeway can be closely modeled by a zero-truncated Poisson distribution
because you can't observe cars on the road with 0 occupants!
The probability distribution for the zero-truncated Poisson distribution is:
$$P(Y=y) = \frac{\lambda^{y} e^{-\lambda}}{y! (1-e^{-\lambda}|~~~ Y=1, 2, \ldots $$
1 2
0.406 0.324 0.000 Likelihood
-0.900 -1.128 -8.704 log-likelihood
The Poisson distribution is a popular distribution used to model smallish counts. In some cases,
only positive counts can be observed, i.e.\ you can't observe a 0 value. This is known as a zero-truncated Poisson distribution.
For example, the number of occupants in a car on a freeway can be closely modeled by a zero-truncated Poisson distribution
Illustration of simple Cormack-Jolly-Seber experiment
Parameter
phi 0.76018747
p 0.45300781
History Count Probability log-likelihood
100 501 0.51243401 -334.96025
101 140 0.14319513 -272.09659
110 250 0.22577957 -372.04902
111 109 0.11859129 -232.39587
1000 1 -1211.5017
Illustration of simple Cormack-Jolly-Seber experiment

You might also like