Financial Statistics Laboratory 3: Bootstrap

Master in Finance (Uc3m)
1
FINANCIAL STATISTICS
Laboratory 3

In this session we are going to study the importance of the bootstrap in financial
econometrics.
We are going to learn how to apply this fantastic technique in the case of independent
and dependent data. With very simple examples we will realize how powerful it is not
only for parameter estimation purposes but for forecasting densities estimation as
well.
The contents for this session are:
Bootstrap for independent data
o xample: bootstrapping the sample mean of un!nown population
o xample: bootstrapping the correlation coefficient
Bootstrap for dependent data
o xample: bootstrapping the parameters of an "#$%& process

Bootstrap
The bootstrap is a very general resampling method introduced by fron $%'('& that can
be used either for approximating the distribution of statistics of interest or their
standard deviation. )riginally* it was thought for independent data but soon it was
applied to dependent data as financial time series. +articularly* the bootstrap has
been useful in financial econometrics for estimating the distribution of parameters*
predicting the distribution of future returns or volatilities* etc.


2
The basic idea of bootstrap in the case of independent data can be represented
graphically as below $ta!en from fron* %'',&.

s(. ) is the statistic of interest and s(x
) is its value computed from a bootstrap

sample. The bootstrap samples are obtained resampling with replacement from the
original data.
"t end we have a collection of B bootstrap sample* x
b
for b = 1,2, , B* and for each
bootstrap sample we compute s(x
b
) * so that we approximate the distribution of s(. )
or its standard error:
sc
(s) =
_
( s(x
b
) s(x
)
2
B
b=1
B 1


3
Example 1: mean of unknown population
We have a sample of size n = 8 from a population of
interest x = (22, 1S, 2u, S, 18, 8, 26, 28).
We compute the sample mean:
x =
x
t
n
t=1
n
= 17.S
and a measure of dispersion of our estimator which is given by
sc(x ) =
_
o
2
n
= 9.21
where o
2
=
(x
t
-x )
2 n
t=1
n-1
= 7S.71.
We also !now asymptotic results:
x - N(p,
c
2
n
) as n gets bigger and bigger $n - &.
But in this case the population is un!nown and the sample size is small* so that the -.T
may give a poor approximation of the sampling distribution of the estimator.
"n alternative is to use the bootstrap to approximate the small sample distribution of
x .
/ote the importance of having a good estimation for the sampling distribution of the
estimator if we need to compute confidence intervals for* as in this case* the sample
mean. 0or small sample sizes the -.T can give misleading results.
Example 1: mean of unknown population
%%%Example 1: mean of unknown population
x=[22, 15, 20, 3, 18, 8, 26, 28];
hist(x)
title('Histogram of x')

n=length(x) %%%sample size
x_mean=mean(x)
x_var=var(x)
x_mean_var=x_var/n


4

%%%Asymptotic result:x_mean-N(x_mean, s2/n)
xx=linspace(x_mean-4*sqrt(x_mean_var),
x_mean+4*sqrt(x_mean_var), 100);
fx_mean_asym=normpdf(xx,x_mean, sqrt(x_mean_var));

The distribution function behind this data does not look normal.

B=500; %%number of bootstrap samples
index=unidrnd(n, n, B); %%%it generates a matrix
of dimension nxB containing indexes. These indexes
are independent random draws from a discrete
uniform taking values from 1 to n (sample size);
i.e. each observation in the sample has the sample
probability (1/n) of being selected.
%%%the way this works is by taking independent
draws from a discrete uniform distribution with
integers from one up to length(x).%%%Next we
construct the bootstrap samples from x by using
the matrix index
bootstrap_samples=zeros(n, B);

for j=1:B
for i=1:n
bootstrap_samples(i,j)=x(index(i,j));
end
0 5 10 15 20 25 30
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Histogram of x


5
end
%%%we compute the mean for each of the samples
mean_x_boot=mean(bootstrap_samples)';
[fx_boot, x_boot]=ksdensity(mean_x_boot);

plot(x_boot, fx_boot, 'r')
hold on
plot(xx, fx_mean_asym,'k')
hold off
title('Asymptotic and bootstrap density for
xmean')
legend('Bootstrap', 'Asymptotic')

"s the sample size goes to infinity the bootstrap distribution will converge to the
asymptotic one given by the -.T.
In this case* confidence intervals computed using the asymptotic distribution and the
bootstrap one will be pretty similar. 1owever* when having data with small sample
sizes* it is recommended to use the bootstrap distribution.

0 5 10 15 20 25 30
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Asymptotic and bootstrap density for xmean

Bootstrap
Asymptotic


6
Example 2: Bootstrapping the correlation coefficient
2uppose we have a data of size n = Su which consists of two variables
X = _
(u.2S, u.14)
(u.19, u.8S)
(S.16, 2.u1)
_.
We are interest in the distribution of p $correlation coefficient&.

The estimates of the p and 3 are
x = (u.u6, u.u9) and
= j
6.2S 2.14
2.14 1.u9
[.
Therefore:
p =
co: (x
1
, x
2
)
o
1
o
2
=
2.14
V6.2SV1.u9
= u.82

1ow to get the sampling distribution of p4 We might rely on asymptotic results which
are based on strong assumptions. 1owever* another alternative is to use a bootstrap
procedure.
-8 -6 -4 -2 0 2 4 6 8
-3
-2
-1
0
1
2
3
Scatter plot (X
1
, X
2
)
X
1
X
2


7
%%%Example 2: Bootstraping the correlation
coefficient

X=mvnrnd([0 0],[6 2;2 1],100);

scatter(X(:,1), X(:,2))
title('Scatter plot (X_1, X_2)')
xlabel('X_1')
ylabel('X_2')

n=length(X) %%%sample size
X_mean=mean(X)
X_cov=cov(X)
X_corr=corr(X)

B=500; %%number of bootstrap samples
index=unidrnd(n, n, B); %%%it generates a matrix
of dimension nxB containing indexes
%%%the way this works is by taking independent
draws from a discrete uniform distribution with
integers from one up to length(x).

%%%Next we construct the bootstrap samples from X
by using the matrix index
X_boot=zeros(n, 2, B);

for b=1:B
for i=1:n
X_boot(i,:,b)=X(index(i,b), :);
end
end

%%%let's plot three bootstrap samples in order to
see graphically if they
%%%reproduce the linear relationship
subplot(2,2,1);
scatter(X_boot(:,1,400), X_boot(:,2,400), 'rx')
title('Bootstrap sample 400')


8
subplot(2,2,2); scatter(X_boot(:,1,311),
X_boot(:,2,311), 'bs')
subplot(2,2,3);scatter(X_boot(:,1,234),
X_boot(:,2,234), 'k')
subplot(2,2,4);scatter(X_boot(:,1,156),
X_boot(:,2,156), 'g')

corr_boot=zeros(2,2,B); %%%zero 2x2xB array
%%% next we compute the sample correlation
matrices for each of the
%%% bootstrap samples

for b=1:B
corr_boot(:,:,b)=corr(X_boot(:,:, b));
end

rho_boot(:,1)=corr_boot(1,2,:);
%%%we compute the mean and variance obtaing by the
bootstrap distribution
mean_rho_boot=mean(rho_boot);
var_rho_boot=var(rho_boot);
-8 -6 -4 -2 0 2 4 6 8
-3
-2
-1
0
1
2
3
Bootstrao sample 400
-6 -4 -2 0 2 4 6 8
-3
-2
-1
0
1
2
3
-6 -4 -2 0 2 4 6 8
-3
-2
-1
0
1
2
3
-8 -6 -4 -2 0 2 4 6
-3
-2
-1
0
1
2
3


9

%%%finally we plot the bootstrap distribution of
rho
[f_rho, x_rhor]=ksdensity(rho_boot);
subplot(2,1,1)
hist(rho_boot, 30)
title('Histogram of \rho')
subplot(2,1,2)
plot(x_rhor, f_rho, 'k--')
title('Bootstrap smoothed density estimate of
\rho')

%%%this distribution can be used to compute
confidence interval for rho
ci_rho=prctile(rho_boot, [2.5, 97.5])

The bootstrap approximation of the un!nown E(p) and Ior(p) are:
E
(p) = p

=
p
b B
b=1
B
= u.8178
Ior
(p) =
(p
b
p

)
2 B
b=1
B
= u.uu12
"nd a confidence interval for p can be constructed as:
CI(p) = |I, u] = jq
[
o
2
, q
[1
o
2
[ = | u.741S , u.874u]
where q
(. ) is the quantile of the distribution approximated by p

b
* for b = 1,2, , B.


10

0.7 0.75 0.8 0.85 0.9 0.95
0
10
20
30
40
50
Histogram of
0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0
2
4
6
8
10
12
Bootstrap smoothed density estimate of


11
Bootstrap for dependent data
Dependent data
When dealing with dependent data there are different approaches to apply the
bootstrap. )ne of the most used is the model based bootstrap. The idea is to first fit a
model to the series in order to capture the dependence. )nce we have it* we sample
the residuals to construct bootstrap samples of the process.
Example AR(1)
2uppose we have a time series y
t
* for t = 1,2, I* which follows an "#$%&:
y
t
= c + y
t-1
+ o
t

where o
t
~wN(u, o
u
2
).

We are usually interested in the parameters of the model: c and . In practice we do
not !now the true c and . Therefore we use the series to estimate them. If the
sample size is large enough* asymptotic results may be used to get an approximation
of the distributions of c and so that different tests can be performed. 1owever* if
the sample size is small* the asymptotic results may give misleading conclusion about c
and . In this case the use of bootstrap procedure is advisable.
0 10 20 30 40 50 60 70 80 90 100
-1
0
1
2
3
4
5
6
7
8
9
y series


12

%%%Example: dependent data

y=ARMA_SIMU(250,0.8,[],1,1);

plot(y)
title('y series')

%%%We estimate by using Least Squares (LS)%%%
%%%In case of autoregressive, it is exactly the
same as likelihood maximum (LM)
%%%We use the function ols

% help regress

%%%Estimation by OLS
[B,Binterv,Residuals]=regress(y(2:end),
[ones(length(y(2:end)), 1), y(1:end-1)]);
%%%Estimated parameters
c_hat=B(1)
phi_hat=B(2)
%%%Residuals of the regression
a_hat=Residuals;
marginal_mean=c_hat/(1-phi_hat);


13

To get an approximation of the distribution of c and we need to get bootstrap
replicates $samples& of the series. 0or doing that* we proceed as follows:
%. 0irst we estimate "#$%& for the series y
t
* and get c and .
The residuals of the model are calculated by
o
t
= y
t
(c + y
t-1
)
5. We ta!e a random draw of size I of the residuals of the model* o
t
, for t6%* 5*7T

.
,. We construct a bootstrap sample by using the "#$%& model with the estimated
parameters and the residuals
y
t
= c +y
t-1
+ o
t

/ote: as initial condition we may use the first observation or the marginal mean.
8. 9se the bootstrap replicate of the process $bootstrap sample& to estimate the
parameters* c
and
.
We repeat these steps B times* so that at the end we have c
b
and
b
b6%* 5*7B:
which can be used to approximate the distribution of the parameters c and .


14

%%%matrix of indexes
B=1000;
index=unidrnd(length(y)-1, length(y), B); %%%it
generates a matrix of dimension length(y)xB
containing indexes

%%%bootstrap replicates
y_boot=zeros(length(y), B); %each column is gonna
be a bootstrap replicate
%%%we set the first value of each bootstrap
replicate to the marginal mean
y_boot(1,:)=marginal_mean;

for b=1:B
for t=2:length(y)
y_boot(t,b)=c_hat+phi_hat*y_boot(t-
1,b)+a_hat(index(t,b));
end
end

plot(y_boot) %%%all the bootstrap replications
plot(y_boot(:, 100:110)) %%%only 10 replications

%%%Now we estimate the bootstrap parameters
0 10 20 30 40 50 60 70 80 90 100
0
2
4
6
8
10
12
Ten bootstrap replication of the process


15
%%%For each of the bootstrap sample we will have
one parameters
c_boot=zeros(B,1);
phi_boot=zeros(B,1);

for b=1:B
[B_boot,Binterv_boot,Residuals_boot]=regress(y_boo
t(2:end, b), [ones(length(y_boot(2:end, b)), 1),
y_boot(1:end-1, b)]);

c_boot(b)=B_boot(1);
phi_boot(b)=B_boot(2);
end

%%%We can compute the mean and the variance of
c_boot and phi_boot
mean_c=mean(c_boot)
var_c=var(c_boot)
subplot(2,1,1)
hist(c_boot, 30)
title('Bootstrap distribution of c')

mean_c=mean(phi_boot)
var_c=var(phi_boot)
subplot(2,1,2)
hist(phi_boot, 30)
title('Bootstrap distribution of \phi')

%%%These distribution can be used to construct
confidence intervals for the
%%%parameters with the quantiles

ci_c=prctile(c_boot, [2.5, 97.5])
ci_phi=prctile(phi_boot, [2.5, 97.5])

"s before* we may construct confidence intervals for the parameters of interest. 0or
instance* a u% confidence interval for is given by
CI() = |I, u] = jq
[
o
2
, q
[1
o
2
[ =


16
| u.6u91, u.8796]
and for c by
CI() = |I, u] = jq
[
o
2
, q
[1
o
2
[ =
| u.6S86, 2.1996]
when o% = 1 9S%.

0 0.5 1 1.5 2 2.5 3 3.5
0
20
40
60
80
100
120
Bootstrap distribution of c
0.4 0.5 0.6 0.7 0.8 0.9 1
0
20
40
60
80
100
120
Bootstrap distribution of

Financial Statistics Laboratory 3: Bootstrap

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Financial Statistics Laboratory 3: Bootstrap

Uploaded by

Copyright:

Available Formats

Master in Finance (Uc3m)

) is its value computed from a bootstrap

(. ) is the quantile of the distribution approximated by p

, for t6%* 5*7T

You might also like