Professional Documents
Culture Documents
1
FINANCIAL STATISTICS
Laboratory 3
In this session we are going to study the importance of the bootstrap in financial
econometrics.
We are going to learn how to apply this fantastic technique in the case of independent
and dependent data. With very simple examples we will realize how powerful it is not
only for parameter estimation purposes but for forecasting densities estimation as
well.
The contents for this session are:
Bootstrap for independent data
o xample: bootstrapping the sample mean of un!nown population
o xample: bootstrapping the correlation coefficient
Bootstrap for dependent data
o xample: bootstrapping the parameters of an "#$%& process
Bootstrap
The bootstrap is a very general resampling method introduced by fron $%'('& that can
be used either for approximating the distribution of statistics of interest or their
standard deviation. )riginally* it was thought for independent data but soon it was
applied to dependent data as financial time series. +articularly* the bootstrap has
been useful in financial econometrics for estimating the distribution of parameters*
predicting the distribution of future returns or volatilities* etc.
Master in Finance (Uc3m)
2
The basic idea of bootstrap in the case of independent data can be represented
graphically as below $ta!en from fron* %'',&.
s(. ) is the statistic of interest and s(x
(s) =
_
( s(x
b
) s(x
)
2
B
b=1
B 1
Master in Finance (Uc3m)
3
Example 1: mean of unknown population
We have a sample of size n = 8 from a population of
interest x = (22, 1S, 2u, S, 18, 8, 26, 28).
We compute the sample mean:
x =
x
t
n
t=1
n
= 17.S
and a measure of dispersion of our estimator which is given by
sc(x ) =
_
o
2
n
= 9.21
where o
2
=
(x
t
-x )
2 n
t=1
n-1
= 7S.71.
We also !now asymptotic results:
x - N(p,
c
2
n
) as n gets bigger and bigger $n - &.
But in this case the population is un!nown and the sample size is small* so that the -.T
may give a poor approximation of the sampling distribution of the estimator.
"n alternative is to use the bootstrap to approximate the small sample distribution of
x .
/ote the importance of having a good estimation for the sampling distribution of the
estimator if we need to compute confidence intervals for* as in this case* the sample
mean. 0or small sample sizes the -.T can give misleading results.
Example 1: mean of unknown population
%%%Example 1: mean of unknown population
x=[22, 15, 20, 3, 18, 8, 26, 28];
hist(x)
title('Histogram of x')
n=length(x) %%%sample size
x_mean=mean(x)
x_var=var(x)
x_mean_var=x_var/n
Master in Finance (Uc3m)
4
%%%Asymptotic result:x_mean-N(x_mean, s2/n)
xx=linspace(x_mean-4*sqrt(x_mean_var),
x_mean+4*sqrt(x_mean_var), 100);
fx_mean_asym=normpdf(xx,x_mean, sqrt(x_mean_var));
The distribution function behind this data does not look normal.
B=500; %%number of bootstrap samples
index=unidrnd(n, n, B); %%%it generates a matrix
of dimension nxB containing indexes. These indexes
are independent random draws from a discrete
uniform taking values from 1 to n (sample size);
i.e. each observation in the sample has the sample
probability (1/n) of being selected.
%%%the way this works is by taking independent
draws from a discrete uniform distribution with
integers from one up to length(x).%%%Next we
construct the bootstrap samples from x by using
the matrix index
bootstrap_samples=zeros(n, B);
for j=1:B
for i=1:n
bootstrap_samples(i,j)=x(index(i,j));
end
0 5 10 15 20 25 30
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Histogram of x
Master in Finance (Uc3m)
5
end
%%%we compute the mean for each of the samples
mean_x_boot=mean(bootstrap_samples)';
[fx_boot, x_boot]=ksdensity(mean_x_boot);
plot(x_boot, fx_boot, 'r')
hold on
plot(xx, fx_mean_asym,'k')
hold off
title('Asymptotic and bootstrap density for
xmean')
legend('Bootstrap', 'Asymptotic')
"s the sample size goes to infinity the bootstrap distribution will converge to the
asymptotic one given by the -.T.
In this case* confidence intervals computed using the asymptotic distribution and the
bootstrap one will be pretty similar. 1owever* when having data with small sample
sizes* it is recommended to use the bootstrap distribution.
0 5 10 15 20 25 30
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Asymptotic and bootstrap density for xmean
Bootstrap
Asymptotic
Master in Finance (Uc3m)
6
Example 2: Bootstrapping the correlation coefficient
2uppose we have a data of size n = Su which consists of two variables
X = _
(u.2S, u.14)
(u.19, u.8S)
(S.16, 2.u1)
_.
We are interest in the distribution of p $correlation coefficient&.
The estimates of the p and 3 are
x = (u.u6, u.u9) and
= j
6.2S 2.14
2.14 1.u9
[.
Therefore:
p =
co: (x
1
, x
2
)
o
1
o
2
=
2.14
V6.2SV1.u9
= u.82
1ow to get the sampling distribution of p4 We might rely on asymptotic results which
are based on strong assumptions. 1owever* another alternative is to use a bootstrap
procedure.
-8 -6 -4 -2 0 2 4 6 8
-3
-2
-1
0
1
2
3
Scatter plot (X
1
, X
2
)
X
1
X
2
Master in Finance (Uc3m)
7
%%%Example 2: Bootstraping the correlation
coefficient
X=mvnrnd([0 0],[6 2;2 1],100);
scatter(X(:,1), X(:,2))
title('Scatter plot (X_1, X_2)')
xlabel('X_1')
ylabel('X_2')
n=length(X) %%%sample size
X_mean=mean(X)
X_cov=cov(X)
X_corr=corr(X)
B=500; %%number of bootstrap samples
index=unidrnd(n, n, B); %%%it generates a matrix
of dimension nxB containing indexes
%%%the way this works is by taking independent
draws from a discrete uniform distribution with
integers from one up to length(x).
%%%Next we construct the bootstrap samples from X
by using the matrix index
X_boot=zeros(n, 2, B);
for b=1:B
for i=1:n
X_boot(i,:,b)=X(index(i,b), :);
end
end
%%%let's plot three bootstrap samples in order to
see graphically if they
%%%reproduce the linear relationship
subplot(2,2,1);
scatter(X_boot(:,1,400), X_boot(:,2,400), 'rx')
title('Bootstrap sample 400')
Master in Finance (Uc3m)
8
subplot(2,2,2); scatter(X_boot(:,1,311),
X_boot(:,2,311), 'bs')
title('Bootstrap sample 311')
subplot(2,2,3);scatter(X_boot(:,1,234),
X_boot(:,2,234), 'k')
title('Bootstrap sample 234')
subplot(2,2,4);scatter(X_boot(:,1,156),
X_boot(:,2,156), 'g')
title('Bootstrap sample 156')
corr_boot=zeros(2,2,B); %%%zero 2x2xB array
%%% next we compute the sample correlation
matrices for each of the
%%% bootstrap samples
for b=1:B
corr_boot(:,:,b)=corr(X_boot(:,:, b));
end
rho_boot(:,1)=corr_boot(1,2,:);
%%%we compute the mean and variance obtaing by the
bootstrap distribution
mean_rho_boot=mean(rho_boot);
var_rho_boot=var(rho_boot);
-8 -6 -4 -2 0 2 4 6 8
-3
-2
-1
0
1
2
3
Bootstrao sample 400
-6 -4 -2 0 2 4 6 8
-3
-2
-1
0
1
2
3
Bootstrao sample 311
-6 -4 -2 0 2 4 6 8
-3
-2
-1
0
1
2
3
Bootstrao sample 234
-8 -6 -4 -2 0 2 4 6
-3
-2
-1
0
1
2
3
Bootstrao sample 156
Master in Finance (Uc3m)
9
%%%finally we plot the bootstrap distribution of
rho
[f_rho, x_rhor]=ksdensity(rho_boot);
subplot(2,1,1)
hist(rho_boot, 30)
title('Histogram of \rho')
subplot(2,1,2)
plot(x_rhor, f_rho, 'k--')
title('Bootstrap smoothed density estimate of
\rho')
%%%this distribution can be used to compute
confidence interval for rho
ci_rho=prctile(rho_boot, [2.5, 97.5])
The bootstrap approximation of the un!nown E(p) and Ior(p) are:
E
(p) = p
=
p
b B
b=1
B
= u.8178
Ior
(p) =
(p
b
p
)
2 B
b=1
B
= u.uu12
"nd a confidence interval for p can be constructed as:
CI(p) = |I, u] = jq
[
o
2
, q
[1
o
2
[ = | u.741S , u.874u]
where q
= c +y
t-1
+ o
t
/ote: as initial condition we may use the first observation or the marginal mean.
8. 9se the bootstrap replicate of the process $bootstrap sample& to estimate the
parameters* c
and
.
We repeat these steps B times* so that at the end we have c
b
and
b
b6%* 5*7B:
which can be used to approximate the distribution of the parameters c and .
Master in Finance (Uc3m)
14
%%%matrix of indexes
B=1000;
index=unidrnd(length(y)-1, length(y), B); %%%it
generates a matrix of dimension length(y)xB
containing indexes
%%%bootstrap replicates
y_boot=zeros(length(y), B); %each column is gonna
be a bootstrap replicate
%%%we set the first value of each bootstrap
replicate to the marginal mean
y_boot(1,:)=marginal_mean;
for b=1:B
for t=2:length(y)
y_boot(t,b)=c_hat+phi_hat*y_boot(t-
1,b)+a_hat(index(t,b));
end
end
plot(y_boot) %%%all the bootstrap replications
plot(y_boot(:, 100:110)) %%%only 10 replications
%%%Now we estimate the bootstrap parameters
0 10 20 30 40 50 60 70 80 90 100
0
2
4
6
8
10
12
Ten bootstrap replication of the process
Master in Finance (Uc3m)
15
%%%For each of the bootstrap sample we will have
one parameters
c_boot=zeros(B,1);
phi_boot=zeros(B,1);
for b=1:B
[B_boot,Binterv_boot,Residuals_boot]=regress(y_boo
t(2:end, b), [ones(length(y_boot(2:end, b)), 1),
y_boot(1:end-1, b)]);
c_boot(b)=B_boot(1);
phi_boot(b)=B_boot(2);
end
%%%We can compute the mean and the variance of
c_boot and phi_boot
mean_c=mean(c_boot)
var_c=var(c_boot)
subplot(2,1,1)
hist(c_boot, 30)
title('Bootstrap distribution of c')
mean_c=mean(phi_boot)
var_c=var(phi_boot)
subplot(2,1,2)
hist(phi_boot, 30)
title('Bootstrap distribution of \phi')
%%%These distribution can be used to construct
confidence intervals for the
%%%parameters with the quantiles
ci_c=prctile(c_boot, [2.5, 97.5])
ci_phi=prctile(phi_boot, [2.5, 97.5])
"s before* we may construct confidence intervals for the parameters of interest. 0or
instance* a u% confidence interval for is given by
CI() = |I, u] = jq
[
o
2
, q
[1
o
2
[ =
Master in Finance (Uc3m)
16
| u.6u91, u.8796]
and for c by
CI() = |I, u] = jq
[
o
2
, q
[1
o
2
[ =
| u.6S86, 2.1996]
when o% = 1 9S%.
0 0.5 1 1.5 2 2.5 3 3.5
0
20
40
60
80
100
120
Bootstrap distribution of c
0.4 0.5 0.6 0.7 0.8 0.9 1
0
20
40
60
80
100
120
Bootstrap distribution of