You are on page 1of 49

On fitting models for Danish fire data-1981.

Reported to: Professor Jose Garrido

April 30, 2007

Presented and reported by:


I. Mendoza
&
S.Mozumder.

Reported as the partial fulfilment of the requirements for the course MAST726/4

1
1 Introduction
We have a portfolio of dwelling. In the year 1981 we received 793 claims where each claim amount
and the floor space(in square meters) are given. The underlying policies are without deductible.
Given the data set our task is to obtain a feasible model for the portfolio. While searching for
models we managed to get some good fit which are not feasible from different point of views. And
here we decided to include only the PP-plots for those “good fit but not feasible” model. We
will describe only one of those non- feasible models. And finally we will talk about two feasible
but not best fit models.
At first we used regression analysis to study the independence of two components of the data
set, namely space data and claim data.
When estimating the parameters for different models we used the utilities included in sophis-
ticated packages of R. Here we should mention that though we tried our best to figure out best
model still we could not try some model because of the failure in optimization while estimating
the parameters.
Our efforts were consistent in getting sensitivity curve, plotting likelihood region for some
parameters, and finally getting box-plot for comparing two feasible models. For each model we
produced graphs of probability density function(PDF) over Histogram, cumulative density func-
tion(CDF), limited expected value(LEV), mean residual life (MRL). We put both the empirical
and theoretical graphs together to get the idea of goodness of fit. The PP-plots we produced
supports, specially, our comments that to get a feasible model we need to sacrifice the best model
we managed to get with infinite mean for the original claims.
In conducting the goodness of fit test for different models , we should mention that, we
run our own routine in R for that job. We carried out Kolmogorov-Smirnov test for individual
data, Chi-square goodness of fit test for grouped data, Anderson-Darling test for heavy tailed
model and Akaike selection criterion for model selection. We also programmed codes for com-
puting Cramer-Von-Moises distances for both individual and grouped data with weights. For
weights we use the mean of each class. All the codes are included in the Appendix.
Talking about the plots of likelihood region we should mention that we fix one parameter to
get the plot in two dimensions. However, at the same time, we produce 3D plots of the likelihood
region to get the idea of optimal region containing the parameters.
Our last strenuous effort is devoted to the calculation of “premium”. We devote a complete
section for this purpose.

2 Empirical tools for data analysis


In this section we simply define the empirical version of different tools that we will use in next
sections to compare with their corresponding theoretical versions. This tools are so common that
we content ourself in simply defining those, without any elaboration.

2.1 Tools for individual data


For a random sample X1 , X2 , · · · , Xn the empirical CDF is defined as:
n
1X
Fn (x) = I[Xi ≤x] (1)
n
i=1

2
for x > 0 and I(x) is an indicator function. By definition it is a step function that jumps by
1
n at each observation. Since its is a discrete function the corresponding empirical probability
function is defined as:
n
1X
fn (x) = I[Xi =x] (2)
n
i=1

The k th sample moments is defined as :


Z ∞ n
1X k
µ̂0k = xk dFn (x) = Xi (3)
0 n
i=1

The empirical limited expected value is defined as:


Z d
1 X X
Ê(X ∧ d) = xdFn (x) + d[1 − Fn (d)] = [ Xi + d 1] (4)
0 n
Xi <d Xi ≥d

k = 1 in (3) and d = ∞ in (4) implies that:

lim Ê(X ∧ d) = Ê(X)


d→∞

And then the empirical MRL is defined as:

Ê(X) − Ê(X ∧ d)
ê(d) = (5)
1 − Fn (d)
1 Pn 1 P P
n i=1 Xi − n [ Xi <d Xi + d Xi ≥d 1]
= 1 Pn
1 − n i=1 I[Xi ≤d]
where all the functions on the right hand side are defined earlier.

2.2 Tools for grouped data


The tools we defined in the previous section need to be modified for grouped data. Let
n
X
nj = I[cj−1 <Xi ≤cj ] (6)
i=1

be the frequency in the interval (cj−1 , cj ] for j = 1, · · · , r. At boundary point cj the empirical
CDF is defined as (possibly with cr = ∞):
j
1X
Fn (cj ) = ni , Fn (cj ) = 0 (7)
n
i=1

where n = rj=1 nj .
P
A smooth version of Fn is obtained, (in order to see whether the derivative exists), by simply
connecting the jump points by straight lines:

 0
 ifx ≤ co
(c −x)F (c )+(x−c )F (c )
F˜n (x) = j n j−1
(cj −cj−1 )
j−1 n j
ifcj−1 < x ≤ cj (8)

 1 ifx > cr

3
Clearly Fn˜(x) is not defined for x > cr−1 if cr = ∞ and nr 6= 0.
The derivative of the above function , if it exists, is known as histogram and is defined as:

 0
 ifx ≤ co
˜ Fn (cj )−Fn (cj−1 ) nj
fn (x) = (cj −cj−1 ) = n(cj −cj−1 ) ifcj−1 < x ≤ cj (9)

 0 ifx > cr

Hence
n
X 1[cj−1 <Xi <cj ]
f˜n (x) = if cj−1 < x < cj .
n(cj − cj−1 )
i=1

The k th sample moment for grouped data can be obtained from empirical CDF as:
Z cr r
0 k
X nj
µˆk = x dFn (x) = ckj (10)
c0 n
j=1

and using ogive version we get:


Z cr r Z
X cj
0
µ˜k = k
x dFn (x) = xk f˜n (x)dx (11)
c0 j=1 cj−1

r
X nj (ck+1
j − ck+1
j−1 )
=
n(k + 1)(cj − cj−1 )
j=1

The limited expected value , for cj−1 < d ≤ cj , can be expressed as:
Z d
Ê(X ∧ d) = xdFn (x) + d[1 − Fn (d)] (12)
0

j−1 r
X ci ni X ni
= +d
n n
i=1 i=j

And in terms of ogive, for cj−1 < d ≤ cj , it becomes:


Z d
Ẽ(X ∧ d) = xdF̃n (x) + d[1 − F̃n (d)] (13)
0

j−1 Z
X ci Z u
= xf˜n (x)dx + xf˜n (x)dx
i=1 ci−1 cj−1
Z cj r
X Z ci
+ uf˜n (x)dx + uf˜n (x)dx
u i=j+1 ci−1

j−1
X ni (ci + ci−1 ) nj (2ucj − c2j−1 − u2 )
= +
2n 2n(cj − cj−1 )
i=1
r
X ni u
+ .
n
i=j+1

4
And then the empirical MRL is defined as:

Ê(x) − Ê(X ∧ d)
ê(d) = (14)
1 − Fn (x)
and
Ẽ(x) − Ẽ(X ∧ d)
ẽ(d) = (15)
1 − F̃n (x)
depending on discontinuous and continuous versions of CDF’s, where all other terms on the
right hand side are defined earlier.

3 Analysis of outlier and independence


In our first look we observe that the values for losses (response variable), are very disperse. This
becomes evident from Figure(2)which contains the plot of “space VS losses”.
According to the graph in Figure(1), it is very clear that the Loss 3,408,712.19 represents
an outlier for the data. And in fact we can check that by getting the Sensitivity Curve using
this value as an outlier. We know that Sensitivity Curve (SC) is given by:

SCn (x; x1 , · · · , xn−1 , Tn ) = n[Tn (x1 , · · · , xn−1 , x) − Tn−1 (x1 , · · · , xn−1 )].

And with Tn (x1 , · · · , xn ) = x̄n we can obtain

SCn (x; x1 , · · · , xn−1 , x̄n ) = x − x̄n−1

Using x = 3, 408, 712.19 we have the graph of sensitivity curve in Figure(3)


The plot of the sensitivity curve with the same outlier , but the median as the estima-
tor, appears in Figure(4).
So we can see that the range of sensitivity is much smaller for median than that obtained
for mean. That means that the value x, considered as outlier , is affecting the estimation.
Even when we do not consider outlier the data itself seems very disperse as shown in
Figure(2). That is in addition to having outlier, a close look reveals that, the data itself is
by nature very disperse. The possible reasons may be the heterogeneous dwelling standard of
the people in Denmark and the extent of damages caused by fire. Because of varying quality of
life the insurance of dwelling is also varying. And for different sorts of damages the corresponding
claims are also different.
In addition, we run a normal regression analysis to check that Losses and Spaces are un-
correlated. It means that the estimation of losses will not depend on space . For t-test( the
test which has usually been used in regression analysis to check whether there is any relation
between the underlying variables) we get the p-value = 0.4873 when considering outliers and
p-value = 0.4447 without considering outlier.
Since our data is very disperse we decided to use some transformation on response variable
in order to reduce the scale of data and hence to analyze it better. The idea is to choose a
monotone function that preserves all properties of the original data. And we use the logarithmic
transformation for that job.
In the following sections we will analyze the log of the data, henceforth log(data), to find
a good distribution fitting the empirical one. That is we have to fit the original data with a
transformed distribution.

5
Figure 1: Dispersion with outlier

3.1 Transformation of variables


The following theorem ensures that finding a distribution for log(data) is equivalent to finding
a distribution for the original data:

Theorem 3.1 Let X be a continuous random variable with PDF f (x). Let Y = g(X) is a
transformation of X such that Y is a continuous random variable. Then the PDF of Y is given
by:
fX (x)
fY (y) = 0 .
g (x)
Thus if Y = log (X) is the transformation then the new density is given by:

f (log (x))
fY (y) =
x

4 Initial Model selection


4.1 Descriptive statistics
As we discussed earlier we are applying logarithm on our disperse data. We derived our sum-
mary statistics for original and transformed data(which basically gives us the first idea of the
data), with and without outlier. We put this in Figure(5).
From the table of summary statistics we see that the distribution of losses are heavily skewed
to the right and it is more evident in case of the original data with outlier. Right skewness is
quite observable even in log scale. The effect of outlier is conceived from the huge differences of
mean and standard deviation for original data.

6
Figure 2: Dispersion without outlier

4.2 Empirical analysis for initial model selection


Initial model selection is very important from the perspective that we may exclude our best fit
model initially because of erroneous initial choices. However it is more of an artistic selection
than a scientific one.
For the initial model selection we plot the empirical PDF, CDF, LEV and MRL for log(data)in
Figure(6).
And the corresponding graphs for the original losses appears in Figure(7)
Having these empirical graphs in our exposure we now try to fit some well know right skewed
distributions e.g.Gumbel, Burr-3, Generalized Pareto, Inverse Transform Gamma, LogNor-
mal, and Inverse Gaussian. And by trial and error we chose R ∞relatively better fits. We also take
into account the values of the theoretical
R∞ means E(X) = 0 xdF (x), for each F , and compare
0 1 Pn
that with the empirical mean µ̂ = 0 xdFn (x) = n i=1 Xi . And we chose those models for
which the differences of these empirical and theoretical means are minimum. This is the basic
concern of premium calculation.
After initial selection we decided to describe only the log-normal and inverse gaussian feasible
models. Since all other models are not feasible on the grounds we just mentioned, we just describe
one of those namely Gumble model. Though some of them are really good fit, we say these models
the garbage models.

5 Description of feasible models


When we mention “feasible model” we mean that we have managed to get finite theoretical mean
for the original claims. Our first feasible model is log-normal. This model performs better than
the inverse gaussian model presented in the next section. But because of consistency between

7
Figure 3: Sensitivity curve with the estimator mean

empirical and theoretical mean we declared inverse gaussian our best model. Because a relevant
and consistent theoretical means is vital for “premium ” calculation.

5.1 Log -Normal fit for log(data)


5.1.1 Theoretical functions
The probability density function of a log-normal distribution with parameter µ and σ is given
by:
1 1 log x−µ 2
f (x) = √ e− 2 ( σ ) (16)
xσ 2π
The corresponding CDF is given by:
Z x
1 1 log y−µ 2
F (x) = √ e− 2 ( σ ) dy (17)
0 yσ 2π

log y−µ dz 1
To get the closed form we use the transformation σ = z = N (0, 1). So dy = σy . Thus
Z log y−µ
σ 1 1 2
F (x) = √ e− 2 z dz (18)
−∞ 2π
log y − µ
= Φ( )
σ
= Φ(z)
The theoretical LEV for the log-normal model is given by:
Z d
1 1 log x−µ 2 log y − µ
E(X ∧ d) = x. √ e− 2 ( σ ) dx + d[1 − Φ( )] (19)
0 xσ 2π σ

8
Figure 4: Sensitivity curve with the estimator median

To see how it gets into a closed form let us use the transformation
log y−µ
σ = z = N (0, 1). Then x = eµ+zσ which implies dx = σeµ+zσ dz. Then (19) becomes:
log d−µ
log d − µ
Z
µ+ 21 σ 2
σ 1 1 2
E(X ∧ d) = e √ e− 2 (z−σ) dz + d[1 − Φ( )]
−∞ 2π σ

1 2 log d − µ log d − µ
= eµ+ 2 σ P {N (σ, 1) < } + d[1 − Φ( )]
σ σ
1 2 log d − µ log d − µ
= eµ+ 2 σ P {N (0, 1) < − σ} + d[1 − Φ( )]
σ σ
1 log d − µ − σ 2
2 log d − µ
= eµ+ 2 σ Φ( ) + d[1 − Φ( )] (20)
σ σ
With the form of LEV given by (20) the MRL function could be expressed in closed form as
follows:
E(X) − E(X ∧ d)
e(d) = (21)
1 − F (d)
1 2 1 2 2
eµ+ 2 σ − {eµ+ 2 σ Φ( log d−µ−σ
σ ) + d[1 − Φ( log σd−µ )]}
=
1 − Φ( log σd−µ )
Here we mention that E(X) can be easily obtained, from (20) by simply letting d −→ ∞ , as
1 2
E(X) = eµ+ 2 σ

9
Figure 5: Summary statistics with and without outlier

5.1.2 Parameter estimation


We estimated parameters for log-normal distribution by LogLikelihood Method. The Loglike-
lihood function (LLF) is given, for µ  R and σ > 0, by:
n n
n X 1 X ln xi − µ
l(µ, σ 2 ) = − (ln σ 2 + ln 2π) − ln xi − ( )
2 2 σ2
i=1 i=1

And the corresponding score functions are:


n
2 ∂l(µ, σ 2 ) 1 X
S1 (µ, σ ) = = 2 (ln xi − µ)
∂µ σ
i=1

n
∂l(µ, σ 2 ) n 1 X
S2 (µ, σ 2 ) = =− + 3 (ln xi − µ)2
∂σ σ σ
i=1

Solving these equations we got the maximum likelihood estimators (mle’s):


n
1X
µ̂ = ln xi (22)
n
i=1

n
1X
σ̂ 2 = (ln xi − µ̂)2 (23)
n
i=1

We used a package included in R to estimate theses parameters (see the source code in Ap-
pendix). Since the package needs initial values we used Method of Moments(mom) as initial

10
Figure 6: Empirical model tools for log(data)

values.The method of moments estimators are given by:


Pn Pn
2 x2 xi
σ̂mom = ln( i=1 i ) − 2 ln i=1
n n
Pn
xi 1 2
µ̂mom = ln( i=1 ) − σ̂mom
n 2

Finally the mle’s we obtained are:

µ̂ = 2.089585 and σ̂ 2 = 0.187487


Talking about the effect of outlier on our estimate we should mention that we got almost
same values of the parameters both with and without outlier. This, may be, because of log
-scale. May be the estimates coming from original claims will show difference with and without
outlier. But except one model(Burr4, which we didn’t report), with infinite mean, we were not
able to estimate the parameters for the original claims because of the failure of the optimization
package.
To talk about the robustness of the estimates we see from (22) and (23) that µ̂ and σ̂ 2 goes
∞ when one of the xi ’s goes ∞. That means that µ̂ and σ̂ 2 are sensitive enough to outliers but
the sensitivity would be less than that we would have obtained from the parameters coming
from original claims.
However there is quantitative way to check the robustness of the estimated parameters. One
of the well known method is know as “Bootstrap”. We will take a sample of fixed size, k < 793,

11
Figure 7: Empirical model tools for the original data

from 793 claims lets say arbitrary large number of times. For each sample we need to estimate
the parameters separately and then if the mean of those estimated parameters are very close to
the real estimate of the parameters and the standard deviation of the estimates coming from
samples is small enough so that the original estimate falls in 95% confidence interval we can say
the parameters are robust.
Also we should mention that we incorporate “Huber estimates”(see Appendix for the codes)
in method of moments estimates which are supplied as the initial value of our MLE estima-
tion, which we believe has a robustifying effect.
There is another way to check the robustness of our parameters.
Now we talk about Variance-Covariance Matrix of log-normal mle’s . To estimate the
Variance-Covariances matrix, we used the Loglikelihood function and got the derivatives.

∂ 2 l(µ, σ 2 ) n
2
=− 2
∂µ σ
n
∂ 2 l(µ, σ 2 ) n 3 X
= − (ln xi − µ)2
∂σ 2 σ2 σ4
i=1
n
∂ 2 l(µ, σ 2 ) 2 X
=− 3 (ln xi − µ)2
∂σ∂µ σ
i=1

12
So the Covariance matrix can be computed as follows:
∂ 2 l(µ,σ 2 )
 
−1
∂ 2 l(µ,σ 2 ) ∂σ∂µ
 ∂µ2 
 ∂ 2 l(µ,σ 2 ) −1

∂σ∂µ ∂ 2 l(µ,σ 2 )
∂σ 2

−1 −2 Pn 2

−n σ3 i=1 (ln xi − µ)
= Pn σ2
−2 −1

σ3 i=1 (ln xi − µ)2 n 3 Pn 2
σ2− 4 σ i=1 (ln xi −µ)
 
0.0000443293 0
=
0 0.00002216616

Hence we can compute a 95% confidence interval around parameters, that is, approximately
1.96 standard deviations on both sides of each estimate:

µ : 2.0859586 ± 1.96(0.0000443293)1/2 ⇒ µ ∈ (2.072909, 2.099008)


σ : 0.1874917 ± 1.96(0.00002216616)1/2 ⇒ σ ∈ (0.1782638, 0.1967196)
We can see that our estimates fall in 95% confidence interval.
Now we talk about the Log-relative likelihood function to get the graphs of likelihood re-
gion. The Log-relative likelihood function is defined by:
r(µ, σ 2 ) = l(µ, σ 2 ) − l(µ̂, σ̂ 2 )

Evaluating the log-likelihood function using our estimates for µ and σ 2 we obtain:
n n
n X 1 X ln xi − µ
r(µ, σ 2 ) = − (ln σ 2 + ln 2π) − ln xi − ( ) − (−1451.867)
2 2 σ2
i=1 i=1

The set of values of µ and σ 2 such that r(µ, σ 2 ) > ln(p) is called a 100×p Likelihood Region(LR)
for each parameter. In order to show the LR for different values of p (let say 10% and 50%),
we plot LLF in 2-dimensions (fixing one parameter) and in 3-dimensions without fixing any
parameter.

We generated possible values of µ for fixed σ to find the feasible region of µ. In Figure(8)
we can see the plausible region from where we can choose the most likely parameter value for µ
when σ is fixed.

We also generated possible values for µ and σ to find the feasible region around the optimum
values of the parameters. In Figure(9) we can see the plausible surface from where we can choose
the most likely values for µ and σ.

5.1.3 Comparison by graphs


In this section we produce the graphs of this model together with the corresponding empirical
one using the parameters we estimated in previous section. The graphs of PDF, CDF, LEV, and
MRL appears in Figure(10)
We end this section with the PP-plot which gives us the idea of our fit(we remind its not
best but feasible), in Figure(11).

13
Figure 8: Likelihood region of the feasible models

5.2 Log(log normal) fit for the original data


In this section we present the model tools for the original data. In previous section we fit log
normal on log(data). And so here the model is log(lognormal).

5.2.1 Theoretical functions


The PDF for this model is given by:
1 log(log(x))−µ 2
o 1 e− 2 ( σ
)
f (x) = √ (24)
x log(x)σ 2π

where x ∈ (1, ∞).


The CDF of this model can be obtained as
1 log(log(y))−µ 2
x
1 e− 2 ( )
Z
σ
F o (x) = √ dy (25)
1 y log(y)σ 2π

The theoretical LEV for this model , with the density given by (24), can be obtained as:
1 log(log(x))−µ 2
d
1 e− 2 ( )
Z
σ
o
E (X ∧ d) = x √ dx + d[1 − F o (d)] (26)
1 x log(x)σ 2π

14
Figure 9: 3D Likelihood region of log normal model

The theoretical MRL for this model can be computed as:

E o (X) − E o (X ∧ d)
e(d) = (27)
1 − F o (d)

where E o (X) = limd→∞ E o (X ∧ d) and the functions on the right hand side are given above.

5.2.2 Parameter estimation


Since MLE’s are invariant our estimated parameters for this model remain same as µ̂ = 2.08958523
and σ̂ = 0.187487455.

5.2.3 Comparison by graphs


In this section we produce the graphs of this model for the original losses using the parameters
we estimated in previous section. The graphs of PDF, CDF,
LEV, MRL are produced in Figure(12)

5.3 Inverse Gaussian fit for log(data)

15
Figure 10: Log normal model graphs

5.3.1 Theoretical functions


The PDF of inverse Gaussian distribution is given by:
 1/2 x−µ 2
θ θ( µ )
f (x) = e− 2x (28)
2πx3

And the corresponding CDF for this model can be obtained as :


1/2 y−µ 2
Z x θ( µ )
θ −
F (x) = e 2y dy (29)
0 2πy 3

The LEV for inverse Gaussian model can be expressed as:


Z d  1/2 x−µ 2
θ θ( µ )
E(X ∧ d) = x e− 2x dx + d[1 − F (d)] (30)
0 2πx3

Finally the expression of MRL for inverse gaussian distribution is:

E(X) − E(X ∧ d)
e(d) = (31)
1 − F (d)

where the functions on the right hand side are defined above.

16
Figure 11: PP-plot of the feasible models

5.3.2 Parameter estimation


As before we estimated the parameters for inverse gaussian distribution by LogLikelihood
Method. The Loglikelihood function is given , for µ  R and θ > 0, by:
n n
n 3X θ X (xi − µ)2
l(µ, θ) = (ln θ − ln 2π) − ln xi − 2
2 2 2µ xi
i=1 i=1

And the corresponding score functions are:


n
∂l(µ, θ) θ X
S1 (µ, θ) = = 3 (xi − µ)
∂µ µ
i=1
n
∂l(µ, θ) n 1 X (xi − µ)2
S2 (µ, θ) = = −
∂θ 2θ 2µ2 xi
i=1

Solving these equations we get the maximum likelihood estimators (mle’s):


n
1X
µ̂ = xi (32)
n
i=1

nµ̂2
θ̂ = Pn (xi −µ̂)2
(33)
i=1 xi

17
Figure 12: Log(lognormal) model graphs

We used a package inclueded in R to estimate these parameters (see the source code in Ap-
pendix). Because R needs initial values we used method of moments estimates as initial values
which are given by: Pn
xi
µ̂mom = i=1
n
µ̂3mom
θ̂mom = Pn
x2i
i=1
n − µ̂2mom

Finally the mle’s we obtained are as followings:

µ̂ = 8.19756 and θ̂ = 229.18479


To talk about the robustness of the estimates we see from (32) and (33) that µ̂ and θ̂ goes ∞
when one of the xi ’s goes ∞. That means that µ̂ and θ̂ are sensitive enough to outliers but the
sensitivity would be less than that we would have obtained from the parameters coming from
original claims.
For quantitative analysis of robustness we can apply the “Bootstrap” method as discussed in
case of “lognormal model”. Also we should mention that we incorporate “Huber estimates”(see
Appendix for the codes) in method of moments estimates which are supplied as the initial value
of our MLE estimation , which we believe has a robustifying effect.
Now we talk about the Variance-Covariance Matrix of inverse Gaussian mle’s. To estimate

18
the Variance-Covariances matrix, we used the Loglikelihood function and got the derivatives.

∂ 2 l(µ, θ) −3θ ni=1 xi 2nθ


P
=− + 3
∂2µ µ4 µ

∂ 2 l(µ, θ) n
=− 2
∂2θ 2θ
Pn
∂ 2 l(µ, θ) xi − nµ
= i=1 3
∂θ∂µ µ

So the Covariance matrix can be computed as follows:


∂ 2 l(µ,θ)
 
−1
∂ 2 l(µ,θ) ∂µ∂θ
 ∂µ2 
 ∂ 2 l(µ,θ) −1

∂µ∂θ ∂ 2 l(µ,θ)
∂θ 2

 
0.003028186 −0.0001363386
=
−0.0001363386 132.7429

Hence we can compute a 95% confidence interval around parameters, that is, approximately
1.96 standard deviations on both sides of each estimate:
µ : 8.197649 ± 1.96(0.003028186)1/2 ⇒ µ ∈ (8.089792, 8.305506)
θ : 229.417857 ± 1.96(132.7429)1/2 ⇒ θ ∈ (206.8359, 251.9998)
We can see that our estimates fall in 95% confidence interval.
We know the Log-relative likelihood function is defined as:

r(µ, θ) = l(µ, θ) − l(µ̂, θ̂)

Evatuating the log-likelihood function using the estimates of the parameters µ and θ we obtain:
n n
n 3X θ X (xi − µ)2
r(µ, θ) = (ln θ − ln 2π) − ln xi − 2 − (−1451.670)
2 2 2µ xi
i=1 i=1

The set of values of µ and θ such that r(µ, θ) > ln(p) is called a 100×p Likelihood Region(LR)
for each parameter. In order to show the LR for different values of p (let say 10% and 50%),
we plot LLF in 2-dimensions (fixing one parameter) and in 3-dimensions without fixing any
parameter.

We generated possible values for µ and we fixed θ to find the feasible region for µ. In Figure(8)
we can see the plausible region from where we can choose the most likely parameter value for µ
when θ is fixed.

We also generated possible values for µ and θ to find the feasible region around the optimum
parameter values. In Figure(13) we can see the plausible surface from where we can choose the
most likely parameter values for µ and θ.

19
Figure 13: 3D likelihood region for Inverse gaussian model

5.3.3 Comparison by Graphs


In this section we produce the graphs of the inverse gaussian model for the transformed data
using the parameters we estimated in previous section. The graphs of PDF, CDF, LEV, MRL
are produced in Figure(14)
We produce the PP-plot for this model in Figure(11)

5.4 Log(inverse gaussian) model for the original data


5.4.1 Theoretical functions
The PDF of log(inverse gaussian) fit is given by:
 1/2 θ( logx−µ )2
o 1 θ µ
− 2 log
f (x) = 3
e x (34)
x 2π(log x)
The corresponding CDF is obtained as:
Z x  1/2 θ( logy−µ )2
1 θ µ
− 2 log
F o (x) = 3
e y dy (35)
1 y 2π(log y)
The equation of LEV for this model can be obtained as:
Z d  1/2 θ( logx−µ )2
o 1 θ µ
− 2 log
E (X ∧ d) = x 3
e x dx + d[1 − F o (d)] (36)
1 x 2π(log x)

20
Figure 14: Inverse gaussian model graphs

The MRL of this model is given by :

E o (X) − E o (X ∧ d)
e(d) = (37)
1 − F o (d)

5.4.2 Parameter estimation


The invariant MLE of the parameters are µ̂ = 8.19756 and θ̂ = 229.18479 as obtained in the
previous section.

5.4.3 Comparison by graphs


In this section we produce the graphs of the inverse gaussian model for the transformed data
using the parameters we estimated in previous section. The graphs of PDF, CDF, LEV, MRL
are produced in Figure(15)
We produce the PP-plot for this model in Figure(11)

6 A non-feasible good fit model


In this section we present a nice fit, both for log(data) and original data which has eventually
no use because of its infinite mean for original claims.

21
Figure 15: log(inverse gaussian) model graphs

6.1 Gumble fit for Log(data)


6.1.1 Theoretical functions
The PDF of Gumble distribution is defined as:
1 (α−x) ( α−x )
β
f (x) = e β e−e
β
where β > 0.
The corresponding CDF is defined as :
( α−x )
β
F (x) = e−e

Then the theoretical LEV can be expressed as:


Z d ( α−x )
1 (α−x) β
E(X ∧ d) = x e β e−e dx + d[1 − F (d)]
0 β
from which it can be shown that E(X) = α + γβ, γ being a constant.
Finally the theoretical MRL can be written as:
Rd (α−x) ( α−x )
β
α + γβ − 0 x β1 e β e−e dx − d[1 − F (d)]
e(d) =
( α−d )
1−e −e β

22
6.1.2 Parameter estimation
As usual we try to estimate the parameters of Gumble distribution by LogLikelihood Method.
The Loglikelihood function is given , for α  R and β > 0, by:
n n
1X X α − xi
l(α, β) = −n ln β + (α − xi ) − exp ( )
β β
i=1 i=1

And the corresponding score functions are:


n
∂l(α, β) n 1X α − xi
S1 (α, β) = = − exp ( )
∂α β β β
i=1
n n
∂l(α, β) n 1 X 1 X α − xi
S2 (α, β) = =− + 2 (xi − α) − 2 (xi − α) exp ( )
∂β β β β β
i=1 i=1

We have to solve the following equations to get the maximum likelihood estimators (mle’s):
n
X α − xi
exp ( )−n=0
β
i=1
n n
X X α − xi
−nβ + xi − xi exp ( )=0
β
i=1 i=1

Since we are not able to find an explicit form for the estimators, we had to use the package in-
built in R to estimate them numerically. As an initial value we supply the method of moments
estimates. From our experience we learned that supplying method of moments estimates as
initial value sometimes leads to some absurd estimates in R. But fortunately this time the mles
given by R with method of moments estimates as initial value are consistent. We know that
theoretical first and second moments are:

E[X] = α + γβ
1
V [X] = E[X 2 ] − E 2 [X] = (πβ)2
6

So we matched the empirical moments with the theoretical one to get the initial values needed
to get the mle’s:
α̂ = 7.464233 and β̂ = 1.294363

6.1.3 Comparison by graphs


In this section we put four basic graphs of this model, namely PDF, CDF, LEV, and MRL us-
ing the parameters estimated in the previous section. The graphs appear in Figure(16) We
end this section with the PP-plot of this model , together with other models which didn’t
work. In Figure(17) we produce the PP-plots of so called “garbage models” for log(data). And
in Figure(18) we produce the PP-plots of the corresponding models for the original data. As
expected the PP-plots are exactly same for both original and transformed data for each model.

23
Figure 16: Gumble fit for log(data)

6.2 Log(gumble) fit for the original data


6.2.1 Theoretical functions
The PDF of this distribution is given by:
α−log (x)
1 (α−log (x)) (
β
)
f o (x) = e β e−e
β
where β > 0.
And the corresponding CDF is given by:
Z x α−log (y)
1 (α−log (y)) (
β
)
o
F (x) = e β e−e dy
1 β

The theoretical LEV for this model can be expressed as:


Z d α−log (x)
1 (α−log (x)) (
β
)
E o (X ∧ d) = x e β e−e + d[1 − F o (d)]
1 β
And, finally, the theoretical MRL can be expressed as :
E o (x) − E o (X ∧ d)
eo (d) =
1 − F o (d)
where the functions on the right hand side are defined earlier.

24
Figure 17: PP plots of non-feasible models for log(data)

6.2.2 Parameter estimation


The estimated parameter values are the invariant MLE’s obtained in previous section.

6.2.3 Comparison by graphs


In this subsection we plot the four basic graphs, using the parameters we estimated in the
previous subsection, for the original claims. The graphs appear in Figure(19)
And we already produced the PP-plot of this model for the original claims in Figure(18)

7 Goodness of fit and model selection


In previous section we produce all the graphs giving the idea of the goodness of our fit. Specially
we mention the PP-plots of both feasible and non-feasible models shown earlier.
Here we use quantitative tests for assessing the goodness of fit. We now define the test
statistics we will use in this section. In Appendix we put some codes, for some other goodness
of fit tests, which are not even included here.
The Kolmogorov-Smirnov test statistic is defined as :

K.S. = max | Fn (x) − F (x, θ̂) | (38)


x

where Fn (x) is the empirical and F (x) is the model CDF.

25
Figure 18: PP plots of non-feasible models for original data

Anderson-Darling test statistic is defined as :


k
X
2
A = −nF (u) + n (1 − Fn (yj ))2 {log (1 − F (yj )) − log (1 − F (yj+1 ))}
0

k
X
+n (Fn (yj )2 {log (F (yj+1 )) − log (F (yj ))} (39)
1
where yk ’s are class boundaries with t = y0 < y1 < y2 < · · · < yk < yk+1 = u.
The Cramer-Von -Misses distance for the individual data is defined as :
n
1 X 2
Cn (θ̂) = Fn (xi ) − F (xi , θ̂) (40)
n
1

The Akike value is defined as:

AIC = 2K − 2loglikelihood (41)

All the above test statistics and the selection criteria are applied on some of our non-feasible
and two feasible models. The results are produced in tabular form in Figure(20). Since we test
the goodness of fit for two feasible models we have to choose better one from these two. Both
the model passed the tests. Now it might have been better to compare the models with some
empirical statistics. But because of computational difficulties most of the theoretical statistics are

26
Figure 19: log(Gumble) fit for the original data

not available to compare. However comparing theoretical and empirical means(See Figure(21))
we can clearly opt for Inverse Gaussian model. And we are more interested in doing so because a
consistent mean is the vital concern of premium calculation. The boxplot(appears in Figure(22))
we produce also strongly supports the choice of inverse Gaussian model. Because, first of all, we
can see from box plot that the claims treated as outliers(the dots outside the whiskers in box
plot )are very few in inverse Gaussian model compare to log-normal or even Gumble. That
means that inverse Gaussian model can accommodate more and more large claims with less
premium(because of smaller mean) than log-normal. Also we know the length of the box, which
is just IQR(inter quartile range), is a good source of comparison for different models. From
this point too we see that the inverse Gaussian quartiles matched better, than the log-normal
quartiles, with the empirical quartiles (two extremes of the box being 1st and 3rd quartiles
and the line inside the box being median). So we believe that our moled selection is strongly
justified.

8 Simulating rate and getting premium


We simulated “number of claims arrive in a year” 10000 times(see the source code in Ap-
pendix). And then we use the mean of those rates in our premium calculation. We know the
premium(P) is calculated according to:

P = (1 + θ)λ̂E(X) (42)

27
Figure 20: Comparison of different models

where θ is the security loading. In most cases it varies among 1%,2% and 5%. The calculated
premiums with different security loadings appear in Figure(23).
Though we didn’t include in our premium calculation we tried in another way to simulate
the claims arrival time. Since in our data we are missing recording or arrival times of claims we
simulate the arrival times. With those simulated arrival times one can apply the Poisson model
to estimate the parameter λ of the associated Poisson process.
Talking about simulating claim arrival times we should mention the idea we applied. We
simulate 1000 times. For each simulation we generate exponential inter arrival times with fixed
rate, so the arrival times can be treated as Poisson with same rate. The rates, which varies from
one simulation to other, are generated randomly from U (1.17, 3.17). We chose this interval as
we have , originally, 793 claims observed in one year , which means the rate is 793
365 = 2.17. And
we deviated one unit on both sides. Finally we take the mean , of those 1000 simulated arrival
times for each of 793 claims , as the arrival time of each claims.
We come up with some , seemingly, nice codes. We were able to write a code which will
give us , for a sorted vector of claims, the number of repetitions of each claims. We also write
a code to compute ”n.claimsday”, where n = 1, 2 · · · , which gives us , how many days, out of
365 days, we received n (number of) claims. See the source codes in Appendix. A typical output
appears in Figure(24).

28
Figure 21: Final model selection

9 Scopes limitations and conclusion


The model we chose is specially suitable for premium calculation. We accommodate the influence
of outlier in parameter estimation. This has an increasing influence on premium, which goes in
favor of the company, and the price are being evenly paid by all policy holders. However one can
estimate the parameters disregarding one or few outliers. And the premium obtained with the
new estimate will be less than that obtained previously. Which goes in favor of the policy holders
but troughs the company in a more risky position in case such outlying claims occurs. There
may be compromised solution to this problem, namely, segregating the big claims and small
claims. Estimating the parameters separately and then projecting more consistent premiums for
each groups.
However in some situation if premium doesn’t seem to be that important then log(log-
normal) model may turn out to be more suitable than log(inverse-gaussian). And , furthermore, if
finite mean is not a requirement then either of log(Gumble), log(inverse transformed gamma)or
log(generalized Pareto) may serve well.
When talking about limitation(s), we should mention that, outside the models we reported
or talked about there are some other models which we couldn’t try because of the failure in op-
timization when using the package in R for estimating parameters. May be some more advanced
optimization package will help at that point, which may lead to more suitable model may even
with finite mean.

29
Figure 22: Comparison by box plots

10 Appendix
This section is completely devoted to the source codes . Here basically we put all the codes , in
R, we used for different parameter estimation, various numerical computations and graphical
manipulation etc.

fires.x=matrix(scan("C:/Ivan/Master/Winter Term 2007/MAST726 - Loss


Distributions/Project/Data/Danish_Fires_2.txt"),byrow=T,ncol=2)
space=fires.x[,1]
losses=fires.x[,2]
lnloss=log(losses)
origfunc=losses
func=lnloss
y=func
classes=c(0,4.27,5.25,6.23,7.21,8.19,9.17,10.2,11.1,12.1,13.1,15.1)
hbreak=classes
summary(func)
cvar=function(x){stdev(x)/mean(x)}
cvar(func)
skew<-function(x){mean((x-mean(x))^3)/sd(x)^3}
skew(func)
kurt<-function(x){mean((x-mean(x))^4)/sd(x)^4}

30
Figure 23: Premiums with different security loadings

Figure 24: A typical output for Poisson model

kurt(func)
quantile(func, c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9))
func.mu=mean(func)
func.std=sd(func)
func.mu
func.std

10.1 Codes for empirical tools


Empirical CDF:
func.s<-sort(func)
ecdf<-function(x)
{a<-x for(i in 1:length(x))
{a[i]<-sum(x<=x[i])/length(x)}
return(a)}
func.fn<-ecdf(func.s)
plot(func.s, func.fn, type="s",
xlab="ln(Losses)", ylab="F_n", main="Empirical CDF",col="red")
Empirical MRL:
emrl<-function(x){a<-x
for(i in 1:length(x))

31
{(if(x[i] < max(x)) (a[i] <- sum(x[x-x[i] > 0] - x[i])
/(length(x))/(1-func.fn[i])) else a[i] <- 0)}
return(a)}
func.en<-emrl(func.s) plot(func.s, func.en, type="l",
xlab="ln(Losses)", ylab="Limited Expected Value", main="Empirical
MRL of the Individual Ln(Losses)",col="red")

Empirical LEV:

func.lev<-mean(func.s)-func.en*(1-func.fn)
plot(func.s,
func.lev,type="l", xlab="ln(Losses)", ylab="Limited Expected Value",
main="Empirical LEV of the Individual Ln(Losses)",col="red")

Histogram:

hist(func, prob=T, breaks=hbreak, xlim=c(4,14.5),xlab="Ln(losses)",


main="Histogram ln(Losses)",col="gray")
n=length(func)

n.sim<-10000
t.period<-365
rate<-rep(0,n.sim)
lambda<-rep(0,n.sim)
claimnumbers<-rep(0,t.period)
for (i in 1:n.sim){
lambda[i]<-runif(1,2,2.34)
claimnumbers<-rpois(t.period,lambda[i])
rate[i]<-sum(claimnumbers)/t.period }
rate.estimate<-mean(rate)
rate.estimate
num.claims=rpois(365,rate.estimate)
sum(num.claims)

10.2 Selective codes to get the model tools for log(data)


Here we put some selective codes for one of the non-feasible model and two feasible models. The
codes for other models are exactly same just need to change the respective inputs.

10.2.1 Gumble tools


Libraries used:

func<-lnloss
library(stats4)
library (MASS)
library(lattice)

To obtain MLE:

32
#MLE
ll<-function(alpha,beta){n*log(beta)-sum((alpha-y)/beta)+sum(exp((alpha-y)/beta))}
est<-mle(minuslog=ll,start=list(alpha=alpha.mom,beta=beta.mom))
summary(est)
alpha_gumbel=7.464233
beta_gumbel=1.294363
alpha=7.464233
beta=1.294363

#PDF
pdfgumbel=function(x,alpha,beta){(beta^(-1))*exp((alpha-x)/beta)*exp(-exp((alpha-x)/beta))}
pdf7=pdfgumbel(func,alpha,beta) hist(func, prob=T, breaks=hbreak,
col="gray",xlim=c(4,14.5),xlab="ln(losses)", main="Histogram
ln(losses) vs Gumbel Distribution")
lines(func,pdf7,type="o",xpd=T,col="red")

#CDF
cdfgumbel=function(x,alpha,beta) {exp(-exp((alpha-x)/beta))}
alpha=alpha_gumbel
beta=beta_gumbel
cumgumbel=cdfgumbel(func,alpha,beta)
plot(func.s, func.fn, type="s",
xlab="ln(losses)", ylab="F_n", main="Empirical vs Theoretical Gumbel
CDF",col="blue")
lines(func.s, cumgumbel, type="o",col="red")

#LEV
n=length(func)
alpha=alpha_gumbel
beta=beta_gumbel
int=rep(0,793)
levgumbel=rep(0,n)
for(i in 1:n)
{
integrand=
function(z)
{z*((1/beta)*exp((alpha-z)/beta))*exp(-exp((alpha-z)/beta))}
int=integrate(integrand, lower = 0, upper = func[i])
levgumbel[i]=int$va+(func.s[i]*(1-cumgumbel[i]))
}
plot(func.s,
func.lev, type="l", xlab="ln(Losses)", ylab="Limited Expected
Value", main="Empirical-Gumbel LEV of the Ln(Losses)",col="blue")
lines(func.s,levgumbel,type="o",xpd=T,col="red")

#MRL
euler=0.577215665

33
mean.gumbel=alpha+(euler*beta)
mrl3=(mean.gumbel-levgumbel)/(1-cumgumbel)
plot(func.s, func.en,
type="l", xlab="ln(Losses)", ylab="Limited Expected Value",
main="Empirical-Gumbel MRL of the Ln(Losses)",col="blue")
lines(func.s,mrl3,type="o",xpd=T,col="red")

# P-P Plot
qqplot(cumgumbel,func.fn,xlab="CDF Theoretical",
ylab="CDF Empirical",main="P-P Plot Gumbel vs
ln(Losses)",col="blue")
abline(0,1)
# Q-Q Plot and simulating
‘‘Gumble’’
u=runif(793)
simgumbel=alpha-(beta*log(-log(u)))
qgumbel=quantile(simgumbel, probs = func.fn, na.rm = FALSE,names =
TRUE, type = 7) qfunc=quantile(func, probs = func.fn, na.rm =
FALSE,names = TRUE, type = 7)
plot(qgumbel,qfunc,xlab="Theoretical
Quantiles",ylab="Sample
Quantiles",xlim=c(min(func),max(func)),col="red",main="Q-Q Plot
Gumbel vs ln(Losses)")
abline(0,1)

10.2.2 Lognormal tools


Here in parameter estimation we have something to mention. Instead of using first moment(fm)
and second moment(sm) we use use “Hubers robust estimate” for mean and std deviation. Which
have eventually been used in method of moments estimate and then in MLE’s. Though because
of log-sclale the difference is not that much observable we believe it has a robustifying effect and
we believe applying on original large scale data the difference would be quite observable. Useful
code s are listed below:
# MLE
robmean=huber(y,k=1.5,tol = 1e-06)$mu
robstd=huber(y,k=1.5,tol
= 1e-06)$s
#fm=mean(y)
#sm=sum(y^2)/n
fm=robmean
sm=robstd^2+robmean^2
sigma.mom=sqrt(log(sm)-2*log(fm))
mu.mom=log(fm)-0.5*sigma.mom^2
ll<-function(mu,sigma){n*log(sigma)+(n/2)*log(2*pi)+sum(log(y))
+0.5*sum(((log(y)-mu)/sigma)^2)}
est<-mle(minuslog=ll,start=list(mu=mu.mom,sigma=sigma.mom))
summary(est)

34
# Not Robust MEAN
#mu=2.0859586
#sigma=0.1874896
#Robust MEAN
mu=2.0859586
sigma= 0.1874917

# PDF
pdflognorm=function(x,mu,sigma){(1/(x*sigma*sqrt(2*pi)))*exp(-(((log(x)-mu)/sigma)^2)/2)}
denlognorm=pdflognorm(func,mu,sigma) hist(func,prob=T,breaks=hbreak,
col="gray",xlim=c(4,15),xlab="ln(Losses)", main="Histogram
ln(Losses) vs LogNormal Distribution")
lines(func,denlognorm,type="o",xpd=T,col="red")

# CDF
cdflognorm=function(x,mu,sigma) {pnorm((log(x)-mu)/sigma,0,1)}
cumlognorm=cdflognorm(func,mu,sigma) plot(func.s, func.fn, type="s",
xlab="ln(Losses)", ylab="F_n", main="Empirical vs Theoretical
LogNormal CDF",col="blue")
lines(func.s,cumlognorm,type="o",col="red")

# LEV
int=rep(0,793)
levlognorm=rep(0,n)
for(i in 1:n)
{
integrand=function(z){z*((1/(z*sigma*sqrt(2*pi)))*exp(-(((log(z)-mu)/sigma)^2)/2))}
int=integrate(integrand, lower = 0, upper = y[i])
levlognorm[i]=int$va+(y[i]*(1-cumlognorm[i]))
}
plot(func.s,func.lev, type="l", xlab="ln(Losses)", ylab="Limited
Expected Value", main="Empirical-LogNormal LEV of the
ln(Losses)",col="blue")
lines(func.s, levlognorm,type="o",col="red")

# MRL
mean.lognorm=exp(mu+((sigma^2)/2))
mrllognorm=(mean.lognorm-levlognorm)/(1-cumlognorm)
plot(func.s,func.en, type="l", xlab="ln(Losses)", ylab="Limited
Expected Value", main="Empirical-Lognormal MRL of the
Ln(Losses)",col="blue")
lines(func.s,mrllognorm,type="o",xpd=T,col="red")

# P-P Plot plot(plnorm(func.s,mu,sigma),func.fn,xlab="CDF


Theoretical", ylab="CDF Empirical",main="P-P Plot LogNormal vs
ln(Losses)",col="blue") abline(0,1) # Q-Q Plot
plot(qlnorm(func.fn,mu,sigma),func.s,xlab="Theoretical
Quantiles",ylab="Sample

35
Quantiles",xlim=c(min(func),max(func)),col="red",main="Q-Q Plot
LogNormal vs ln(Losses)") abline(0,1)

# Simulating LogNormal random numbers


u=runif(n)
v=runif(n)
simstdnormal=sqrt(-2*log(u))*cos(2*pi*v)
simnormal=mu+sigma*simstdnormal simlognorm=exp(simnormal)

# Likelihood Region (2D - Fixing Sigma)


ll<-function(mu){-(n*log(sigma)+(n/2)*log(2*pi)+sum(log(y))+
0.5*sum(((log(y)-mu)/(sigma))^2))}
loglike=ll(mu)
rpar=function(x){ll(x)-loglike}
mu=seq(2,2.15,length=n)
rpar=rep(0,n)
for(i in 1:n) { rpar[i]=ll(mu[i])-loglike }
prob1=rep(log(0.10),n)
prob2=rep(log(0.50),n)
plot(mu,rpar,type="l",col="black",ylim=c(-50,0),xlab="mu",ylab="r(mu)",main="Log-relative
Likelihood Function for LogNormal (Given sigma)")
lines(mu,prob1,col="red")
lines(mu,prob2,col="blue")

# Likelihood Region (3D)


ll<-function(mu,sigma){-(n*log(sigma)+(n/2)*log(2*pi)+sum(log(y))+
0.5*sum(((log(y)-mu)/(sigma))^2))}
mu_ini=2.0859586
sigma_ini=0.1874917
loglike=ll(mu_ini,sigma_ini)
rpar=function(x,y){ll(x,y)-loglike}
nrep=100
mu=seq(1.9,2.25,length=nrep)
sigma=seq(0.15,0.21,length=nrep)
llr_m=matrix(rep(0,nrep*nrep),ncol=nrep)
for(i in 1:nrep)
{
for(j in 1:nrep)
{
llr_m[i,j]=rpar(mu[i],sigma[j])
}
}
wireframe(llr_m, shade = TRUE,list(arrows = TRUE),more=TRUE,
aspect = c(50/97, 0.4),drape=TRUE,colorkey = FALSE,screen =
list(z=40, x=-80),
light.source =
c(10,0,10),xlab="Mu",ylab="Sigma",zlab="Loglikelihood
F.",main="Loglikelihood Function LogNormal")

36
# Covariance Matrix
var.mu=-1/(-n/sigma^2)
var.sigma=-1/((n/sigma^2)-((3/sigma^4)*sum((log(y)-mu)^2)))
cov.mu.sigma=(-2/sigma^3)*(sum(log(y)-mu))
sqrt(var.mu)
sqrt(var.sigma)
cov.mu.sigma
rho.mu.sigma=cov.mu.sigma/(sqrt(var.mu)*sqrt(var.sigma))
rho.mu.sigma
# Confidence Interval (95%)
lim.inf.mu=mu-(1.96*sqrt(var.mu))
lim.sup.mu=mu+(1.96*sqrt(var.mu))
lim.inf.sigma=sigma-(1.96*sqrt(var.sigma))
lim.sup.sigma=sigma+(1.96*sqrt(var.sigma))
lim.inf.mu
lim.sup.mu
lim.inf.sigma
lim.sup.sigma

10.2.3 Inverse Gaussian Tools


Like “lognormal”model here also we incorporate the “Hunber estimate” in method of moments
estimates and hence in MLE’s.

#mu.mom=mean(y)
#theta.mom=(n*(mean(y)^3))/(sum(y^2)-(n*(mean(y)^2)))
robmean=huber(y,k=1.5,tol = 1e-06)$mu
robstd=huber(y,k=1.5,tol=1e-06)$s
mu.mom=robmean
theta.mom=robmean^3/robstd^2
# MLE library(stats4)
ll<-function(mu,theta){-((n*log(theta)/2)-(n*log(2*pi)/2)-(1.5*sum(log(y)))
-(theta*sum((((y-mu)/mu)^2)/(2*y))))}
est<-mle(minuslog=ll,start=list(mu=mu.mom,theta=theta.mom))
summary(est)
mu=8.197649
theta=229.417857

# PDF
pdfinvnorm=function(x,mu,theta)
{((theta/(2*pi*x^3))^(1/2))*exp(-theta*((x-mu)^2)/(2*x*mu^2))}
deninvnorm=pdfinvnorm(y,mu,theta)
hist(func,prob=T,breaks=hbreak,col="gray",xlim=c(4,15),xlab="ln(Losses)",
main="Histogram ln(Losses) vs Inverse Gaussian Distribution")
lines(func,deninvnorm,type="o",xpd=T,col="red")

#CDF
int=rep(0,n)

37
cdfinvnorm=rep(0,n)
for(i in 1:n)
{
integrand=function(x) {pdfinvnorm(x,mu,theta)}
int=integrate(integrand, lower = 0, upper = y[i])
cdfinvnorm[i]=int$va
}
plot(func.s,func.fn,type="l",xlab="ln(Losses)", ylab="F_n",
main="Empirical vs Theoretical Inverse Gaussian CDF",col="blue")
lines(func.s,cdfinvnorm,type="o",col="red")

# LEV
int=rep(0,793)
levinvnorm=rep(0,n)
for(i in 1:n)
{
integrand=function(z) {z*pdfinvnorm(z,mu,theta)}
int=integrate(integrand,lower= 0, upper = y[i])
levinvnorm[i]=int$va+(y[i]*(1-cdfinvnorm[i]))
}
plot(func.s,func.lev, type="l", xlab="ln(Losses)", ylab="Limited
Expected Value", main="Empirical-Inverse Gaussian LEV of the
ln(Losses)",col="blue")
lines(func.s, levinvnorm,type="o",col="red")

# MRL
mean.invnorm=mu
mrlinvnorm=(mean.invnorm-levinvnorm)/(1-cdfinvnorm)
plot(func.s,func.en, type="l", xlab="ln(Losses)", ylab="Limited
Expected Value", main="Empirical-Inverse Gaussian MRL of the
Ln(Losses)",col="blue")
lines(func.s,mrlinvnorm,type="o",xpd=T,col="red")

# P-P Plot
qqplot(cdfinvnorm,func.fn,xlab="CDF Theoretical",
ylab="CDF Empirical",main="P-P Plot Inverse Gaussian vs
ln(Losses)",col="blue")
abline(0,1)

Here we simulate inverse Gaussian distribution . We refer (simig)

# simulating inverse gaussian


rinvgauss<-function(n,mu=stop("no shape arg"), lambda =1)
{
if(any(mu<=0)) stop("mu must be positive")
if(any(lambda<=0)) stop("lambda must be positive")
if(length(n)>1) n <- length(n)
if(length(mu)>1 && length(mu)!=n) mu <- rep(mu,length=n)

38
if(length(lambda)>1 && length(lambda)!=n) lambda <-
rep(lambda,length=n)
y2 <- rchisq(n,1)
u <- runif(n)
r1 <- mu/(2*lambda) * (2*lambda + mu*y2 - sqrt(4*lambda*mu*y2 +
mu^2*y2^2))
r2 <- mu^2/r1
ifelse(u < mu/(mu+r1), r1, r2)
}
siminvgaussian=rinvgauss(n, mu, lambda)

# Likelihood Region (2D - Fixing theta)


theta=229.417857
ll<-function(mu){((n*log(theta)/2)-(n*log(2*pi)/2)-(1.5*sum(log(y)))
-(theta*sum((((y-mu)/mu)^2)/(2*y))))}
mu=8.197649
loglike=ll(mu)
rpar=function(x) {ll(x)-loglike}
mu=seq(8,8.5,length=n)
rpar=rep(0,n)
for(i in 1:n){rpar[i]=ll(mu[i])-loglike }
prob1=rep(log(0.10),n)
prob2=rep(log(0.50),n)
plot(mu,rpar,type="l",col="black",ylim=c(-20,0),xlab="mu",ylab="r(mu)",main="Log-relative
Likelihood Function for InvGaussian (Given theta)")
lines(mu,prob1,col="red")
lines(mu,prob2,col="blue")

# Likelihood Region (3D)


ll<-function(mu,theta){((n*log(theta)/2)-(n*log(2*pi)/2)-(1.5*sum(log(y)))
-(theta*sum((((y-mu)/mu)^2)/(2*y))))}
mu_ini=8.197649
theta_ini=229.417857
loglike=ll(mu_ini,theta_ini)
rpar=function(x,y) {ll(x,y)-loglike}
nrep=100
mu=seq(8,8.5,length=nrep)
theta=seq(200,250,length=nrep)
llr_m=matrix(rep(0,nrep*nrep),ncol=nrep)
for(i in 1:nrep)
{
for(j in 1:nrep)
{
llr_m[i,j]=rpar(mu[i],theta[j])
}
} wireframe(llr_m, shade = TRUE,list(arrows = TRUE),more=TRUE,
aspect = c(60/97, 0.4),drape=TRUE,colorkey = FALSE,screen =

39
list(z=40, x=-70),
light.source =
c(10,0,10),xlab="Mu",ylab="Theta",zlab="Loglikelihood
F.",main="Loglikelihood Function InvGaussian") #mu.1=rep(0,nrep)
plane=matrix(rep(0,nrep*nrep),ncol=nrep) p.l=log(0.5)
k=((-1/theta)*(p.l-n*log(theta)+n*log(2*pi)+3*sum(log(y))))
a.eq=k-sum(1/y) b.eq=2*n c.eq=n*mean(y)
root1=(-b.eq-(sqrt(b.eq^2-(4*a.eq*c.eq))))/(2*a.eq)
root2=(-b.eq+(sqrt(b.eq^2-(4*a.eq*c.eq))))/(2*a.eq)
for(i in 1:nrep)
{
for(j in 1:nrep)
{
plane[i,j]=rpar(root1[i],theta[j])
}
}
win.graph()
wireframe(llr_m, shade = TRUE,list(arrows =
TRUE),more=TRUE,
aspect = c(60/97, 0.4),drape=TRUE,colorkey = FALSE,screen =
list(z=40, x=-70),
light.source =
c(10,0,10),xlab="Mu",ylab="Theta",zlab="Loglikelihood
F.",main="Loglikelihood Function InvGaussian")

wireframe(plane, shade = FALSE,list(arrows = TRUE),more=TRUE,


aspect = c(60/97, 0.4),drape=TRUE,colorkey = FALSE,screen =
list(z=40, x=-70),
light.source =
c(10,0,10),xlab="Mu",ylab="Theta",zlab="Loglikelihood
F.",main="Loglikelihood Function InvGaussian")

10.3 Selective codes to get the model tools for original data
In this section we provide mostly the similar codes as the previous section but for original data.

10.3.1 Log(lognormal) tools


Histogram and classes

losses=fires.x[,2]
# Transformation:
origfunc=losses
y=origfunc
#Classes used in this analysis classes=c(0, 229, 337, 727, 1068,
1569, 2304, 3383, 4968, 7295, 10712, 15730, 23098, 33917, 49805,
73134, 107391, 157693, 231559, 340023, 688498, 3470555)
hbreak=classes

40
hist(origfunc, prob=T, breaks=classes ,
xlim=c(0,40000), xlab="Losses",main="Histogram - Losses")

# Some summary statistics:


summary(y)
cvar=function(x){sd(x)/mean(x)}
cvar(y)
skew<-function(x){mean((x-mean(x))^3)/sd(x)^3}
skew(y)
kurt<-function(x){mean((x-mean(x))^4)/sd(x)^4}
kurt(y)
quantile(y,c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9))
sd(y)

# Sensitivity Curve (ordinary mean)


y=origfunc
n=793
t=rep(0,n)
for(i in 3:n) { t[i]=y[793]-mean(y[c(1:(i-1))]) }
plot(y[c(3:792)],t[c(3:792)])
# Sensitivity Curve (robust mean)
y=origfunc n=793 t=rep(0,n)
for (i in 3:n)
{t[i]=y[793]-median(y[c(1:(i-1))]) }
plot(y[c(3:792)],t[c(3:792)])

We dont like to report the codes for empirical PDF, CDF, LEV, MRL for the original data as
those are exactly same as log(data) with the respective changes being made in input.

# Parameters from LogData


mu=2.0859586
sigma= 0.1874917
pdflognorm=function(x,mu,sigma){(1/(x*sigma*sqrt(2*pi)))*exp(-(((log(x)-mu)/sigma)^2)/2)}
# It is a density!!!
integrand=function(x){(1/x)*pdflognorm(log(x),mu,sigma)}
int=integrate(integrand, lower =1, upper = Inf)

# PDF
pdflnlognormal=function(x,mu,sigma){(1/x)*pdflognorm(log(x),mu,sigma)}
denlnlognormal=pdflnlognormal(origfunc,mu,sigma)
hist(origfunc,prob=T, breaks=classes , xlim=c(0,40000),
xlab="Losses",main="Histogram - Empirical vs Theoretical
LnLogNormal",col="gray")
lines(origfunc,denlnlognormal,type="p",xpd=T,col="red")

#CDF
int=rep(0,793)
cdflnlognormal=rep(0,n)

41
for(i in 1:n)
{
integrand= function(x){ pdflnlognormal(x,mu,sigma)}
int=integrate(integrand, lower = 1, upper = origfunc.s[i])
cdflnlognormal[i]=int$va
}
plot(origfunc.s, origfunc.fn, type="s",
xlab="Losses", ylab="F_n",main="Empirical vs Theoretical lnlognormal
CDF",col="blue")
lines(origfunc.s,cdflnlognormal,type="p",col="red")

# LEV
int=rep(0,n)
levlnlognormal=rep(0,n)
for(i in 1:n)
{
integrand= function(x) {x*pdflnlognormal(x,mu,sigma)}
int=integrate(integrand, lower = 1, upper = origfunc[i])
levlnlognormal[i]=int$va+(origfunc.s[i]*(1-cdflnlognormal[i]))
}
plot(origfunc.s, origfunc.lev, type="l", xlab="Losses",
ylab="Limited Expected Value",
ylim=c(0,30000),xlim=c(0,600000),main="Empirical-lnlognormal LEV of
Losses",col="blue")
lines(origfunc.s,levlnlognormal,type="p",xpd=T,col="red")
#Computingthe mean :
integrand= function(x)
{x*pdflnlognormal(x,mu,sigma)}
int=integrate(integrand,lower=1,upper= Inf)

# MRL
mean_lnlognormal=int$va
mrllnlognormal=(mean_lnlognormal-levlnlognormal)/(1-cdflnlognormal)
plot(origfunc.s, origfunc.en, type="l", xlab="Losses",
xlim=c(0,origfunc[n-1]),ylim=c(0,2500000),ylab="Limited Expected
Value",main="Empirical-lnlognormal MRL of Losses",col="blue")
lines(origfunc.s,mrllnlognormal,type="p",xpd=T,col="red")

# P-P Plot
qqplot(cdflnlognormal,origfunc.fn,xlab="CDF Theoretical",
ylab="CDF Empirical",main="P-P Plot Log-LogNormal vs
Losses",col="blue")
abline(0,1)

# Simulating Log-LogNormal random numbers


u=runif(n)
v=runif(n)

42
simstdnormal=sqrt(-2*log(u))*cos(2*pi*v)
simnormal=mu+sigma*simstdnormal simlognorm=exp(simnormal)
simloglognorm=exp(exp(simnormal))
qloglognorm=quantile(simloglognorm, probs = origfunc.fn, na.rm =
FALSE,names = TRUE, type = 7)
qfunc=quantile(origfunc,probs=origfunc.fn, na.rm = FALSE,names =
TRUE, type = 7)
plot(qloglognorm,origfunc,xlab="TheoreticalQuantiles",ylab="Sample
Quantiles",xlim=c(min(origfunc),100000),ylim=c(min(origfunc),100000),col="red",main="Q-Q
Plot Log-LogNormal vs Losses")
abline(0,1)

10.3.2 Log(inverse gaussian) tools


mu=8.19756
theta=229.18479
pdfinvnorm=function(x,mu,theta){((theta/(2*pi*x^3))^(1/2))*exp(-theta*((x-mu)^2)/(2*x*mu^2))
# PDF
y=origfunc
pdflninvnorm=function(x,mu,theta){(1/x)*pdfinvnorm(log(x),mu,theta)}
denlninvnorm=pdflninvnorm(y,mu,theta)
hist(y, prob=T, breaks=classes
, xlim=c(0,40000), xlab="Losses",main="Histogram - Empirical vs
Theoretical Ln(Inverse Gaussian)",col="gray")
lines(y,denlninvnorm,type="p",xpd=T,col="red")

# CDF
int=rep(0,n)
cdflninvnorm=rep(0,n)
for(i in 1:n)
{
integrand=function(x) {pdflninvnorm(x,mu,theta)}
int=integrate(integrand,lower = 1, upper = y[i])
cdflninvnorm[i]=int$va
}
plot(origfunc,origfunc.fn, type="l",
xlab="Losses", ylab="F_n",main="Empirical vs Theoretical Ln(Inverse
Gaussian) CDF",col="blue")
lines(origfunc.s,cdflninvnorm,type="p",col="red")

# LEV
int=rep(0,n)
levlninvnorm=rep(0,n)
for(i in 1:n)
{
integrand=function(x) {x*pdflninvnorm(x,mu,theta)}
int=integrate(integrand,lower = 1, upper = origfunc[i])

43
levlninvnorm[i]=int$va+(origfunc.s[i]*(1-cdflninvnorm[i]))
}
plot(origfunc.s, origfunc.lev, type="l", xlab="Losses",
ylab="Limited Expected Value",
ylim=c(0,30000),xlim=c(0,y[n-1]),main="Empirical-Ln(Inverse
Gaussian) LEV of Losses",col="blue")
lines(origfunc.s,levlninvnorm,type="p",xpd=T,col="red")
#Computingthe mean :
integrand= function(x) {x*pdflninvnorm(x,mu,theta)}
int=integrate(integrand, lower = 1, upper = Inf)

# MRL
mean_lninvnorm=int$va
mrllninvnorm=(mean_lninvnorm-levlninvnorm)/(1-cdflninvnorm)
plot(origfunc.s, origfunc.en, type="l", xlab="Losses",
xlim=c(0,y[n-2]),ylim=c(0,origfunc.en[n-1]),ylab="Limited Expected
Value",main="Empirical-Ln(Inverse Gaussian) MRL of
Losses",col="blue")
lines(origfunc.s,mrllninvnorm,type="p",xpd=T,col="red")
# P-P Plot
qqplot(cdflninvnorm,origfunc.fn,xlab="CDF Theoretical", ylab="CDF
Empirical",main="P-P Plot Log-InvGaussian vs Losses",col="blue")
abline(0,1)

10.4 Codes for Goodness of fit tests


Kolmogorov-Smirnov test:
Here “mcdf” stands for “model CDF”, which , for each model, is defined earlier.

mu=2.08
sigma= 0.18
mcdf=function(x) {cdflognorm(x,mu,sigma)}
#Kolmogorov smirnov test:
ks.n<-rep(0,n)
ks.n[1]<-abs(ecdf(y)[1]-mcdf(y[1]))
for(i in 2:n)
{ks.n[i]<-max(abs(ecdf(y)[i-1]-mcdf(y[i])),abs(ecdf(y)[i]-mcdf(y[i])))}
ks.statistic<-max(ks.n)
# Critical value at 10% significance level
ks.cri.val<-(1.36/sqrt(n))
ks.statistic
ks.cri.val

In “Inverse Gaussian” model the code is same just parameter values and “mcdf”are different.
Anderson Darling test:
Here we report the code for inverse gaussian and for other models its same just need to change
the respective inputs.

44
mu=8.197649
theta=229.417857
z=vector()
integrand= function(x)
{pdfinvnorm(x,mu,theta)}
mcdf=function(z) {integrate(integrand,
lower = 0, upper = z)}
fsum=0
for(i in 1:(n-1)) {
fsum=(((1-ecdf(y)[i])^(2))*(log(1-mcdf(y[i])$va)-log(1-mcdf(y[i+1])$va)))+fsum
}
fsum
ssum=0
for(i in 2:(n-1)) {
ssum=(((ecdf(y)[i])^(2))*(log(mcdf(y[i+1])$va)-log(mcdf(y[i])$va)))+ssum
}
ssum
adtest=(-n*mcdf(y[n])$va)+n*fsum+n*ssum
adcrit=1.933
adtest
adcrit

The code for “log-normal” is same just need to change the parameters and “mcdf”.
Cramer Von Misses(individiual data) test:

mu=8.197649
theta=229.417857
z=vector()
integrand= function(x)
{pdfinvnorm(x,mu,theta)} mcdf=function(z) {integrate(integrand,
lower = 0, upper = z)}
crv.test=0
for(i in 1:n) {
crv.test=(1/n)*sum((ecdf(y)[i]-mcdf(y[i])$va)^2)+crv.test }
crv.test

Cramer Von Misses(grouped data) test:


Though we didn’t report in any of our model here is a useful code which we report for gumble
model. It can be applied for any other model with respective changes being made for parameters
and “mcdf”.

par=2
alpha=7.464233
beta=1.294363
mcdf=function(x)
{cdfgumbel(x,alpha,beta)} freq<-rep(0,k) total<-rep(0,k) for(i in
2:k) {
for(j in 1:n)

45
{
(if(y[j]<=p[i])(total[i]<-total[i]+y[j]))
}
}
classsum<-rep(0,k-1)
for(mx in 1:(k-1)) {
classsum[mx]<-total[mx+1]-total[mx] }
classfreq<-rep(0,k-1)
for(nx
in 1:(k-1)) { classfreq[nx]<-table(ybreaks)[nx] }
classmean=classsum/classfreq
cvm=0
for(i in 2:k) {
cvm=(classmean[i-1]*((((cumsum(classfreq)[i-1])/n)-mcdf(p[i]))^2))+cvm
}
cvm
Chi square test:
Here is another code which we didn’t use but may be helpful to many people. We set it for
Gumble but can be easily modified for other models by changing the “inputs”.
par=2
alpha=7.464233
beta=1.294363
mcdf=function(x)
{cdfgumbel(x,alpha,beta)}
# Computing expected frequencies:
n<-length(y)
ybreaks=cut(y,breaks=classes)
p=classes
k=length(p)
f.ex<-rep(0,k-1)
for(i in 1:k-1)
{f.ex[i]<-(mcdf(p[i+1])-mcdf(p[i]))*n}
# Getting observed frequecies
& computing chisquare:
f.ob=rep(0,k-1)
for(i in 1:(k-1))
{f.ob[i]=table(ybreaks)[[i]]}
# Empirical Statistis Chi-Square
chi.sq<-sum((f.ob-f.ex)^2/f.ex)
chi.sq
# Getting degrees of freedom
df=(k-1)-par-1
# Getting p-value
p.value=1-pchisq(chi.sq,df)
p.value
t_value=qchisq(0.95,df)

46
e_value=chi.sq
t_value
e_value

10.4.1 Code for estimating λ from simulated claim count:


Here is the code which we used in our premium calculation.

n.sim<-10000
t.period<-365
rate<-rep(0,n.sim)
lambda<-rep(0,n.sim)
claimnumbers<-rep(0,t.period)
for (i in 1:n.sim){
lambda[i]<-runif(1,2,2.34)
claimnumbers<-rpois(t.period,lambda[i])
rate[i]<-sum(claimnumbers)/t.period }
rate.estimate<-mean(rate)
rate.estimate
num.claims=rpois(365,rate.estimate)
sum(num.claims)

10.4.2 Simulating arrival times


Here is the code for simulating arrival times which we didn’t use but we report a typical output
from it to apply the poisson model for estimating λ. The output appears in Figure(24) from
which we can obtain an estimate of λ.
Inside the code there are some other useful codes. Here is the outputs from those codes:
(*)“f” will give us the frequency of each arrival time in the vector of final arrival times(y=f.ar.time)
(*)“cardinality.y”will give us how many distinct days, out of 365 days, claims occur.
(*)“n.claimsday”will give us how many days, out of 365, “n” number of claims occur and finally
(*)“no.claimsday” will give us how many days, out of 365, no claim occur.

n<-793
n.sim<-50
mat<-matrix(c(rep(0,n.sim*n)),nrow=n.sim,ncol=n,byrow=TRUE)
lambda<-rep(0,n.sim)
for(j in 1:n.sim){ i.ar.time<-rep(0,n)
ar.time2<-rep(0,n)
ar.time<-rep(0,n)
lambda[j]<-runif(1,1.17,3.17)
i.ar.time<-rexp(n,lambda[j])
ar.time2[1]<-i.ar.time[1]
for(i in
1:(n-1)){ar.time2[i+1]<-ar.time2[i]+i.ar.time[i+1]}
c<-365/max(ar.time2)
ar.time<-c*ar.time2
mat[j,]<-ar.time}

47
f.ar.time<-rep(0,n)
for(k in
1:n){f.ar.time[k]<-ceiling(sum(mat[,k])/n.sim)}
##Frequency
counter(My best efforts succeeded):
y<-f.ar.time
f<-rep(0,length(y))
for(i in 1:length(y)){
t1<-sum(y<=y[i])
t2<-sum(y>=y[i])
if(t1+t2!=length(y))(f[i]<-(t1+t2-length(y)))}
distinct.total=1
for(i in 1:(length(y)-1)){
temp1<-y[i]
temp2<-y[i+1]
if(temp1!=temp2
)(distinct.total<-distinct.total+1) }
max(f)
cardinality.y<-distinct.total
cardinality.y
cl.count<-rep(0,max(f))
j<-1
repeat{
##repeat can help nicely
in place of repeated use of of "for loop".
if(j>max(f)){break}
i<-1
repeat{
if(i>n){break}
if(f[i]==j){cl.count[j]<-cl.count[j]+1}
i<-(i+1) } j<-(j+1) }
n.claimsday<-rep(0,max(f))
for(i in
1:max(f)){n.claimsday[i]<-((cl.count[i])/i)}
no.claimsday<-(365-cardinality.y)
cardinality.y
cl.count
sum(cl.count)
n.claimsday
no.claimsday
sum(n.claimsday)+no.claimsday

References
[1] Garrido, Jose (2007). Loss distribution lecture notes Unpublished.

48
[2] Klugman, Stuart A., Panjer, Harry H. ,Willmot, Gordon E(2004). Loss Models: From Data
to Decision. Wiley serier in probability and statistics.

[3] McLaughlin, Dr. Michael, P(2001)A Compendium of common Probability Distributions.

[4] The R Develoment Core Team(2003)R: A Language and Environment for Statistical Com-
puting.

[5] Patrick, Daly W.(1998)Graphics and Colour with Latex.

[6] Prof. Dr. Antonio Jos Sez Castillo Funciones de una variable aleatoria

[7] Chhikara and Folks(1989)The Inverse Gaussian Distribution Marcel Dekker, page53. GKS
15 Jan 98

49

You might also like