Professional Documents
Culture Documents
E [ y
i
] =
i
= g
1
(
i
) = f (x
i
)
var [ y
i
] =
V (i)
i
(1)
where g
1
() is the link function, V (
i
), the variance function, is a function
of the mean specic for the distribution family, is a constant that can be
estimated from the data and
i
a prior weight of the i-th observatio, that can
be set equal to one for all cases (see [1] and [17] for details). However, standard
GLM framework leads to restrictive modeling for the variance of y
i
since it
depends on
i
as expressed within the variance function.
Focusing now on claims reserve, we can consider a generic loss development
triangle with dimension (I, J) where rows (i = 1, . . . , I) represent the claims
accident years and columns (with j = 1, . . . , J) describe development years for
payments. It needs to be emphasized that the number of columns may dier
from the number of rows, for example due to a tail in payments development.
We dene now P
ij
as the incremental paid estimate.
Renshaw and Verrall ([20]) casted the chain ladder method into the framework
of GLM with an over-dispersed Poisson model for incremental payments as
Equation 2 shows:
E [P
ij
] = m
ij
var [P
ij
] = m
ij
m
ij
= x
i
y
j
ln (m
ij
) =
ij
= c +
i
+
j
(2)
This approach is based on an over-dispersed Poisson framework since it assumes
the incremental claims P
ij
to be distributed as independent over-dispersed Pois-
son random variables, with mean and variance dened by previous relations.
Here, x
i
represents the expected ultimate claims and y
j
represents the propor-
tion of ultimate claims to emerge in each development year (with the constraint
3
2 THE USE OF GLM IN STOCHASTIC LOSS RESERVING
J
j=1
y
j
= 1). The over-dispersion is introduced by the parameter , which is a-
priori unknown and estimated from the data. The allowance for over-dispersion
does not aect estimation of the parameters, but it yields to increase their stan-
dard errors.
A exible framework, within which previous model could be regarded as
a special case, is reported in Equation 3 (see [10]). The rst two items in
Equation 3 bundle the claim reserving within the GAM framework. The choice
of dependent variables distribution is driven by the parameter, being Normal,
Poisson, Gamma and Inverse Gaussian specied by setting equal to either 0,
1, 2, or 3, respectively. The predictor is linked to the expected value of the
response by means of the logarithmic link function. The last item of Equation 3
denes the central estimate as a function of accident and development years.
This relation extends the last term in Equation 2 by introducing two optional
terms (the oset, u
ij
, and the ination, t) and by taking into account the eect
of the accident year i and the development year j by means of smoothing splines.
It is worth to be noted that the accident and development years are treated as
factors (as Equation 2 shows) when both
j
and
i
are set equal to zero.
E [P
ij
] = m
ij
var [P
ij
] = m
ij
ln (m
ij
) =
ij
= u
ij
+ t + c + s
i
(i) + s
j
(j) + s
i
(ln (j))
(3)
Within this framework, the authors derive the prediction error of the single
incremental payments P
ij
through the square root of the next formula:
E
P
ij
P
ij
ij
+ m
2
ij
V ar (
ij
) . (4)
The estimation variance (second component in Equation 4) is usually com-
puted directly by the implementation of the model, letting the prediction error
to be evaluate.
Let D = {P
ij
, i + j > I + 1, 1 < j < J} denote the missing part of the tri-
angle, the mean square error of prediction (see [19]) of the claims reserve R =
I
i=1
R
i
can be derived as:
E
R
R
i,jD
m
ij
+
i,jD
m
2
ij
V ar (
ij
)
+ 2
i
1
, j
1
D
i
2
, j
2
D
i
1
, j
1
= i
2
, j
2
m
i1j1
m
i2j2
Cov(
i1j1
,
i2j2
).
(5)
An alternative approach to estimate the prediction error is based on the use
of bootstrap, where the scaled Pearson residuals are commonly used (for further
details see [8] and [11]). However, bootstrap analysis accounts only for the esti-
mation variance. The process variance contribution is then computed through
4
3 GAMLSS FOR STOCHASTIC LOSS RESERVING
a closed formula [8] or an additional step based on a simulation of payments
by the process distribution [7]. According to the ODP framework, the process
error can be obtained by simulating the forecasted value from an overdispersed
Poisson with mean
P
ij
and variance
P
ij
, where
P
ij
is the incremental value
in the lower part of the triangle derived by the bootstrap procedure and
is
obtained as the ratio between the sum of squared pearson residuals and the
degrees of freedom. England [7] makes several suggestions to achieve this goal.
Finally, recent approaches are based on a non-constant scale parameter
j
(see
[11]), the use of generalized linear mixed models (see [2]) and a reparameterized
version of original GLM (see [4]).
3 GAMLSS for stochastic loss reserving
GAMLSS is a general class of univariate regression models where the exponen-
tial family assumption is relaxed and replaced by a general distribution family.
The systematic part of the model allows all the parameters of the conditional
distribution of the response variable y
i
to be modelled as parametric or non-
parametric functions of explanatory variables within the framework.
Let
T
= (
1
,
2
, ...,
p
) the p parameters of a probability density function
f (y
i
|) modelled using an additive model.
T
i
= (
i,1
,
i,2
, ...,
i,p
) is a vector of
p parameters related to explanatory variables, where the rst two parameters
i,1
and
i,2
are usually characterized as location
i
and scale
i
. The remaining
parameters, if any, are characterized as shape parameters. For many families of
population distributions, a maximum of two shape parameters
i
and
i
suces
(i.e. p = 4).
Under this condition, we can derive the following model:
g
1
() =
1
= X
1
1
g
2
() =
2
= X
2
2
g
3
() =
3
= X
3
3
g
4
() =
4
= X
4
4
(6)
where we consider only the parametric part (see [21] for an extension of the
model), and where
T
k
=
1,k
,
2,k
, ...,
J
k
,k
k
and X
k
is a known design matrix of size n J
k
.
In particular, Equation 6 implies that the moments of response variable in
each cell can be directly expressed as a function of covariates after a convenient
parametrization.
Considering now the claims reserve framework, we can identify the incremental
payments P
i,j
as response variables and derive the following structure:
E [P
i,j
] = g
1
1
(
1,i,j
)
var [P
i,j
] = g
1
2
(
2,i,j
)
(7)
5
4 NUMERICAL RESULTS
At this regard, GAMLSS R package [23] supports more than 60 distribu-
tions, non-linear and non-parametric relationships (e.g. cubic splines and non
parametric smoothers), random eect modeling, etc. The [22] paper widely
deals with GAMLSS models estimation and selection. The selection process
consists of comparing many dierent competing models that vary by dierent
combinations of either distribution of the response variable, set of link function
for distribution parameters and set of predictor terms. Within the GAMLSS
package, a full set of diagnostic tools is available for checking models assump-
tions. In particular, the Generalized Akaike Information Criterion (GAIC) can
be used to compare alternative models and the normalized randomized quantile
residuals (see Dunn and Smyth, [6]) can be used to check the adequacy of the
model, for example regarding the distribution of the response variable. These
residuals are given by r
i,j
=
1
( u
i,j
) where
1
is the inverse cumulative dis-
tribution function of a standard normal distribution and u
i,j
= F(P
i,j
|
i,j
) is
derived by the assumed cumulative distribution for the cell (i, j).
The procedure to assess the stochastic distribution of loss reserve proposed
by the literature for GLM models can be therefore adapted to GAMLSS as
follows:
1. dene the GAMLSS model underlying the claims development triangle,
M;
2. compute the residuals r
i,j
=
1
[F(P
i,j
|
i,j
];
3. generate N upper triangles of residuals r
k
i,j
with k = 1, . . . , N by sample
with replacement;
4. derive N upper triangles of pseudo-incremental payments from the gamlss
model by the inverse relation: P
k
i,j
= F
1
[( r
k
i,j
)|
i,j
];
5. ret the gamlss model, M;
6. for each cell of the lower part of the triangle simulate from the process
distribution whose parameters are tted from M;
7. sum the simulated payments in the lower triangle by origin year and overall
to compute the origin year and total reserve estimates, respectively.
This approach leads to the full distribution of claims reserve denition and
to the assessment of both the process and the estimation error.
Finally, normality of residuals needs to be veried in order to apply the method-
ology. In the following numerical results the Shapiro-Wilk test has been used.
4 Numerical results
As done in [9] example, the Taylor-Ashe triangle [25], TA, available in the
ChainLadder package (see GenIns triangle in [13]), has been used. TA triangle,
a 10x10 incremental payments triangle, has been used in order to assess both
6
4 NUMERICAL RESULTS
the BE and the distribution of claims reserve in order to compare the proposed
GAMLSS approach to the classic ODP methodology.
Table 1 shows key gures derived by the use of two classical approaches
based on Chain-Ladder method. In particular, the BE and the CV derived
by using Mack formula and GLM ODP model are compared. Furthermore the
comparison is extended to a GLM based on a Gamma distribution. Finally, the
99.5 quantile is obtained under a LogNormal assumption for the Mack formula
and by applying a classical bootstrapping methodology (10.000 simulations) for
the GLM approaches. GLM has been applied here by following England and
Verral [11] approach without any judgement in claims reserve evaluation.
model BE CV Quant
Mack 18680856 0.13 25919050
ODP GLM 18680856 0.16 27912402
Gamma GLM 18085805 0.15 27710864
Table 1: Classical stochastic reserving methods results on Taylor-Ashe triangle
BE and CV are here reported only for comparison with GAMLSS results
and values are equivalent to that derived in [8] and [11].
Then, several GAMLSS models were t on the same triangle and compared
by GAIC index. A wide range of conditional distributions, much more beyond
the classical exponential family, were tested. As expected the GAIC derived
by using a GAMLSS with a Gamma distribution is almost equal to the AIC
(roughly 1500) derived by GLM based on the same distribution.
Table 2 shows the GAIC t index and the relating degrees of freedom of various
GAMLSS models applied on triangle. Table 3 conrms results in terms of BE
models df GAIC
Weibull 20 1495.04
NegativeBinomial TypeII 20 1495.25
NegativeBinomial 20 1500.77
Gamma 20 1500.77
Gumbel 20 1515.18
InverseGaussian 20 1515.69
Exponential 19 1599.88
Table 2: GAMLSS regression ts on Taylor-Ashe triangle
assessment using GAMLSS to be similar to the ones that would be derived
following a straight GLM approach (see Table 1).
The greatest advantage of GAMLSS in reserving application is that more
than one distribution parameter can be explicitely modeled as function of co-
variates.
For the TA triangle, the dispersion parameter of incremental payments has been
assumed as a function of either origin or development year in order to assure
7
4 NUMERICAL RESULTS
distribution BE
WEI 19939326
GAMMA 18085822
IG 17364127
NB 18085841
GU 23467287
NB2 18995459
EXP 18085822
Table 3: Best estimates using GAMLSS models
a better tting on data. The latter one could be described by the following
Equation when the Gamma distribution is considered:
E [P
i,j
] = exp(c +
i
+
j
)
var [P
i,j
] = exp (d + e
j
)
(8)
The analysis showed that the best tting model with varying dispersion
parameter to be when incremental payments follow a Gamma distribution. In
particular, Table 4 shows GAIC values determined by assuming the dispersion
parameter to vary by development year or by accident year. Figures show that
assuming dispersion to vary by development year signicantly enhances the
GAIC t, also with respect to models shown in Table 3.
Figure 1 displays the diagnostics plot as given by GAMLSS R package for the
model where dispersion varies by development year. The well behaviour of
residuals can be shown since no sistematic trend with respect to tted value or
position appears as well as the shape of normalized quantile residuals can be
well approximated by a Normal distribution as plots in lower section show.
model GAIC BE
origin, factor 1380.79 20387778
development, factor 1239.41 20277356
Table 4: GAIC and Best estimates using GAMLSS models with dierent models
for dispersion parameters
8
4 NUMERICAL RESULTS
*******************************************************************
Summary of the Quantile Residuals
mean = -0.0004790768
variance = 0.8516852
coef. of skewness = 0.1062681
coef. of kurtosis = 2.586916
Filliben correlation coefficient = 0.9900941
*******************************************************************
200000 800000 1400000
1
0
1
2
Against Fitted Values
Fitted Values
Q
u
a
n
t
i
l
e
R
e
s
i
d
u
a
l
s
0 10 20 30 40 50
1
0
1
2
Against index
index
Q
u
a
n
t
i
l
e
R
e
s
i
d
u
a
l
s
3 1 0 1 2 3
0
.
0
0
.
2
0
.
4
Density Estimate
Quantile. Residuals
D
e
n
s
i
t
y
2 1 0 1 2
1
0
1
2
Normal QQ Plot
Theoretical Quantiles
S
a
m
p
l
e
Q
u
a
n
t
i
l
e
s
Figure 1: GAMLSS model diagnostic plot for nal model
model BE
Mean 20250642.78
CV 0.09
Skewness 0.36
99.5 Quantile 25807705.98
Table 5: Main characteristics of Claims Reserve using GAMLSS models
Table 5 shows key gures of Claims Reserve distribution derived by the
GAMLSS model where the conditional distribution follows a Gamma and the
9
4 NUMERICAL RESULTS
dispersion parameter varies as a function of development year. Lower variability
and greater skewness than classical models come forth. Furthermore, it could
be noted that the variability coecient is roughly equal to the value derived on
the same triangle by England and Verrall [11] where an Over Dispersed Poisson
Model with a non-constant scale parameters was used. Finally Figure 2 and
3 show the simulated distributions (10,000 simulations) obtained by combining
bootstrap to an additional step to incorporate the process error for classical
Gamma model and GAMLSS Gamma model with varying by development year
dispersion parameter respectively.
Claims Reserve (Gamma GLM)
Reserve
F
r
e
q
u
e
n
c
y
1.0e+07 1.5e+07 2.0e+07 2.5e+07 3.0e+07
0
2
0
0
4
0
0
6
0
0
8
0
0
Figure 2: Claim reserve distribution obtained by a Gamma GLM
10
5 CONCLUSIONS
Claims Reserve (Gamma GAMLSS)
Reserve
F
r
e
q
u
e
n
c
y
1.0e+07 1.5e+07 2.0e+07 2.5e+07 3.0e+07
0
2
0
0
4
0
0
6
0
0
8
0
0
Figure 3: Claim reserve distribution obtained by a GAMLSS model with a
Gamma distribution
5 Conclusions
This paper aims to propose a methodology to assess the claims reserve distri-
bution based on a gamlss framework. The approach appears more exible than
classical GLM since it tries to describe the variance eect as a function of avail-
able covariates, either accident or development year. Furthermore GAMLSS
methodology allows a wide range of response variable distribution to be used,
not bounded to exponential family. Similarly it permits exibility in specify
the regression relationship for example allowing the use of non - parametric
smoothers.
A numerical exemplication was performed by testing the well-known Taylor-
Ashe triangle. The best tting model for such triangle is assuming Gamma
distribution for incremental payment where dispersion parameter varies as a
function of development year.
While it is dicult to make any nal conclusions from the single triangle which
has been analyzed in this paper, it is interesting to note the improvement in the
GAIC t when the variance is explicitly described. Further areas of research
lie in extending to gamlss the analytical expression of the estimation variance
as derived for GLM in [20]. Finally it is noteworthy that despite the exibility
of GAMLSS, this methodology could result overparameterized when few data
(small triangles) are considered.
11
A MODEL LOGS
Appendix
A Model logs
The selected GAMLSS model used to compute reserve distribution is reported
in R log below shown.
*******************************************************************
Family: c("GA", "Gamma")
Call:
gamlss(formula = value ~ factor(origin) + factor(dev), sigma.formula = ~factor(dev),
family = GA, data = GenInsDF, control = con)
Fitting method: RS()
-------------------------------------------------------------------
Mu link function: log
Mu Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 12.412282 7.331e-02 1.693e+02 7.802e-54
factor(origin)2 0.425016 1.588e-07 2.677e+06 5.507e-205
factor(origin)3 0.416764 1.588e-07 2.625e+06 1.116e-204
factor(origin)4 0.240049 1.588e-07 1.512e+06 4.707e-196
factor(origin)5 0.485359 1.588e-07 3.057e+06 4.626e-207
factor(origin)6 0.327941 1.588e-07 2.065e+06 6.235e-201
factor(origin)7 0.616862 1.588e-07 3.885e+06 8.255e-211
factor(origin)8 0.860389 1.588e-07 5.419e+06 5.182e-216
factor(origin)9 0.449620 1.669e-01 2.695e+00 1.064e-02
factor(origin)10 0.336155 2.251e-01 1.494e+00 1.440e-01
factor(dev)2 0.913387 1.053e-01 8.671e+00 2.426e-10
factor(dev)3 0.909820 7.331e-02 1.241e+01 1.442e-14
factor(dev)4 1.022382 1.409e-01 7.258e+00 1.510e-08
factor(dev)5 0.464730 1.530e-01 3.037e+00 4.424e-03
factor(dev)6 0.163759 2.541e-01 6.445e-01 5.234e-01
factor(dev)7 -0.002649 2.046e-01 -1.295e-02 9.897e-01
factor(dev)8 -0.390936 1.000e-01 -3.908e+00 3.937e-04
factor(dev)9 0.027096 1.021e-01 2.655e-01 7.922e-01
factor(dev)10 -1.285784 7.331e-02 -1.754e+01 3.220e-19
-------------------------------------------------------------------
Sigma link function: log
Sigma Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.54747 0.2219 -6.9725 1.119e-08
factor(dev)2 0.09349 0.3222 0.2902 7.730e-01
12
A MODEL LOGS
factor(dev)3 -14.45481 0.3319 -43.5454 1.966e-38
factor(dev)4 0.40252 0.3440 1.1700 2.482e-01
factor(dev)5 0.43574 0.3601 1.2099 2.326e-01
factor(dev)6 0.93870 0.3747 2.5051 1.593e-02
factor(dev)7 0.58500 0.4105 1.4252 1.610e-01
factor(dev)8 -0.59069 0.4638 -1.2735 2.094e-01
factor(dev)9 -0.75070 0.5463 -1.3742 1.762e-01
factor(dev)10 -14.68125 0.7337 -20.0091 4.969e-24
-------------------------------------------------------------------
No. of observations in the fit: 55
Degrees of Freedom for the fit: 29
Residual Deg. of Freedom: 26
at cycle: 10
Global Deviance: 1181.415
AIC: 1239.415
SBC: 1297.628
*******************************************************************
13
REFERENCES REFERENCES
References
[1] D. Anderson, S. Feldblum, C. Modlin, D. Schirmacher, E. Schirmacher, and
N. Thandi. A Practitioners Guide to Generalized Linear Models, 2007.
[2] K. Antonio and B. Beirlant. Issues in claims reserving and credibility: a
semiparametric approach with mixed models. Journal of Risk and Insur-
ance, 75:643676, 2008.
[3] G. Barnett and B. Zehnwirth. Best estimates for reserves. In Proceedings
of the Casualty Actuarial Society, pages 245321, 2000.
[4] S. Bjorkwall, O. Hossjer, E. Ohlsson, and R.J Verrall. A generalized linear
model with smoothing eects for claims reserving. Insurance: Mathematics
and Economics, 49:2737, 2011.
[5] European Commission. QIS5 Technical Specications, 2010.
[6] P. K. Dunn and G. K. Smyth. Randomized quantile residuals. Journal of
Computational and Graphical Statistics, 5:236244, 1996.
[7] P.D England. Addendum to analytic and bootstrap estimates of prediction
errors in claim reserving. Insurance Mathematics and Economics, 31:461
466, 2002.
[8] P.D. England and R.J Verrall. Standard error of prediction in claim re-
serving: a comparison of methods. In Institute of Actuaries, pages 120,
1998.
[9] P.D England and R.J Verrall. Analytic and bootstrap estimates of pre-
diction errors in claims reserving. Insurance Mathematics and Economics,
25:281293, 1999.
[10] P.D. England and R.J. Verrall. A exible framework for stochastic claims
reserving. In Proceedings of the Casualty Actuarial Society, pages 138,
2001.
[11] P.D. England and R.J Verrall. Predictive distributions of outstanding lia-
bilities in general insurance. Annals of Actuarial Science, 1:221270, 2007.
[12] S. Feldblum. Naic property/casualty insurance company risk-based capital
requirements. In Casualty Actuarial Society EForum, 1996.
[13] M. Gesmann and Y. Zhang. ChainLadder: Mack, Bootstrap, Munich and
Multivariate-chain-ladder Methods, 2011. R package version 0.1.5-1.
[14] G. Z. Heller, D. M. Stasinopoulos, R.A. Rigby, and P. De Jong. Mean
and dispersion modelling for policy claim costs. Scandinavian Actuarial
Journal, 4:281292, 2007.
14
REFERENCES REFERENCES
[15] E. Kremer. Ibnr-claims and the two-way model of anova. Scandinavian
Actuarial Journal, 1(2):4752, 1982.
[16] T. Mack. Distribution-free calculation of the standard error of chain ladder
reserve estimates. Astin Bulletin, 23(2):213225, 1993.
[17] P. McCullagh and J.A. Nelder. Generalized Linear Models, 1989.
[18] R Core Team. R: A Language and Environment for Statistical Computing.
R Foundation for Statistical Computing, Vienna, Austria, 2013.
[19] A.E. Renshaw. Claims reserving by joint modelling. Insurance: Mathemat-
ics and Economics, 17(3):239240, 1996.
[20] R. A. Renshaw and R.J Verrall. A stochastic model underlying the chain-
ladder technique. British Actuarial Journal, 4:903923, 1998.
[21] R. A. Rigby and D. M. Stasinopoulos. Generalized additive models for
location, scale and shape,(with discussion). Applied Statistics, 54:507554,
2005.
[22] R. A. Rigby and D. M. Stasinopoulos. A exible regression approach using
GAMLSS in R, 2009.
[23] R. A. Rigby and D. M. Stasinopoulos. Generalised Additive Models for
Location Scale and Shape, 2013. R package version 4.2-5.
[24] G. A. Spedicato. Solvency II premium risk modeling under the direct com-
pensation CARD system. PhD thesis, La Sapienza, Universit`a di Roma,
2011.
[25] G.C. Taylor and F.R. Ashe. Second moments of estimates of outstanding
claims. Journal of Econometrics, 23:3761, 1983.
[26] G. G. Venter. Mortality trend models. Casualty Actuarial Society E-Forum,
2:130, 2011.
[27] R.J Verrall. Claims reserving and generalised additive models. Insurance:
Mathematics and Economics, 19:3143, 1996.
[28] T.S. Wright. A stochastic method for claims reserving in general insurance.
Journal of the Institute of Actuaries, 117:677731, 1990.
15