Credibility Theory Features of Actuar
Credibility Theory Features of Actuar
Credibility Theory Features of Actuar
Christophe Dutang
Université Paris Dauphine
Vincent Goulet
Université Laval
Xavier Milhaud
Université Claude Bernard Lyon 1
Mathieu Pigeon
Université du Québec à Montréal
1 Introduction
Credibility models are actuarial tools to distribute premiums fairly among a
heterogeneous group of policyholders (henceforth called entities). More gen-
erally, they can be seen as prediction methods applicable in any setting where
repeated measures are made for subjects with different risk levels.
The credibility theory features of actuar consist of matrix hachemeister
containing the famous data set of Hachemeister (1975) and function cm to fit
hierarchical (including Bühlmann, Bühlmann-Straub), regression and linear
Bayes credibility models. Furthermore, function rcomphierarc can simulate
portfolios of data satisfying the assumptions of the aforementioned credibility
models; see the "simulation" vignette for details.
1
> data(hachemeister)
> hachemeister
state ratio.1 ratio.2 ratio.3 ratio.4 ratio.5
[1,] 1 1738 1642 1794 2051 2079
[2,] 2 1364 1408 1597 1444 1342
[3,] 3 1759 1685 1479 1763 1674
[4,] 4 1223 1146 1010 1257 1426
[5,] 5 1456 1499 1609 1741 1482
ratio.6 ratio.7 ratio.8 ratio.9 ratio.10 ratio.11
[1,] 2234 2032 2035 2115 2262 2267
[2,] 1675 1470 1448 1464 1831 1612
[3,] 2103 1502 1622 1828 2155 2233
[4,] 1532 1953 1123 1343 1243 1762
[5,] 1572 1606 1735 1607 1573 1613
ratio.12 weight.1 weight.2 weight.3 weight.4
[1,] 2517 7861 9251 8706 8575
[2,] 1471 1622 1742 1523 1515
[3,] 2059 1147 1357 1329 1204
[4,] 1306 407 396 348 341
[5,] 1690 2902 3172 3046 3068
weight.5 weight.6 weight.7 weight.8 weight.9
[1,] 7917 8263 9456 8003 7365
[2,] 1622 1602 1964 1515 1527
[3,] 998 1077 1277 1218 896
[4,] 315 328 352 331 287
[5,] 2693 2910 3275 2697 2663
weight.10 weight.11 weight.12
[1,] 7832 7849 9077
[2,] 1748 1654 1861
[3,] 1003 1108 1121
[4,] 384 321 342
[5,] 3017 3242 3425
2
(Bühlmann and Gisler, 2005, Section 8.4); linear Bayes models. The modular
design of cm makes it easy to add new models if desired.
This section concentrates on usage of cm for hierarchical models.
There are some variations in the formulas of the hierarchical model in the
literature. We compute the credibility premiums as given in Bühlmann and
Jewell (1987) or Bühlmann and Gisler (2005), supporting three types of esti-
mators of the between variance structure parameters: the unbiased estimators
of Bühlmann and Gisler (2005) (the default), the slightly different version of
Ohlsson (2005) and the iterative pseudo-estimators as found in Goovaerts and
Hoogstad (1987) or Goulet (1998).
Consider an insurance portfolio where entities are classified into cohorts. In
our terminology, this is a two-level hierarchical classification structure. The
observations are claim amounts Sijt , where index i = 1, . . . , I identifies the
cohort, index j = 1, . . . , Ji identifies the entity within the cohort and index t =
1, . . . , nij identifies the period (usually a year). To each data point corresponds
a weight — or volume — wijt . Then, the best linear prediction for the next
period outcome of a entity based on ratios Xijt = Sijt /wijt is
The estimator of s2 is
I Ji nij
1
2
ŝ = J ∑ ∑ ∑ wijt (Xijt − Xijw )2 . (2)
∑iI=1 ∑ j=
i
1 ( nij − 1) i =1 j =1 t =1
The three types of estimators for the variance components a and b are the
3
following. First, let
Ji Ji wij2 Σ
Ai = ∑ wijΣ (Xijw − Xiww )2 − ( Ji − 1)s2 ci = wiΣΣ − ∑ wiΣΣ
j =1 j =1
I Iz2iΣ
B= ∑ ziΣ (Xizw − X̄zzw )2 − ( I − 1)a d = zΣΣ − ∑
z
,
i =1 i =1 ΣΣ
with
I
z
X̄zzw = ∑ zΣΣ
iΣ
Xizw . (3)
i =1
1 I
Ai
â = ∑ max ,0 (4)
I i =1 ci
B
b̂ = max ,0 , (5)
d
∑iI=1 Ai
â0 = (6)
∑iI=1 ci
B
b̂0 = (7)
d
and the iterative (pseudo-)estimators are
I Ji
1
ã = I
∑i=1 ( Ji − 1)
∑ ∑ zij (Xijw − Xizw )2 (8)
i =1 j =1
I
1
b̃ = ∑ z ( X − Xzzw )2 ,
I − 1 i=1 i izw
(9)
where
I
z
Xzzw = ∑ zΣi Xizw . (10)
i =1
Note the difference between the two weighted averages (3) and (10). See Bel-
hadj et al. (2009) for further discussion on this topic.
Finally, the estimator of the collective mean m is m̂ = Xzzw .
The credibility modeling function cm assumes that data is available in the
format most practical applications would use, namely a rectangular array (ma-
trix or data frame) with entity observations in the rows and with one or more
classification index columns (numeric or character). One will recognize the
output format of rcomphierarc and its summary methods.
4
Then, function cm works much the same as lm. It takes in argument: a
formula of the form ˜ terms describing the hierarchical interactions in a data
set; the data set containing the variables referenced in the formula; the names
of the columns where the ratios and the weights are to be found in the data
set. The latter should contain at least two nodes in each level and more than
one period of experience for at least one entity. Missing values are represented
by NAs. There can be entities with no experience (complete lines of NAs).
In order to give an easily reproducible example, we group states 1 and 3 of
the Hachemeister data set into one cohort and states 2, 4 and 5 into another.
This shows that data does not have to be sorted by level. The fitted model
below uses the iterative estimators of the variance components.
> X <- cbind(cohort = c(1, 2, 1, 2, 2), hachemeister)
> fit <- cm(~cohort + cohort:state, data = X,
+ ratios = ratio.1:ratio.12,
+ weights = weight.1:weight.12,
+ method = "iterative")
> fit
Call:
cm(formula = ~cohort + cohort:state, data = X, ratios = ratio.1:ratio.12,
weights = weight.1:weight.12, method = "iterative")
$state
[1] 2048 1524 1875 1497 1585
One can also obtain a nicely formatted view of the most important results
with a call to summary.
> summary(fit)
Call:
cm(formula = ~cohort + cohort:state, data = X, ratios = ratio.1:ratio.12,
weights = weight.1:weight.12, method = "iterative")
5
Structure Parameters Estimators
Detailed premiums
Level: cohort
cohort Indiv. mean Weight Cred. factor Cred. premium
1 1967 1.407 0.9196 1949
2 1528 1.596 0.9284 1543
Level: state
cohort state Indiv. mean Weight Cred. factor
1 1 2061 100155 0.8874
2 2 1511 19895 0.6103
1 3 1806 13735 0.5195
2 4 1353 4152 0.2463
2 5 1600 36110 0.7398
Cred. premium
2048
1524
1875
1497
1585
The methods of predict and summary can both report for a subset of the
levels by means of an argument levels.
> summary(fit, levels = "cohort")
Call:
cm(formula = ~cohort + cohort:state, data = X, ratios = ratio.1:ratio.12,
weights = weight.1:weight.12, method = "iterative")
Detailed premiums
6
cohort Indiv. mean Weight Cred. factor Cred. premium
1 1967 1.407 0.9196 1949
2 1528 1.596 0.9284 1543
> predict(fit, levels = "cohort")
$cohort
[1] 1949 1543
7
Structure Parameters Estimators
Xit = β 0 + β 1 t + ε t , t = 1, . . . , 12
8
2000
1800
1600
1400
1200
collective
individual
1000
credibility
2 4 6 8 10 12
9
+ ratios = ratio.1:ratio.12,
+ weights = weight.1:weight.12)
> summary(fit2, newdata = data.frame(time = 13))
Call:
cm(formula = ~state, data = hachemeister, ratios = ratio.1:ratio.12,
weights = weight.1:weight.12, regformula = ~time, regdata = data.frame(time = 1:12),
adj.intercept = TRUE)
Detailed premiums
1651
2071
1597
1698
Figure 2 shows the beneficient effect of the intercept adjustment on the
premium of State 4.
10
2000
1800
1600
1400
1200
collective
individual
1000
credibility
2 4 6 8 10 12
Bn+1 = E[µ(Θ)| X1 , . . . , Xn ].
It is then well known (Bühlmann and Gisler, 2005; Klugman et al., 2012) that
for some combinaisons of distributions, the Bayesian premium is linear and
can written as a credibility premium
Bn+1 = z X̄ + (1 − z)m,
11
members of the univariate exponential family for the distribution of X |Θ = θ
and their natural conjugate for the distribution of Θ:
• X |Θ = θ ∼ Poisson(θ ), Θ ∼ Gamma(α, λ);
• X |Θ = θ ∼ Exponential(θ ), Θ ∼ Gamma(α, λ);
• X |Θ = θ ∼ Normal(θ, σ22 ), Θ ∼ Normal(µ, σ12 );
• X |Θ = θ ∼ Bernoulli(θ ), Θ ∼ Beta( a, b);
• X |Θ = θ ∼ Geometric(θ ), Θ ∼ Beta( a, b);
and the convolutions
• X |Θ = θ ∼ Gamma(τ, θ ), Θ ∼ Gamma(α, λ);
• X |Θ = θ ∼ Binomial(ν, θ ), Θ ∼ Beta( a, b);
• X |Θ = θ ∼ Negative Binomial(r, θ ) and Θ ∼ Beta( a, b).
Appendix A provides the complete formulas for the above combinations of
distributions.
In addition, Bühlmann and Gisler (2005, section 2.6) show that if X |Θ =
θ ∼ Single Parameter Pareto(θ, x0 ) and Θ ∼ Gamma(α, λ), then the Bayesian
estimator of parameter θ — not of the risk premium! — is
α
Θ̂ = η θ̂ MLE + (1 − η ) ,
λ
where
n
θ̂ MLE =
∑in=1 ln( Xi /x0 )
is the maximum likelihood estimator of θ and
∑in=1 ln( Xi /x0 )
η=
λ + ∑in=1 ln( Xi /x0 )
is a weight not restricted to (0, 1). (See the "distributions" package vignette
for details on the Single Parameter Pareto distribution.)
When argument formula is "bayes", function cm computes pure Bayesian
premiums — or estimator in the Pareto/Gamma case — for the combinations
of distributions above. We identify which by means of argument likelihood
that must be one of "poisson", "exponential", "gamma", "normal", "bernoulli",
"binomial", "geometric", "negative binomial" or "pareto". The parameters
of the distribution of X |Θ = θ, if any, and those of the distribution of Θ are
specified using the argument names (and default values) of dgamma, dnorm,
dbeta, dbinom, dnbinom or dpareto1, as appropriate.
Consider the case where
X |Θ = θ ∼ Poisson(θ )
Θ ∼ Gamma(α, λ).
12
The posterior distribution of Θ is
!
n
Θ| X1 , . . . , Xn ∼ Gamma α + ∑ Xt , λ + n .
t =1
Bn+1 = E[µ(Θ)| X1 , . . . , Xn ]
= E [ Θ | X1 , . . . , X n ]
α + ∑nt=1 Xt
=
λ+n
n λ α
= X̄ +
n+λ n+λλ
= z X̄ + (1 − z)m,
Collective premium: 1
13
Collective premium: 1
Detailed premiums
Risk premium
µ(θ ) = θ
14
Collective premium
a
m=
a+b
Bayesian premium
a + ∑nt=1 Xt
Bn+1 =
a+b+n
Credibility factor
n
z=
n+a+b
Risk premium
µ(θ ) = νθ
Collective premium
νa
m=
a+b
Bayesian premium
ν( a + ∑nt=1 Xt )
Bn+1 =
a + b + nν
Credibility factor
n
z=
n + ( a + b)/ν
ã = a + n
n
b̃ = b + ∑ xt
t =1
Risk premium
1−θ
µ(θ ) =
θ
15
Collective premium
b
m=
a−1
Bayesian premium
b + ∑nt=1 Xt
Bn+1 =
a+n−1
Credibility factor
n
z=
n+a−1
ã = a + nr
n
b̃ = b + ∑ xt
t =1
Risk premium
r (1 − θ )
µ(θ ) =
θ
Collective premium
rb
m=
a−1
Bayesian premium
r (b + ∑nt=1 Xt )
Bn+1 =
a + nr − 1
Credibility factor
n
z=
n + ( a − 1)/r
Risk premium
µ(θ ) = θ
16
Collective premium
α
m=
λ
Bayesian premium
α + ∑nt=1 Xt
Bn+1 =
λ+n
Credibility factor
n
z=
n+λ
α̃ = α + n
n
λ̃ = λ + ∑ xt
t =1
Risk premium
1
µ(θ ) =
θ
Collective premium
λ
m=
α−1
Bayesian premium
λ + ∑nt=1 Xt
Bn+1 =
α+n−1
Credibility factor
n
z=
n+α−1
α̃ = α + nτ
n
λ̃ = λ + ∑ xt
t =1
Risk premium
τ
µ(θ ) =
θ
17
Collective premium
τλ
m=
α−1
Bayesian premium
τ (λ + ∑nt=1 Xt )
Bn+1 =
α + nτ − 1
Credibility factor
n
z=
n + (α − 1)/τ
Risk premium
µ(θ ) = θ
Collective premium
m=µ
Bayesian premium
σ12 ∑nt=1 Xt + σ22 µ
Bn+1 =
nσ12 + σ22
Credibility factor
n
z=
n + σ22 /σ12
References
H. Belhadj, V. Goulet, and T. Ouellet. On parameter estimation in hierarchical
credibility. ASTIN Bulletin, 39(2), 2009.
H. Bühlmann. Experience rating and credibility. ASTIN Bulletin, 5:157–165,
1969.
H. Bühlmann and A. Gisler. Credibility in the regression case revisited. ASTIN
Bulletin, 27:83–98, 1997.
18
H. Bühlmann and A. Gisler. A course in credibility theory and its applications.
Springer, 2005. ISBN 3-5402575-3-5.
H. Bühlmann and W. S. Jewell. Hierarchical credibility revisited. Bulletin of
the Swiss Association of Actuaries, 87:35–54, 1987.
19