D Important Probability Distributions

D
Important Probability Distributions

Development of stochastic models is facilitated by identifying a few probabil-
ity distributions that seem to correspond to a variety of data-generating pro-
cesses, and then studying the properties of these distributions. In the following
tables, I list some of the more useful distributions, both discrete distributions
and continuous ones. The names listed are the most common names, although
some distributions go by dierent names, especially for specic values of the
parameters. In the rst column, following the name of the distribution, the
parameter space is specied.
There are two very special continuous distributions, for which I use spe-
cial symbols: the uniform over the interval [a, b], designated U(a, b), and the
normal (or Gaussian), denoted by N(,
2
). Notice that the second parame-
ter in the notation for the normal is the variance. Sometimes, such as in the
functions in R, the second parameter of the normal distribution is the stan-
dard deviation instead of the variance. A normal distribution with = 0 and
2
= 1 is called the standard normal. I also often use the notation (x) for
the PDF of a standard normal and (x) for the CDF of a standard normal,
and these are generalized in the obvious way as (x|,
2
) and (x|,
2
).
Except for the uniform and the normal, I designate distributions by a
name followed by symbols for the parameters, for example, binomial(, n) or
gamma(, ). Some families of distributions are subfamilies of larger families.
For example, the usual gamma family of distributions is a the two-parameter
subfamily of the three-parameter gamma.
There are other general families of probability distributions that are de-
ned in terms of a dierential equation or of a form for the CDF. These include
the Pearson, Johnson, Burr, and Tukeys lambda distributions.
Most of the common distributions fall naturally into one of two classes.
They have either a countable support with positive probability at each point
in the support, or a continuous (dense, uncountable) support with zero prob-
ability for any subset with zero Lebesgue measure. The distributions listed in
the following tables are divided into these two natural classes.
Elements of Computational Statistics, Second Edition c 2013 James E. Gentle
418 Appendix D. Important Probability Distributions
There are situations for which these two distinct classes are not appropri-
ate. For many such situations, however, a mixture distribution provides an
appropriate model. We can express a PDF of a mixture distribution as
p
M
(y) =
m
j=1
j
p
j
(y |
j
),
where the m distributions with PDFs p
j
can be either discrete or continuous.
A simple example is a probability model for the amount of rainfall in a given
period, say a day. It is likely that a nonzero probability should be associated
with zero rainfall, but with no other amount of rainfall. In the model above,
m is 2,
1
is the probability of no rain, p
1
is a degenerate PDF with a value
of 1 at 0,
2
= 1
1
, and p
2
is some continuous PDF over IR
+
, possibly
similar to a distribution in the exponential family.
A mixture family that is useful in robustness studies is the -mixture dis-
tribution family, which is characterized by a given family with CDF P that is
referred to as the reference distribution, together with a point x
c
and a weight
. The CDF of a -mixture distribution family is
P
xc,
(x) = (1 )P(x) +I
[xc,[
(x),
where 0 1.
Another example of a mixture distribution is a binomial with constant
parameter n, but with a nonconstant parameter . In many applications, if
an identical binomial distribution is assumed (that is, a constant ), it is often
the case that over-dispersion will be observed; that is, the sample variance
exceeds what would be expected given an estimate of some other parameter
that determines the population variance. This situation can occur in a model,
such as the binomial, in which a single parameter determines both the rst
and second moments. The mixture model above in which each p
j
is a binomial
PDF with parameters n and
j
may be a better model.
Of course, we can extend this kind of mixing even further. Instead of
j
p
j
(y |
j
) with
j
0 and
m
j=1
j
= 1, we can take ()p(y | ) with
() 0 and
() d = 1, from which we recognize that () is a PDF and

can be considered to be the realization of a random variable.
Extending the example of the mixture of binomial distributions, we may
choose some reasonable PDF (). An obvious choice is a beta PDF. This
yields the beta-binomial distribution, with PDF
p
X,
(x, ) =
n
x
( +)
()()
x+1
(1 )
nx+1
I
{0,1,...,n}]0,1[
(x, ).
This is a standard distribution but I did not include it in the tables below.
This distribution may be useful in situations in which a binomial model is
appropriate, but the probability parameter is changing more-or-less continu-
ously.
Appendix D. Important Probability Distributions 419
We recognize a basic property of any mixture distribution: It is a joint
distribution factored as a marginal (prior) for a random variable, which is often
not observable, and a conditional distribution for another random variable,
which is usually the observable variable of interest.
In Bayesian analyses, the rst two assumptions (a prior distribution for
the parameters and a conditional distribution for the observable) lead immedi-
ately to a mixture distribution. The beta-binomial above arises in a canonical
example of Bayesian analysis.
Some distributions are recognized because of their use as conjugate priors
and their relationship to sampling distributions. These include the inverted
chi-square and the inverted Wishart.
General References
Evans et al. (2000)give general descriptions of 40 probability distributions.
Balakrishnan and Nevzorov (2003) provide an overview of the important char-
acteristics that distinguish dierent distributions and then describe the impor-
tant characteristics of many common distributions. Leemis and McQueston
(2008) present an interesting compact graph of the relationships among a
large number of probability distributions.
Currently, the most readily accessible summary of common probability
distributions is Wikipedia: http://wikipedia.org/ Search under the name
of the distribution.
Table D.1. Discrete Distributions (PDFs are wrt counting measure)
discrete uniform PDF
1
m
, y = a1, . . . , am
a1, . . . , am IR mean
P
ai/m
variance
P
(ai a)
2
/m, where a =
P
ai/m
Bernoulli PDF
y
(1 )
1y
, y = 0, 1
]0, 1[ mean
variance (1 )
binomial (n Bernoullis) PDF
n
y
!
y
(1 )
ny
, y = 0, 1, . . . , n
n = 1, 2, . . . ; ]0, 1[ CF (1 +e
it
)
n
mean n
variance n(1 )
geometric PDF (1 )
y
, y=0,1,2,. . .
]0, 1[ mean (1 )/
variance (1 )/
2
negative binomial (n geometrics) PDF
y +n 1
n 1
!
n
(1 )
y
, y = 0, 1, 2, . . .
n = 1, 2, . . . ; ]0, 1[ CF

1 (1 )e
it
n
mean n(1 )/
variance n(1 )/
2
multinomial PDF
n!
Q
yi!
d
Y
i=1
y
i
i
, yi = 0, 1, . . . , n,
X
yi = n
n = 1, 2, . . ., CF
P
d
i=1
ie
it
i
n
for i = 1, . . . , d, i ]0, 1[,
P
i = 1 means ni
variances ni(1 i)
covariances nij
hypergeometric PDF
M
y
!
N M
n y
!
N
n
!
,
y = max(0, n N + M), . . . , min(n, M)
N = 2, 3, . . .; mean nM/N
M = 1, . . . , N; n = 1, . . . , N variance (nM/N)(1 M/N)(N n)/(N 1)
continued ...
Table D.1. Discrete Distributions (continued)
Poisson PDF
y
e
/y!, y = 0, 1, 2, . . .
IR+ CF e
(e
it
1)
mean
variance
power series PDF
hy
c()
y
, y = 0, 1, 2, . . .
IR+ CF
P
y
hy(e
it
)
y
/c()
{hy} positive constants mean
d
d
(log(c())
c() =
P
y
hy
y
variance
d
d
(log(c()) +
2
d
2
d
2
(log(c())
logarithmic PDF

y
y log(1 )
, y = 1, 2, 3, . . .
]0, 1[ mean /((1 ) log(1 ))
variance ( + log(1 ))/((1 )
2
(log(1 ))
2
)
Benfords PDF log
b
(y + 1) log
b
(y), y = 1, . . . , b 1
b integer 3 mean b 1 log
b
((b 1)!)
Table D.2. The Normal Distributions
normal; N(,
2
) PDF (y|,
2
)
def
=
1
2
e
(y)
2
/2
2
IR; IR+ CF e
it
2
t
2
/2
mean
variance
2
multivariate normal; Nd(, ) PDF
1
(2)
d/2
||
1/2
e
(y)
T
1
(y)/2
IR
d
; 0 IR
dd
CF e
i
T
tt
T
t/2
mean
covariance
matrix normal PDF
1
(2)
nm/2
||
n/2
||
m/2
e
tr(
1
(Y M)
T
1
(Y M))/2
M IR
nm
, 0 IR
mm
, mean M
0 IR
nn
covariance
complex multivariate normal PDF
1
(2)
d/2
||
1/2
e
(z)
H
1
(z)/2
I C
d
, 0 I C
dd
mean
covariance
Table D.3. Sampling Distributions from the Normal Distribution
chi-squared;
2
PDF
1
(/2)2
/2
y
/21
e
y/2
I
IR
+
(y)
IR+ mean
if ZZ+, variance 2
t PDF
(( + 1)/2)
(/2)
(1 +y
2
/)
(+1)/2
IR+ mean 0
variance /( 2), for > 2
F PDF

1
/2
1

2
/2
2
(1 +2)
(1/2)(2/2)
y
1
/21
(2 +1y)
(
1
+
2
)/2
I
IR
+
(y)
1, 2 IR+ mean 2/(2 2), for 2 > 2
variance 2
2
2
(1 +2 2)/(1(2 2)
2
(2 4)), for 2 > 4
Wishart PDF
|W|
(d1)/2
2
d/2
||
/2
d(/2)
exp
`
trace(
1
W)
I
{M| M0IR
dd
}
(W)
d = 1, 2, . . . ; mean
> d 1 IR; covariance Cov(Wij , Wkl) = (ikjl + iljk), where = (ij)
0 IR
dd
noncentral chi-squared PDF
e
/2
2
/2
y
/21
e
y/2
X
k=0
(/2)
k
k!
1
(/2 +k)2
k
y
k
I
IR
+
(y)
, IR+ mean +
variance 2( + 2)
noncentral t PDF

/2
e
2
/2
(/2)
1/2
( +y
2
)
(+1)/2
IR+, IR
X
k=0
+ k + 1
2
(y)
k
k!
2
+y
2
k/2
mean
(/2)
1/2
(( 1)/2)
(/2)
, for > 1
variance

2
(1 +
2
)

2
(( 1)/2)
(/2)
2
, for > 2
noncentral F PDF
1
2
1
/2
e
/2
y
1
/21
2
2 +1y
1
/2+
2
/2
1, 2, IR+
X
k=0
(/2)
k
(2 +1 + k)
(2)(1 +k)k!
1
2
k
y
k
2
2 +1y
k
I
IR
+
(y)
mean 2(1 + )/(1(2 2)), for 2 > 2
variance 2
2
1
(1 + )
2
+ (1 + 2)(2 2)
(2 2)
2
(2 4)
, for 2 > 4
Table D.4. Distributions Useful as Priors for the Normal Parameters
inverted gamma PDF
1
()
1
y
+1
e
1/y
I
IR
+
(y)
, IR+ mean 1/( 1) for > 1
variance 1/(
2
( 1)
2
( 2)) for > 2
inverted chi-squared PDF
1
(/2)2
/2
1
y
/2+1
e
1/2y
I
IR
+
(y)
IR+ mean 1/( 2) for > 2
variance 2/(( 2)
2
( 4)) for > 4
Table D.5. Distributions Derived from the Univariate Normal
lognormal PDF
1
2
y
1
e
(log(y))
2
/2
2
I
IR
+
(y)
IR; IR+ mean e
+
2
/2
variance e
2+
2
(e
2
1)
inverse Gaussian PDF
s

2y
3
e
(y)
2
/2
2
y
I
IR
+
(y)
, IR+ mean
variance
3
/
skew normal PDF
1
e
(y)
2
/2
2
Z
(y)/
e
t
2
/2
dt
, IR; IR+ mean +
q
2
(1+
2
)
variance
2
(1 2
2
/)
Table D.6. Other Continuous Distributions (PDFs are wrt Lebesgue measure)
beta PDF
( +)
()()
y
1
(1 y)
1
I
[0,1]
(y)
, IR+ mean /( + )
variance

( +)
2
( + + 1)
Dirichlet PDF
(
P
d+1
i=1
i)
Q
d+1
i=1
(i)
d
Y
i=1
y
i
1
i
1
d
X
i=1
yi
!
d+1
1
I
[0,1]
d (y)
IR
d+1
+
mean /1 (d+1/1 is the mean of Yd+1.)
variance
(1 )
2
1
(1 + 1)
uniform; U(1, 2) PDF
1
2 1
I
[
1
,
2
]
(y)
1 < 2 IR mean (2 + 1)/2
variance (
2
2
212 +
2
1
)/12
Cauchy PDF
1
1 +
IR; IR+ mean does not exist

variance does not exist
logistic PDF
e
(y)/
(1 + e
(y)/
)
2
IR; IR+ mean
variance
2
2
/3
Pareto PDF

y
+1
I
[,[
(y)
, IR+ mean /( 1) for > 1
variance
2
/(( 1)
2
( 2)) for > 2
power function PDF (y/)
I
[0,[
(y)
, IR+ mean /( + 1)
variance
2
/(( + 2)( + 1)
2
)
von Mises PDF
1
2I0()
e
cos(x)
I
[,+]
(y)
IR; IR+ mean
variance 1 (I1()/I0())
2
continued ...
Table D.6. Other Continuous Distributions (continued)
gamma PDF
1
()
y
1
e
y/
I
IR
+
(y)
, IR+ mean
variance
2
three-parameter gamma PDF
1
()
(y )
1
e
(y)/
I
],[
(y)
, IR+; IR mean +
variance
2
exponential PDF
1
e
y/
I
IR
+
(y)
IR+ mean
variance
2
double exponential PDF
1
2
e
|y|/
IR; IR+ mean
(folded exponential) variance 2
2
Weibull PDF

y
1
e
y
/
I
IR
+
(y)
, IR+ mean
1/
(
1
+ 1)
variance
2/
`
(2
1
+ 1) ((
1
+ 1))
2
extreme value (Type I) PDF

1
e
(y)/
exp(e
(y)/
)
IR; IR+ mean
(1)
variance
2
2
/6

D Important Probability Distributions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

D Important Probability Distributions

Uploaded by

Copyright:

Available Formats

D

Important Probability Distributions

() d = 1, from which we recognize that () is a PDF and

IR; IR+ mean does not exist

extreme value (Type I) PDF

You might also like