You are on page 1of 16

Derivation of some frequency distributions using the

principle of maximum entropy (POME)


Vijay P. Singh
Department of Civil Engineerin9

A. K. Rajagopal
Department of Physics and Astronomy

and Kulwant Singh


Department of Civil Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
The principle of maximum entropy (POME) was employed to develop a procedure for derivation
of a number of frequency distributions used in hydrology. The procedure required specification of
constraints and maximization of entropy, and is thus a solution of the classical optimization
problem. The POME led to a unique technique for parameter estimation. For six selected river
gaging stations, parameters of the gamma distribution, the log-Pearson type III distribution and
extreme value type I distribution fitted to annual maximum discharges, were evaluated by this
technique and compared with those obtained by using the methods of moments and maximum
likelihood estimation. The concept of entropy, used as a measured of uncertainty associated with a
specified distribution, facilitated this comparison.

INTRODUCTION
There exists a multitude of frequency distributions for
hydrologic analyses. For example, exponential and
Weibull distributions are often used for frequency
analysis of depth, intensity, duration and number of
rainfall events [Eagleson, 1972; Rao and Chenchagya,
1974, 1975; Richardson, 1982]; gamma distribution for
rainfall-runoff modelling [Nash, 1957; Dooge, 1973;
Singh, 1982a] as well as for flood analysis I-Phien and
Jivajirajah, 1984; Yevjevich and Obeysekera, 1984];
extreme-value (EV) type I distribution and its logarithmic
version for flood frequency analysis [Gumbel, 1958;
Todorovic, 1982; Lettenmeier and Burges, 1981; Singh,
1982b]; Pearson type (PT) 3 distribution and its
logarithmic version [Matalas and Wallis, 1977; Bobee
and Robitaille, 1977; Bucket and Oliver, 1977; Kite, 1977;
Rao, 1980, 1983], as well as lognormal distribution
[Kalinske, 1946; Chow, 1954; Sangal and Biswas, 1970;
Burges et al., 1975; Kite, 1977; Kottegoda, 1977]; etc.
Some of the distributions (e.g., gamma, EV1) have been
derived in standard statistical textbooks, but the
approach of derivation has varied from one distribution
to another except for the Pearsonian family. It is therefore
not clear if there is a unified approach which can be
employed to derive any desired distribution. Such an
approach may have several advantages: (1) It may aid in
understanding the distribution by knowing the type of
information needed for its derivation. (2) It may offer an
alternative method of estimating the parameters
AcceptedJanuary 1986.Discussioncloses August 1986.

0309-1708/86/020091-1652.00
1986 ComputationalMechanics Publications

contained in the distribution which can then be compared


with other methods of parameter estimation such as
methods of moments and maximum likelihood
estimation. (3) It may show connections between different
distributions. (4) It may lead to an alternative way of
assessing goodness of fit of a distribution to experimental
data [Janes, 1979]. This study attempts to develop such a
unified approach employing the principle of maximum
entropy abbreviated as POME.
The frequency distributions employed in hydrology
range from one-parameter to five-parameter distributions. The problem of finding which distribution best
fits. a given set of experimental data of a hydrologic
variable has been considered extensively in literature.
There can be more than one distribution providing
reasonably good fit. Classical statistical tests of goodness
of fit are not sufficiently powerful to discriminate among
reasonable choices of distributions. Frequently, the choice
is made almost arbitrarily. Perhaps the best known
example of an arbitrary choice is the recommendation of
the Work Group on Flow Frequency Methods of Water
Resources Council reported by Benson (1968): 'The log
Pearson type (LPT) 3 distribution has been selected as the
base method, with provisions for departures from the base
method where justified.' The LPT 3 distribution has since
been accepted for flood frequency analysis by most federal
agencies in the United States. The POME may provide an
answer to this question. The underlying argument is as
follows. A data set represents a hydrologic system whose
information content is constant and hopefully finite. The
fitted distribution attempts to mimic some of the system

Adv. Water Resources, 1986, Volume 9, June 91

Derivation of some frequency distributions: V. P. Singh et al.

characteristics or predict a portion of the information


content (or complementarily, lack of it). The distribution
whose information content more closely matches that of
the prototype system should be chosen. Implicitly
assumed here, of course, is that the sample (or data set) is
sufficiently large to reflect the information content of the
system. However, the question of how large is large
warrants further research and is beyond the scope of this
study. It is plausible to extract from the sample maximum
information about the system by using POME.
Likewise, there exists a multitude of methods for
estimating parameters of a frequency distribution. The
methods of moments and maximum likelihood estimation
are perhaps the best known methods. The succesful
application
of a frequency
distribution
depends,
doubtless, on the accuracy with which its parameters can
be estimated. None of the existing methods have proved
to be uniformly superior. A number of criteria are used to
evaluate and compare these methods. These, however, do
not convey an unequivocally perceptible impression of
the quality of performance of the methods. The POME
appears to provide an alternative, and hopefully superior,
criterion for evaluation and comparison of these methods.
A related problem in choosing a frequency distribution
is the following: Suppose we have a prior knowledge of
some characteristics of the frequency distribution or may
wish to impose certain constraints on it. For example,
certain moments (or expectations) or even bounds on
these values are known. The problem then is one of
choosing a distribution that in some sense is the best
estimate of the population distribution based on these
known characteristics. There can, in general, be a large
(even an infinite) number of distributions which may
satisfy these constraints. The POME provides a uniquely
correct answer to this question.
One of the reasons frequently cited for preference of the
LPT 3 distribution to a two-parameter EVl distribution
is that it has more parameters and hence greater fitting
ability. However, what is not clear is how much indeed is
the gain and if this gain is justified in the wake of added
complexity in its parameter estimation. This issue and
those related above are discussed in this study.

A SHORT HISTORICAL PERSPECTIVE


The entropy of a system was first defined by Boltzmann
(1872) as a measure of our degree of ignorance as to its
true state.In a series of landmark contributions, Shannon
(1948a, 1948b) developed a mathematical theory of
entropy and applied it in the field of communications.
Nearly a decade later, Jaynes (1957a, 1957b, 1961, 1982)
formulated the principle of maximum entropy and applied
it in thermodynamics. The works of Shannon and Jaynes
uncovered a new area of research and provided real
impetus to applications of entropy and especially POME
to various areas of science and technology. These
encompass
communications,
economics,
t hermodynamics, psychology, hydrology, statistical mechanics,
reservoir engineering, turbulence, structural reliability,
landscape evolution, to name but a few. An excellent
exposition on various aspects of this principle and its
application is contained in Shannon and Weaver (1949),
Jaynes (1961), Tribus (1969) Levine and Tribus (1978)
and Rosenkrantz (1983).
Leopold and Langbein (1962) were perhaps the first to

92

Adv. Water Resources,

1986, Volume 9, June

have applied the concept of entropy in geomorphology


and landscape evolution. It was employed to defined the
distribution of energy in a river system. By hypothesizing
a uniform distribution of energy in the river system, the
most probable state of the state was determined. This
yielded equations for the longitudinal profiles of the river
which were then verified by field data. Recently, Davy and
Davies (1979) examined the thermodynamic basis of the
concept of entropy as applied to fluvial geomorphology.
They concluded that the use of entropy in analysis of
stream behaviour and sediment transport was of dubious
validity.
Sonuga (1972, 1976) used POME successfully in
frequency analysis and rainfall-runoff relationship. His
work showed its strengths and limitations in the context
of hydrologic modelling, especially in data-scarce areas.
Jowitt (1979) discussed the properties and problems
associated with this concept in parameter estimmation of
the extreme value type I distribution. It was shown that
the parameters estimated in this manner were superior to
those estimated by the method of moments.
Another milestone in the application of this concept to
hydrology was achieved by Amorocho and Espildora
(1973). They derived an objective criterion based on
marginal entropy, conditional entropy and transinformation to assess uncertainty of the Stanford Watershed
Model (Crawford and Linsley, 1966) in simulating
streamflow from a basin in California for which historical
records were available. Their results clearly showed the
value and limitations of the concept of entropy in
assessing model performance.
Clearly, studies dealing with the concept of entropy in
hydrology have been relatively few. Nevertheless, their
findings are promising and justify further research. It is
these studies that provided motivation for our work. Our
results espouse the findings of earlier workers.
THE SHANNON

ENTROPY FUNCTIONAL (SEF)

Consider a probability density function (pdf) f(x)


associated with a dimensionless random variable x. The
dimensionless random variable may be constructed by
dividing the observed quantities by its mean value, e.g.,
annual flood maxima divided by mean annual flood. As
usual, f(x) is a positive function for every x in some
interval (a, b) and is normalized to unity,

:i(r)dx=l

(1)

We often make a change of variable based on physical


or mathematical considerations as
x= W(z)
where W is a monotonic function of z. Under such a
transformation, quite generally we have the mapping
x: (a,b)+z:

(1,~)

where a = W(I) and b = W(u). Thus, I and u stand for lower


and upper limits in the z-variable. Then
f(x) dx=f(x

= W(z)) 2
!

g(z) dz

dz
I

Derivation of some frequency distributions: V. P. Singh et al.


i=b

in which
l[f]=-

Z f~ln f~;

~f~=l

i=a

9(z) = f(x = W(z)) d~z


z
Here 9(z) is again a pdf but in the z-variable, and has
positivity as well as normalization properties,

f u9(z) dz = 1
Often f(x) is n o t k n o w n beforehand, although some of its
properties (or constrainsts) may be known, e.g., moments,
lower and upper bounds, etc. These constraints and the
condition in (1) are generally insufficient to define f(x)
uniquely, but may delineate a set of feasible distributions.
Each of these distributions contains a certain amount of
uncertainty which can be expressed by employing the
concept of entropy.
Entropy was first mathematically expressed by Shannon
(1948a, 1948b). I t h a s since been called the Shannon
entropy functional, SEF in short, denoted as I[f] or I[x]
and is a numerical measure of uncertainty associated with
f(x) in describing the random variable x, and defined as

l[f] = l[x] = - k f f f(x)ln[f(x)/m(x)] dx

(4)

in which f~denotes the probability of occurrence of xi, and


N is sample size. Here 0~< f~< 1 for all i. The passage of the
continuous to discrete and vice versa is subtle because f~ in
(4) are probabilities and f(x) in (3) are probability
densities. The use of re(x) as in (2) facilitates the
understanding of these transformations from discrete to
continuous and vice versa to some extent. Except for
mentioning this point, we shall not discuss this aspect in
this paper. Mostly we will use the form in (3) in formal
analysis but in actual numerical work, the discrete version
in (4) is employed. For a clear discussion of continuous
random variables, their transformations, and probability
distributions, one may refer to Rohatgi (1976).
Shannon (1948a, 1948b) showed that I is unique and the
only functional that satisfies the following properties: (1)
It is a function of the probabilities f~, f2 . . . . . fN. (2) It
follows an additive law, i.e., I[xy]=I[x]+I[y]. (3)It
monotonically increases with number of outcomes when
f~ are all equal. (4) It is consistent and continuous.

(2)
THE P R I N C I P L E O F M A X I M U M ENTROPY

where k > 0 is an arbitrary constant or scale factor


depending upon the choice of measurement units, and
m(x) is an invariant measure function guaranteeing the
invariance of I[f] under any allowable change of
variable, and provides an origin of measurement of l[f].
The term k can be absorbed into the base of the logarithm
and re(x) may be taken as unity so that (2) can be written

as

(POME)
The P O M E formulated by Jaynes (1961, 1982) states that
'the minimally prejudiced assignment of probabilities is
that which maximizes the entropy subject to the given
information.' Matfiematically, it can be stated as follows:
Given m linearly independent constraints C~ in the form

Ci = f f yi(x)f(x) dx,

i= 1, 2. . . . . m

(5)

~ab
I[f] = -

f(x) In f(x)

dx

(3)

We may think of I[f] as the mean value of - I n [ f (x)].


Actually, - I measures the strength, + I measures the
weakness. The SEF allows choosing that f(x) which
minimizes the uncertainty subject to specified constraints.
Note that f(x) is conditioned on the constraints used for
its derivation. Verdugo Lazo and Rathie (1978) have given
SEF for a number of probability distributions.
The SEF with the transformed function 9(z) is written
accordingly as

l[g] = It can be shown that

f/
f,"

l[f]=l[9]+

= 119] +

9(z) In 9(z) dz

9(z)ln dx dx

f(x) In dzz dx

In practice we usually have a discrete set of data points


xi, i=1, 2..... N, instead of a continuous x variable.
Therefore, the discrete analog of (3) can be expressed as

where yi(x) are some functions whose averages over f(x)


are specified, then the maximum of I subject to the
conditions (5) is given by the distribution
f ( x ) = e x p [ - - a o - ,~x

aLva(x)]

(6)

where ai, i = 0, 1. . . . . m, are the Lagrange multipliers, and


can be determined from (5) and (6) along with the
normalization condition in (1). This can be done as
follows.
According to POME, we maximize (3) subject to (5),
that is,

~ab
6 ( - I) =

[1 + In f(x)]ff(x)

dx

(7)

I can be maximized by the method of Lagrange


multipliers. This introduces parameters (a0 - 1), al, a2 .....
am, which are chosen such that variations in a functional
of f(x)

F(f)= - f f f(x)[ln f(x)+ (ao-1)+ i~=l ayi(x)] dx


vanish when

Adv. Water Resources, 1986, Volume 9, June 93

Derivation of some frequency distributions: V. P. Singh

et al.
Since -J'~9(x)ln f(x) dx= - ~ f ( x ) In f(x)dx in this
problem, because the first m constraints are the same, we
have

6F(f)=-ff[lnf(x)+l+(ao-1)
+ ~=a
~ ayi(x)]6f(x)dx=O

which is the same as (6).


The value of I for such

o - ~=~a.,y~(x)l
f(x)

as given by (6) is
I,--

I,.[f] = -

f(x)

In f(x)

dx = ao +

1.[9]

(14a)

and hence we obtain (11). The significance of this result lies


in the fact that the increase of the number of constraints
leads to less uncertainty as to the information concerning
the system. Since (14a) defines the gain in information or
.reduction in uncertainty due to increased number of
constraints, an average rate of gain in information I r can
be defined as

This produces
f(x)= exp[-a

l[glf] = I.,[f] -

aiCi

Im[f ] --I[,[9]

(14b)

n - m

(8)

i=1

The subscript m attached to I is to emphasize the number


of constraints used. This, however, raises an important
question: How does I change with the changing number
of constraints? To address this question, let us suppose
that 9(x) is some other pdf such that ~ 9(x) dx = 1 and is
found by imposing n constraints (n > m) which include the
previous m constraints in (5). Then

I.[9]<~I,,[f ]
where

I.[9] = -

and

I"[f]-I"[g]~'2

n>~m

(9)

g(x) In 9(x) dx

(10)

for

The general procedure for deriving a frequency


distribution involves the following steps: (1) Define the
available information. (2) Define the given information in
terms of constraints. (3) Maximize the entropy subject to
the given constraints. (4) Modify, if necessary, the
resulting probability distribution by using Bayes' theorem
when additional information becomes available.
More specifically, let the available information be given
by (5). The P O M E specifies f(x) by (6). Then inserting (6)
in (3) yields

gtxl(Jlxt-olxt?
fo
\

9~f

I[f]=ao+ ~ a,C,

dx~>O (11)

f"g(x)InLf(x)A[
g(x)~ dx

(12)

exp - a o -

ao'i d x = l
i=l

Because of Jensen's inequality,


lnx~>l--

(15)

i=1

In addition, the potential function or the zeroth


Lagrange multiplier a0 is obtained by inserting (6) in (1) as

In order to prove these statements, consider

I[g [[] =

DERIVATION OF FREQUENCY
DISTRIBUTIONS

resulting in
1

(13)

we have upon using the normalization of

f(x)

and 9(x),

l[glf ] >>-0

ao=lnffexpl-

~ aLv,ldx

(16)

The Lagrange multipliers are related to the given


information (or constraints) by

From (10), this relation may be written as

- ff g(x)lng(x)dx<~- ff g(x)ln f(x}dx

a o
- - ~-~3a~
=Ci

It can also be shown that

Inserting (6) for f(x) in the right side of this inequality and
the definitions (8) and (10) we get (9).
To obtain (I1) we note that

94

#(x)(f(x)_-g(x)']adx

\ g(x) /

Adv. Water Resources,

1986,

~a~ = var[y,(x)],

~2a o

=--fbO(x)ln[1q f(x)-g(x)ld
x g ~j
>i+21 f f

6~2ao

0 a ~ j = COV[yi(x)yj(x)],

l[glf]= ;f g(x)ln[~]dx

Volume 9, June

(17)

i#j

(18)

With Lagrange multipliers estimated from (17)-(18)the


frequency distribution given by (6) is uniquely defined. It is
implicit that the distribution parameters are uniquely
related to the Lagrange multipliers. Clearly, this
procedure states that a frequency distribution is uniquely

Derivation of some frequency distributions." V. P. Singh et al.


defined by specification of constraints and application of
POME.
Quite often we anticipate a certain structure of pdf, say
in the form [this is normalized according to (1)],
f ( x ) = Ax k exp[-i~=1 a~vi(x)J

(19)

where yi(x) are known functions and k may not be known


explicitly but the form xk is a guess. The we may apply
POME as follows. We explicitly construct the expression
for l[f] in the form

I[f] = - l n A -kE[ln x] + ~ aiE[yi(x)]

fo

(26)

where E denotes expectation. Note that if the gamma


distribution were to represent the instantaneous unit
hydrograph of a watershed then (25) would define average
lag time and (26) average of the log of lag times. The leastbiased density function in (21) consistent with (24)-(26)
can be obtained by invoking POME, that is, using (6),

f ( x ) = e x p [ - a o - a l x - a 2 In x]

(27)

Inserting (27) in (24) yields

(20)

exp(a)= f o e x p [ - a l x - a 2 In x] dx

i=1

We may seek to maximize I subject to the constraints,


E[ln x], E[yi(x)], which can be evaluated numerically by
means of experimental data. In this fashion we arrive at an
estimation of the pdf which is least biased with respect to
the specific constraints and of the surmised form based
upon our intuition. This provides a method of deducing
the constraints given a 'form' for the pdf.
This procedure can be applied to derive any probability
distribution for which appropriate constraints can be
found. The hydrologic import of constraints for every
distribution, except a few, is not clear at this point. This
procedure needs modification, however, if the distribution
is expressed in inverse form as for example the Wakeby
distribution. Table 1 summarizes a number of
distributions used in hydrology which were derived
following the above procedure. In addition, we illustrate
the procedure by deriving the gamma distribution using
POME. The probability density function (pdf) of a
gamma distributed random variable x can be written as

ln xf(x) dx = E [In x]

-,al,X~
) -a~r(1-a2)

(28)

Thus, the zeroth Lagrange multiplier is given as


a o=

(a 2 -

1) In a 1 + In [F(1 - a2 )]

(29)

Substituting in (27),
f(x) = exp[(1 - a 2 ) I n a I - I n F(1 - a 2 ) - a l x - a 2 In x]
( a 1 )1 - a 2

-F(1 -a2)

(x)- a~exp( - a i x)

(30)

(30) reduces to (21) rewritten as,

f(x) =

(x)b 1
a

e x p ( - x/a)

(31)

if

exp(-x/a)

f(x)--a- ~

(21)

where a > 0 and b > 0 are parameters.


Taking log to the base e of (21),

a 2 = 1 -- b

(33a)

a I = l/a

(33b)

However, the Lagrange multipliers al and a 2 remain yet


to be determined in terms of (25)-{26).
Rewriting (28) and the differentiating with respect to a~
and a 2 respectively,

In f(x)= - ln(aF (b))- (b - 1)In a + (b - 1) In x - -

(22)

t3ao=
Oal

~ x e x p [ - a l x - a 2 lnx] dx
~ exp [ - a 1x - a2 In x] dx

Multiplying (22) by - f ( x ) and integrating between 0 and


0(3,

= - f~o x e x p [ - a - a l x - a 2 1 n x] dx
n oo

-fo

f ( x ) I n f (x) dx = [ln(aF (b )) + (b - l ) ln a] f o f (x) dx

=-

xf(x) dx

-(b-

l)

fo

lnxf(x)dx+ a

xf(x)dx

On comparing (23) with (15) the constraints can be


expressed as

o f ( X dx = 1
f o Xf (x) dx = E[ x] = ~

= -E[x] = -2

(23)
t3ao
da2

(34)

~ In x e x p [ - a l x - a 2 In x] dx
~o exp[ - a x - a In x] dx

(24)

= - f o In x e x p [ - a o - a l x - a 2

(25)

= - f o In xf(x)= - E [ l n x]

In x] dx

(35)

Adv. Water Resources, 1986, Volume 9, June 95

Derivation of some frequency distributions: V. P. Singh

e t al.

II

li

II

II

II

II

II

II

II
'W

II

~.

.o
e. o
o.~

=E

II

II

II

i
=

,.Tp ~

.g
~1'~

~lt'q

~lt'q

II

tl

II

II

elle~

Ir

li

II

II

II

II

II

II

II

teq

II

eq
v/
v/

.o~
.~ ~,.N
,.~ "~, ~

I
E
V
II

11

II

~'

"':

-=

II

-~

,~

~-I~:

~'[-~

e-, 0

"

o
....

96

Adv. Water Resources,

1986,

Volume

9,

June

""

Ieq~

II

-I~
~

I~

Derivation of some fi'equency distributions: V. P. Singh et al.

"D
I

~"~

~,

It

II

II

II

'~

t'q
II

"~

'~

~-

II

e~

~, ~

~ .~

,~

~ ,~ ...~

~.~

,o,

L.

--~

~,

"

.-~

..~

II

II

II

tl
...g

-.rl~
i

~.

.~

"7.

~ ,
._.o ~

~ ]

~1xl
V
~I~
II

I ~

tt

'~

:~

.-~

~.

II

--I~1e ' ~ - I
~'~

II

--

-d
t-,

e,

>,-=
t~

.=

~.

E K
m

Adv. Water Resources, 1986, Volume 9, June

97

Derivation of some frequency distributions: V. P. Singh et al.

e~

II

_~

II

i ~

j _=

II

II

II

_=
~a

II

II

II

"-~

V
o
A

e~

-~

~.

+
x

II

II

-t-

--~

II

V
o

-~

~=
~rd

98

Adv. Water Resources, 1986, Volume 9, June

,,.d

~
--.=

Derivation of some frequency distributions: V. P. Singh et al.

"C
I

II

11

~.~

,,...,.

II

,-~

~,

-.=

r"-n

I
+
r~
II

II

.~ ~
~.~_

II

r~
II

"~

"~

,:~

II

~.,~,,~

II

~.;-

II

,~

~,~

o~

,~

o ~

o~

.o

e,l

.-,,T
+

e~

--~

+
I
-

II

.~

II

II

,~

-I

k.

~~

I
+

~"~
I

I
II

N
+

II

-I.-

~~'

~
",~

'

-~

,~-

I1

II

i ~-

--

r~

77

_~

~
li

II

e~

Adv. Water Resources, 1986, Volume 9, June

QQ

Derivation of some frequency distributions: V. P. Sin~h et al.

~
,--,

+
~

_=
I

-~

--=

I
I

e~

II

II

II

_=
-=
+
II
2 ,

"~

[_=
-4-

e.

..7
_=

E~
I
II

..7

II

II

II

.?

.~

..7

.,7

-A

_=

_=

.~

.~~

I
II

_=
+

[.
I

"

_l

e-

100

Adv. Water Resources, 1986, Volume 9, June

II

II

+
v

II

Derivation of some frequency distributions: V. P. Singh et al.


Also from (29),
~ao

( a z - 1)

Oax

a1

(36)

?
~ a - l n al + ~ F ( 1 - a 2 )
~a 2

Let k = 1 - a 2 ,

,--

:--,

#k/~a2 = - 1. Therefore,

Oao

~, + ~

~'z7

ca 2

~k

8a---;=lnal +ff~F(k) ?az=lnal-~k(k)

(37)

in which ~(.) is the digamma function defined as


d[ln F(k)]/dk.
Equating (34) to (36),

II

a 2 -- 1 =.~
a 1
+
or
+

k
1

(38)

In al = qJ(k) - E[ln x]

(39)

--

X,

k =

a 2 -

al

Equating (35) to (37),

W
t,.,.,a

Inserting (38) in (39) to eliminate al,


O ( k ) - l n k = E [ l n x] - l n J

(40)

(38) and (40) determine a I and a2 in terms of J and E[ln x].


This completes the derivation of the gamma distribution
using POME.

e~

._.9.
...

PARAMETER
N

I
"-L

,-.

,.

='=,~

_f
e-

e~
o
.,..~

~"

ESTIMATION

The discussion on derivation of frequency distributions


indicates that the Lagrange multipliers are related to the
constraints on one hand and to the distribution
parameters on the other hand. These two sets of relations
are used to eliminate the Lagrange multipliers, and
develop, in turn, equations for estimating parameters in
terms of constraints. For example, consider the gamma
distribution. The Lagrange multipliers al and a2 are
related to the constraints E (x) and E(ln x) by (38) and (39),
and independently to distribution parameters a and b by
(33a) and (33b). Finally, (38) and (40) relate the parameters
to the specified constraints. Thus, P O M E leads to a
unique method of parameter estimation. Table 1 provides
equations, derived by employung POME, for estimating
parameters from constraints for a number of distributions
frequently used in hydrology.
It may be useful to briefly compare this P O M E method
of parameter estimation with the method of moments
(MOM) and method of maximum likelihood estimation
(MLE), two of the most frequently used methods. To
contrast the P O M E method with the MLE method, we
consider the case of a general pdf f(x:O) where 0
represents a family of parameters ai, i = 1, 2. . . . . M. In the
MLE method we construct the likelihood function L for a
sample of size N as

Adv. Water Resources, 1986, Volume 9, June 101

Derivation of some frequency distributions: V. P. Singh et al.


N

L = l - [ f(xi'O)

(41)

i=l

This result can be extended to a very general case of

f(x) as in (19) rewritten as

and maximize either L or In L. Taking lof of (41)

f ( x ) = Axk e x p [ - i ~ 1 a.J,,(xQ

(19)

In L = ~ In f(xi; O)

(42)

The SEF of this function is

By differentiating In L with respect to each o f the


parameters a~ separately and equating to zero, we
generate as many equations as the number of parameters.
We solve these equations to obtain parameter estimates.
If, however, we multiply (42) by - ( I / N ) then

l[f]=-lnA-kE[lnx]+

i=1

N ln f(xi,'O)
---~l l n L = ---~1 ~ ln f ( x i ; O ) = - ~,
i=l

i=l

j=l

t50)
On the other hand,
In L =

i=1

~ a, Z E[yi(xj)]

ln[A~] exp

aiy,lxj)

i=1

(43)
Recall that

= Z l n A + k Z l n x i - Z ai Z yi(xj)
i=1

i=1

i=1

j=l

l [ f ] = - ~ f(xi; O)In f(xi; 0);

Multiplying by - ( l / N ) throughout,

i=1
N

~_, f(xi;O)= l

(44)

- ~I l n L = - l n A - k,

'~lnx~

i=l

~
i=1

~a~

~1

yi(xj)

j-

(51)

On comparing (43) with (44) it is seen that


1

I [ f ] = - ~ In L

(45)

provided In f(xi;O) is uniformly weighted over the entire


sample. The POME method involves population
expectations, whereas the MLE method involves sample
averages. If population is replaced by a sample then the
two methods would yield the same parameter estimates.
To fully appreciate the significance of(45), we consider the
case of an exponential distribution

f(xi) = ~ exp(-- exi)

(46)

Then
N

I[f] = - ~ e e x p ( - e x , ) l n [ e exp(-ex,)]
k=l

= - In + eE[x]

(47)

(50) is the same as (51) if E [ ' ] terms are replaced by


corresponding averages.
To compare the POME method with MOM is not
straightforward and requires further research. The MOM
is not variational in character, whereas the POME
method is. If the constraints in (5) are moments then the
parameter estimates by the two methods would be the
same. This is, for example, true in the case of exponential
and normal distributions as seen from Table 1. If the
constraints are other than ordinary moments which is
true of most distributions then it is not known whether the
two methods would yield identical or different parameter
estimates and what conditions, if any, would there be for
differences in the parameter estimates.
In practice we usually employ a discrete set of data
points, xi, i = 1, 2..... N, to determine the constraints (or
moments) representativeness and accuracy of which
depend upon the sample size. To emphasize the
dependence of I on N, we write (4) as

By maximizing I[f] with repsect to ct,

1%'

IN[f] = -- ~ f(x, "a)In f(xi,'a),

(48)

= 1/E[x]

i=1

On the other hand,

with

lnL=Nln~-ct

~, f ( x i ; a ) = l
i=l

~ xi
i=1

Using the inequality

f(x) - f2(x) ~<f(x)In f(x) <<.1 - f ( x )

By maximizing In L,
we obtain
(49)

1-Y~
The difference in the two estimates of ~t given by (48) and
(49) is that the POME method uses expectation ofx or the
pupulation mean, whereas the MLE method uses average
of x or sample mean.

102

Adv. Water Resources, 1986, Volume 9, June

f 2 ( x i ; a) ~<l ~ [ f ] ~ N - 1

i=1

If, however, f~= 1/N (uniform distribution) then


0~< IN[f] ~<In N

Derivation o f some f r e q u e n c y distributions:

V. P. Singh et al.

Table 2. Some pertinent statistical characteristics of annual maximum discharge series for six selected river gaging stations

River Gaging Station


Comite River near
Olive Branch, Louisiana
Comite River near
Comite, Louisiana
Amite River near
Amite, Louisiana
St. John River at Nine
Mile Bridge, Main
St. John River at
Dickey, Maine
Allagash River near
Allagash, Maine

Area
(km 2)

Mean
~

St.
Deviation
Sx

Skew
coeff,
C~

Kurtosis
coeff.
Ck

38

1 405

238.2

174.5

0.7

2.52

38

1 896

315.7

166.8

0.54

2.77

34

4092

745.1

539.5

0.71

3.03

32

1 890

699.0

223.7

0.41

3.01

36

5 089

1449.7

517.7

0.35

2.55

51

1 659

438.8

159.8

0.71

3.30

Table 3. Parameters of the gamma distribution fated to annual maximum discharge series by MOM, MLE and POME methods
MOM
River Gaging Station

Comite River near


Olive Branch, Louisiana
Comite River near
Comite, Louisiana
Amite River near
Amite, Louisiana
St. John River at Nine
Mile Bridge, Maine
St. John River at
Dickey, Maine
Allagash River near
Allagash, Maine

MLE
b

127.85

1.86 '

POME

131.82

1.81

131.82

1.81

88.07

3.59

95.15

3.32

95.15

3.32

390.62

1.91

445.72

1.67

445.72

1.67

71.61

9.76

70.98

9.85

70.98

9.85

184.91

7.84

187.62

7.73

187.62

7.73

58.18

7.54

55.97

7.84

55.97

7.84

Table 4. Parameters of the log-Pearson type (LPT) III distribution fitted to annual maximum discharge series by MOM, MLE and POME methods
MOM
River Gaging Station
Comite River near
Olive Branch, Louisiana
Comite River near
Comite, Louisiana
Amite River near
Amite, Louisiana
St. John River at Nine
Mile Bridge, Maine
St. John River at
Dickey, Maine
Allagash River near
Allagash, Maine

Table 5.

MLE

POME

0.063

171.0

- 5.6

0.062

173.5

- 5.6

0.063

171.0

- 5.6

0.223

7.4

3.9

0.114

32.0

1.9

0.093

42.3

1.6

0.288

9.8

3.5

0.197

23.5

1.7

0.156

33.6

1.1

0.062

29.1

4.7

0.067

26.8

4.7

0.062

29.1

4.7

0.101

14.1

5.8

0.071

31.1

5.0

0.062

37.6

4.9

0.051

52.6

3.3

0.053

50.3

3.3

0.051

52.6

3.3

Parameters of the extreme value type (EV1) I distribution fitted to annual discharge series by MOM, MLE and POME methods
MOM

River Gaging Station


Comite River near
Olive Branch, Louisiana
Comite River near
Comite, Louisiana
Amite River near
Amite, Louisiana
St. John River at Nine
Mile Bridge, Maine
St. John River at
Dickey, Maine
Allagash River near
Allagash, Maine

MLE
b

POME
b

0.0074

160.0

0.0078

159.0

0.0075

161.0

0.0077

241.0

0.0074

238.0

0.0074

238.0

0.0024

502.0

0.0024

498.0

0.0024

502.0

0.0057

598.0

0.0053

593.0

0.0054

591.0

0.0025

1 220.0

0.0023

1 200.0

0.0023

1 200.0

0.0080

367.0

0.0078

365.0

0.0078

365.0

Adv. Water Resources, 1 9 8 6 , Volume 9, J u n e

103

Derivation of some frequency distributions: V. P. Singh et al.


700
-" :
o----.e

OBSERVATIONS
M L E , POME

;/

f/

500

U
o

MOM

600

use an appropriate distribution by different methods


(MOM, MLE, POME, etc.). Calculate the entropy for
each of these methods, and call it computed entropy. The
method providing the computed entropy closest to the
observed entropy is deemed as the best method.

//

,.
/ ,,/

APPLICATION TO ANNUAL MAXIMUM


D I S C H A R G E SERIES

E 4OO
v

w
a: 300
u

~ 2ooi
100

REDUCED
;

VARIATE

2 0'

4 ' 0 ' 6 0 7'0 g '0' 8 5 9 '0 9'2 . 5 '9 5


9 7' 5
PROBABILITY
OF N O N E X C E E D A N C E

9 9'

Fig. 1. Frequency curve using annual maximum discharge


series for the Comite River near Olive Branch, Louisiana
7OO

O0$1rR1TIONS
MLE, POIml[
600

g
3

~/J/

/,

roOM

3o0

/ ;

iio

~'p"

~ /
s. f

"

zoo

/ *

I00

Oo

I
,o

I
20

i
30

,io

REDUCED
O,i

,. . S. ,.0

ZO ~0 4'0 SO 6 0'

PROBABILITY

,'o

:o

;o

'

'

:o

VARIATE

7 '0 7 s' e 0 ' e~'


,0
9.~
OF N O N E X C E E D A N C E

Fig. 2. Frequency curve using annual maximum discharge


series for the Comite River near Comite, Louisiana

E N T R O P Y AS A C R I T E R I O N F O R G O O D N E S S
O F FIT
It is plausible to employ entropy to evaluate goodness of
fit, and consequently delineate the best parameter
estimates of a fitted distribution. This can be
accomplished as follows. For a given sample compute the
entropy and call it observed entropy. To this end, we may

104

Adv. Water Resources, 1986, Volume 9, June

9Z'S

Data on annual maximum discharge series for Six selected


river gaging stations were used for fitting two-parameter
gamma distribution, log-Pearson type III distribution
and extreme value type I distribution. It should be
emphasized that these distributions are being used to
illustrate the parameter estimation capability of P O M E ;
actually any distribution can be chosen. Some pertinent
characteristics of the discharge series are given in Table 2.
These gaging stations were selected on the basis of
homogeneity, completeness, independence and length of
record. Each station had more than 30 years of record.
The objective was to evaluate and compare fitting of each
of the three distributions to discharge series by the
methods of moments, maximum likelihood estimation
and principle of maximum entropy.
The parameters a and b of the gamma distribution
obtained by the three methods for each discharge series
are given in Table 3. Likewise, the parameters a, b and c of
the log-Pearson type III distribution are given in Table 4,
and the parameters a and b of the estreme value type I in
Table 5. The parameter estimates by MLE and P O M E
methods are identical for the gamma disLribution. Also
these are not greatly different from those by MOM. This is
further illustrated by Figs 1 and 2 which compare
frequency curves generated by these methods for two
sample gaging stations. P O M E was found to be superior
to M O M for the data used in this study. The SEF was
computed with parameters estimated by each method for
each discharge series, as given in Table 6. Also shown in
the table are differences between computed and observed
SEF values. For 4 data sets, the SEF difference is less for
P O M E than for MOM, and for two data sets it is more.
This implies that P O M E is a better parameter estimation
method for 4 data sets and worse for two data sets than
MOM.
The three methods yield comparable parameter
estimates for the extreme value type I distribution; the
differences in parameter estimates are only marginal as
shown in Table 4. This also holds for the log-Pearson type
III distribution as shown in Table 5. The SEF was
computed for these two distributions with parameters
estimated by each method for each discharge series, as
given in Tables 7 and 8. Also shown in these tables are the
differences between computed and observed SEF values.
In case of the log-Pearson type III distribution, the SEF
difference for P O M E is either less than or equal to that for
MOM, and also for MLE except for one data set. This
same is true in case of the extreme value type I
distribution. This then points out that P O M E is a better
parameter estimation that M O M as well as MLE. Thus, it
can be concluded that P O M E does offer a promising
alternative for parameter estimation. Additional research,
however, is required to better understand its strengths and
weaknesses.

Derivation o f some frequency distributions: V. P. Singh et al.


Table 6. Valuesof SEF for the gamma distribution
SEF
River Gaging Station
(1)
Comite River near
Olive Branch, Louisiana
Comite River near
Comite, Louisiana
Amite River near
Amite, Louisiana
St. John River at Nine
Mile Bridge, Maine
St. John River at
Dickey, Maine
Allagash River near
Allagash, Maine

SEF Difference

SEF of
sample
(2)

MOM
(3a)

POME
(3b)

MLE
(3c)

MOM
[(2)-(3a)]

3.592

3.162

3.166

3.166

0.430

0.426

0.426

3.664

3.343

3.349

3.349

0.321

0.315

0.315

3.397

2.946

2.977

2.977

0.451

0.420

0.420

3.412

3.144

3.143

3.143

0.268

0.269

0.269

3.611

3.325

3.326

3.326

0.286

0.285

0.285

3.781

3.544

3.540

3.540

0.237

0.241

0.241

POME
[(2)-(3b)]

MLE
[(2)-(3c)]

Table 7. Valuesof SEF for the log-Pearson type III distribution


SEF
River Gaging Station
(1)
Comite River near
Olive Branch, Louisiana
Comite River near
Comite, Louisiana
Amite River near
Amite, Louisiana
St. John River at Nine
Mile Bridge, Maine
St. John River at
Dickey, Maine
Allagash River near
Allagash, Maine

SEF Difference

SEF of
sample
(2)

MOM
(3a)

POME
(3b)

MLE
(3c)

MOM
[(2)-(3a)]

3.592

3.155

3.153

3.155

0.437

0.439

0.437

3.664

3,219

3.283

3.283

0.445

0.381

0.381

3.397

2.883

2.935

2.926

0.514

0.462

0.471

3.412

3.123

3.123

3.123

0.289

0.289

0.289

3.611

3.320

3.320

3.323

0.291

0.291

0.288

3.781

3.541

3.540

3.541

0.240

0.241

0.240

POME
[(2}--(3b)]

MLE
[(2)-(3c)]

Table 8. Valuesof SEF for the extreme value type I distribution


SEF
River Gaging Station
(1)
Comite River near
Olive Branch, Louisiana
Comite River near
Comite, Louisiana
Amite River near
Amite, Louisiana
St. John River at Nine
Mile Bridge, Maine
St. John River at
Dickey, Maine
Allagash River near
Allagsh, Maine

SEF Difference

SEF of
sample
(2)

MOM
(3a)

POME
(3b)

MLE
(3c)

MOM
[(2)-(3a)]

3.592

3.105

3.095

3.103

0.487

0.497

0.489

3.664

3.340

3.348

3.348

0.324

0.316

0.316

3.397

2.919

2.920

2.919

0.478

0.477

0.478

3.412

3.122

3.139

3.135

0.290

0.273

0.277

3.611

3.313

3.329

3.329

0.298

0.282

0.282

3.781

3.535

3.542

3.542

0.246

0.239

0.239

POME
[(2)-(3b)]

MLE
[(2)-(3c)]

CONCLUSIONS

ACKNOWLEDGEMENTS

T h e following c o n c l u s i o n s are d r a w n from this study: (1)


P O M E c a n be used to derive a n y d i s t r i b u t i o n provided
a p p r o p r i a t e c o n s t r a i n t s are specified. (2) It provides a
unified framework for deriving a n u m b e r of frequency
d i s t r i b u t i o n s used in hydrology. (3) It offers a n alternative
m e t h o d for p a r a m e t e r estimation. (4) There exists a
u n i q u e relationship between M L E a n d P O M E methods.
(5) F o r data o n a n n u a l m a x i m u m discharge series of six
selected rivers, the g a m m a d i s t r i b u t i o n s yielded by
P O M E were identical to those by M L E a n d c o m p a r a b l e
with those of M O M .

This study was s u p p o r t e d in part by funds provided by the


Geological Survey, US D e p a r t m e n t of Interior, t h r o u g h
the L o u i s i a n a W a t e r Resources Research Institute u n d e r
the project, Assessment o f Uncertainty in Hydrologic
Models for Flood Frequency Analysis. The a u t h o r s are
grateful to M r K i s h o r e A r o r a who assisted in
c o m p u t a t i o n a l aspects.

Adv. Water Resources, 1986, Volume 9, June

105

Derivation o f some frequency distributions: V. P. Sin9 h et al.


REFERENCES
Amorocho, J. and Espildora, B. Entropy in the assessment of
uncertainty in hydrologic systems and models, Water Resources
Research 1973, 9(6), 151-152
Benson, M. A. Uniform flood-frequency methods for federal agencies,
Water Resources Research 1968, 4(5), 891-908
Bobee, B. and Robitaille, R. The use of the Pearson type 3 and logPearson type 3 revisited, Water Resources Research 1977, 13(2), 527443
Bucket, J. and Oliver, F. R. Fitting the Pearson type 3 distribution in
practice, Water Resources Research 1977, 13(5), 851-852
Burges, S. J., Lettenmaier, D. P. and Bates, C. L. Properties of the threeparameter log normal probability distribution, Water Resources
Research 1975, 11, 299-235
Chow, V. T. The log-probability law and its engineering applications.
Proceedings of the American Society of Civil Engineers, 80(536), 125
Crawford, N. H. and Linsley, R. K. Digital simulation in hydrology:
Stanford watershed model IV. TechnicalReport No. 39, Department
of Civil Engineering, Stanford University, Palo Alto, California, 1966
Davy, B. W. and Davies, T. R. H. Entropy concepts in fluvial
geomorphology: a reevaluation, Water Resources Research 1979,
15(1), 103-106
Dooge, J. C. I. Linear theory of hydrologic systems, Technical Bulletin
No. 1468, 327, Agricultural Research Service, US Department of
Agriculture, Washington DC, 1973
Eagleson, P. S. Dynamics of flood frequancy, Water Resources Research
1970, 8(4), 878-898
Gumbel, E. J. Statistics of Extremes. 3775 pp., Columbia University
Press, New York, 1958
Jaynes, E. T. Information theory and statistical mechanics, I. Physical
Review, 1957a, 106, 620-630
Jaynes, E. T. Information theory and statistical mechanics, II. Physical
Review, 1957b, 108, 171-190
Jaynes, E. T. Probability Theory in Science and Engineering, McGrawHill Book Co., New York, 1961
Jaynes, E. T. Concentration of distributions at entropy maxima, paper
presented at the 19th NBER-NSF Seminar on Bayesian Statistics,
Montreal, Canada, 1979
Jaynes, E. T. On the rationale of entropy methods. Proceedings of the
IEEE, 1982, 70(9), 939-952
Jowitt, P. W. The extreme-value type-I distribution and the principle of
maximum entropy, Journal of Hydroloyy 1979, 42, 23-38
Kalinske, A. A. On the logarithmic-probability law, Transactions,
American Geophyscial Union, 1946. 27, 709-711
Kite, G. W. Frequency and Risk Analysis in Hydrology, Water
Resources Publications, Littleton, Colarado, 1977
Kottegoda, N. T. Stochastic Water Resources Technology, John Wiley
and Sons, New York, 208-263
Leopold, L. B. and Langbein, W. B. The concept of entropy in landscape
evolution, US Geological Survey Professional Paper 500A, 20 pp.,
1962

Lettenmeier, D. P. and Burges. S. J. Gumbers extreme value 1


distribution: a new look, Journal of the Hydraulics Division,
Proceedings of the American Society of Civil Engineers, 1982,
108(HY4), 502-514
Levine, R. D. and Tribus, M. The Maximum Entropy Formalism, The
MIT Press, Cambridge, Massachusetts, 1978

106

Adv. Water Resources, 1986, Volume 9, June

Matalas, N. C. and Wallis, J. R. Eureka! it fits a Pearson type 3


distribution, Water Resources Research 1973, 9(2), 281-289
Nash, J. E. The form of the instantaneous unit hydrograph,
International Association of Scientific Hydrology Publication, 1957,
(42), 114-118
Phien, H. N. and Jivajirajah, T. The transformed gamma distribution for
annual streamflow frequency analysis, Proceedings, Fourth
Congress of IAHR-APD on Water Resources Management and
Development, Chiang Mai, Thailand, 1984, II, 1151-1166
Rao, D. V. Log Pearson type 3 distribution : method of mixed moments,
Journal of the Hydraulics Division, Proceedings of the American
Society of Civil Engineers, 1980, 106(HY6), 999-1019
Rao, D. V. Estimating log Peason parameters by mixed moments,
Journal of the Hydraulic Division, Proceedings of the American
Society of Civil Engineers, 1983, 109(HY9), 1118-1131
Rao, A. R. and Chenchagya, B. T. Probabilistic analysis and simulation
of the short-term increment rainfall process. Technical Report No.
55, Indiana Water Resources Research Institute, Purdue University,
West Lafayette, Indiana, 1974
Rao, A. R. and Chenchagya, B. T, Comparative analysis of short time
increment urban precipitation characteristics.
Proceedings,
National Symposium on Precipitation Analysis for Hydrologic
Modeling, 1975, 90-98
Richardson, C. W. A comparison of three distributions for the
generation of daily rainfall amounts. In Statistical Analysis of
Rainfall and Runoff, (Ed. V. P. Singh), Water Resources
Publication, Littleton, Colorado, 1982
Rohatgi, V. K. An Introduction to Probability Theory and
Mathematical Statistics, John Wiley and Sons, New York, 1976
Rosenkrantz, R. D., ed. E. T, Jaynes Papers on Probability, Statistics
and Statistical Physics, D. Reidel Publishing Company, Boston,
Massachusetts, 434 pp., 1983
Sangal, B. P. and Biswas, A. K. The 3-parameter lognormal distribution
and its application in hydrology, Water Resources Research 1970, 6,
505-515
Shannon, C. E. The mathematical theory of communications, I and IL
Bell System Technical Journal 1948a, 27, 379-423
Shannon, C. E. The mathematical theory of communications, III and IV,
Bell System Technical Journal 1948b, 27, 623-656
Shannon, C. E. and Beaver, W. The Mathematical Theory of
Communication, University of Illinois Press, Urbana, Illinois,
117 pp., 1949
Singh, V. P., ed. Rainfall-Runoff Relationship, Water Resources
Publications, Littleton, Colorado, 581 pp., 1982a
Singh, V. P., ed. Statistical Analysis of Rainfall and Runoff, Water
Resources Publications, Littleton, Colarado, 700 pp., 1982b
Sonuga, J. O. Principal of maximum entropy in hydrologic frequency
analysis, Journal of Hydrology 1972, 17, 177-191
Sonuga, J. O. Entropy principle applied to rainfall-runoff process,
Journal of Hysrolooy 1976, 30, 81-94
Todorovic, P. Stochastic modeling of floods. In Statistical Analysis of
Rainfall and Runoff, (Ed. V. P. Singh), Water Resources
Publications, Littleton, Colorado, 1982, 597~36
Tribus, M. Rational Descriptors, Decisions and Designs, Pergamon
Press, New York, 1969
Verdugo Lazo, A. C. Z. and Rathie, P. N. On the entropy of continuous
probability distributions, IEEE Transactions on Information
Theory, 1978, IT-24(1), 120-122
Yevjevich, V. and Obeysekera, J. T. B. Estimation of skewness of
hydrologic variables, Water Resources Research 1984, 20(7), 935-943

You might also like