Mathematics Statistics 1

ADVANCED
THEORY
OF
STATISTICS
1. Distribution Theory
2. Estimation
3. Tests of Hypotheses
"
Lectures delivered by D. G. Chapman

(Stat 611-612)
1958-1959
North Carolina state College
Notes recorded by R. C.. Taeuber
Institute of Statistics
Mimeo Series No. 241
September, 1959
'J
"
- ii TABLE OF CONTENTS
I. PRELIl1INARIES
Concepts of set theory
Probability measure
Distribution function (Th. 1)
Density funct ion (def. 8)
Random variable (def. 9)

Conditional distribution
Functions of random variables
Transformations
Rieman - Stieljes Integrals
0
o

0
"
PROPERTIES OF UNIVARIATE DISTRIBUTIONS -
III.
1
3
5
8
10
10
12
12
15
CHARACTERISTIC FUNCTIONS

0
0
0
0
0

0
0
0
0
Standard distributions

Expectation and moments
Tchebycheff 's Inequality (Th. e + corollary)
Characteristic functions
Cumulant generating functions (def. 17)
Inversion theorem for characteristic functions
Central Limit Theorem (Tho 14)
Laplace Transform "

Fourier Transform "

Derived distributions (from Normal) "
0
20
" 21
.. 24
e 25
28
0
I)
(Tho 12)
C)
30
33
36
36
36
CONVERGENCE
Convergence of distributions 0 0 0 0 0 e 0 39
Convergence in probability (def. 18) " 41
Weak law of large numbers (Th. 16)
41
A Convergence Theorem (Th~ 18) 0 0 0 0 0 44
Convergence almost everywhere (def. 20)
e
49
Strong law of large numbers (Th. 20)
51
Other types of convergence 0 0 0 0 " 0 0 53
0
IV.
"
"
POINT ESTIMATION
Preliminaries

Ways to formulate the problem of obtaining best estimates
Restricting the class of estimates
optimum properties in the large
Dealing only with large sampla (asymptotic) properties
Methods of Estimation
I'1ethod of moments e e
"
Method of least squares 0

Gauss-Markov Theorem (Th. 21) ,
"
Maximum Likelihood Estimates

Single parameter estimation (Th. 22)
Multiparameter estimation (Tho 23) 0 0 0
0
III
I)
ill
55
56
58
60

It
61
61
,.
61
64
65
70
-t
iii Unbiased Estimation

Information Theorem (Cramer-Rao) (Th. 24) 0
Multiparameter extension of Cramer-Rao (Tho 25)
Summary of usages and properties of m.l.e. 0 0
Sufficient statistics (def. 31)
Improving unbiased estimates (Th. 26) 0 0 0
Complete sufficient statistics (Tho 27)
2 Non-parametric estimation (m,vou.e.)
X -estimation 0 0 0 e .. 0 0
Maximum likelihood estimation 0 0 0 0 0
"
Min:unum
X2
2
Modified minimum X
Transformed nun:unum X2
Summary 0 0 0 0
0
..
..
Examples ".. .
Minimax estimation
c
Wolfowitzts minimum distance estimation
0
0
DO'
V.
~EST1NG
..
..
"
fi
..
..
..
76
80
85
86
88
92
95
98
102
102
103
103
104
10,
108
110
OF HYPOTHESES -- DISTRIBUTION FREE TESTS
Basic concepts
113
Distribution free tests
U5
One sample tests
o. 117
Probability transformation
~ 121
Tests for one sample problems
121
Convergence of sample distribution function (Tho 31)
124
Approaches to combining probabilities o. 129
Two sample tests 0 0 41 0 0 0 133
Pitman I s theorem on asymptotic relative efficiency (Tho 32) 140
k-sample tests 0 0 .. 0 0 0 0 ,. 0 ,. 146
0
....
.'
....
..
VI.
{PARAMETRIC TESTS -- POWER

Optimum tests 0 0 0 0 0 0
Probability ratio test (Neyman-Pearson) (Th. 33) . . . . . .
U,M.P. U. test (Th. 35) 0 " 0 0 0
Invariant tests
Distribution of non-central t (~)
e
Tests for variances
0
0
.
Summary of normal tests .. " ..
Maximum likelihood ratio tests e 0 0 e 0
General Linear Hypothesis 0.... 41 0 ..
Power of analysis of variance test
e
Multiple Comparisons 0 .. 0 0 0 0 0
Tukey's procedure " e .. ()
2 Scheffels test for all linear contrasts (Tho 38)
X -tests 0 0 .. 0 0 .. 0
Power of a simple X2-test e 0 0 0 " 0 ~
One sided x2.test
0
o Q o "
0
41
....
'"
..
..
..
41
148
149
156
161
163
168
169
173
175
182
186
186
187
191
191
192
- iv x 2.test of oomposite hypotheses 0 0

Asymptotic power of the x2.test (Th. 39)
Other approaches to testing hypotheses 0 0 o 0
VIII.'
I{(SCELLANEOUS~
193
0
194
197
CONFlDENCE REGIONS
Correspondence of confidence regions and tests (Tho 40) 199

Confidence interval for the ratio of two means .. .. 0 200
List of Theorems
List of Definitions
List of Problems
202
.,
204
206
CHAPl'ER I ..... kELIMINARIE8
I:
Preliminaries:
Set~
Def",
A collection of points in
1/ 8J.
lie
(Euclidian k-dimensional space) -. S
+ 8 2 is the set of points in either or both sets.
Sl + 8 2 is the set of points in both
(Sr e
in S:t
it' 81 is contained in 8
points in 5 2 bllt not
Exercise 1/ Show that 8
1 '" 8 2
~S2
c.l
8:l.
and 52'
8 ) then 8 2 .. ~ is the set of

2
82 + ~
I:
8 251
There is also the obvious extension of these definitions of set addition and multiplication to 3 or more sets.
CD
~ 18n
II;l
the set of points in at least one of the 8n
&:
the set of points common to all the Sn
CD
-nn'J 18n
s*
is defined as the complement of 8 and is the same as ~ -
Proof':
denote
Let
X. 8
(~ + 5 2 )* means that x is not a member of either ~ or 8 >

2
iGeo x
!l1is
s.
J 81J
therefore x e
an element of n
82
'*
~ ~ 1
e 8 2*
since x is common to both
Sr*
8 2*~ x
B ~
* 82*
I;
To complete the proof
=- ~
x e 81* 8 2*
* and x
:l 6
~ x_~
e 8 2*
andxJS2
. 7
x /.
(f\
::::::::r
(81 + S2)*'
+ 82 )
Exercise
2/
Show that 52 - 51 - S2~*0
Exercise
1/
In R define 81 .. {x,y: x 2 + y2 ~
2
1}
i.e~ the set of points x,y subject to the restriction x 2 + 1'2 , 1
~ 'xf~o8, t1"~ 08}
82
{x,;r
{x,y:
z O}
Represent graphically ~ + 8 2,
Def. 2/ If ~ c. 82 c 8
S:J.* 8 2"
S3S251~ 81 8 2 - SlS;"
(an exploding family)
co
We
def'ine~
lim Sn'" ~ Sn
n-l
n~(X)
CD
We
define~
lim Sn
n~tO
fr.B
n .1 n
Such sequences ot sets are called monotone sets ..

Exercise
(aj
hi
{x,;r: txt
8how that the closed interval in R

~ loll
2
as an inf1hite product of a set ot open intervals:
Ans~
(b)
8n [x,p1'g
Ixl
<:
1 +~,
lyl ~l
may be represented
1Y/< 1 + ~
Show that the open interval in R2 {x,y~ JxJ.( 1"

as an infinite sum of closed intervals~ .
11'1 <1
can be represented
-.1-
Probability is generally thought of in terms of sets, which is why we study sets.

Def ~
3/ Borel
Sets -
the family of sets which can be obtained from t.he family of

intervals in lie by a finite or enumerable sequenoe of operation
of set addition, multiplication, or complementation are called
Borel Sets.
The word multiplication oould be deleted, since multiplication can be performed by complementation, eog4l:
(.tIL * + S
-~
Def.
4/
*)*
(s*)*
S. S
-J. 2
~(S) is an additive set function i f
11 for
each Borel 8et A(S) is a real number, and

2/ it ~,\I 8 2, 0 0 are a sequence of disjoint sets
Examples: -
area is a set function
- in!l:L
C?'
~(S).
A(S). !f(X) dx
2Y
f:
dx
~ef3
5/
peS) is a probability measure on ~ i t
1/ P is an addi tive set function

2/ P is non-negative
3/ P(I1c) '" 1
fA will denote the empty set which contains no points,

Ex!!
C;;
P() .. 0
Exo
6/
i t ~ CS2 then P(S:!) ~ P(S2)
Lemma 2/ P (~ .;. 8 2 +
Problem 1:
Frovc lcr.una 2.
0)
~ P(~) + P(S2) +
1 oeo
til
R *;
k
9J
1\
t:t
R
k
Proof:
case I Define:
~ C 8 C 8,
~.
Sf
st
3
82 - ~
S3 - 52
etc.
CD
These sets Stare disjoint; &lso

n
n-l
CD
Sn
n-l n
CD
P ( 11m Sn)
n~CD
(~
8n )
-1
00
(~ S~)
n-l
the additive properly of P
P (~) + P (S2-51 ) + P (8 - 8 2 )
3
P(~,>
..P (8 ) + P(S2)
1
- P(S2) + P(S3)
etc.
P (Sn) after n steps
lim P(Sn)
n~
Case 2:
~::>
8 2 ,:) 8
Problem 2/ Prove lelllJl8 3 for case 2.

Defg
6/ Associated with any probabUity measure
rex)
peS) on
defined by
F(x.)
pc
00 I
x).
F(x) is called a distribution function -
dolo
Ii. there
is a point function
-sTheorem 1/ Any distribution function, F(x), has the following properties:

1. It is a monotone, non-decreasing sequence.
2e F (. CD) 0; F (+ CX +1
3. F(x) is continuous on the right.
Proof:
1/ For Xl < x 2 we
The interval (.
From exercise
have to show that
00" ~)
6 we
PC:;) ~ F(X2 )
C "'he interval (.
have that
P(I:t)
cx),
x2 )
~ P(I )
Therefore F(~) ~ F(x2 )

2/a/ It we define Gn as the interval (-
G a no+CD
lim G
n
cx)"
n)
n:d.,2 p 3,uuo
(the empty set)
~~ .F~n) l~cl(Gn) P(n~~ Gn )
From lemma 3
p(G) P() 0
b/ Follows in a simUar tashion by defining Gn
3/
= (-
CD" n)
Pick a point a ~ for this point we want to show
lim F(x)
= F(a),
JC.7a
x>a
Consider a nested sequence

Then
tn~O,
sn> 0
lig14(CX)a + en) =- F(a) is the property to be shown.
If we define ~ (-
lim Hn H.
n?CD
IX)
a + t n)
interval (-
n:l,2,,3,uuo
ex),
a)
lim P(~) P(lim \) P(H)

Therefore
l~~ a + en) F(a)
lemma 3
.6.
Problem 3/ Show that
F(a) .. lif ~(f) + P {Cal}
Where (a
is a.
x<a
Or in familiar terms F(a) F{x - 0) + Pr(x
is the set whose only point
aD
a)
Where F(x .. 0) is the
limit from the left #
Theorem 2/ To any point funotion F{x) satisfying properties 1,,2, and 3 of theorem
1, there oorresponds a probability measure peS) defined for all Borel
Sets suoh that for any interval (- ex>, x)
p(- co ~ x) F(x) ,
Proof omitted Theorem
3/ A distribution
see Cramer p.
53
referring to p. 22 II
funotion F(x) has at most a oountable number of dis-
oontinuitieso
Proof~ Let vnbe the number of points of disoontinuity with a jump,.

then v
n whroh is what we have to show(1
Suppo,se the oontrary holds, i.e 0
:> n
1#
Then if we let Sn be the set of such disoontinuities, we have

1 P{R:t)
P(Sn' ">
(n) ~ 1
whioh is a oontradiotion,
ex>
ex>
Therefore s the total number of discontinuities. ~ V <:. ~ n

n 1 n
1
co
where
is the sum of the integers whioh is a oountable sum_
Notationg
[ ]
square braokets -- means the end point is not inoluded in the interval -ioe cr , an open interval"
round brackets -- mean the end points are inoluded in the interval, ioe."
a olosed interval.
(a, bJ is the interval a
to
b, inoluding a
but not b.
In Rk to each probability measure peS) there corresponds a unique distributic.
function
7
The interval is the set of points in
1\
i .. 1,
Theorem
4:
F{~ .. x ' ~
&:1
2~ ~
0,
~) has the following properties:
10
It is cant inuous from the right in each variable
2.
F{- co, x 2' It . , x k ) F{~, co,

F{+co, +00'800, + 0 ) - 1
.3.
~ F{al' a2~ ", ~) ~ 0
3:1
0,
xk )
II:
..
..
see p. 79 Cramer
i.eo, the P measure of any interval is non-negative"

Conversely if F(X ~ x ' " "
l
2
P-measure defined by
0'
xk ) has these properties.ll then there is a unique
That is, I is the interval [ . co, -
00,
0,
~).
0 0; ",
('1.'
x2 '
0 0' ~).
&2 + h2~
-----~-------6---~(~ +
h.t,
4>
h 2)
In R
2
P(I) .. F(~ + ~,
~ F(~, a 2)
2 + h2 ) - F(~, a 2 + h2 ) - F(a1 + ~I a 2 ) + F(~I a 2 )
-8-
,,-P(Sl" ~
-F(sl +
-I-
h:t,
+F(~.; a 2.9
h2" &.3 + h3 ) F(~
1-
b:Ls
~i &.3 + h.3)
8,2 .. h2 , 8.,3)
3 + ~) + F(~D
8.
8.
2 + h2~ a.3) <c. F(~ .. ~) Q2' a.3)
...F(al , $2" a )
3
CII
~ F{~I
B- 2D
a,)
TIle proof of theorem 4 is by analogy with the linear case (theorem 1),
F(xl' +
Q)"
~ + 00) p(
CD" xl)
.. the marginal distribution of xl

(similarly for other dimensions)
Def.
8:
It F is continuous and differentiable in all variable s" then
is the density function of xl' x "

2
.!J
~,
Exercise 7~
F(x, y) .. 0
..
if
x ~0
ex + y)
for
.. 1 for x::>l,p '1) 1

Can this be a distribution function in R2?
How can the definitions be completed?
or
'1 ~ 0
O<x
~l
O<y
~l
- 9 -
Solution -- consider the marginal distribution of x

x +1
Fl (x) .. F(x" + (0) ~ F(x, 1 ) a=2
1)
F (0) .. (0
l
But in fact
ai
F(O, 0) .. 0
F(O, y) 0 for all y
Fl (0)
Therefore there is a contradictiono

function"
F cannot be a proper distribution
If F(x, y) is a proper distribution function, then the two marginal

distributions
F2 (7)
Proble!'l ~:
lit
F (+
00;
y)
If we define f(x, y) .. x + l'
must be proper and in this case

they break dowo
0 ~x ~1
o
.. 0
~l
elsewhere
Find F(x, y); F (x); and F (y)

2
l
Show that F(x" y) satisfies the properties of a distribution function..
.10 Delo
9:
Random Variable
We assume we have exper:lments which yield a vector valued set of observations
1.
la
(Xl' X2' II 0 <>1 Xlt ) with the properties:

For each Borel set S in ~ there is a probability measure peS) which is
13
the probability that the whole vector.! lalls in the set in S

(P(S) is non-negativeJ additive, and P(~)
then the combined vector
(.!ll
.!2J
()
1).
Cramer pe 1,2-4
axioms 1 and 2
Il'}on) is also a random variable
--
Conditional Distribution
( .~ Y)
- -
are random variables in Rk _:; R. .--~ -1(2
~2'
Let S and T be sets in "

!!et. 10:
It'
p( X belongs to 80" then we detine conditional probability

p( YeT
I xes).
P(Y C T.y XeS)

p( XeS)
We show that P( Y c:. T , XeS ) does satisfy the requirements of a probability

measure
1.. It is non-negative since p( YeT, XC.S ) is non-negative.
2... It is additive since p( Y c: T/1 xes) is additive in ~ ,.
2
P( Y C Tl J xes ) + P( yeT 2
,3-
P(Y c~,!) X CS)
p(X cS)
P(X <;.S)
If p(Y C T)
P(x
Co
1 X c: 5
CI
P(l CS)
> Owe could also define
SlY C T)
PCI c. SA yeT)
-P(YeT;
) P{ Y C Tl + T2 1 xes )
-11In familiar terminology what we are saying is that
Pr(A
B)
~tA)B)
or
Pr(A$ B) .. Pr(A' B) Pr(B).,
If we have the corresponding distribution functions
F(x, y); ~ (x) ... F(x, +
CD
>;
and F2 (Y) F(+
00,
y) then:
notation:
..- Oapital ktin lettet'~ used tor ranflom variables in general.

-- Small latin letters used for observations or specific values of the
random variables 0
Pr(X ~x) .. F(x)
Def. 11 -- extension:
In the case of n random variables,
Note':
X $
2
..
I)
these are independent if
Three variables may be pairwise independent, but may not be (mutually)

il'ldependent -- see the example on po 162 of Cramer..
If density functions exist, then
Note:
1.J.. ~
~,
X $
2
.,
independent means that
The fact that the density functions factor does not necessa.rily mean
independence since dependence may be brought in through the limitso
y
eog. X and Yare distributed uniformly on OAB
lex, y) 2
O~x~l
O~y~x
- 12 -
Var1ables~
Functions of Random
g is a Boral measurable function if for each Borel set S~ there is a set

such that
Sf
xeS' ~ 8(:)88
is also a Borel set o
U LLU,--St
Sf
A notat ion sometime s used is that S i

S under the mapping go
Sf
g-1 (S) -- that S i is the inverse image of
Now consider Y .. g(X) where X is a random variable.
.=
Pr(yc S) = Pr(x cS i)
p(S')
Therefore, arty' Borel measureable function" Y .. g(X) of a random variable, X" is

itself a random variable.
Pr(y
S) p [g-1(5)J
This extends rea.dily to k dimensionse
Transforrnations:
We have X and Y whioh are random variables with distribution function F(x" y) and
a density function I(x, y)q
Where ~l and 2 are 1 to 1 with';'" ~; are continuous" differentiable;

and the Jacobian of the transformation is non-vanishing."
dX
'Ox
d~
'OY
~Y
1:1
a.;.
o~
We then have the inverse functions
X ..
The density function f(a, b)

f ['t'l (a, b);
III
'f1 (.;."
'f2 (..<,
~)
~)
of the random variables ..<,
2(a, b)]
IJ
is
.. J3,-
Solution:
Consider alsoW
III
X .. Y
Consider the joint distribution ot (Z, V)
z X+
(0, 1)
f-------44!~$ 1)
t(x"Y) 1
V-x-y
o Z+W
C02
Ja
X Z -2 W III Y
III
(:..,,0)
22
1
l'lI
...
2'-2
Density of
IJI
Z" Will f(x, y)
-1
The limits of Z and W are dependent
Z X+ Y
WeaX..,Y
III
It Z .. z, then lIT takes on values from (0 .. z) thru 0 to z ... 0, so that
for Z
II
Q
III
Z ~
f(z"W) ..
with limits ..z.s; W'$ +z

z~l
Since we started with only Z, and "a.rtificially" added toT to get a solution" we must
now get the marginal distribution of Z (this being what we desire)o
z
z
F(z)
) F(t)dt
- 14
If X
'> 1, then
(z - 1) - 1
F(z)
takes on values from
to 1 - (z - 1)
(2-z) dz
2-z 2 .
2 . ~!L.
1
1
(2-Zi. +2'.. T2
~
z2
F(z) =T
1:1
f(z)
1 ..
o
(2_z)2
2
1:1
or from (a-2)to (2-z)
~z ~
(2-!l:.
-~
1 ~
.. z
2-z
1~z~2
~z ~l
See p.
24, ~ 6
in Cramer
F(z)
density
DoFo
2'
z
1
Joint density of
Z, W
fez, w}
2
1
1:1-
1
Oiz~l
... z-s.w'S.z
1
-"2
z - 2
~w ~
2 .. z
If the transformation is not 1 to 1 (that is J .. 0) then the usual devi. e to avoid

the difficulties that may arise is to divide ~ into regions in each of which
the transformation is 1 to 1, and then work separately in eaoh regiono
ioeo" consider in
R:J.
2
Y .. X
We should consider 2 separate

cases:
x.<o
Cramer-
p. 167
X~ 0
Riemann-Stie~jes
Integral:
0.,
Let F(x) be a d.f with at most a finite number of discontinuities in (a,9b) and
let g(x) be a continuous function, then we can define
g(x) dF(x)
Cramer
PCll 71-74
.s follows:
Divide (a , b) into
~l <.
-s;
sub-intervals xl' x 2"
but as
n ~ 00" !J. ----? 0
They can be shown to have a common limit.
0'
1i.
of length ~ !J.
S is increasing, S is deoreasing
So the common limit is called the R-S integral,
b(
+00
Also define
g(x) dF(x) lim
) g(x) dF(x)
a-7 -00 a
b-; +00
"'00
provided the :j.imit exists

and in general
b(
)
rf..x )
dF(x) bas all the usual properties of the fam:lliar Riemann

integral 0
If F(x) has density f(x) which is continuous except at a finite number of points,
,
F(x ) - F(x _l ) -.t(JI') (Xi - x _l )
x i _l < x
xi
i
i
i
t~n
<'
-!', (x ) ~ (x)
j' .. ~
[int g(x)]
~li
g(x) dF(x)
lim
l
D
f (x i) ]
fjx
g(x) .!(x) dx
ordinary Riemann integral.
JC:l.$ x 2,
If F(x) has only jumps at
.,
nand elsewhere
is oonstant
If g(x) is oontinuous then this limit (the R-S Integral) exists o Also, if g(x)
has at most a finite number of discontinuities and so does F(x) and they don1t
ooincide, then the R..S integral existso
Def. 12:
X is a disorete random variable if there exists a countable set of

points, ~, x 2' C I I , x n with Pr(X .. Xi) .. Pi and ~ (Pt) .. 1
[elsewhere F(x) is oonstant, ioe e FY(x)
=oJ.
~I
-I--..-------"~~ -
--'
...--- '
.----,r--__
g(x)
for such a discrete random variable, the R-S integral reduoes to a sum:
,J
b
g(x) dF(x)'" lim

n.....,co
~ g(x~ f) [F(X~) - F(X~"l)]
.. 17 r
rr
Where the xi are points of division of (a, b) and xi

point in the i th interval
is an intermediate
summed over the set of points xi

in (a, b) - .. the points where
there is some probability"
Det ~ 13: X is a continuous random variable if F(x) is continuous and has a
derivative f(x) continuous except at a oountable number of points e
F(X ) - F(X _ )
i
i l
aa
t(X~ t) [ xi
x _
i l
.i X _
i 1
(the theorem of the mean)
~ %~ r ~ Xi
g(x) dF(x)
.
J
Va can extend this definition to k-dimensions readily by writing:
g(~1 ~~6j ~) d . ~
x1 ,
HtJ,
For a del
II
Xt<:
F(x11 , x k )
of '\ see po B
.. 18 If F(~, x ' , ~) is continuous and the densitj f(X l , x '

2
2
It
.,
exists and is continuous}} then

b-
jrg('i'
a
0'
~) ~o.o ~ F('i'
~l ~
0
g('i"
~) f('i'
~)
0,
and this extends easily to the k-diU!ensional case,
So
In
l1.
0,
.,
~) dxl' ~
ak
d F(x) F(x) - F(- ClO) F(x)
-00
ji
b
dF(x). F(b) - F(a)
a.
+CD If we let b ~ .,. 00,
&.-7- 00
dF(x) 1
+00
d'i
~F('i'
0,
~)
-00
Consider k 2, and the marginal distributions

+00
Fl(xl ) F('i' + m).
)
-CD
lim
dFx (x1Si x 2 )
a1-CD
b -7+CD
that we have:
xk )
.. 19 ..
a.., ..co
b-7 +00
Fl (~, + m ) .. Fl (~" .00)
This also extends readily to
R.k
with xl held f:ixed.

If the density function exists 3 then this reduces to a k..l integral
+00
+00
0' ~) ~2'
0'
..m
- 00
Problem
f(>i' x 2, ' ,
6~
if Xl'
show that
U$
-2
ln Xi
have independent, uniform on (0.. l)distributions,
has a X2 distribution with 2n d.f.
and indicate the statistical application of thiso

see~
Snedecor ch~ 9
Fisher ...- about pG100
Anderson + BancrOft -- last chapter of section 1
From problem $b
Z .. -2 lnX Y
-2 ln X + ... 2 In Y
or is the sum of 2 X2 with 2 d.f" each

Referenoes on
integI'als~
-... Cramer pp. 39-40

-
Sokolnikoft -
Advanced Calculus -
ch.
.. 20 ..
Chapter II
Properties ot Univariate Distributionsl Characteristic Functions
Standard Distributionsl
A"
Trivial or point mass (discrete)

Pr
B0
[X ... a]
x.(a
x~a
Uniform (contimous)
F(x) = 0
x <0
F(x) ... x
1
rex)
C.
F(x) ... 0
F(x) 1
~x ~
>1
Binomial (discrete)
Pr
[x
sa
k]
.(:)pk(l
~ p)n-k
o ~p
] (:)pk (1 _ p),,"k [<l-P) + pJ n

D
1n
kaO
~ ( : )pk
(1 _ p)n-k
;; 1
is an identity in PI n
k=0
Do
Poisson (disorete)
'\
ct
00
e- A
'\ k
(X)
~ e->.. ~ .. e ->.. ~
kaO
E.
k~
k:
k~O k~
=1,
2,
.o~, 00
A >0
->..>..
.. e
e 1
Negative binomial (discrete)

Pr
[x . . kJ
r
. . ( r+k-l)
r-l
p
(1.. p)
k ...
l~
2, ..
oe,
00
O<::p<l
r is an integer
Example.
Draw trom an urn with proportion p of red ballS, with replacement, until
we get r reds out e The random variable in this situation is the number of
non-reds drawn in the processJ to have k black balls means that in the
first r + k - 1 trials, we got r .. 1 reds and k blaoks, and then on

the last trial got a red, the probability at this is
This is what is referred to as inverse sampling in that the number of

dofectives is speo:!1'ied rathez' than speoifying the sample size whioh
is then sorutinized for the mmber of defeotives.
F.
Normal distribution (oontinuous)

(x_ij)2
rex)
==
J'2ia
cl-0:>
Problem 71
<:
X <0:>
Prove that
t:~l
pr (1 - p)k 1
i"e., is an identity in r, p
Hints
Def. 14:
is in the Mmc ... express (a+b)-rl in an infinii.te series.
It' X is a random variable with distribution F(x) and if'
..
g(x) dF(x)
exists, then we define the expectation of g(X) as
g(x) dF(x)
this being the R .. S integral,

if X is oontinuous
E [g(X) ]
Cl
-co
if X is disorete
Problem 8,
Given
F(x)
==
E [g(X)
1 +
'2'
rex)
dx
-OD
v=O
g(xv ) Pv
x <.:0
1/2
Q
g(x)
)(
ffn
S .
o
x == 0
t 2/2
dt
>0
(This is a oensored normal distribution -- i.e., all the negative

values are conoentrated at the origin)
.. 22 Find,
Def .. 153
E(X)
exists it is defined to be the kth central moment and
If E [X - E(X)]k
"'k0
is denoted
If E(X)k exists it is defined as the kth moment about the origin" and is
denoted
Exercise 9g
Find E(X) for each of the standard distributions/1
53
Theorem
"\0
E [g(X) + hey)] .. E [g(X)] + E [hey)]
Proofs
Let F(x, y) be the joint d.f Cl of X, Y and F and F2 the marginal dof
l
E [g(X) + hey)]
III
Theorem 61
II
If X, Yare independent random variables, then
Proof.
Corollary.
[g(X)+ h(y) ]dx/(X,y) =jJg(X)dx/(X,y) 1f(y)dx/(X,y)
g(x)dl'l (x) + h(y)dy F2(y) .. E [g(X)] + E [hey>]
E [g(X) hey)] ..
E[g(X~ E~(y)J
See Cramer, po 1730
If X and Y are irdependent random variables, then
Var(X+ Y) .. Var(X) + Var(Y)

Mornentsa
'\
E(X) .. mean
"'-2 .. E(X ... lJ.) 2 .. variance
=:
~ .. E(X _ ~)3
""4 ..
E(X ..,
~)4,
etc.
for the normal distribution -- N(O, 1)
..x2/2
t(X! 0, 1)"
.J 2n
_1
{2i"
00
_..k
E(X--)
1
.---..-
all odd moments (k odd)
=0
(>
by a
"s~etry"
argument,
23
E(X2)
E(X 4)
III
dx = 1
{2n
3
or in general
E(X2n )
from integration by parts
2
2n -x /2
III
{2n
dx = 1 :3
5 ..
z: (2n - 1)
which can be shown by induction.

Theorem 7a Let.,(o III 1"
F(x), i.e o ,
~" ~,
u.,
be the moments of a distribution function
ilk
CD
then if' for some r
:>O~ ~ -
converges absolutely then F(x) is the

k=O
k~
only distribution with these moments~
Proof~
See Cramer, p. 176.
Example I
N(O, 1)
~k+l
=0
co
since odd terms drop out.
k=O
co
III
~ _~(_r2"",,)_k_ _
k=O 2k- 1 (k-l)! .2k
2
{r )k
~ ~(:;)k
k.
_r 2/2
III
::I
==
exponential series
- 24 converges absolutely for all r, therefore the only distribution

with these moments is the d.f. with the density
,
"
f(x)
1lI..!.-
V2it
-x /2
ilts., the normal
Problem 91
Find the moments of the uniform distribution and show that this is the
only d.f. with these moments.
Theorem 8a
(Tchebycheff's)
If g(X) is a non-negative function of X then for every K> 0
Pr [g(X)
~KJ ~
E g(X)]
K
00
Prooh E [g(X)]
g(x) dF(x)
Let S be the set of values of X where g(X)

Pr [g(X)
't KJ ? !.(X)
S g
dF(x)
.f
dF(x)
III
'.- Pr
Corollary II
since the smallest value of g(X) in S is K

KPr
[g(X)~ K]
E G(X) ]
K
The above (th. 8) converts readily into the more familiar form
Proof,
g(X)
~(X)~KJ !
~K
III
(See p. 182 in Cramer) setting

~)
(X .,
Pr [ (X - lJ.)
(k 0)
J~ TI2
'" k
taking the square root of the left hand side
- 25 -
X
n
II
munber of successes in n independent trials with a probability of

success in each trial I I p
O'2x = np(l-p)
Proof &
From corollary to theorem 8

Pr
[I :.r - pi ~ k
Choose s
=5
0"]
(,
or k
kJ p(l-p)
n
;l
1
II -
-J8
= e or
n = p(l-~l
8 e
Hence 1 n is chosen this large~ from the corollary to theorem 8 the stated
probability inequality follows o
Note I
Theorem 9 could be rewritten
n=~
8 s
Characteristic Functions 3
Def o
168
Characteristic functions
X(t)
=]
eixt dF(x)
or since e ixt = cos xt + i sin xt
CD
(X)
X(t) =
-00
cos xt dF(x) + i
-r~ sin xt dF(x)
- 26 -
the successive derivatives of (t) when evaluated at t = 0 yield the moments of

F(x) except for a factor of a power or tti".
d
(t)
dt
J ix e uro dF(x)
= v (t) =
m
-(I)
CD
' (0)
= i
x e O dF(x) i ~
-00
in general
the moment generating function operates in the same manner, except it does not include
the factor tti" $ and is therefore not as general in application,
MaF
= M(t) = E
(e R )
00
= .~
ext dF(x)
if' this integral exists
-00
and operates by evaluating successive derivatives with respect to t at t 0

Lemmas
i t E(Xk ) exists, then k (t) exists and is contimous.
also true.
Examples I
1.
Trivial distribution,
Pr
[X = a]
(t)
CD(
III
-00
20
Binomial.
1
xt
eJ.
The converse ~
- 27 co
3.
(t)
Poissona
k
eitk a-A _A
'"
~
III
kl
k=O
411
Normal (O~ 1) S
(t)
(1)( itx _x2/2 dx

1
)
e
e
-
1
III -
.J2n -00
1
=-
{2n
Fn
.~... e-
2
x
2' 2itx
dx
x2 ... 2itx + ( it l 2 + (it)2
e-
dx
setting y = x - it*
(Note the term in curly brackets is the integral of a normal density and equals one.)
The validity of this complex transformation has to be justif'ied o
If X is N(O, 1)
(t)
III
e-t
/2
2t _t 2/2
= - reI'
.)1
.),
(t)
= -e
(0)
=-
_t 2/2
2 _t 2/2
+ te"
22
E (X)
=i "
=1
See Problem 11.
.. 28 ..
Problem 10 s Prove
Problem 11.1
Show that
without using the transformation used in class

Hint. e itx oos tx + i sin tx
Theorem 108
AX +B (t). e
iBt
X (At)
where X is a random variable with a dof. F(x) and a characteristic function

(t)
y .. AX + B where A, B are constants
Proof:
i f X is N(O, 1) then Y a X + jJ. is N(jJ.,
Pyet)
I:
eitjJ. x(a t)
= e
Def 17 i
<)
i)
it
jJ. e
;:;e 1tjJ. - .
The cumulant generat ing function, K{t )" is defined to be i
K( t) :Ln ~(t)
Example. For Y which is N(jJ.,

2 2
t
K(t) itjJ. -
T-
i)
- 29 Note I
For further discussion of cumulants see Cramer or Kendall e
Note,
Originally cumulants = semi..invariants (British school name ~

Scandanavian school name) -- however, semi-invariants have been
extended so that now aumulants are a special case.
Theorem 11:
If
J1.P
X2J
0 UJl
Xn are independent
ran~om
variables, then
tr~
~f;.,~~.~ =.,.--1. ~AO~J'~. ~~}
.~x.-""
.i=l'
. J.
Ai
(the c.f e of a sum = the product of the individual c ,f' )

Proof,
=e
n
n
therefore we could say that Y is N(] j.(.i' ]
1
ai)
To justify this last step we need to show the converse of F(X),"""? X (t) ioe"
X(t)
~F(X)o
Therefore, we need the followirg lemma am theorem,
sin ht dt
- 1 b<.O
III
t
Proo.t'~
that
J(..(, ~)
~f3
III
a:>.
III
(-
cJ
0
+1
e-...(U
e-..al
cos
h '=0
b>O
~u
du
sm r U
u
du
'.,('>0
/
Note I
Differentiation under the

integral can be justified,
- 30 ..
J.
III
J.~+'P2
see tables or integrate by parts tWice
f~ dp =2~2
Let
djl
= arc
then J(.<., 0)
{3""'70
tan
~ +c:
e'<'u ~ 0 du ... 0
arc tan 0 + C = 0
o + C ... 0
Let J. ~O, aId put {3 ... h, then

lim J(J., h) -= arc tan:
0<.'--7 0
h>O
-T
n
... "' 2
Theorem 12&
h<O
It F(x) is contimous at
F(a + h) ... F(a - h)

If
1rfil
depending on h + or -
00
\-7 00n
(t) dt<", then rex) ='
Note~
Recall that (t)
1:..
lim
a - h, a + h, then
i.t
-1
....ita
SJ.ntht e
(t) dt
.1 .
-itxIt (t) dt
itx
(I)
lex) dx
~el~
16-1
Combining this with the above theorem means that given (t) or lex)
we can determine the other.
Prooll
Define
1
J=n
;T
sin ht e
-ita
wet) dt
-T
T
1
=n-T{
sin ht
-ita
Ell
CJ
/tx d
F(X)] dt
-31Interchanging integrals (reversing the order of integration) which can be justified

this becomes
or)
.l.
n
...ita
tee
...
J[J
The
T( sin ht
-00
1
a_
No1>e!
'-
s~
ht
S~ sin term is an
;it(x ..
Notes
_~_(
J
,r.b
dt
a)
[:OS t,(x - a)+
d F(x)
i sin t(x..a)] dt
_~
old ,function "".
The !~ cos term is an even function

- _;
itx
dF(x)
0. Ja2j
S;i~ h! cos t(x.-a) dt dF(x)
2 cos A sin B = sin (AotB) ... sin (A - B)

J
a}
1 fj
~ sin t(x-a.hl
sin t(x-a+h)
dt d F(x)
now take the limit as T 7(0

using the lemma just proved, with that h = (x-a+h) here
x=a+h
---11---------
For sin t(x-a+h),-_Xi_-_a_+h_<O...;..._I___-~_-_+_h)O~-
For sin t(X-a.-h)----__

For x in each region
both
= ... 2'n
x-a-h<..O
:1st part
n
'2"
I.. .
Xi_-_a_-....;h>O
both
lit
I
J 2nd
I
Whole integral (in

brackets)
part""
I
I
1t
.. 32 i,e o , in the region a .. h
rf
T
sin t(x
~a
~x~a
+ h) dt
+ h
T( 1
) t sin t(x - a .. h) dt ..
o
.. 0
elsewhere
a+h
1
Ja-
There
alh
a-h
CD
o.
dF(x) + ..
1l
dF(x) +
a~
0'1 dF(x)
.. F(a + h) .. F(a - h)
Proof of the second statement in the theorems
F(a +
h) F(a h) .
2h
ex>
sint&
ht e -ita rI.(t)dt
'P
-00
taking the limit of both sides as h -70
= 2l n
tea)
<X>J lim
_
h-.,o
sin ht e...
ht
ita
(t) dt
"-~=1
therefore
f(a) ...
-J
'2i'
-ita
e
(t) dt
Problem 12iLet Xi (i .. 1, 2,
eo,
a-l
f(x) ... a x
n) have a density function given by

O~xfl
a>O
n
Find the density of Y
-rr
i=1
X
i
X.~ are independent
(may need the re sult of Cramer p. 126)

Problem 13r
Define a factorial moment
B(X[r]) = E [X(X-l)
Define
F (t)...
.1
ex>
(x-rotl)]
(1 + t)X dF(x) as the factorial moment generating function.
- 33Find
F*(t) for the binomia1 and use this to get the factorial moments.
-It I
Problem 142 If (t) 1:1 e

find the density of f(x) corresponding to o Find the
distribution of the mean of n independent variables with this dof ~
Theorem 13: A necessary and wfficient comition that a sequenoe of distribution
fUnotions Fn tend to ., at every point of oontiwity of F is that n" the oharaoteristic function oorresponding to Fn' tend to a funotion, (t) which is continuous at
t 1:1 0 ~r tends to a funotion (t) which is itself a oharaoteristic function] It
Proof:
Omitted -
see Cramer po 96-98
This theorem says, if we have Fl
F2 F
3
(11. 2 3
ouu
fl
we go from the Fi to the i - .. observe that the i tend to a limit ~or exampl,
the normal approximation to the binomia~ -- observe that the limit is itself
a charaoteristio funotion [of the normaJJ .... then go to F by previously
disoussed teohniques.
Theorem
l4~
(Central Limit Theorem)
~" X u o u are a sequenoe of independently and identically distributed

3
random variables with a distribution function F(x) with finite first ilnd second
If'
X:t."
moments
(say mean
IoL
and variance ~2] then3
1-- the distribution of Yn =
vn- (*
~ Xi
IJ.)
tends" as n
normal distribution with mean 0 and variance (]2
2- for any interval (a" b)
1 is asymptotically N(O, i)
3-- the sequence [Y n

Proof:
Denote the o.f o of X. as(t)
to get the c.f. of Yn =
(t)
n
rn
~ (X...
~
IJ.)
~ (Xi""'
rn-
IJ.)
~oo"
to the
- 34 ...
it' we expand
(t)
where
(t)
in a Taylor series, we have
i2~ t 2/2
&:
1 + i ." t +
&:
1 + i lJ,t ... (lJ,2 + ci) t 2/2 + R(t)
~ ~O
." = loLl
+ remainder
= I0Io2+
as t?O
1\n (t) n ~n e -:I. fn t
In
&:
rn lJ, t
:: i IoL
Note: In (1 + x)
rn
x2
= Xae T
+ n In
(i>]
$6(*)
t + n In [ 1 +
ifi..
al :::". ) ]
lrn
... (1+2 + (j2 )t 2 +
2n
x 3 x4
'1"
.. "4
x
= x -"7
r6
+ In
+ R
.~......,
where
setting
ln
0{
~n
X!'I
vn
.rn
(t) = ..
1.
i/
i IJ. -
as x ~O
2
2
2 t
t
.. (IJ. + a ) ~ + R (~) we get
t:n
'In
2
2
2
t
~t + n i lJ. .., (~ + (] ) 2n + R (-)
[trn
(i j.L.i)
rn
3
~
It]
An + R ~t)
+nR(~ +V~~
n~CD
In
r (t) =- a 2
n
it2
"
lim
n-too
rA
'Yn
(t)
&:
e-
-r
.,.
Y2
denote this by
2 t2
lim
nJ/to.
n
~
Tn
Rill
since lim
n-)ro
\-;:::F/"" = 0
./ U
lim
R~ t
n~
Q)
~) l
&:
t
0
since as n~, ~~
which is the characteristic function of the normal
distribution N(O, (]2)
Note 8
See Cramer p
214""l215.
tI
The theorem says, for any given s there is some sufficiently large nee, a, b) such
that
b( -t2/202
a.
btl
j
dt <::::s
directly -- use Sterling's approximation

by characteristic functions
see Feller ch. VII
Problem 15 ..- says that the standardiZed Poisson random variable is asymptotically
normally distributed N(O, 1)
. ..,
.,
Cauchy Distribution - see problem
differentiable at the origin
14 --- has no mean or variance since e
lim
b~CD
- \t
is not
.!.
n
a"-7CD
=oo-co
therefore E(x) does not exist for a Cauchy random variable, even though the distribution is symmetric about the origin"
Theorem 15~
(Liapounoff)
If ~I X2,
2
0i and with
e&o
X are independent random variables with means
P~ = E
I Xi - i 1
8
3 4.. CD and lim
~p3
(~
VI -']00
then
Y
"O'i
Proof l
an
""
~X. -~J,t..
:L
)
:L
=c
;r:L
is asymptotically N(O, 1)
1/2.
See Cramer p tl 216-217 ..
."
3/2
and variances
- 36-
ft.Problem 16a Define M*(t)
III
E(Xt ) as the Mellin Transform

III
k xk- l
III
...
If X has density f(x)

then M*(t)
III
k~t
It X has dens! ty f{x)
then
Wet)
0 ~X~
&~
k2. ~"'l J,n x
O<x~l
Use this to find the density of Y

density f(x)
k x k-l
III
Xf2
where the Xi are independent and have
III
State a theorem necessary to validate this approach.

Laplace Transform:
if f(x)
III
x<O
-- extensive uses in differential equations

..,. extensive tables of the L.T o in the literature -- tables for passing from
the transform to the fuction and vice versa ...- see Doetscha Tables of the
Laplace Transform"
Notes
Replacing (s) by (-t) gives the m(Jgof .. , or by (-it) gives the c.f 0
Fourier Transform -. the mathematical name for the charact.eristic function

III
[eitx ]
a;
\Ie have previously noted that if X, Y are NID with means jJ.. and variances
221.
then Z a X + Y is normal N(~ + ~I a + a2 )c

1
There is a converse to this "addition theorem for normal variatesll to the effect
that:
if Z
=:
normal
X + Y where X, Y are independent and Z is normal$ then X, Y are both

._. see Cramer po 213.
Derived Distributions (from the Normal and othersh
name
density
1
n/2 ..
2
r (~)
n.
;~
...x/2
.Cof.
(1 ..
a
n
2it)~
2n
- 37 ..
n
~....... Xn are
.- it'
NID(O, 1),
x2t
is y.,2
= X; + X~
_. it has the additive property X;+n

Garruna
aA
-ax
"'-1
x
rCA)
.. XZ1) is a special case of the gamma
- = Pearson's
student's t
(n)
-_ t
Type III distribution with starting point at the origin"
~(~)
rm
r(~)
=XJ n
y
--
2 nil
(1+ !..) 7'"

n
X~
where
n-2
Ln ;;.2)
( n",>l)
Yare independent, X is N(O,? 1), Y is
X~
-- t with 1 d.r \) is a Cauchy distribution
r{~)
F(m, n)
r<;) r()
i 1
n/2
( ~)
---
X m+n
(1+ !!! x) T
n
.,
2
2n (m + n ... 2)
n
n'"
m(n_2)2 (n-4)
~
X>O
-
if X, Y are independent, X is
Fisher's z is defined by F
= e 2z
2
Xm,
Y J.s Xn, then F =
see Cramer p.
243~
Beta
13(p, q)
r (p+q) p...l
q-l
- - x
(l-x)
rp,rg-,
__ putting P
= ~o
= ~,
we get 13
.~.
p+q
~ F
= _n_ _
l+!!F
n
-- see Kendall for the relation to the Binomialo
Jil!!'YTr1
38 Gamma function,
rep)
rep)
III
n-l
x
(p-l) r(p-l)
i f p is a positive integer
r(p+l)
lim r~h)
p
n>O
dx
III
rep)
= (p-l) !
Stirling IS Formula
=1
h Fixed
rep)
P~
Problem 178
2.
X and Y are WID (0,
III
arc tan Y/X
Problem 18g
Cramer po 319
No. 10.
(j
).
Find the marginal distribution of
-39Chapter III
Convergence
Convergence of Distributions:
a. Central Limit Theorem (theorem 14)
b. Poisson distribution - (problem 15) -
Proof: 1. By c of'. is straightforward 2. Direct

>"+b~
P
(a, b)
~>...,y.
K!
= >..+a{X e
i f X is a Poisson r.v .., then
see sol. to problem 15, or p. 250.

where P(a" b) is the above
probability statement.
x= If->..
LetK= >"+VAx
-rr
X+6x aK + l - h
iA
ax
b
..
(a,b) . x=a
.-A AA +
xl?:.
111-
rr
(h + x~)!
using stirlingl s Formula:
nl
n -JIl ' .. 1/2

n e (2nn)
where e{n)--7 0
as
n --7 co
III
1 + Q(n)
-402
Notes: limit
= e-x
X 700 (1+
limit ln
lfi3:1t
Xvr -
:::J
1..-1 00 (1+ .:.)1..
{i
where 9 (X) -7 0
as 1..700
henoe the lim

Pea b) = lim
X 700 '
~:tG1:l18ml sum ..l.. ~
r
b
Note:
.l..
f2;a
2n
e"'x
h2~
e..
/2
dx
For a similar proof for the binomial oase, see J. Neymam First
. Course in statistics, Chapter 4.
o. If' X has the X2 distribution, then
:_n
~
V 2n
is A ~1(0,1) as
n-7
00
Proof: See Cramer, p. 250.

d.
If X has the Student's distribution with n d.f .., then
X-o
J~~
is A NCO,l) as n 700
Proof: Deferred for now) a proof working with densities is given by

Cramer, p. 250.
e.
If X has the Beta distribution with parameters p, q, then the standardized

variate is A N(O,l) as p -j 00, q -700, and p/q remains finite
Note:
X has mean...P....
p+q
Proof:
Omitted
1
1 + qJp
-41Problem 12:
a.
b.
X is a negative binomial r.v. (r,p)
Find the limiting d.f'. of' X as p~l; q-70; rq-7'A. (f'inite).

state and prove a theorem that shows that under certain conditions a
linear function of' X is AN{O,l).
Problem 20:
X,Y have a joint density f'(x,y)
Define E
[seX) (Y] ..
where f(x y)
g(x) f(xly)dx
-co
::I
Show that E[g(X) ]
f'(XJY>!f'l(y),
f'l(y) is the marginal density of y.
DEE [g(x>Iy]
Y X
Use this to find the unconditional distribution of' X if
Y is Poisson (A.).
xlY is B(Y, p) while
Convergence in Probabilit;y
Def. 18:
A sequence of' r.v., X:t' X2, .. ., J!h is said to converge'in probability

to a r ov. 'X i f given s, 8 there exists a number N( eo, 8) such that
f'or n ';7 N(e J 6)
Pr[1 Xn - X )?S J<8

which is written ~
-p
As a special case of' this we. have

Nsuoh that for n
-t tt
indicates converging
in probability.
J!h
p> c i f given
t,
8 there exists
>N
Pr[\~
cA>~e]< ~
Further, in this notation, theorem 9 may be written:

1 J!h is a binomial r.v with parameter n, then
p-1
Ii
Theorem 16:
If' ~,X2' ... 'J!h are a sequence of independent random variables with
means l-Li and variances
n
cri then i f
Lcr~
1 1
---""2- ~O
n
then
Yn
~ ~ (~ ~ - j
-42Proof:
Bit' the corollary to the Tchebyohe.fi'" Theorem (thm.8)

n
~ ~Xi - I'i~ 7 K "Yn1"' ~
Pr [I
given
~, ~
choose
"
Now
oi:'n
Put K
IOl
'.
iii
v'"f
V2
{6
n
)'
~ ~< ~
iii
o~ ~
-7
Gl)
and take n suffioiently large so that
(~ f' o~
n
)1/2 <.. s
with this choice of n

n
Pr[1
as n
~ ~
(Xi -
'1.~ ? ~ J< ~
If X:I.,X ,X , ~ a U ~ have the same distribution, then

2 3
2
2
andOy = .2..~o
asn--7oo
n
ai
2
o
=n-
so that in this particular case, X - P ~IJ.

Theorem 17: (Khintchine) (WGSk Law of Large Numbers)
If J1.,X2,
mean IJ., then
"X are independently and identioally distributed r

n
~ IJ..
Proof: (See Cramer, p. 253...4)
xCt)
In
IT t
E(e ~"')
i(~ IXi)t
l(t)
E(e
. (t)
==
n In (~)
where
~ -7 0
Lx(i)] n
rL
= n In 1 + i IJ.! + R]
n
as t
-1
.v. with
-43-
In
.
(t) .. -1IJ.t + n In(l + i lJ.! + R)
1-1.l.
n
Note:
In(l+x)
where
-:UJ.t .. n ~i
III
R~X) -7
(t) -) 1
as x
-j 0
RU(!)
...ntn.--to
so
henoe
IJ.* + R{i> + Rll]
Rn(1)
= in t whioh as n-f (I)
x + R(x)
as n
-t 00
!-lJ.
but i f (t) = 1
F{X) = 0
x <.0
x~o
III
or, in other words, the limiting distribution of X- lJ. is the

trivial d. whioh takes the value 0 with probability 1.
0
Questiom
X:J.'
X2, X ,
3
Do
II
lJ.
~3
Do
2
2
~ 2
If
1>&0'
Xn
." .
lJ.n ~lJ.
GO"
,in ---762
i.e ., does ~7 I
imply lJ. n }lJ. '1
Not neoessarily)as shown by the following example.

Examples
Let X be defined as follows:
n
2
In = n
with probability
with probability
-1
n
1
1- ....
-7
-44,"~ X
E \'" X
l n
-j 0
:I
0(1 - 1) + n2 (1)
n
n
as n -)00
:I
r.~1-t 00
Problem 21:
Let X be the r.v. defined as the length of the first run in a

series of binomial trials w~th Pr [ success J : I p. Find the distri
bution of X" E [XJ ' and ~(X).
Problem 22:
Let ~" 1 2 , '''''!n be independently unit.ormly distribu.ted (on 0,1

L~t
m1n OS.. " 42' ..." ~). Find the d.f. of I" ELY] , and
er(Y).
r=
Find the as~totic d.f. of nI and of I ... E(Y)

tTy
Is Y ... E(Y)
A N(O"l) as n - j oo~
0'1
Theorem 18:
u"x.a
Let X.,,~,
be a sequence of random variables with distributi~
functions Fl(x)" F2(x), "0' Fn(x) ~F(x). Let 11" 1 2, ""Yn be
a sequence of random variables tending in probability to c.
Define: Un
:I
~ + In;
Vn ~~;
and Wn
:I
~/Yn
a) the d.f. of Un ---7 F(x-c); and if c 70

b) the d.f. of Vn~ F(x/c)
c) the d.f. of Wn~ F(xc)
Proof:
All three parts are similar -
see Oramer p.
2S4...S for proof of
the third part.

Proof of the first statement (a):
Assume x ... c is a point of continuity of F. Lett be small so
that x ... c t is an interval of continuity.
Let B:t : I the set of points such that.
Xn +1<'
n- x;
8
:I
/1n
-cl<::.-
the set of points such that.
xn + Yn
.:::.
Xj
In .. c \ 7
+ 82 CI the set of points such that

Xn +1n_
<:"x
:I
..
..45-
IYn .... 0 \'7 e J~ Pr [I Yn .. 0)7 eJ whioh ten~s to 0 as

n -700" the~efore we can choose nl so that n >nl implies P(8 2 ) c::.. i
P(S2) Pre Xn + Yn L.-~.
In ~:
-1
F@x ....
0 ..
e -Soyn::- 0 +
0 .. 6) ....
P(~ + ~ L. x)
~ p[ V
&,
thus
.
f<F{X" c .. e)
= P(~ = x ..
~c - ~)] = Fn2 [
:=I
[xn ~ x .. (c + el J :::-
Yn)
x - c ..
~.o(x -
c + s) -
where n is chosen so that when n;>n Fn(x) .... F(x)(i in the vicinity of c.
2
2
Therefore" in Sl
i< P(Xn +Yn x) L F(x .. c + ~) +.i

oan be chosen so that F(x ... c +
F(x ..
i for n
F(x ..
0 ...
e) ""
6) -
NO~ing
.... 8 ~ ....
i?hat ,Pr[Un ~ x
i .. i ~ F( x ..
p( 8 ) +p( 8 ) ..
~ F(x -
+ e) +
0 -
6}<
J= ~[Xn + Yn~xJ = P{Sl) + P(S2)'
6) ..
~(x
c)
i + 0 - F( x ., c)
= Pr [Xn ~ .xl
i +- i . . F(x -
0)
~i
:"" F( x ...
~
whioh makes use of F(x
c + e)
F{x .., c)
II(
F(x
0 .... 6)
F{x .... 0)
III
This then reduces to ....
~ ~-
hence J Pr[Xn~xJ .... F(x ...
Pr
[xn~xJ
c)J~~
L.
.. F{x ..
~ ~ 6
i
i
o)~- ~
for n7max(n l ,n 2 )
which is what we set out to prove in the first plaoe
...
.,max
(~"n2)e
we can write:
-46-
:'heorem 19: If X
n
g(X )
n
ProBlem 23i
>0
and i f g(x) is a funotion oontinuous at x
0,
P-} g( 0)
Prove Theorem 190
(work with faot that g{x) is oontinuous)
ExamJ2le of Theorem 18:

Show t n is A N{O,l)
where
XJ.,~,
000
are independent with mean
sn =
whioh is in the Wn form
In is A N( 0,1) by the oentral limit theorem.

Henoe, 1.t' we show that
sn
y=p-71
n
(]
then the statement that t is A N( 0,1) follows from theorem l8{ e)
~(\ _ 1)2
{n _ 1)(]2
::I
n
n -1
=-
~, val' (]
-47-
by KhintchineJs theorem (No. 17) this sample mean tends to (]2
p ) (]2
therefore, the first term -
i.el)'
(Xi ... J.L)2
) 1
hence, it remains to show that
(x - J.L)2
2
(]
':j
but we know that ) X ...

theorem (No. 17).
J.L
(I ... t.l.)2 _
Therefore.
,
-7
J.
:;J
pI
as n -7
ex>
2
and sn
by Khintchine t s
2' ~
p
(]
(]
s
and thus by another application of Theorem 19 -;
Note:
p ., 1
The t n in this case is Student as distribution if the x are

independently and normally distributed -- however, this is
for general tn.
Re Theorem 19, see: Mann and Wald; Annals of Math. Stat., 1943;
itOn stochastic Order Relationships" -- for extending ordinary
limit properties to probability limit properties.
Mise .. remarks#
On the Taylor series remainder term as used in the

proofs of theorems 14, 17, and on p. 37 -see
CratAer p. 122.
If f(x) is continuous and a derivative exists, than we can write

f(x) = f(a) + (x
GO
a) ff La +
Q (x -
a)]
~ 9 ':=-1
=:
f(a.) + (x .. a)Lf'(a) + ,fl(a ... G(x - a) ) ... f~(a)J
10
f(a) ... (x .. a)fU(a) + R
where R
= (x -
a) [fl(a + "(x .. a - f' (a)]
-48R
fa [a + Q(x ... a)] .. fl{a)
IS
ft
then i f
[a + G{x ..
is continuous" as x --7 a
f~
lim
a)] ... fl{a) ----70
III
ioeo,P R converges to 0 faster than x - a
x...-.)a x-a
If there exists an A such that fA g(x) dF (x)
Remark:
0)
..
for n
)
1, 2~.3, 0lJ u then i f Fn--p"7~ F
g(X)dFn(X) - - ;
...co
Question,
r::;
and ,
<s
g{x)dFn(x)
Ref .. Cramer, P4 74
g(;)dF(x)
...( 1 ) '
Under what conditions does E(t n ) 0 for all n

or does E( t n) --j 0 as n ----1. co'l
(tn is defined as in the example illustrating theorem 18)
Counter-exampJ,elt ~
1
2
;-; e- x /2
Define Pn:=...L
~ 1-n
In is normal (0,1) except on the interval. (1 - ~ , 1)
and
III
1 with probability Pn '
X:L'
Then with probability (Pn)n

in which case t n
.l0-
IJ.)
= ,.,.
~
therefore E(t n ) is not defined.

Problem.: 24~
X is .Poisson All then Y III
is asymptotically
(X
X".)
X~l) as ".-7
CD
X2,
"0'
III
<s
-LeConvergence Almost Everywhere:

Def. 19:
A sequence of r.v.,
X:t'
--t X a.e.
X2, ...
i f given
8,
6 there
exists as N such that
Ref ..: Feller, Chapter 9.

~:
Convergence almost everywhere is sometimes called "strong convergence"

Convergence in probability is sometimes called 'lrseak convergence".
Exam:raleg . (of a case when Xn
but Xn~ c a.e.)
p ) c
with probability 1 .. ~
-,=:
1 with probability
n
the XI s are independent.
(1)
To show Xn
p -j 0
.. !n
for any 8<1
can be made arbitrarily small by increasing n.
(2)
~-/-10 ae ..
P.r
D~ "'"
0 )
<
00
III
IT Pr[~L.E]
nmN
Note:
l - x <:::.. e-x
= N,
N + 1, N + 2, ...
CD
= IT
(1 -
N;j)
j=O
0
~x
=1
which series is divergent
-50therefore
[I Xn - 0 I < e
Pr
therei'ore ~
= N,
N+l, N+2, ]
=0
0 a.e.
Problem 25:
If
then
=0 with probability
=1
--j 0
with probability
1 . . ,1~
1
7'
a.e.
Problem 26:
= cos ta
x(t)
1. What is the d.f. of X'1

2. Is 'f A.. N. as n -7 CD (suitably normalized) '1
3. To waht does X converge as a--j 0 '1
Prohlem 27:
X.~ and Y.~ are independent and identically distribu'ted random
.
variables with means
IJ.,
1/
and variances
vi ' a~.
Find and
prove the asymptotic d. of X Y (suitably normalized).

0
Kolmogorov ine9Wit,V
Let
2
and variances O'i"
IJ.i.
Then
Pr
I~
n n
[~~
be a sequence- of r
oVa
Xi
~ fl \ . q
(tai
)1/2
Exampls3
J1.
X2
Pr
[J
Xi
\-<
\ JS. + X2 \
<
Y2K ] '/
1 -
with means
Theorera 208
oo.
are independent random variables with means l1

i
00
2
and variances ai' then, i f. L\
l-7
l1n - 'iin
2/,2 converges
ail.
0 a
or we say that X obeys the strong

n
where jj;
= 1n I
J.!!! !
large numbers
l1i
Proof of Theorem 20:

.
Let A. be the event that for some n in the interval 2 j ...1 4.n '" 2j
J
(Xn .., I1n /;> s
(violating the definition of convergence a ..e.)
-52-
Pr
[J
t<l"- ~)
17
nJ
~ ~ L<x" - jJ.nl
I I n - ~JJ
e
'7---(I <7~
I
j 1
2 -
Pr
j l
2 ...
{X
L.
inserting the lower bounc

on n
PI'
)1/2
<7i )1/2
from the Kolmogorov ine'qUslity:

PI'
x..
-J.'"
h1
L.. If-
<71
IL(~ ~ ~t
(JS. + X2 ) .. (!1 + ~2
<::... If;
t:,;
~
(2 + 2)1/2
<71 . 0'2
..:..-_--..;. <:
{~ <7i)1/2
and inserting the

letting k == 2 - e
bound
of n in the
2)1/2
( <7i
summation
upp~
-53Interohanging the order of summation
~ ~: 2~j
L.. 4.
-2
1=1
2j >i
Note that
since it is a geometttio series.

1
1-':'2"""
2
2j 7i
16
38
III
Now this sum is finite by hypothesis -
henoe
I l
Pr
'X --1
Proof:
fJ.
a.e.
is immediate sinoe
\' O"i
L ":"2"
,,2
III
I!2
Aj ]
which converges, i.e.
other TYpes of Convergenc::e:

1. Convergence in the mean
1.i .m.
Note:
X
n
1.i om.
III
III
i f lim E [X
n-ja>
- X
J 2 ---7 0
limit in the mean
Implies convergence in probability but not convergence .a.e ..

Ref:
Cramer -
Annals of Math. stat. 1947.
~ a>
-542_
Law of the iterated logarithm

Ref.: Feller, Chapter 8
3.
st. Petersburg paradox

X = 2n with probability
Note:
Ln
~n
=1
e.g. Toss a ooin until heads comes up .- oount the total number of
tosses required ( II! n) .... bank pays 2n to the player.,
E(x)
III
for a fair game, the ' entry fee should be equal to the expeoted
gain, therefore, this game presents a problem in the
determinat~
of the fifair" entry fee.

Ref.~
Feller, p. 235"'7 -. he shows
~ ~n~:in ~ l>~}~
1
tha.t is, the game 'lbecomes .fair It if the.. entry' fee 'is .n. 1n n.
... 55CRAPI'ER "N
ESTIMATION (Point):
Ref:
E. L~ Lehman, "Notes on the Theory of Estimation U, U. of Cal. Bookstore

Cramer -- chc 32-3
F(x, 9) -- a family of distributions

X, 9 may be vector valued, in which case they will be denoted
I
!,
8 ~
9 s
n -- parameter space
Example~ 1;
if Xl' X ,
~ ., Xn are NID(~,
i)
then
consists of all possible ~
and all positive (l
2: F(x) Fs
7'
the family of all continuous distributions
is then the space of all continuous dof 0
-- this is the non-parametric case

- might wish to estimate Er(x).
1
co
x dF(x)
"'00
provided that we add the restriction that E.F(X)

Estimate of g(Q) is some function of
close to
from R to
k
-f.L which
<: co
in some sense comes
g(Q)
Or in the "general decision theory" point of view (Wald)

an estimate of g(Q) is a decision function d(X) and we have associated with
each decision function a loss function W [
whenever de!) ... g(g)
d(!>, Q] with
W .. 0
The choice of loss function is arbitrary, but we frequently choose
Poet;)
20~
Risk Function is defined as

R(d,
~)
... E
W[ d(!),
]1 . .
co
S
-00
W [ d(!),
dF(~ ~>
- 56 -
d~(X)
... f
R(d*,~)
E(X _ ~)2
101
~
n
A t'best" estimate might be defined as one which minimizes R(d, Q) with respect to
d uniformly in
Q.
R(d, Q) 15 R(d*, Q) for all Q with d* any other estimator ..

consider the estimate d(x) of g(g) defined as d(x)
..
g(Q)
o
Hence amiformly (supposedly) minimum risk estimate can be found only if there
exists a d(x) such that R [d(x), Q
0
J..
An example would be similar to asking which is better for estimating time -- a
stopped clock which is right twice a day, or one that runs five minutes slowo
Since a uniformly best estimate is Virtually impossible to find, we want to
consider alternative
WAYS to formulate the problem of obtaining best estimates~.
I.
by restrioting the class of estimates

l~
unbiased estimates
Det. 2l~ d(!.> is unbiased i f E [d(!.}
J...
g(Q)
d(X) is a minimum variance unbiased estimate (m~vOuQe.) if

E
~
t de!) -
g(Q)
J2
is minimized over unbiased estimates d
invariant estimates
Let heX) be the transformation of a real line into itself which induces
a transformation h on parameterspaceo If d [heX}] "" li[d(X)] then
d(X) is invariant under this transformation.
Example~
family of d.f c with E(X)
< 00
Problem is to estimate E(X)

hex) ... ax + b
Ii [E(X)]
An estimate d(X) of
~ is invariant if d [h(X) ]
d(aX + b)
Q.
a E(X) + b
101
Fi [d(Xi}
a d(X} + b
Therefore X is an invariant estimate of ~ under this

transformation.
Note that d(X) .. ~o is not invarianto
- 57 -
3. Bast linear unbiased estimates (b.l ~u.a. )
Dei'. 22: Estimates of g(Q) which are unbiased, linear in the X. and which among
- - such estimates have minimum variance are b .l.u.e.
J.
Problem
2~:
Xl' X2, 0 " OJ Xn are independent random variables with mean IJ. and
and variance (12
I is the
show that
Problem 29~
Xl' X2,
W [d(X),
0,
QJ
bolou~e. of IJ..
are NID (IJ., (1 )
co
b [d(I) - IJ. ]
III
[d(I) - IJ.] if d(X) ~ IJ.
a"
d(X)"
(note:
X find R(X,
b.
d(I)"
X+a
[note~
i f d(X) :> IJ.
(1)
the answer depends on the loss function constants only,
not IJ.)
j..L,
-- show how to determine

is minimized
such that R(d,IJ.,(1)
the answer involves (z) which is defined as
-co
Comment on this
problem~
An orthogonal transformation
.- is a rotation or reflection
-- l. .., ~ where A is orthogonal
.... I J 1= 1
.~
...-....wyi = -"ix i
...... For
II
+ a.l_x :
~ a~ ... 1; ~a'jakj
...
j J.
.Lunj=lJ.J
In general if d(~) is a function of T(X .. X )

l
2
\
R(~,
Q) .. E[W(d, Q)J
f,.. f
/
V [T(X), Q ] dF(!)
et
0'
In)
- 58Making the transformation
y ... T(x)
y ...
n-1 functions independent of the first one
Then R [d(T). 9}
f'l
[T(X). 9 ]
~(Yl!
..ro.o ) elF(Y2 Y),
000.
Y</
~arginal distribution ~conditional distribution

of Y1
~
g
II.
r'l
S
III
T(x)
of Y2' Q"11 Yn
given Y1 1
[Y1 , 9 ]elF(Yl)
WCT(X), 9] elF [T(X)]
Optimum Properties in the Large

1c Bayes Estimates
Defo 23:
If
o.r
has a known "a priori" distribution H(Q) then the Bayes estimate
g(G) is that d(G) which minimizes
)R(d, 9) dH(9) with respect to d(x)
ft
Example:
X is B(n, p) and p is uniform on (0, 1)

Let W [d(X), GJ
R(d, 9) =
&(1) .. p] 2 and minimize this with respect to

Id(X) - pJ
x ... O~
( : ) pX(l _ p)n-x
Average risk ... risk function averaged with respect to p

1
SR(d, Q) dp
- 59 -
=~
2
[d (x) - 2pd(x) +
x=o
i]
xJ(:~xli pX(l,
Note:
(a{)b
) p 1 - P dp
- p)n-x dp
a b~
(a+b+lj1
o
Using this evaluation on each part separately we have
x 0
[2
nl
xl (n-x) 1
2d{x)nJ
{x+l)J(n-x)J
d (x)xJ(n-x)J (n+lp - (x)J(n-x}l - (n+2H
+
1:1
~
a
~ r
. 1
2{) _
n+l ~ 0 ~d x
1
n+l " 0
x
nl
xHn-x}J
2
[d (X)
2d x) x+l
0
n+l - n+l n+2
~(x){x+l)
(n+2)
nrr: '
X+11
d(x) ... n+2
(x+2)J(n-x)J ]
-(n+3JJ
(x+l)(x+2)
]
m+l){n+2Hn+3)
{~+1)~X+2~ _(~
'\ 2]
n+2 J
+ (!:!:!)2 +
n+2
\n+2) n+3
x+l {n+l...x
}]
+ n+2 \.Tn+2)(n+3)
This is certainly minimized with respect to d(x) if each term in the first summation
is zero -- ioe 0 if
d(X)
.
Problem 30:
if d(X) ..
1:1
X;l
n+2
?f. find R{d.ll p) as a function of p and also
io R(d~
average R...
p) dp
if P is uniformly distributed on
R(d,? p)
os
E [d(X) ... pJ 2
(o~
1)
- 60 2.
Minimax Estimate
Def~
24:
d(X) is a minimax estimate if d(X) minimizes sUPeR(d, Q) in comparison

to al\Y other estimate d*(X)
(9)
i;Je., we get the min (with respect to d) of the max (with respect
to e) of R(d, 9)
-- or we take the inf (d) sup (e) of R(d, 9)
dl(x) is minimax estimate since it
l---"L------~
_ _- -
R('1. 1 Q}
'--------
.au!~ Q)
has a minimum maximum point
3. . Constant Risk Estimates

Def 0
25:
A constant risk estimate is one for which R(d, Q) is constant with

respect to 9
Problem 31:
II
Der.
Find a constant risk estimate among linear estimates or p if X is B(n,p)
-::J~l. de~~.,,",o~y_.w~p_~..la~~@...!~.!!!E.!e (a.~.~~':~~1~..Eertie~.5)! estimate s
26: Consistent Estimates
~-
d (X) is consistent
n -
it:
d (X)..
) 9
n p
(it does not necessarily follow that E(dn ) ~Q

Problem 32:
If' E(d ) -79

n
and J(d
HO
or that d~drl70)
then d (X) is consistent"

n
(these are the sufficient conditions for consistency)

Best Asymptotically Normal Estimates (B"A"N"Eo) .- d(X) is a B.AeN.
estimate:if:
dn-E(dn '
is A., N(O, 1)
atdnJ -
2.
lim
n-?co
if
--
d: is any other Ao No estimate, then
<:. 1
.61METHODS OF ESTIMATION
-.
A ......
~
~thods
ofdi>ments
(Ko Pear$on)
.. 91' 92,u", 9 .... equate the .first

k
sample moments to
moments (expressed as functions of 91 " 9 <1"" Qk).

2'
91' 92, <I <I . , Qk and these are the moment estimates.
example:
Xl" X2 '
then
3'
.It.
'$
population
Solve these equations for
~ are NID{l.', 02t
first population moment .. ~

second population moment ... 0 2
n
~.
2
..:J (Xi"!)
~2
<I 1
(Un" being the divisor used
, ";'--n--- .. 0
by Ko Pearson)
note: This method yields poor estimates in many cases -- has very few optimum
._ .properties
.
B -- M9thod of Least Squares (Gauss -- M:l.rkov)
let X ... {Xl' X2~
all ....
A ...
<I
a nl
0"
i~e.~
s~n
A is of rank
a ns
s
..
j .. 1
aijQ j
also
Def. 28:
both X and Q are column vectors
i ..
l~
2,
<I
ioe o , the covariances
is a least squares estimate of
if
OJ
III
* minimizes
(X - A9)' (X - AQ) .. ~ (X. - aJ.'191i-l

J.
Theorem 21:
00
e -
a is 98 )2
(Gauss -- Markov)
With the given conditions on Xl' X2' Q$ Xn the least squares estimate
~* is a best linear unbiased estimate {belouve.} of ~ ~
- 62
Proof Theorem 21:
ref:
we first show that
it' we write
g* ..
= C.1
c1
= (!. _ AC-1
A'
1
AC.. A'
1949, po 458
BiometrikaJ
At X
A!. + l
with respect to y,
= (! ..
Placket:
Lehman
where
1:1
AA
=C
since it ia
symetrio
then we are trying to minimize
i.
A ~) t (!.
(! ..
AC
A ~)
AC..1 A'!. .. AZ)
! .. Al)' (! ..
!.)' (! -
1
(1" AC.. A'o!)' Al
AC-1 A'!) + (Al) , (Al)
'!r -
1 A
(! ..
- (Az) t
now the croas-product terma equal zero
-! Az + !
eClg4
since
'-1 ' ,
) A Al
A(C
(Cool)
;:I
1:1
..
Cool
Al. +
(C..1 )tA 'A C-1 C
! Al. ..
= A'A
=I
similarly the seoond cross-produot also .. 0

hence
is minimized with respect to
happen i f
Az ..
or writing
it' (Al,) , (Al,) is minimized which will
sinoe A has maxilrrwn rank
this will happen only if' y=O
= ~ (X ... a 9 - 0" .... a .9.

U 1
iJ J
i .. 1
1
-I)
(I
a. 9)2
0"
1S S
.!
formally minimizing
in the usual fashion by differentiating with respeot to
the 9
j
n
=
-2
~
a .. (Xi .. a
9 .. () .. .. a ..9 .. 0
a. 9 ) .. 0
'1r
11 1
d~j
i
1 1J
1J j
16 S
'dl
(t
;:I
1, 2,
j ..
0, a
solving these equations

n
~.
=1
a i .x.
J J
(~
i. 1
or A X (A A) Q
aijan
91 + '-'
(~
i 1
aij&is) 9 s
. 6.3 ...
To show that
so defined
(ioe.,
' ) is B.LoUoE e
'"' C-1 A!
B~m!
consider a linear estimate
B AQ '"' Q
thus BA '"' I
note that o-2BB ' is the covariance matrix of B! -- the elements in the
diagonal are the variances of the estimates ~ ...., we thus wish to minimize these diagonal elE;1ments (by proper choice of B) subject to the
restriction that BA" I
(B .. C1A') (B ~ a1A'), '"' Bit. B{A')'(C-1 )' _ C1A'B' + C1A'(Ai)'(C l )

using the relationships that BA '"' I
(B_O..1A') (:B..O...1A ')'
thus
BB 8
..
(0...1 ) I
..
0-1 ... C-l
III
0-1
'"' BB' C-l
BB' '"' C1 + (B CnlA') (B ... C1A')'

or if B C...1 A'
minimization will occur if (B .. O-lA I) '"' 0

hence
0 '... 0 or (C.. l ) t
\ AOf A '"' 0
* are
example:
if
(~'"' C...1A '!)
B~LoUoEo
JS.,
X2J
.~
common variance
Xn are uncorrelated with E(X i ) '"' ~ and
ci
then the least squares estimate of ~ is f

1
let A
.,
here
1 (we have a lxn matrix of 1 s)
o '"' A'A '"'
0...1= ~
'"' 1 ~ x... f
n 1=1
= 1"
2,
0,
t i are known constants

find b o l" u. e. of
col., ~
and assume
~ ti
1
using theorem 21
lOl
].
64Aiken extended this result in 19.34 to the case where the X. are correlated and
we mow the correlation matrix V (up to an arbitrary mul tlplier) -b.lou"eo are also least squares estimates which are obtained by minimizing
(! _ A f~) f
V I
(! _ A t~)
ref:
c.
Plackett:
Biometrika, 1949,
p&45~
Maximum Likelihood Estimates (ref. Cramer oh .3.3)
Defe 29:
Likelihood function
L
III
f(JS., X2 "
I)
0,
III
p(Xl, X2 '
I)
."
In' ~) if the I's are continuous

Xn'
Q) where the pts are discrete probabilities

if the Xt S are discrete
i f the Xi are continuous, independent,and identically distributed
L .-n-t(X., Q}
1.1'
or ln L
II:
= 0
i ..
1, 2,
Q ,
~ ln
1
(X., Q)
1
Regularity conditions needed in the maximum likelihood derivations (ref: Cramer

500-504)
1.
2.
The
are continuous, independent, and identically distributed.

~ will assume first ttat ~ 1s a scalar..
~ ln (X ' Q)
.
i
is a function of X. and hence is a random variable.
dQ
we assume that
"lln (X
';)gk
.3.n is an interval and
4.
dk In t(Xi ,
09k
Q)
<
i , Q)
exists for
= 1,
2, .3
(the true value of Q) is an interior point
Fk(X i ) which is integrable over (-
00,
00)
- 65 E [F (X )].(M for all

3 i
50 E
ta;nQfi~2
i.e., it is bounded
...-
J(O ln~f~", Q)~(,,1'
O<k2< 00
Q) dxi k2
If f , n satisfy the regularity conditions, and i f L, or
Theorem 22:
"'-Ma'
In L, has
a unique maximum, then

the maximum likelihood estimate ~ is the solution of the equation
1-
,In L
0
09
1\
2- 9 is consistent
3-
rn (~ . . 9o )
Proof:
1-
is asymptotically normal with mean 0
'9t~ L is continuo'lls with a continuous derivative,
f
n
L(a
2..- to show that
[o~l~i
since
In
and varianca
In f(X1'
has a maximum,
\1~ L.. 0
at this
if
max~
is consistent
summing each term on both sides of the equality, dividing by n, and doing
some substituting
1
-
aIn L
e- 9
1
...::i.
n i=l
s:: -
d In f i
~Q
~ [~.21n f i
+ - ...::i
Q=9
n i=l
o
Q2
9=9
( 9-9)
0
z
(Q-Q )2
+ - ~ F (x )
0
n i-l 3 i
2
the term in the third derivative is replaced by the term in

regularity condition 4a -- and multiplied by the factor z(O$z-il)
to restobe the equality
this equation can be
!..n
a In L
~Q
written~
(9_9 )2
B + B (9 _ 9 ) +
0
BI)
....
20
-----~-----------------~~--------------~---
-66note that we have:
f(x i Q) dxi 1
..00
from which we can Show:
also" differentiating a second time we have:
- 67 ..
-----~--------------~~----~----------------
Thus, by Khintchine's theorem (noe 17)

B
Let S
= the
~O
B2
2
.. ..k
> E[F3 (xi >] <M
set of points where
I B0
IL.. ,}
We can f'ind an
~~
Bl
n j given
Pr[S] >1 - s
0.11 such that
8j
In S the right hand side (rohos.) of' the Taylor series expansion is to be
considered"
Consider: Q = Q + 6
o
rohas~ ... Bo + Blo
~ B2 62
~ F} (1 + M) ,., ~ k 2 8
k2
So that" 1
Considering:
Q ... 9
<
the l"ah~Sc
2{1+M)
~ 0
2
8 +
6" 9 0 + 6) and
in the interval. {at the root)a
.. M5
f'orthe same
Hence, in S, which occurs with probability
__
~ 0 "
- 8
'? . .
%'Oot in the interval (9 0
letting z 1
= -(M + 1)6
2 +
k 26
<~(l+M)
> (1 - 8
~ a"'al~
L .. 0 has a
ln L has a maximum
- 68.1\
Q
is then the maximum likelihood estimate and the solution of the equation
! ~lnQL
n
~..,
whioh yields
($
... B + B
l
0
Q )
0
+ !2 z B (Q
2
Bo
... _ _
co
_Q
)2 .. 0
_
.
B +2" B2 (Q - go)
1
Multiplying both sides by
k..r;;
We lmow that:
1
]
V [ Bo J ... ~ E
therefore
[nB o
is
[d d
ln
AoNo(O, 1)
Bl-~p->~ ...k2
B <M (i>e()~ it is bounded)

2
9-Q---0
o
p
Thus, by the relationships we have jUst stated, and by use of theorem 18
O~
2"
1
ln
III
E[
d~
f.i
----------------~----~-------~------------Example~
f(x)'" a a-ax
Find the moloe o of a, and its asymptotic distributiono
n .
L .. an a-a
Xi
.. 69 n
In L n In(a)....
xi
1
..........
..
i-l
n
.~~
therefore
We can easily verify that E(x)
= -1a ,
c>
E(x)
=-1a
'V
or a .. -
by the method of momentso
is A N(O~ 1).
or
--------------~-----~-----~----------~-----
Problem 346
-
2 ..a 2x
f()
x a e
find the m"loe& of a
x ,>0
and its as;ymptotic distribution.
-- verify that the same result could be obtained by a Taylor Series

expansion.
Problem 35~
--
Xl is Poisson ~
find the
~lc:>eo ot
(i = 1.1' 2, "
.J
n)
and its asymptotic distribution.
- 10 Problem 36:
X is uniform on the interval
- ... Find the mol.e", of a
(0" a) ..
[not by differentiating]
_... Correct it for bias and find the asymptotic distribution of the
unbiased estimate (~!, e
-- Another unbiased estimate is
* .. 2 ..x
Compare the actual variances of 1\,;

a, a*
Note:
a * is a moment estimate -- for another comparison of the method of moments

with the method of maximum likelihood" see Cramer p., 505"
NoteZ
The meloeo is invariant under single valued functional transformations i"eo,p

1\
the mol.e o of g(9) is gee).
Remark~
If dn(X) is
afn)
A N(j.L"
g[dj is oontinuous with continuous first
and
and second derivatives and the first derivative

then
F" [g(dn ) -
g(-IJ.).J
is A" N(O,
[g'(~)J2
.;, 0 at x=1J.
(12).
From 'Which we can get
Hence by use of theorem 18
"---~)
rn (g(d~
(j
... g(lJ')] is A N(O.9 1) "

g (j.L)
------------------------------------------Multiparameter case:
Theor~~~
Under generalized regularity conditions of theorem 22" if

~ "" (91 , Q 2 ,9
"
"
~!I 9n )" then for sufficiently large
and with
- 71
probability 1 - t, the
of the equations
0
III
and furthero
mol.e~
9 , Q2" ,
1
~o
~s
13
9s
."
V"" ... n
~, S
has asymptotically a joint normal
distribution with means Ql' Q2"

matrix V-l
where
is given as the solution
i 1, 2,
~o
"0
of
and variance-covariance
13
13
l
ri
In.1.\
Vlni
d
( dQsQl
s 2)
E
9 (0
(:lln f \
o
E (d2
;:
This is the so-called information matrix used in

multiparameter estimatIOn o
Sketch of part of the proof:
oa
d::l :
L
i ~ ~l~iL toa
+ second and higher degree terms
(Note:
ln L terms can be replaced by
In f i
CI
i 1" 2, , s
n" 111 f i termso)
From theorem
22:
lnL...
~
1
ln f(x i ,!)
E( ~~) E(~ P~l:jii1) =0

a
2
. E(ad ~
Q
Ii) . E(dl~ f2:)2

"l7
We also need a set of covariance terms:

2
ln
_ E f:.d ln f i ) .. E [-( d f i .) (.
C-d Qj
Qk
~Qj
~dln fi)J
Qk
which follows in a manner similar to the derivation of the variance

expression in theorem 220
- 72 ..
Now we oan write:
------:lJo)
p
n
b ok
J
!.n ~
1
II!
The maximum likelihood equations can be written (ignoring the second degree terms
in the expansions):
~ ~
A ~ 9 )
.. B (g
in matrix form or completely
written out as:
1)
(i.e., .. each element b jk replaced by E Lb

jk
B -p....--~> E(BJ
For la~ge n~
[ J -1
B E(B)
-B
o
or:
p ) 1
(~ -
B [E(B)]-l [E(B)]
13
($ - gO\.
[E(B)
-~(B)
J -1 Bo
gO)
13
(E(B)](Q -
(P)
"
V and is non-singular.
CI
S' ..
Variance-covariance matrix of
gO =
E(B) -1 [ var-.cov
V
Example of the information matrix used in the multiparameter estimation:

0
f Xj afj b
(
...
'fCE'
a
-ax b-l
x
X;>O
a 0
ln L .. n b ln a ~ n lnreb) .. a
Xi .., (b ... 1)
defining ~ ""
b'70
~ lnx
. n
i
- 73 ~ln
Lb."
....
n . . a a ." ..
a x 0
1.
dln L.
n~b
ln a -
r/i/f~t
+.~.o
From the first equation a:;

x
Substituting this in the second equation we get:
In.2 - F2(b) +
Xr. 0
ln (' (b) )
defining
lnb-F2(b).lnX-~"
Note:
Pearson has compiled tables for F2 (b) which he called the

di-gamma function
.
..
Thus we have:
.....
'd2 "In
da 2
~21n L
b
oaa
'?J2 In L
~b2
1
a
7
n
V -
IvI-
-1a
F (b)
2
[+ ~ F~ (b) -~ ]
note:
F2 (b) is called the tri-gamma.
The asymptotic variances or covarianoes are thus:

t
)
F2(b
1\
of
of
a 1\
a,P
JVI
J1.!.
Iv/
2
F2'(b)
a
n [bF ~ (b) - 1
----rOt- - - n[bF 2(b) - lJ
... 74
...;n
b);
(~
means
Vn (~ .. a) thus have a joint normal asymptotic distribution with

and variance-covariance matrix:
2
a F~(b)
a
b
F~(b)
:l
b
a
i
b F (b) .. 1
2
Exercise:
f'(x;~,il (] ) ..
j2"";'-e(] >1
.... Find the moloe" of
and
j.l.
(]2
(x - f,/:)2
2
20-
and find the information matrixo

n
..., The
....,.
mol~e
of
a and
~ (xi" i)2
1
are x" - -n - - -
~2
~3 '6-2
are independent:J therefore the covariance terms in the information

matrix are == 0 0
.. oo.(x <00
Note~
This is the so--called Laplace distributiono

It is an example of finite theory ."" the m.l.e" is not a minimumvariance estimate for any finite sample size.
a) Find the moloeQ of
j.l.
(based on n
independent observations).
Does it satisfy the conditions of theorem 22?

b) Is
Problem
38:
an unbiased estimate of
X is Poisson ~
A"
&
Y is Poisson ~ j.l.
Find the m~l<!le(l of
~? find a2
x
why not?
j.l.
We have a single observation"

0
and also their information matrix~
.. 75 ..
Check that the same results are obtained by expanding
about
I-t-I henoe this result is asymptotic as ~
Xi
Problem 39: a
ti'
~
in a Taylor series
ClO
I" Yare independent
is
a">O
b70
j 1,2, ... ,n
Find the m.loeo
matrix.,
ot a$ b and also the asymptotio variance-covariance
- 76 ..
D.
UNBIASED ESl'IMATIONe
Theorem 24:
Information Theorem (Cramer-Rao)
If d(X) is a regular estimate of Q and E [ d(Jt)
possible bias faotor) then
I:
9 + b (Q) (where b(Q) is a
for all n
where
-- the equality holds if and only i f d(X) is a linear funotion of
)~ngf
Regularity oonditions for the Information Theorems

1&
is a soalar in open interval.
JS.,
~,
ou.t
n are independently and identically distributed with density
f(X.? e) 0
2 '"
t~
exists for almost all x (the set Where; ~

on e).
(Noteg
Problem 36 where f(x)
I:
does not exist must not depend
~ (i.e o uniform on (0, a) ) would be an
exception to this oondition) 0
'3.
40
f(x, 9) dx can be differenUatedunder the integral sign with respect to 9.
d(x) f(x" e) dx can be differentiated under the integral sign with respect
to 9&
5. 0<k2~
Proof!
CD
From the proof of theorem 22 we remember that
Differentiut:i,ng both sides of this
~quation,
we get
- 77 -
wh 1.0ch ca n al so b e wrl.Ott en E
k(!) 4dl~
L
The correlation coefficient of S<,~)
(E[S(~)
-
Note~
O"d(~)
d<l) ] )2
2
O"s (:1)
(S<!>l
E [d<!:)]
I:
1 + br
Oln f(X)
dQ-
and d(X)
term vanishes since E CSl.!)
] =0
or
now
therefore
[1 + bR(e)] 2
'D in f (1)]2
n E -'09
Since r = 1 if and only if the random variables are linearly related it follows
that the equality in this result holds if and only if d(~) is a linear f!J.nction of
S(.!)
!2rC'Jrlplea
JS.,
X2 > ~I);"
2.
2
Xn are NID(&J" 0" )$ 0" known.
Find the C.,R lower bound for the variance of unbiased estimates of
jJ.)2
2: (\ _
L=
1.
(2 n 0")n/2
In L = K ""
2 )v
jJ.1)
- 78 -
.~
Hence any regular unbiased estimate d(~) of ~ must satisfy a~

2
but
ax
*2
Example (Discrete):
"minimum variance unbiased estimate (mov.u.e,,).
x is a
hence
Binomial
x is B(n, p) -- find the C-R lower bound for the uJiased estimates at p.
L
n )
=( x
In L
pX (1 - p)
K + x 1n P + (n - x) In (1 '" p)
:or
din L. ~
op
.. ?;}
n-x
In L
p2
E.:!
-:-
I-p
x
=~
P
;:
['d J= ~
Hence
In L
} p2
i >- R\l-p)
~:
d~
but
(!)
YI
'\
\-f )
n-X
(1_p)2
.. E
Y- '..L~~)
- - 0()
n(l-~
(l-p) .
n .
p(l-p)
= n k2
= E(l-p)
Recall that the C-R bound is achieved if' and only if de!) is a linear function
"a'dlQ.f --
of
if
is not a linear function of
d~~:f
then ~ will not achieve
the C-R 10W'er bound.

Problem
Example:
40:
Find the C-R lower bound tor the unbiased estimates of the Poisson
parametsr A. based on independent observations X:t" ~, ..., Xne
Negative Binomial (Ref. Lehman ChI 2" P. 2-21, 22)
L = Pr(x)
In
L = In p
= pqx
+
x In (1 - p)
1
x
=--p
I-p
i.e., sample until a single success occurs
.. 79
E(x)
+
p(l-p)
= I-p
p
= n k
p (l-p)
thus
To find the M.L.E" of P8
1
x
=- - - o r
p
I-p
1\
p .., l+X
- .. it oan be shown that this estimate of p is biased,
E [d(X~
or
= d(O)p
do +
+ d(l)pq + d(2)pq2 +
~q
+ d2q2 +
.00
c~o
+ d(n)pqn +
+ dnqn +
ooeo
fOO
5 1
If this power series is to be an identity in q we can equate the coefficients on the

lett and on the right
do
=1
=0
The unbiased estimate is thus
~. == 0
p* = 1
if x.., 0
=0
i f x~l
ioe .., the decision as to the status of the whole lot is based on the first
observation"
This is unbiased, but o~o.o
(this example is used to show Why we don't always want an unbiased estimate)
i *' = pq~p2
(the C-R lower bound) -
thus the C-R lower bound can't be met
p'
.' since this estimate is the only unbiased estimate by the uniqueness of power
series o
- 80To find an unbiased estimator we can try to solve the equation
.J
co
de!) f(~" Q)
dx
==
for the continuous case.
or
for the discrete case.
These equations" however, are in general rather messy and difficult to handle"
Problem 41. f(x)
k xk- 1
1.
Find the C..R lower bound.
Is it attained by the M.LoE o?
exte~sion
Multiparameter
We have
I:
fey 91 , 92"
'"
of the Information (C..R) Theorem

9k ) ..
Assume the same regularity conditions as for Theorem 24 in all the Gts.
Denotes
>tl
""ooe
~k
..
..
A=
"
"
~l
. / \ is non-singular
..
"""
as in theorems 22, 24
Let d(!.) be an unbiased estimate of
BO
~(~)
that
"J.
f{y 91' 92 , "., 9k )
By differentiation with respect to 9

1
E [d
81]
=1,'
By differentiation with respect to 9

Eld8 j
==
O~'
= l\ ~
- 81 Theorem 2$,
Under the regularity conditions stated
where
1. = (1"
O~
000"
0)
k components
or
~2
~k
fle.
0
0
'1ca
ad:!.
"u 0..
~k
f)
(>
"
0
e..
E [d:!. - 91 constants to be determined.

Proof:
Observe that
a i 8i
J2~ 0
where the a's are arbitrary
Which then becomes
k
ad:!.2
2 a1 "" ~ ~ a. a j A.
. 1 J=
. 1 J.
J.J
J.=
Call the right hand side of this inequality
r(J<,!)
We wish to maximize<:V(a) with respect to a to get the best possible statement about
the bound of the var'iaii'ce..
k
(1) (!)
_ _ _ _T
~
.L--.J;l.:=~l
= 28- -
a J.. AiJ.. -
i"
a.a j X..
J.
J.J
'---'---=-------
'_
- 82 -
d~~CP
or
=2
2~>-.u
- 2 ~
.. 2
J=
a j "\A' j
~
j=l
or
A..
~J
=1
~ a.
j=l
Aij
=0
Hence the equations in ! to maJdmize fJ(~) are

k
a'
j=l j
A:t. =1
J
~ a~ ~ .
j=l
where a
= 0
= 2, 3~ OQO~
is a maximizing value of a j:';.,.
How, multiply the i th equation by a? and add all the equations.

J.
The left hand side of the sum so obtaimd is

k
i=J.
a?
J.
~ a~ A. J~
j=l
J.J
=~~
;max
Therefore
(~)
aO
0 A.
i~ j. i a j
The right hand side is a~"

Hence (ZJ
= 2 a~ - a~ = a~
''ij
- 83 -
or~
as stated in the
O"~
v
I1
theorem~
>'22
C$$
c-
"
Pel
"
\:2
D1
o_1t
.1:.
An
0
III
E [SiSj
8.
1.
Examplei
00.
d 1n L
=-e"
A:Lk
~
Aij
--
7'1':k
"
<>
Remembers
~k
-ax b-l
ab
:rex) = - 8
x
reb)
x>O
-- the generalized Gamma. or Pearson's Type III

... a, b are the unknmm parameters
Find the lower bound for the variance of the unbiased estimate of a",
In L
=n
In a - n 1n
nb
III -
denote
82
= .~lb L. = n
Au=
A_
."2 2
r(b)
2
E [ 81
r (b) - ax
+ n(b...1) 1 n x
..,
-
In a .. n
fo [
1ll r (b )]
J.. F2 (b)
-1 = -
= E [822 J
:I
'd 2
J.n L
In L
nb
da"2-- "";~
2
d b 2-
= n F~ (b)
+ n 1n x
- 84 -
-1
F~{b)
a F~(b)
2
=--~;;...---
n [b F~ (b) - 1
- ~)
1
Note:
See the
e~'l"lJ.ple
for Theorem 23.
~I We started the proof with E [cL
J.
- 91
- ~ a.
i=l ).
_2
s. ]
).
~ o.
The strict inequality will hold unless

oonstant"
Therefore;
[d:J. - 91
~
1
a i 8i ]
If the multiparameter C...R lower bound is attained,

k
function of
'i.
is essentially
is a linear
~ a. 8.
i=l ). J.
and, as before, the M.. L\OE. (corrected for bias :if necessary) attains the
lower bound (IF there is any estimate that attains that lowe!' bound).
Problem 42:
P:r
X:t"
QUI
[x = x ] =
Xn are negative binomial

1'
X -
l' ..
1\
) pr qX
p +q
=1
Find the C-R lower bounds for the unbiased estimates of p
1.
If
l'
is known
2.
If
l'
is unknown
85
Problem 42 (b):
Show that it I
with density
I A.
is Poisson ( A ) and A. is distributed

b
f( A ) ..!.... e- a >. Ab- l
f1b)
then X is negative binomial.
Identify functions of a, b with p,r.
~-------~--~------~~--~----~--------~-------
Summary on usages re m.l,e.:

We have the following criteria for estimates:
1. Efficiency (Cramer) -- attains the CaR lower bOWld.

2"
(Asymptotic) Efficiency (Fisher) ...- (n) (variance) ~ ~ in the limit.

k
3. Minimum variance unbiased estimates.
4.
Best aSYmptotic normal (b.a.n.) -- among A.N. estimates, no other has a

smaller asymptotic variance,
Properties of m.l.e"
10
If there is an efficient (Cramer) estimate (i.e., it the estimate is a
linear function of
2.
Jo
d~nJ=.
) then the m.l.e. is efficient (Cramer).
If the m.l.e. is efficient (Cramer) and unbiased it is JIlc,v.u.e.

m.l,e. mayor may not be m.v.u.e.
other
Under the ~neral regularity conditions, the m.l.e. is asymptotically

effie ient (Fisher) 0
4& Among the class of unbiased estimates, then the m.l.e. are b.atln., otherwise
(ioe., if the m.l.e. are biased) we can not say that the m.l.eo are necessar
b.a.n.
ref: Le Cam; Univ. of Calif. Publications in Statistics;
Vol. I, No,) 11, 19.$3
-- The class of efficient (Cramer) estimates is contained in the class of m.vou.e. -which is the real reason we are interested in attaining the CaR lower bound.
-- 1, 2 are finite results, i"e. for any no
- 3, 4 are finite (asymptotic) results -- for large
n only.
.. 86 -
Eo
MilV"u.eo "".. COMPLETE SUFFICIENT STATISTICS:

."" Let X be a random variable with density f(!., ~), ~
~-
eno
T(X) is a statistic, ioeo, a function of the random variable (~)o
"".. Assume that the conditional dofa of

De1' 0 .31~
X" given T. t, exists",
T(I) is a sut'1'icient statistic for

T
1:11
if the distribution of
is (mathematically) independent of
that iS i f in
Q
1'(!.t 2,) g(T,9

h is independent of
.)
h(xlT)
~v
llt~!!:
NO e 1 -- suppose Xl' X2~

f(x, ~)
..
1. n
(.[2n)
CJ
Xn are N(~, 1)
~(r.i ... ~)2

e'" . ; 2 -
is thus sufficient for

.- distribution of
~::>
~x is N(x,9 1)
No.2 ..... Poisson:
~Xi) ~
Tfii ~
"'------v---------""
heX IT)
X given
- 87 -
~Xi
~ Xi'
is Poisson (nA); given
IJ!1ltinomia1 and T
=~ xi
n i
the individual observations are
Ao
is sufficient for
NO(J> :3 -- Normal
Here sufficient statistics for
IJ., ,,2
x;
are
~ (xi _ i)2
1
Problem 4:3~
Find sufficient statistics (if any) for the parameters of the following
distributions:
a--
a . e-ax xb-1
f(x) =
Gamma
x>o
((b)
b--
power
f(x)
=k
C"
Beta
f (x )
III
d..-
Cauchy
f(x)
x k- 1
pta,1
O~x~l
xa ...1
b)
=; (
O~~l
2)
1 +(x .. lJ.)
e-
nego binomial
p(x) ...
f--
normal
mean:
.,(, +
variance;
(1.. x)b-1
+ r 1'" 1) pr qX
r-
~ t
mown
,,2
--------------------------~----------------
~~:
for any real number

E
[X - 0)2 ~
ETfEx(XLT ) _
~2
..
- 88 Proof'~
E [x21 t]
where t is any particular value of the

statistic Til
[E(X t)]2
E [XJ ... ET[EX(Xt T)]

E [X 2].
~ F~[E(XI T)J2
E.r[EX (x2fT)]
we get
replacing X by (X - c)
tx -
cJ2 ~ ET[Ex(X
Proof:
coo
c f T~2
ET[Ex (X/T)
co
cJ2
Since T is suffioient then the distribution of d(~) \T

of Q and hence
is independent
t(T) ... Ex~(!)IT1 is independent of ~e

1-
2-
E[~(T)]" ETfEx EtC!) IT])
In the result of the lemma put de!)

then it follows immediately that
O'~
..
E[d(!) -
= g(~)
... E@(!)]
g(~IT2 ~
III
ET['t'(T) -
g(~)
.. c
g(~)J2
~ 0'2
/' 'r
" : .. Of
.. 89 Examples:
1....
Binomial
Xi is B(l, p)
1:1
1, 2,
.,
x ) pI(l _ p)n-X
n
OJ
where X
xi
1
(since f(x) is ordered, the coefficient
term is omitted)
X is a sufficient statistic for p.
JS.
E(X ) ... p so that

l
'C1
xl
is an unbiased estimate of
pel ... p)
eo
.... by the theorem

(X)
smaller varianceQ
Xl
in n trials we have
PI{Xr
Pr[Xl
Pr[Xl
1:1
success I X]
= olx
I:
llx]
X successes
II!
X/n
-1 .. -X
"" n
therefore
44:
...
Negative binomial:
show that
10
2B
find E[Xli
Example No, 3:
will be unbiased and have
1 if SUCCEUSS on trial 1
0 if failure on trial 1
II
II
2Problem
p:>
xl"
-Xn
lmown
X is sufficient for p.
(1 = Xl) is an unbiased estimate of p,
nO"iiegXi == 1 if faUure on trial i
0 otherwise
..
(0, s)
Uniform distribution on
Z ... max(Xl ,
E[X i ] .. ~
'f (Z)
II
x2'
01
so that
E[2Xil ZJ
Xn)
2X
II
~Xi
is sufficient for
So
is an unbiased estimate of 9.
will be unbiased and have smaller variance Gl
or>
proof~
90 -
If Xi is one of the observations less than
uniform on (O~
Z then
J 2( ~ ) ... Zc
If Xi is the maximum (1. e ~, ... Z) then

E [2xil
zJ
I:
...
is
Z),
then E [2x1IXi .c Z
Thus:
Xi
J 2Z ..
E[ 2xi Z
~l Z + ~ 2Z (1 - ~) Z + ~ Z
(1 + 1) z ... ( IL,+ 1 ) Z
n
n
4- Given that X is a Poisson random variable,?

observation) ..
T'" ~
Xi is sufficient for
~
we know that
A and
.J.
P (T) ... e-nX
(alSO X is mcl~ee for
(~;"j
(see example No
~]
e- A
PrLx 0]
..2
so that
Define the estimate of
.. 0
if X.J.' :> 0
then nO"
~Y.
1
1.
e-'A.
2 on po 86 )
A and e-X is m~loeo of e- A ')
What is an unbiased estimate of e -X ??
E[
e-'A. ..
91III
III
.\j..J
To show that I (T)
III
1 -
n1 )T
r.
1 Ln(
Ii
III
T 0
1 )T]
Ii
is unbiased:
CD
E ['reT)]
1 ..
(1 - l)T
-n
( 1 _ ~ )T e -n>..
n
(n>..)T
T :
.~ en>.. [(1 *Hn>..)I:: -e-n>.. e n>..->..

T :
. Problem
45:
III
e ->..
Find the variance of

Compare it wi. th the CoR" lower bound for unbiased estimates of e->'"
--~------------~---------------------------
Sufficient statistics are not unique:

Example 1:
Poisson
f(x) ... (
~xi I ~f
is sufficient; but
{2n
!n = X
is equally goodo
)n
are sufficient for
""~
however
~Xi
gives no more information about I.L than does f
is not necessary and
92 -
Example
4: Same as No o 3, but
then
T...
~ (Xi
this set gives t.ha same inforM

mation (need ~o combine them
to estimate a )
is known"
lJ.
"" lJ.)2
is sufficient for
cr2 ~
-~-------~~~-~~----~-~~~-~--~-------------
Remark:
(aimed at problem 43.l1 but holds in general)

The i'llol:le~
is .~function of the sufficient statistic(s)
since L .. g(t, ~) h(b !)

ln L ln g + ln h
III
since it is
independent of
The solution of this equation (which gives the Molee.) obviously

depends only on T"
D!f. 32:
A sufficient statistic :is called complete if

implies
, Theorem 27:
Proof:
-=
h(T)
O~
=9 denotes identically
equal in
If:T is a complete sufficient statistic and 4(T) is an unbiased"

estimate of g(Q) then d(T) is an essentially unique minimum
variance unbiased estimate (mOv0uQ eG)e
Let
dl (T)
be any other unbiased estimate of
E [d(T}]
Q
II
g(Q)
gee); then we know that
., 93 Eg[t\(T)]
g(Q)
thus by subtraction Ee[d(T) - dl (T)
J..
0 for all
Q"
The completeness of T implies that d(T) - dl(T) 0

d(T) ~ ~(T).
or that
Further -- let d i Qf}
that E[diQi T)]..
(see defo 32)
be any other unbiaS~d estimatorG

reT)
holding only if
o~ ~ O~t
is unbiased and
We mow from theorem 26

with the equality
is a funotion of To
dv
'f (T)
But, by the first part of the proof
III
d(T)~
Hence, if we start with an unbiased estimator not a function of TJ we can improve

it. If we start with an unbiased estimator that is a function of T, it is d(Th
This is the contention of the theorem
Q
Example No"
1:
X is
Binomial
B(n, p)
For completeness of X; which has previously been shown to be a sufficient

statistic$ we need
n
x .. 0
heX) ( ~ ) px (l...p)n..,x
~a
p
======~~> heX) .. 0
The left hand side is a polynomial in p of degree
n"
For this to be identically zero in p implies that all coefficients are zero
which means h(X)" 0 for every X" 0" l~ 2, ~ 0 ~~ n,
therefore
heX)
hence
Example No.
2~
pta
and X is a complete sufficient statistic"
~ is m.vouoe e
Poisson
T
III
~Xi
is sufficient
and is Poisson (nX)o
For completeness, we must have

co
~
1;1
h(T) e.. nX (n;i

0
=0
1.
implying h(T) .. 0
on the integers
- 94 CD
or
h(T) (nA)T
T 0
TJ
1\
Such a power series identically zero means each ooefficient "" 0

so that T is complete ..... therefore
3~
Example No o
Normal
Xl' X2 ",
<)
.,
T
n
[h(!}
N(~, 1)0
Xn are
N(lL'~)
e~ ~]e-nX1L eli
~s
CI
IL 0
is a bilateral Laplace transform
By the theory of Laplace transforms (ref~ Widder Ws book) if the Laplace

transform
0 identically, then the function = 0 or
nX 2
heX) e"" -r
0
which implies that h(X)'" 0,
=-
III
therefore
If
Xl' X2' ~
complete for
Remark~
(J.J
X is completeo
Xn are
0",
2
N(lL, a)
then by a similar argument
(..X, s)
2
are
a2 c
lmy estimate which is unbiased and a function of
g( lLJI aC L
of
In particular, if'
2
2
g(lL, a ) ... lJ.2 + a 2 "" E[X ] ,
!n ~X2i
~ ~X~
2
-2
.. (noel)s + nX
n
J.
= 0,
A,9
is m.v.uoe. of
is sufficient and is
-fin- .- '!f S
or h(T)
1J.2+ a 2
1 ~ 2
so ii A::d Xi is me V 0 u. e.
of
2
2
IJ. + a
- 95 !:oblem 46;
Find m"v"upe .. of Ap where Pr[x (hpl "" p"

and Xl" X21
."
are
Y1 " Y2,
~p
In are
Xn are NIO(~I a )
N(~.t
a)
~(<; a2 )
,
Sufficient statistios ar-e
2
(X$ Y', sx'
s;>
~
(i~e","
we have four suffioient statistios for three pal~ametersip

thus this suffioient stitistic is not oompleteCl
for oompleteness~ the suffioient statistic vector must have the same
number of oomponents as the parameter vector.
for an example of what oan happen by foroing the oriteria of' unbiasedness
on an estimate p see Lehmann p~ 3~13, 14@
Non-parametric: Estimation (m"vouoeo)
Let
The
1S.,
X2 'p
prob~e~
().t X be independent random variables with oontinuous dof. F(x).

n
is to estimate some funotion g(F), e.g"
9
00
g(F)
IS
)X dF(x)
a:
E[X]
provided E(X]
<.
co
-co
g(F)
i o e v5 we want to estimate the density function
F(x)
. g(F) "" F(a)
g(F)
g(F)
1:1
F(b)
""
~
Pr[x ~a1
F(a)
(Xl" Xn) such that F(Xn ) - F(Xl ) ~ 1 (two-sided toleranoe limit problem)
co<.
., 96 Theorem 28g
If
d(Xl ; X2.l1
E&qp J =
Proofg
()
0,
g(F)
JS."
In) is symmetrio in
then de!)
Suffioient statistic is T
is mov.u.eo
= (~Xi" ~xi"
X2"
of
'I
0$
<l
<)
0$
Xn and
g(F)"
~X~)e
Consider the n equations:
..
(I
These equations have e.t most
nJ
solutions 0
Assume (as is true with. probability 1) that all the I's are distinot -'" :1:f.'
x_, X2S 0 0 "I XIt is a solution, so is ar.:y permutation of the XiS"
-j,
There are nJ
solutionsQ
permutations of the Xts se these give a oomplete set of
Since the sufficient statistic may be regarded as a set of observations"

order disregarded" any function of ~ is s,ymetric in
X21 0 e I , Xn~
Xl,
To show completeness, we must show

) geT) dF(T) F
==)~>
g(T) .. 0 "
Consider the sub..family of densities
,
i
... 97
where:
p(tl' t 2J
t ) is polynomial in the t's obtained

n
from the Jacobian -- it does not involve the Q's and is
non-negative.
0
.,
..~ Qjt
CD
Hence, if
g(!.) e
jUll
p(~)
-CD
by the same type of theory as :in the normal case (uniqueness of the bilateral
Laplace transform) this implies
or g(l)'" 0
Example 1:
EstiD:ate E(X).
Xi,
X2, 0 0 .~ In'
further, it is unbiased so that by theorem 28 it is mj)v.u.e.
e
Example 2:
is symetric function of
Estimate
F(x).
We have, from the sample I
l..f------~-----~--:---===--:.:::::,,=-,..-.i;'-'~derlying
population dof.
sample d
-<.-:",~\~
01
.\1:::;> ..<'~"
The observations here are ordered

in size but not by observation
order (chronological).
. ?
j'
Fn(x)
~ (:I xi' ~x)
III
!.n .~1 ~(x)

1-
where
~1-' (x)
III
if
Xi(X
]..
n
F (x)
n
= .n i=l
~
Wi (x)
Ii
indicates symmetry in the xts
It then remains to show that it is unbiased$ i.ee, for each x: E[Fn (x)
~.
J . . F(x)
For each fixed x the number of x 'Ex is a binomial random variable with
parameters [n, F(x) J-- and henc~, since the sample frequency is an unbiased
estimate of the binomial parameter, we have the required unbiasedness.
Remark~ Fn(x)
) F(x)
p
for all x
[prOOf given later]
Ii
.
98
Problem
47: Find the mov o u e4\ of

Q
Pr[X '! a]
(a known)
with dof" F(x)o
~xi
liS
is sufficient for
(12,
among functions of the form aS~
(l
a)
find the minimwn risk estimate of
b)
is there a oonstant risk estimator of
of the form as + b?
(12
~----~---~----~------~~_._-----------------
Fit x2.estimation
.- most. generally applied to multinomial
~- us~ associated with Karl Pearson~
ref ~
Ferguson.... Annals of Matho Stat 0
Multinomial
situations~
_.
58
Dec 0
Distribution~
Given ..., a series of trials
with~
possible outcomes of each independent trial

outcome~
probabilities :tor a given

and
n experiments result
in:
v1 v 2
with the restrictions that
~Pi
I;>
vk
~v
... n
,
1.
Characteristic"function~
~
T
vl' v.2' c , v k
(t1 , t 2,
.,
t k ) .. ~
(I1.
v1
:l
n
v 2,
t
e
e 1 +
u.
0,I
0".
~~
(Ple)
Ge.
(p e k,
k
- 99 -
(t 11 0,
vl' v 2 OU$ v k
.00'
0) "" (Pie 1 P2 +
II
[l1.et l + (1 -
GO.
+ Pk)n
I1.~ n
i.e., any one variable in a multinomial situat:.ion is

is binomial B(n, Pi)~
M>ments~
Asymptotic
a)
b)
distributions~
""
vi-nPi
--~ nP (l-P )
is AN(O" 1) as n --.o.J>)' CD
k
k
2
~ 2
~ (vi'",np.)
~z. = A ~. has a
i=l].
i-l
nPi
.. ')
x""-distributionwith k-l d.f o as
.
(see Cramer for the transformation from

k...l independent variables)
v l' v "
2
~ ~
parameters
a,
n-7(Q,
k dependent to
v _ have a. -limiting multi...Poisson distribution with

k 1
h:t.t
~~
I)
0" \:...1 ("i'
v2" c o o , v k-l are independent

in the limit)"
------------------------------------------Example:
(Homogeneity)
We have 3 sets of multinomial trials with possible outcomes Ep E2" E )
3
with probabilities 91' 9 2,91 - 91 - Q2 for each set of trials.
- 100 Therefore we have the following outcomes

v
v l .3
n..t
22
v 2.3
v.3l
v
32
v3 .3
vol
v. 2
v.3
12
21
In L = In[n.. 1
V ll p v12
-~
12
These give the two

(1)
(2)
n.. V31.3 p v 2l p v 22
.~
21
22
P v 23
23
equations~
91 + vol Q2
(vol + v c )
v l2 Q1 + ( v $2
v 03)
-I-
Q2
vol
v 02
which when added together yield

(3)
N Q1 + N Q2
v"l
+ v 82 = N -
v 0.3;.;
Multiplying (3) by N,,2 we get

(4)
v Q2
Ql
V 02 Q
v 2( v 01+ va 2)
N
p v31
31
p v 32
32
P v337 + 1n K
33
- 101 -
It can also be found that
-----~----------------------------------~-
!!oblem 49:
( Independence)
Consider one sequence of N trials that result in one of the
where
i=l
Given:--s
p.
1 ;
IlS
j=l
J.
1:3
1 "
general case
serie~
p ..
't"j
and also their variances and covariances.
Pi" 'l;'j
of n
trials, each trial resulting in one of the events El1o .. ,E
_The probability of Ej
k
j=l
events
Find the lll"l"el} of
x2.estimation;
rc
=1
occuring on any trial in the
i th series is p..
J.J J
J.J
-the random variable in the problem is the number of occurrences of each

event:
k
o
-the PiJ'
G,
v ik'
are functions of
, Jl:ll
vij
= ni
(continuous with continuous first and second

derivatives) 0
_expectation is with respect to
I'jlf
... ioe." E[V
ij ]
= ni Pij
the @ti" subscript is a convenience, we could consider the

big trial"
-. the use of
s trials as one
.. 102 -
The following methods of estimating .i. are asymptotically equivalent, ieee:l

(i) they are consistent
(ii) they are asymptotically normal
(iii) they have equal asymptotic '\rariances
lE>
maximum likelihood
2
m:iriimum X
20
3" modified minimum x
transformed minimum X2
4$
1(1
Maximum Likelihood Estimation:
Maximize
k
~
sUbject to ~ Pi. (~) = 1

j=l
J
for each i,9 with respect to
Q;
or actually solve the equations:
~.~ ~ ~,~ d Pti =

i
p~
:'.J -
'9
" -
(provided su.itable regularity conditions are imposed as given in

theorems 22, 23)~
(vi - niPi)
,J
ni Pij
is asymptotically distributed as X2 with s(k-l) dof p
The method is to minimize this expression with respect to

the equations~
.. 2
This set of equations could be expressed as:
or to solye
therefore the second term

R the third tel"ll1
Hence, equation
r2] is the
eql~ls
>0
as
0
N ~oo
same as equation
[1] except
for R and thus it is seen
that under suitatle regularit,y condiGions the solutions of equation [2] tend in
probability to the solutions of eqtlation [11.
Replaoe Pij in the denominator of the regular minimum ,,2

estimated value, and then minimi3e with respeot olio ~:
2
Pij
i
XI~i1' .. ~ ~ (V;ij: n )
i j
ij
subject to
Pij 1
for
equation with its
i 1, 2, , s.
Comparing the two ,,2.,. methods
X ... Xm
III
...
R.
are
2
) 0 faster than does X --....aS~l!l.pi.;otioal1y equivalent.
p
therefore methods
(2)
and
(3)
The equations to be solled in this oase are:
~~
40
Transformed
ni
(V";j.:'ni~ij)
'''.,....,'1
1I'~i.'n''!!LL
Recall that 1
In(Xn..~)
-b;-
is a.symptotioally normal~ then
and the asymptotic variance of
g(X )
n
is
cf[
S(X )-g(~)
Cng'(~)
g t (IJ.>] 2
is ll.N.
104 Before writing the necessary equation, write
vii
qij .~
thus the qij
Thus the modified minimum
x2
X; ~ ~
i
are random variableso
equation can be written:

2
~i (qiJ - Pi
.
qij
32..
The equivalent estimation procedure is thus to minimize
~~
i
D1
[g(q~)
g(P~W
Qij Lg.(qij>]
(the numberator should be Pij [gf(Pij}J2 but the difference

occasioned by the modification
>0)
for gts which are continuous with continuous first and second derivatives" and
with g' (Pij ) bounded away from zero~
This method will be useful i f the transformation; gl!ij{~~
is simple.
The equations to be solved are thus;
Summary:
.
ni [g(Qij) ..
qij [gl(qij)]
Method (1) [mol"e"
Method (2)
g~;ij) ]
J... is most useful if the
Pij
products, i.e., Pij Pi'Cj
2
[minimum x ] -- most useful it Pij
can be expressed as
as in problem
49~
can be expressed as a
sum of probabilities, i.e. Pij .. .,(i + Pjo
J-
2
M9thod (3) [modified X
Method (4)
same as (2).
[tranSformed x2] -- may be most useful in some special cases..
\) t ,1----1
t V'\\.
01-) / V\,
\) '2'
- 10.5 -
Examples of minimum - "
prooedures:
1. Linear Trend:
-vobs.
p.)
ni 2
total
prob~
Pup-A
12
P12- l - p + A
v2l
P2l P
22
P22 1 - P
Il
31
31
., P + A
v\,
v
32
P32
III
1 .. P - A
Problem is to estimate both p and Ao

2
Using the modified minimum_x
procedure:
l [vu""':J.(P-d)j2 [V~l + V~2] + ("2l-n2"Y[~ + "~2] + [".3l-n3(P+d) 2 [vi + V~
-t ~~2.
~ [vll"~(P-A)J +
where
a 2 [v 2l-n2P] + a3
rV31.. n3(P+A) ]
a i ni'...L +..L)
Vil
i2
These tw equations yield the following two equations which can be solved simultaneously for p and A ~
Note:
remember that this procedure is asymptotically equivalent to the m.l.e o - therefore the asymptotic variance-covariance matrix can be found by evaluatin~
!.2 ).2y 2:
~#
! ~2y2
2U'
l_d_
2' ~ at the expectations since - ~ is the exponent
of the asymptotic normal distributiono
- 106 -1
a..
. J. = n1 ( v
l)
+
v12
ll
Recall
E[vlIJ
Replacing the random variables in
= ~ (p-A)
by their expectations yields
1
1
I
--p--). ( p-A + l ..p+l ).. {p..A) (i.. p+A)'
I
p 7 p(l..p)
Similarly~
...
thus the inverse of the Ao V-COV matrix is
-----------------------------------~-----~-
2e
Logistic (Bio-assay) problem (Bergson)

... applying greater dosages produces greater kill (or reaction)
s
= general
xl'
~$
ni
=2
xs
dosages
0,
ns
receive dosages
09
va
die (or react)
o~
nI" n 2"
vI,\l v 2'
:>
e-(-<+~xi)
= l+e.(-<+~xi)
1 .. Pi
Vi
q.J. = -n
=- proportion dying or reacting
107
2
Applying the transformed- X method to estimate
Putting in the values of
g, g',
.,/."
and using the transformation
we have
,,2 ~ n.
i=l
---------~-----------~----------~---------
Problem
50~
Consider rc sequenoes of
where in trial (ij )
n trials whioh 'lrS.y result in E or E
i
j
Set up equations to estimate
"/"i"
10
maximum likelihood..
20
minimum modified X2
~j
by
based on observations v ij ' n - vij
1,,2, 00 ."r
1,2,.06,0
- 108 Problem as stated produce.s r+c equations in r+c unknowns" but their matrix
is singular -- i.e., the equations are not independent by virtue of the fact that
the swn of the I' equations for the .,(i equals the swn of the c equations for
the ~jfl
Therefore, reduce the number of sequences to
and add the restriction that
P22
CI
4i
P12 -+ P2l - Pll
ioe o " reparameterize and find the equations necessary to solve for the
new parameters<ll
Given
51~
Problem
s
sequences of n i
(let the number of Eis
trials may result in E or E where in sequence
observed be denoted by vi)o
Set up the equations to estimate

l(t
maximum likelihood"
20
2
minimum lOOdified X "
A by
and find the asymptotic variance of

(noteg
G~
1.
Ferguson discusses this problem in his Dec.
-Minimax estimation:
Recall that
d(!) is a minimax estimate of

sup R~,
Q
Q]
R[d, Q] E[d(!> -
58
g<.~) if
is minimized by dqf)
g(~r
[see def e 24 on p.
article in the Annals)
[d(,y - g(,2.>r f(!.)
54]
- 109 Methods to try to find minimax estimates:

1.
Cramer-Rao
cl;:"
[l+b'(Q
d ;;;..-
nE[d ln
E[d{~)
g(~)J2
x ~1
J:.:l+ b'~lL
nE
~(!) -
III
<{ +
Rld,9] >-
,/
f1+b (9)1
nEI~ In
-an92f(X~J
[. d
E[d{!)] + E[de!)] -
Lb(9)}2
+ b 2 (Q)
k(Q)
f!&12
J
dQ
If we have an est:iJnate
d(X) which actually has the risk
k(9)
estimator with risk less than k(Q) leads to a contradiction.
Example:
~"X21 ~
0,
J2
g{~)
Xn are N(~,
then any other
J)
2
and we know R(x). ....2n
Lehmann shows that
which implies that b(Q)

Therefore X is a minimax estimator of
2..
~0
This method based on the following theoJ:em which will be stated without proof G
~eorem 29:
!.f ~ Bayes estimator [deft 23] has constant risk [defo 25] then
1.t 1.S minimax/}
d(~)
is constant risk if E [de;) - g(9)]2
de!)
is Bayes if d{!.) minimizes

where
ref:
Lehmann -- section
Example:
4,
is constant.
[d(!.) ..
gee) ]
G is the Ita priori" distribution of
f(x, Q) dG(9)
Q.
p. 19-21
Binomial -- X is binomial
B(n p p)e
Recall that we found that
d(X)
rn
!. +
1+.Jn n
1
2(1+Jl}
is a constant risk estimator of p [see problem 31" p.
54].
- liOhas a prior Beta distribution
If'
.f'n"
f(p)
then d(X)
I:
K P2"
12. 1
(1 _ p) 2
is Bayes;
hence tJ?,e minimax estimate of the binomial parameter p
rn_!+
l+Jii n
1
2(1+$)
is
Graphically we have the following risk functions:
1
rn-- - -- - - - -.---.-,.
.It.::;
o ;!.
-+-
Problem 52:
Compare
( d
Ho
Efd (X)_p,2
l - 'j
Rd' (p)
and Rd (p)
m
for
I:
:e(l-p)
n
n =25..
X
- ; d = minimax estimate)
n
m
Wolfowitz is Minimum Distance Estimation

Xl' X2,
.~
Xn are observations from the distribution F(x,
Q).
We are also given some measure of distance between 2 distributions
e.go:
(F, G) =
sup
1F(x)
ro (F,
- G(X) \
.co<x<oo
Ii
(F. G)
(fr(X) -
Also we have that the sample d.fo Jl
G(x>]2
<lex) ; G(x)
F (x) -= noo of.l's<x
(see p. 97).,
G) e
.. III -
Q'
is a minimum distance
minimized by choosing 9
Note:
es~mate of
= 90
if
f. fF(x,
t
Q), F (x)] is
n
We want the whole sample dof. to agree 'With the whole theoretical
distribution ... not just the means or the variances agreeingo
~
In particular, i f we use the "sup-distance" .then
which yields
suP' F(x" Q)- Fn(x)
min
9 -CD <x <CD

/V
Remar!.:
Proof:
"..-
n) 9
differs from 9 by
15
Then for some
and for some
jJ
Suppo se ,)for all sufficiently large

more than 8"
sup
is that estimate of
eo
is a consistent estimate of
I F(x, ~)
.. F(x" 90 >J
> 15
We want to show that

sup
"'CD<X <00
Let x'
F (x) - F(x,
n
sup l Fn(x). F(x,Q>

o >} L..-oo.(x.
<. 00
I"
be the x such that

,.J
F(x', Q) - F(x'a Qo > = 15

Q )
therefore F(x',
thus
< :315
'Q) '" F~ (XSj
sup F (x I, Q) - Fn (x W>J
1
15
>j
)
but this contradicts the fact that

therefore
is consistent.
",
15
sup
minimizes
Fn(x) .. F(x"
o)
sup Fn (x) .. F(x,
Q)}
Example: Find the minimum distance estimate for
IJ. given Xl' X2, 1

are
3
N(IJ., 1.
~ = 1
p
Fn(X)
_::::;:::....,,==....--
2/3
1/3
.\
o
Note:
The n:sup-distance" will obviously occur at one of the jump points on F
(x)~
n.
To find the minimum distance estimate of IJ. an itterative procedure was used$! whicJh
started by guessing at IJ. and then finding F(x, IJ.} at each X.o
J.
204 zi=xi-IJ.
P(zi)
767
lilt
-1.4
"",0.4
1,,6
..
,,081
0345
0945
e325 (067 - w345)
zi
...1.,3
-0 0 3
10 7
f(zi)
. . 097
0382
~955
~288
2 e 33
P(zi)
..371
&952
299
20 29
f(zi) ;:
0386
.956
284
2,,3
Other values of
therefore
~~em
e.
(\33
X
3
1 00
Fn(x)
X2
Xl
IJ. were
IV
tried~
IJ. ... 2,,29..
(x
but the min sup-distance
..
(.(,1- 3~ b)
284,
=2,,33)
53: Take 4 observations from a table of normal deviates (add an arbitrary

factor if desired) and find the minimum distance estimate of
IJ.p
113 Q.hapter
I~
V~
TESTING OF HYPOTHESES ... DISTRIBurroN FREE TESJfS

.~
Basic concepts..
Xl' X '
2
Der Q
~, X
have distribution function F(x) ... continuous

-- absolute continuous
[density f(x)]
The hypothesis
H specifies that Fe 7o(some family of dQf.)
Alternative
specifies that Fe
33:
A test is a function
iQe.~
if f(!) ... 1
7- f{
J(~) taking on values between 0 and 1

reject
H with probability
... k
II
tI
II
II
it
Illl
(note that this considers a test as a function instead of the usual

consideration of regions)
Def e
34:
.,( if E fI(!) ]
A test is of size
fr'(!lJ
rf(~) dF(~)
for Fe
? ~ .,(
"'00
(this says that the probability of rejecting
~~:
~...12~
Power of a test is
r~1
E[1'(!)]
Fe
1",. ~
and is denoted
is a consistent sequence of tests if for
~96 (F)
Def e ~:
for
H when it is true
as
Fe
~f (F)
7- (
n ~ (I)
Index of a sequence of tests~

N(
Cfn~
1: co<!)~)
is the least integer n
~ co<)
such that, given that
.. 114 ..
l-~
+-----/
I--
--Io-;-
7~
-- 1: 10)
Def. 38;
Asymptotic Relative Efficiency (A~RoE.)
Let
p(
Let
fn and 9ii
be some measure of the distance of ;/;rom
10
/\A.-'
be two sequences of consistent tests
then the AoR"E e of
lim
N(
--
~ to
~ is defined to be
~'1," 7~ ""'3 ~)
---~-----
p-';'O N(
fn9 7~ .<.~ 13)
provided the lindt exists
(0' mown)
alternative~
Consider two tests of
IJ.
H~
10
t he mean te st ~
2"
the median test:
reject
H if
>0
... ~-z
X)'_..-
(J
Vn
[use the large sample distribution of the median, i"e-the sample median is asymptotically normal with mean
... the population median and variance = Ra2/ 2n
if test:
a)
rejeot
H if the ssmple median
'7 "1--<~
find the index for each test for the alternative
~!
- 11S b)
find the
A~RoEo"
ieee
lim N~f',
IJ.' -7 0 N(
II g
r;
,*U 2
.,(,9
IJ. r;
co<."
@)
(3)
Distribution Free Tests

refs~
Siegel -- Non-parametric Statistics

Fraser -. Non-parametric Methods .in Statistics
Savage -- Bibliography in the Dec. 19S3 JoAcSeAo
Ao
Quantile and M:ldian Tests:

Let the solution of the equation F('" )
I:
If the solution is not unique, define
Given the problem to test

note~
Test:
Ho : \
I:
the number of
against the alternative
H~
<
-
A0
(q ~p) at level
co<.
JC]." x 2"
I:
pJ
~: Aq
= "'0
/)
"
oa X
X is B(n" p)
for the tl<ro sided test

reject
~::!!lple:
"'0
.. infx(l(x)
A.
the altemative could be stated ApI

-L A.0 but this statement precludes
.
consideration of the power of the test since the power in this case wouJ
be undetermined by virtue of the unknown behavior of F(x).
under
Problem
-.
..
P define the quantile
using the normal approximation
. /) Jx~ npl- ~ >
H if.
tJ np(l
- p)
;;
zl..<./
.., 2
SS: Compute the power of the test (using the normal approximation) as a
function of q for n"'" 100 and p == 005
for paired comparisons" to test if the median is equal to zero set up the
series of observations dl , d 2"
0"
dn,
di
= Xi-Y i
- 116 This test that the median is zero is often referred to as the sign test, sine(
in this case is merely the number of d with a negative signa
i
pete 40:
the sample quantUe~ is defined as the solution of , n (Ap )
>-p"
=P
Theorem 30:
\>
has density function
where
Proof:
ref:
Pr
l.L -= [np]
which denotes the greatest integer in
. np
Cramer po ,368
[Xp ~x J
Pr [l.L + 1 or more observations {x]
III
( j ) [F(X)] j
[1 - F(X)]n- ..
j-l.L+l
li'(~)
To get the density hex) [assuming that F(x) has a density f{X)]
we differentiate the summation, getting the following two summations
[trom each part of the produot thatoocurs in each term summed]
n
hex)
-=. ~ ( j ) j ~(x)]j-l
J=l.L+l
[1 . . F(X>]n- j
... ~ ( j )
(n - j) [F(x)]j
j-",,+l
f(x)
[1 - F(x8n-j-lf(x)
The corresponding terms [in ,a(l .. F)bJ cancel each other except for
the first (j'" l.L + 1) term in the first summation which has no
corresponding term in the second sum.,
/\
By the usual limiting process applied to density functions, A

normal" with mean
\>
.
and variance . 2 1
f{\)
p(l..,p)
n
is asymptotically
- 117 ~ob1em
56: Let
J$;
ha11e density
co
) rex)
where
dx .. 1
""co
and
rex)
....1
c
f(
X"/J: )
C1
co
x rex)
dx .. 0
x 2f(x) dx -= 1
"00
is symmetric about Oc
Find the condition on rex) such that the AcR.E. of the median to the mean
is greater than 1" Use large sample normal approximations tor both tests"
Specialize this r.esult to
I x..,u I
o;-~ ..
f(x)" ~ e-
CJ.)
<x <0:>
--~-------------------~-------------~-----
III.
One Sample
H:
o
Tes~~
F .. F0
completely specified
~ \)
the smaller distribution has larger

observations.
-J..
:Ell:> F == F.,,..F
.I.
0
problems:
1)
goodness of fit (usually a 'W-.lder problem since F

o
completely specified)
2)
ilslippage test!i.
is usually not
3) combine.tion of tests -. ref: A Birnbaum, JASA, Septo 1954
4)
randomness il1 time or space _.
re~
Bartholomew, Barton" and David;

Biometrika; circa 195,.6
Randomness in Time:
Assume:
events in (O,t) (the time interval
i"e." is Poisson with parameter
to
t)] e
-At J.~t)n
n'
At
"-.
Probability of event oocuring in one t.ime irrlierval is in:iependent of an occurrence ir

any other time ~tervalo
GOO
.. 118 i th event occurred c
Let T. be the time at which the

J.
~ t.
j=l
Pr[ti>t
tet)
III
= density
f(t 1s t ,
2
I)
prGo event occurs in time (O,t)] ..

of the t
0,
Distribution ot Tl' T2'
T1
== t
At
),e" At (the exponential distribution)
~ e "A~ti
0 <u;
Tn
t ) ..
e"'
s:L"lce the
IS
are independent
is obtained by transformation
Therefore, let us find the conditional distribution of Tp T , ~

2
given that n events occurred in the fixed time interval (0, T)e
f(T 1 , T2~
~ ~, Tn ; n)
= density
ot Tl , T2,
!>
.,
Tn
0'
Tn and also the probability

that no events occur in the time interval (T , T)
$
..
,\n
-AT
1\ e
ten)
This distribution is the distribution of

random variables on (O~ T)c
ordered uniform independent
- 119 Ordered Observations:

~, X "
0,
I , I 2"
l
.,
If the XQs
between the
X have density f(x) and are independent.

n
In are the ordered XiS.
have a continuous distribution then we may disregard any equalities

or the I's~
Xrs
The marginal distribution of the Y

i
10 Integration:
Q)
00
nl
can be obtained by three methods, i .. e.
f'(Yn ) dyn
Yn""l
Ii
f(Yi+l) dYi+l
Yi
Y3
f(Yi...l) dYi_l
/)
"'00
f(y.)
J.
Y2
f(Y2) dY2)
f(Yl) dYl
-00-00
note: ...- the observations above Yi are constrained by the next lower
observation$ since this is an ordered sequence
-- the observations below Yi are constrained from above
integrating this we get
gi(y)
20
lllI
the marginal density of Yi
By differentiation:
Pr\!i.( Y] =
Prei
or more observations fall to the left of Y]
1:1
J=i
( j ) [F(y)]j
[1 .. F(y)l n- j
:.J
... 120
3(
tn.i)~ti.ln
[F(y)]i-l
.., F(y)]n...i
fey)
Heuristic Method (Wilks):

dy
_ _ _ _-"'~1!i.._;
1i
Y
(1-1) observations
in the interval
(-eo, y)
(n...i) observations
in the interval
(y" + 00)
The probability of this is essentially a trinomial distribution therefore
Example:
In particular, if we have the uniform distribution:

fey)
F(y)
=Y
n).
J.-,l
gi ( y ) = "{ii..n trI:IYl y
(1
.. y
)n..."i
We could also define the spacings~
(which is a Beta distribution)
81 " Yi-Y i - 1
i 1 , 2,
QI
n+l
n+l
i-1
8 1 -1
Y .. 0
Problem
5n Show that each 81 has the same densityo

a) find this density.
b)
find
1)
{~5~]
n+l
11) E [
11
1~ r51 - n~~
----~---------------~----~-~--------------
-121Lemma~
Probability
Transformation~
If X has distribution F{x).. F(X) has a uniform distribution on (0" l)c

u
define the inverse function as
PrI!(X)~U]
Pr[X~F-l(u)
F'"'l (u)
...r.-
_ _ _ _ _ _ _.......I - -
in! [F{X)
F~"'l(u)J
au
One Sample Problem can be put in the following form by use of the probability
t ransl'ormation:
Given Xl' X21
')
0);
In; let
increasingly and define
Ui
III
Xll)'~' I(2)~
H:
o
o<X(n) be the IVs ordered
i = 1,9 2,
Fo(I(i
we have a set of ordered cbservations

interval (Oi 1) which ul'lder
U1 , U2s ..
eo...
!9
Un on the
has uniform distribution (with density n,t')

[or equ~salently;,l that the original XiS had do.f o Fo(X); i.eo" that
the correct probability transformation was used]
and under HI ~
has distribution F
1
two-sided problem
[F~l
a Go
one-sided problem
G{U)~ u -- with the
strict inequality for

some interval
Some of the Tests for One Sample Problems

1.
n which is
AoN~(~,
rk )
- for the oneo:osided sitnation

-- consistent
-- reject 1i if it is "too large"" ioe~ if
(for the above pictured situation)
or if
ii is "too small"
'O'>~ + zl-.-<. (lin)
in the 'converse situation .. [i"e." G(u) >uJ '..
.. 12.2 -
n
20
-2
i=l
whioh is X2 with 2n do!~ (rafo problem 6)
In U
i
e~
X2 is exact, not asymptotic
....
-""
consistent
for the one-sided problem
used in combination problems
reject H if' X2 is Of too largeU
30
<.02
i-l
ln (1 ., U ' _.. also is ,,2 with 2n d.! 0

i
....., tor the one-sided problem

Pearson's counter to Fisher's advocating
",,- consistent
... reject H if' X2 is ~too small If
=.
4G Distanoe
NOa
Type Problem
Kolmogorov Statistic defined as follows
n+ ... sup [Fn(U) .. u] )

~u~l
n-...
sup [ u - F (u) ]
sup
III
o~uu
for the one..sided problem
, Fn(u) - u I
for the two,.,sided problem
o~u~l
.- is consistent
5a
Related to the Ko1.mogorov statistic is

R+ ...
sup
a~u!l
III
sup
a~u~l
-----
Fn(u)...u
u
one""sided problem
tFn(u)-u)
u
two-sided problem
not necessarily consistent

lIa" is arbitrary but positive
derived by a Hungarian, RenYi
not of very great merit
6.,
12.3 -
eu~
)[Fn(U)
....
__
---
ul du
two-sided or one-sided problem

sort of a continuous analogue of ,, 2
consistent
due to Von Mises and Smirnov
n+1
7~
wn = ~ i=l
~'8i ~ I
co
-- ref: Sherman$ Annals, 1950

-- one-sided or two-sided problem
.... consistent
-- is AoN.
n+l
8f)
~ = ~ S~
i-1
J.
a_
one~sided or two-sided problem

-- due to lII.bran
...- is consistent
90 Ul
., 1
is AoNo(O$ 1)
(Wilkinson Combination Procedure)
-- one-sided problem
generally not consistent
Problem
58~
a) Find the test based on

with g(u) = G'(u)
Ul
for the set of alternatives
b) Find the power of the test for the alternative

c) Find the limiting power as n
~co
G(u)
:I
G(u)")u
uk
O<k<l
(if the limit is 1" the test is

consistent)
10. ,,2
11$ NeymanYs smooth tests
-- discovered by Neyman about 1937, but never generally used
-- re.t'g Neyman; Skandinavisk Aktuarietidskrift; 1937
Pearson ... Biometrika; 1938
David"" Biometrika; 1938.
- 124 -.
Problem ,9~
Xl" X2,
let
o.
= max
are independent with dof~ F(x), density rex)
Xn
0,
Xi -
min Xi
l~i'!tn
l~i~n
a)
Find the distribution of
b)
Suppose F(x)
R...
has density of the form
E(R 1- k(f"
show that
(J
x 2(>:) dx
)f(X) dx 1
where
1: f( x""tL
a
~1
n)O'o
-------~----------------------------------
Theorem 31:
F
If
!!22!.~
is continuous,
Fn
We want to show that given
- Pr
[I Fn (x)
.. F(x)
I <' s
p>F for
8, (3
for all
au
Xv
we oan find
x]., 1 -
such that for
n '/ N
(3
Let Af) B be such that
F(A) <. 8/2

l-F(B) <: 6/2
Flok out points X(1).9 X(2)" 0 0 .. , X(k_l) in (A, B) suoh that

whioh we can do because of uniform continuity.
Now set
F[x(i)
1.. F(X(i_l)] <
= X(o)
ConSider a sample of size

(x(i_l)' xCi~
no
Let
has a
,,2
i=o
We oan find an
1>1
such that
n.
1.
=;/-{x.
n J
that fall in the
distribution with
k+l dof"
i~ interval
- 12,5 -
or
Pr
[X~l
MJ /' 1.. S
<.
Since the above sum is less than Mj each term and aJ. so its square root are certainly
less than MJ therefore~
Ini..nPii
PI'
[ -JnPi ... Mfor each i 0, 1, , k + >1. 6
<
which could be written
Recall that
~ n.
j=o J
n
Now
J Fn(X(i)J
~ F[X(i)l/ /~? - ~ P I -/' ~(~. P ~1~ .~ I~ - P I

j=o
j=o
(k+2)MJP!
Choose n
rn
so large that
< 2'
J=O
JUO
for all i
Henoe, i f n is chosen this large s with probability l .. S the following

relationship will hold:
Fn(X(i)] '" F [XCi)
11
<.
(i+l)M.JP"i
vn
<:.
Consider x lying between x(i-I) and xCi)
<.
*
F[X(i)] - F[x(i)]
+ F[X(i_l)l Fn[x(i)l
F[x(i_l)l ...Fn [XCi)]
F(x) Fn(x)
~ F[X(i)J
<C 8
8
'2'+~&'lS
with probability 1 - S
* lltUieiI1g the following
x'elalllonsblps:
'i.'* making use of the following

relationship:
.. 126 ..
..
~ <: F[X(i)l
~ < F(x(i_l)l
.., Fn[X(i)]
..
<~
F(x(i~ <0
Therefore the theorem holds c

------------~-------------~----~-~-------~
Example:
Referring to test No o
4 given
on page 1220
is true, F ~uniform, thus for

n
If
If
G is true,
IFn(U). G(u)
1<
Therefore" the test based on D

probability tending to 1;
eo
sufficiently large
and thus
sup
IF
I Fn(u)
(u) - u
DO
I Fn(u) -
u, ..(&0
uJ '>s.
will reject
with
henoe the test based on D is consistent$

Problem 59 (Addenduml (see p. 12~ ~
c)
Find the distribution of R for
1) Xl' X2,
()
Xl' X2,
2)
note:
Xn uniform on (0, 1)
~1 Xn with exponential density rex)
0$
=a
e- ax
in the uniform case U and R are dependent, in the exponential case

they are independent l since the upper limit of the observations is co
---~-~----------------~------------------~
Remark:
The Kolmogorov statistics, Dn, D+, and D- are in fact invariant under
the probability integral transform"
Proof g
we have to show that
sup
f F (x) "" F(x)
-a><x..(oo n
III
sup
FJF(X)] ...
o !F(x)~
S!
Fex)
sup
IF
0 ~u ~l
(u) - u ,
- 127 -
Fn (x)
where the YI S are the 0 rdered

observations
X2'
In
JS.,
.i
n
Therefore:
III
o ~u-{l n
sup
, F (x)
..c:o~x <con
F(x)
D~
Distributions of
When H is true, and Ul, U ,

2
a) lim . Pr
n~CD
.,
where
, F (u) ,. u I
sup
rJ
(l
III
[rn D: <z J ..
Un are uniform, then
_,
QI
e_2z 2
(due to Smirnov)
0 {z <co
_. Finite dist!'ibution of D+ given by Zc Birnbaum + Tingey in the

Annals of Math., Stat..,
-- Tabled by Miller in JASA, 1956, pPo 113-11'0
b) 1(z)
III
lim PrrJii D+
n-7 co l~
n
.c. z
J.
22
CD
~ (_l)m+l e- 2m z (due to Kolmogorov)

m=l
O~z <CD
-- Tabled by Smirnov in the Annals of
Ma.th~
Stat"" 1948
The simplest proof of both results is due to Doob in the Al".nals of Mith"
Stat~"
1949
Some tables on the finite distribution of D are given by Massey in JABA" 1949"
ppe 68-77Q
n
Consider now that
is t:rue, i.eo, that U has, de!. G(U):
n- test is to reject
H if
PI{D~ > en1 . . .,(
n'"n
> en
where
Pr[.rn D~ >frl I-;nl

thus
lIn. .,( I
2n
~
max. difference between up G(u)
D~ .. sup [u ... F (u)]

o ~u~l
n
- 128 Suppose
is ture so that
maximum / u - G(u)
Then:
Pr[D::> fin
A at u
1~
~
~
Fn
III
is the sample d.f
l)
from
G( u) with
Uo G
Pr[uo - Fn(uo )
> en ]
Pr[Fn(Uo ) - uo<-e n ]
Pr[nFn(uo ) t::n(uo - en>]
But Fn(u ) is a binomial random variable with expectation

o
n(u -e )
. 0 n
and
Binomial probability'" ~ F(k; n, u - A)
k=o
Prob1em.Q.:
D~ test where G(u)'" u 2
Find the bound on the power of the
Using the normal approximation given an explicit f.orm for this bound
in terms of
x
2
,J, (x) = .1n, -<, 'I
e _t / 2 dt
J2n
Test No 0
IlC
5; Renyi Statistic
+
R
R
=0
Fn(U)-U
sup
a~u!l
= sup
u
Fn(U)-U.l
a<u~l
.....
Limiting distribution:
lim Pr [
n"-1 oo
rn R+<z]
For the distribution of R and further discussion see an article by

Renyi in Acta Mathematica (Magyar), 1953.
Test No. 6:
CJ~
I
1
1
sa
[Fn(U)
lF~(~)
UY
co
du
-2uFn (u) + u
j
2
(Fn(x) - F(X)]2 dF(x)
du
- 129 ttl
i
n
eeg., F (u) is flat for an interval

n
fo
recall:
un+1" 1
n+1
.. i=l
~
i ( u2+'"" u 2) + !3
-n
i
i
2 "'"
( -i )2 (u 1 .. u. ) ...,:r...c::J
1
n
+
1
~ 1-1
n
.. 1 +
2
u du
~ ~ (1
n~ 1-1
.., 2 )u .. ~ (~
i i
n 1=1
u~)
.., 1 +
i
J
.:F.,2
'n
.""4J
Cramer shows that: GJ2

n
III
Tables of lim Pr(nw2
1 + 1 ~ [u 12n2 n 1-1
1
z]
~
2n
have been given by Te We Anderson and Darling in
n~co
in the Annals of Math o
StatQ~
19$2, P3 20611
Approache.s to Combining Probabilities:

Exampte~
The following probabilities are from a one-sided t-test:
...R-.
0026
.11$
,,27
036
&7$
-.1F
02
03
04
S
J....
,,76
081
,,89
,,92
.98
-06F
07
08
09
1 0
0
.. 130 ..
Under the basic J::wpothesis the
pts are uniform on (O,\i 1)
Alternative tests:
acoept H
P OeS7
21.'
To test alternatives of the form G(uu the Kolmogorov-Smirnov test

statistic is
n ., U
D:. F
sup D+
OQ08S
10
Pr[~O '> .342 J 015
From Miller's tables:

..
1
U ... ~
34> D. eS88
"
0088
J~r ~
Pr~ <' 0964]

Since
0e964
067
BtU] under
H.L <: ELU under Ho we will reject
therefore we cannot rejeot
H if'
Ho~
40 Against two-sided alternatives

D = 035
(,,75 - 040)
See Massey's tables for small sample size probabilitieso

Usi.l'lg the large sample approximationg
co
22
Pr
D
2 ~ (__l)m e- 2m z
[rn z ]
Pr
[D)~ ]
no
(II
at
L(z)
mal
L(035)
Example~
-
In the following-
table~
the Xi are taken from a table of N(O,t 1) normal

-- the XCi) are the ordered Xi
-- the U...
1.
..L (XCi) e- t 2i/2
{2n
)
..co
dt = PI'
f J
Z
<Xi
deviat~so
z <zco(
- 131 -
the 8 are the spacings between the Ui

1
042
-0038
-0002
-0 0 02
.49
088
,,08
.53
040
009
.54
031
064
'?40
066
e08
042
.66
1012
088
011
-Oe38
1;p12
081
031
1016
&96
X(1)
1.16
009
Under H:
Ui
Xi
.35
8i
~
-;"09
.14
009
&04
009
~01
~09
.10
t>09
.02
009
,,00
1109
.05
:>09
c16
~09
009
t}09
004
009
n--!
the XIS are N(O,\l 1)0
The test statistic is: 10

1
l.Jn =-2
~
i-O
Is i . n+l
.1.../
Ill;
Oe38
Ignoring the slight negative oorrelation between the S., one could use a
normal approximation with:
1.
E[W ]
n
l:
! .
e
0037
v(
W)
n
2e-?2 (0.07)2
lDe
~amp1e:
It you want to test
transformation:
H:t ~
u..
2
the X' s are X(,5) then you should use the probability
57~i?5)
It ~
X
t 2 i-l
r e- / dt
t'
- 132 ...
Combination
-
Probabilit1es~
- --
of Test
...
Ho:
U is unito rm
H:J.:
U has distribution G(uu
(i~e.J the observations tend to be smaller)
ref: Ao Birnbaum, JABA, September 1954

He oonsiders a oomparison of the following
1... Fisher:
-2
ln Ui
Fearson~
-2
In (1 .. U )
2<:
tests~
i
..- found unsatisfaotory for moet applicationso
3- tJ.J.. .... not consistent, but is this important?

Birnbaum conoluded that -2
testing normal meansc
other possible
tests~
D,
~ lnUi was better
~,.9
1'0 r the one particular case of
...
U ; have not been studied in this light.
Goodness of Fit Tests:

HO:
F - F0 completely specified
The test usually involves estimation of parametersQ

The only oompletely worked out theory is for the x2-testo
For other suggestions see:
.- F.
David~
Biometrika" 19389
-- Kac, Kiefer, and Wolfowitz, Annals of Math o Stato" June 1955.

They present a Mmte Carlo derived distribution of W 2 and n
for the case of testing H~ Xts are N(fJ., 0'2) where lJ.; n 0'2
nare
estimated by
s2 working with n-25, n=100 o
x,
For consideration of one-sided tests see:
Ohapman, Annals of Mathe> Stat., 1958
... The nff test is allminimax" test (among Fisher, Pearsen$

of the one-sided hypothesis He: ,
ioe~,
'0 versus Hl
oR, W~ , U)
F .. '1<' Fo.
timinimax ll in the sense that it has minimum power to piok up easyto-detect alternatives" maximum power to pick up hard-to-detect alternatives.
- 133 -
IV. Two Sample Problems s:

X1"~"
Yp Y ,
2
Ho : F
1:1
\!l
.,
II
9'
have d~f. F(x)
Yn have d"f. G(y)

H1I F < G
or
F>G
Hl :. F "G
In the parametric case one could use the normal approximation en d the two-sample
t-test on the means, but these hypotheses are somewhat wider~
Partial list of testsi

1. Median Test
2. 0 Runs Test
3. WilcoxentsTest (also called the Mann..v-lhitney test)
4~ KolmogorovoooSmirnov D-test
5. Ven der Wa.erdents X-test (or Terry's C-test)
All we need for these tests is to be able to order the observations; magnitudes
are not important; eeg. lv y. y \/~ \.'
\
~3~~W~
Test 1:
For the median test set up a 2x2 table classifying the observations
as above or below the median o! the combined sample. For example:
Below
Above
Total
X:
Y:
m+n
m+n
m +n
Total:
2
Test Statistics is the usual X for 2x2 contingency table s with one d.f,
Test
2:
For the runs test, set

I'
= number of runs of Xt S and of Yt s in the combined sample (in our

example r = 4)
If' H is true, then the Xis and yts are intermingled and the value of
o
r will be "large"; 1 Hl is true, then r will be "smalllt
- .134 Test j :
For the Wilcoxen (Mann-Whitney) test, we define:

=1
= 0
otherwise
m n
~ ~
0:
in our example U = 22 (4+4+4+5+5)
~j
i-1 j-l
Wilcoxen's original test was based on
Rf
where ~ = the sum of the ranks of ~ in the combined sample ordering from
the smallest.
S1m1larly R':! = the sum of the ranks of Y.

Problem 62:
Test
4:
2.,.
Prove: U = mn + m(m
1) ...
~ ~
n
0:
3.=1
~ R~ _ n(n + 1)
. 1 J
JD
Kolmogorov-8mirnov define D for the two-sample problem as:

Dmn
= sup
\ Fm(x) - Gn (x)
-oo<x<co
Test
2:
For Van der Waerden' s X-test we

let . (u) ..
.l:fin
_t2~
dt
-eo
then the test statistic is:

Problem 63:
a..)
let Q.
X=
:t~1 r(
RX
m + ~ ...
1)
number of Y's which exceed max(X p ~,
I!l
8'
~) 0
Find the distribution of Q in the general case"
b) Specialize the result in (a) when H is true.

o
c)
Find the lindting distribution of Q for case (b) as m~, n
~oo,
.. 135 Runs Test (Test No 2):

Q
Ret lI' :
r
==
Mood Chap 16
fl\
no c of runs of X' s or Y' s in t he ordered combined sample.
There are two cases to be considered-that of an even and that of an odd

number of runs.
Even case:
The number of arrangements of m X's and of n Yrs that have tl:e

property of giving rise to 2kruns can be found by a simple
generating function device.
Again~ there are two possibilities, i~ell starting with a run of l'S or of
yts, so starting with either, we must multiply the end result by 2 since
the two starts are symmetric.
Starting with the X's, they are divided into k groups, all non-zero. To
find the number of ways of doing this, consider the ccefficient of t m in the
expansion of (t + t 2 + t3 + ., )k.
(t + t
23
k
t
+ t .;. He) = (l-t)
==
=tk
k
(1 .. t)-
t k ( 1 + kt + (-k) (-k-l) t 2 +
The coefficient of t m is found when j
=m -
k" and
_ fk + m - k - 1) 1 _ (m .. 1)
-m k)t (k .. I)l - k .. 1
Similarly the number of arrangements of the n Irs into k non-zero groups

is
Therefore" the number of arrangements of X' s and Irs with 2k runs is

2 (~ :
i) (~ : i)
The total number of arrangements possible with m
XiS
and n Its is (m + n)t

mt nL
-136Therefore:
(r = 2k + 1) For the case w..l1en the number of runs is odd, i oe '),

to determine Pr [r ... 2k + IJ , the argument is similar, but we
start and end with either X or Y (instead of starting with one
end and ending with the other), therefore
OOd Case:
E (r)
=m
~
+n
V( r )
= em
t'
2mn (2mn - m - n)
+ n) em + n _ i):;:::::;
N =m+n
Where:
+ 1 ;::;:;; 2 N..(
= No<.
2 2
N,.( ~
= N!3
0(
+ ~
=1
By Sterling's approximation methods, we can show that
lim
N~
Test:
00
r
2..<13
H is rejected i f r
o
- 2N4.
is N(O,l)
rN
<
r 0 can be determined from the left hand tail of the normal

appro:ximation or from tables in
Swed + Eisenhart, 1943 Annals of Math. Stat.
Siegel, Table F
3 Dixon + Mas sey" Table 11
1
2e
0
Wilcoxen (Mann-Whitney) Test (Test No o 3):
U=
~ ~
i
where
u ' ... 1 if X.
~
iJ .
=0
If H is true"
~(U) ...
..2
i
~ E(uij ) = ~
j
< Y.J
otherwise
-137. Pr
For the general case, assume
[y
> X~
= P
Pr
[Y > x
I X = x] dF(x)
-co
CD
= ) [1 -
G(x)] dF(x)
-00
In this general case
E(tf)
=m n
E(U)
[{I -
G(x)}
I F(x)]
2 )
~4 E(u..
~J
::1
"
+-Z] ~
B(u.. , u. b)
E(uJ.' " u )
aJ
J
(3 )
E(u .. , u b)
~J
a
(4)
J.J
ijfb
(1)
+~
2i
.z
isla
+~
ifa
jfb
J.
To evaluate this, we can examine each part separately, e og.:

(1)
2
E(U ) = p
ij
(4)
E(Uij " uab)
(2)
E(uij , uib) = Pr)j
mn such terms
= E(Uij )
m(m-l)n(n-l) such terms
E(uab ) = p2
Xi' Yb
JS.:
> x,
>x
Pr [Y j
Yb
I~
x] dF(x)
-co
00
= )
[1 - G(xll dF(x)
-00
= B[ {I -
G(x)} 2 tFJ
mn(n-l) such terms

00
(3 )
by similar argument
2
E(uij " u ) = ) F (y) dG(y)
aj
-00
=E
C; (y) IGJ
mn (m-l) such terms
-1,38Thus~
V(U) = mnp + mn(m-l) (n_l)p2+ mn (n-l) E [(1_G)21 FJ

2 2
+ mn(m-l) E [ ; G] - m n p2
If H is true" verify that E(F2 )
, Exercise:
and then Var(U) ==
= E(l
_ G)2
==
IT (m + n + 1)
Mann-Whitney in their further work on Wilcoxen's test proved that

mn
U -2
is AN(O,l). This they found by discovering a recursion
/~
(min+l)
formula to get all the moments of the distribution of U" then observing
that the limits of the moments" as m -+ (x)" n -7 (X) were the moments of
the normal distribution.
U -- may
be used for ane-sided or two-sided tests.
-- the most complete tables are given by Hodges + Fix, Annals of

Math. Stat." 1955.
Problem 64:
Take 10 observations of (a) X which is N(O,l) and
(b) Y which is N(l"l)
Apply each of the fi ve tests to the data to obtain tests for
=G
H F
Problem 65:
let a 1o' a 2 ,
let Xl'
~"
Define:
zi
(a)
IOU'
~-l
be fixed points o
n., Xm, Y1o' Y2"
= Fm(ai )
u.,
Yn be independent observations from

F(x), G(y).
- Gn(ai )
Find E(Z), Var (Z) in general and for the case F

denote:
F(a.)
J.
=f i
==
G.
-139(b)
Find E(Z), Var(Z) for the case when
=u
F(u)
=0
G(u)
"'_ 1 + u - 1 ... b
1 - 215
=1
In this problem let a
(c)
S. u
l-b
.s:. l-b
<.. u < 1
Assuming Z is AN, is it consistent for alternatives of the

form of G in (b).
- - - - - - - - - - - - - -- - - - -- -
--
...
_-------
KoJ.mogorov-8mirnov Test (Test No.4):
Our basic sample (X X X X Y Y y X Y Y) could be expressed graphically in

two ways:
(b)
(a)
Y (5,5)
Fm(x)
-roj-1,-,_
..........
L!.
'I
l~:
--.J'
I
I
~>
Gn(y)
i
~
I
.r-
if
I
I
(0,0) ,.'"'-----~~-----
i.e., the sample could be plotted as
a two-dimensi ona1 random walk reaching
the point (5,5)--reject if the walk
strays beyond a line parallel to the
0
45 line"
Asymptotic distributions of D , if H is true:
ron
lim
m,n --7
1(z) has been tabled in the Annals of Math. Stat., 1948.
+
+
Also" D- have the same limiting distribution as D- (one-sample
mn
n
statistic) with the normalization
(see p. 127 )
1m:
-J.Lo- Test:
.
Reject H if Dron )' d.
Van der Waerden's X-test (Test NO e
5):
where
u
(~) =...L. (
.;rn)
e-t /2 dt
-00
X is AN(O, : 1 Q)
Example:
where
2 i
r
(N+1)
i=l
Q =j
=m + n
XXXXyyYXyy
ItJ.
RX
i
11
11
11
11
11
111+1
e09, 018, .27, 036, 073
using normal deviate tables
(111+1)
= z .09
z .18
-.60,
For determining the variance of X, tables of

Van der Waerden in an appendix to his text.
have been gi van by
- - - - - - - - - - - - - --- - - - --- - - - - - - - - - - - - -Theorem 32 (Pitman's Theorem on A.R.E.):

Assume:
Tn' T1i are A. N. Statistics
.!. L '
is a subset of ..!n L indexed by p such that
when H is true p
= 0"
Let the sequence of p'S tends to
e,
i.e., P1 P2 " Pn
--+
O.
-141-
Test:
T -- reject H if Tn
> t n.,(
T -- reject H if Tl!-
>
Assumptions:
>0
(1)
-dd Ep (T )
.p
n
(2)
d Ep (T ) p=O
lim ~
n.
pm"
d
(3)
lim
n",,"*oo
ered
-dp
=c
J.'f
CJ
o
Ep (Tn)
Ep (Tn)
I p = pn
Ip = 0
l{
P =n
rn
=1
plJ"l (T)
n
CJ
(4)
lim
CJ;rr;J- = 1
n-1'OO
Theorem:
.
Under these regularity conditions, the limiting power of .the Tn
=..!. as
n{r1
test for alternatives p

~.
The A.R.E. of
= (c*)2 =
T:~:'
"l!
to
lim
n-7oo
~~
is
00
is 1 -
~ (zO'( -
kc).
-142-
Proof:
Pr
>
Tn - EO (Tn)
t n:< EO
0"0
~J.
= ..<
0"0
For large n, since T is asymptotically normal, t n.,(

Pr [Reject H I Pn ]
lim
= z.,(O"O
EO(Tn)
n~co
= lim
n-1oo
0"
= Z.,(
(...9...)
0"
lim
n..-?oo
Pn
from assumption 4, as n
0"0
00,
( _..) ~
0"
kc
Pn
using a Taylor series expansion:
o < pIn
=Therefore:
n -1
kc
prlReject Hr
lim
< p.n
P~
=1
(z.. -
00
x-'"
kc)
By a similar argument, the limiting power of the Tll- test is .
d EP(Toll-)
n
where
c~l-
-d
= lim
n-j
. . , '"'OS
00
P= 0
-143We want to determine sequences {n l
~ , fn11 such that
1 - (Zl_..< - k*ci~) mich means that kc
to be the same p
n
= p*
or ---
y'ii1
k*
=---
0i
= k*c*.
1 -
(Zl_..< - kc) =-
Also" for the two sequences
Thus we can determine the equality of the following ratios:
The AeR.E. of T to T* is given by any of these ratios, or as stated in the

theorem:
AeR.E.
=( -r-C'
= lim
c ,)
Example:
n ....,
Coa2).(~d Ep
CD
(Tn)
dp Ff' (Tjt)
Ip = 0 )
Ip
Obtaining the A.R.E. for the Wilcoxen test versus the normal
mean test, i.e e
U versus Z
I-X
which is asymptotically equivalent to the

two-sample t-test e
X and Y have d.t. F(x) under Hoe

X has def. F(x), Y has d.f. F(x -IJ,) under HI.
In both cases the variance =
8'
For vfilcoxen' s test:
co
p Pr
[Y
> X]
[1 - G(x) ] <IF(x)
-h>
- co
f (x - IJ,)
(x) dx
-00
[1 - F(x -101)] rex) dx
-144~ E(u)
E(U) = mnp
= mn
(x) dx
IJ.=O
_ mn(m + n + 1)
Var (U) 12
E(Z)
x dF(x -
Xx dF(x)
IJ,J -
crV/ .!n + m
!
dE(Z)
dIJ.
~/1+1
vI
iii
ii
Var (Z)
mn
~
(J
in
==
Therefore, the ADR.E e of U to Z is

00
) f2 (x)
-00
I-mn<m ; n + 1) ).
n~
oo\!
I ron
(] y' iii-+n
12
which reduces to:
= 12
A.R.E. (U to Z)
;. [
Thus we can compare U and Z for any f (x) whatsoever.

For ins tance, if f (x)
=-L
Il1tcr
e-x2/2cr2 then
f2 (x) dx
2no-
using the transformation
1
0
.1
...........
V2n .j2
l /2 dy
e-
-00
and
12
J [ )
2cr/fl
00
(x) dx
12i
-00
12i
4ncf"
=1= 955
n
=~
All of which says that the A.R.E. of the Wilcoxen test to the Normal
(two-sample) test" if the underlying populations are normal, and we are
testing IIs1ippage of the mean" is 31n.
Problem 66:
(a)
Evaluate the A.R ..E. of U and Z when

(1) f(x)
=1
F(x)
=x
F (x - j.l.)
a<x
=x
<1
IJ. <x ~ 1 + IJ.
j.l.
a <x <CD
Find an f(x) such that the A.RIlE. of U to Z is + co
(b)
It has been shown that the A.ReE. of U to Z in this case" i.e e

testing slippage" is always > e864. - Hodges and Lehmen, Annals
of Math" Stat .. , 1955.
AeRoE. of test to Z
(testi~ for slippage)
Test
1. Median
2/n
2. Runs
3.
3/n
4.
K-S
???
5.
~ 1" i.e,
= e-x
(2) f(x)
Remark;
a~ x
Consistency
yes, if the median
median of F
l
consistent for all
yes" if the median
median of F
l
consistent for all
of Fa
Fa = Fl
of F ,.
Fa = Fl
X
1
yes, if the median of Fa f
median of F
1
Robustnes s of a test (as propounded by Box) refers to the behavior of a
test when the various assumpti ons made .for the validity of the test are
not fulfilled.
Type 1 error - P.r [reject H when true under the assumptions]
Power
- P.r [reject H when false under theassumptionsl
A test is sa:.d to be robust if Pr [reject H when true if assumptions are

not fulfilled] remains close to
Note:
..4
regardless of the assumptions.
the Z-test, or two-tailed t-test, is robust.
The proponents of distribution-free statistics argue that the disadvantage

of t,he Z-tes tis that the power may slip if the as sumpti ons (of normalcy, etc)
are not satisfied.
r
- 146 -
1.
2.
3.
4.
Median Tests
X 2_Test with arbitrary groupings
Kruskal-Wallis Rank Order Test
Kolmogorov-Smirnov T,ype Tests
samples:
1.
..
2.
..
k.
1.
~l
\2
k
N= }"n.
J~ J
~~
Median Test is made by setting up a 2xk: table:

sample
...
.....
./
Above
mi
N/2
Below
n - m.J.
i
N/2
n
l
...
n.J.
nk
Where mi = the number of observations in sample i above the median

of the combined sample.
Use a x2.test of the null hypothesis with the expected values
is even, and X2
= n./2
J.
when N
with k-l d.f. under the null hypothesis
that the k samples all came from the same distribution.

2. x2.Test:
Being given or arbitrarily choosing groups Aj (a j < X
a j +l ) define nij
the number of X's in sample i that fall in A.

J
Under the null hypothesis Pr[Xfalls in A. J = p. independently of i.

..
J
J
2
This can 'be tested by x in the usual manner -- as an rxk: test of homogeniety,
2
where x has (r-l)(k-l) d.f.
- 147 -
=:
12
N(I~H)
where Ri = sum of the ranks of sample i taken wi thin the combined sample
ii
=:
Ri/n i
H is asymptotically distributed as X2 with k-l d.f.

ref: March 1959 JASA for small sample approximations.
Problem 67:
!to
What does H reduce, to when k = 21

Prove your answer.
K-S Type _Tests:
Would involve drawing a step-function for each sample on the same graph.
Unfortunately nothing is now known about the distributions.
ref:
Kefer, Annals of Math. Stat., article to be published probably in 1959.
Consistenc"I:
The Median and H are consistent against all alternatives if at least one of
the sub-group medians differs from the others.
A.R.E. :
A.R.E. for slippage alternatives:
median test against the standard ANOVA test
2/rr
against the standard ANOVA test
3/rr
H test
where the underlying distribution is normal.

If the underlying distribution is rectangular, then the A.R.E.fs become:
median against ANOVA

H
against ANOVA
1/3
1
.. 148 CRAnER VI
TESTING OF HYPGrHESES -- PARAMETRIC THEORY -
Cramer" ch. 35
Kendall" vol 0 2" chs. 26-27
Lehmann" IlTheory of Testing Bypotheses,lI, U. of Cale Bookstore
(notes-by C. Blyth)
Fraser" IINonparametric Methods in Statisticsll"ch. 5
Refsg
1.
't.-
POWER
Generalities
-- usually the XIS are independent with density f(x, 9), the density having a
specified parametric form with one or more unknOwnparameters.
For the parameter space.fl HO'
~. ~ e '--\
e CJo
Recall that (~) is a test function of size
0<
such that
reject H with probability 1

reject H with probability k
=k
. do not reject H (accept H)
==0
where E [()
Power function'
Def'll ~:
~ (Q) ~ E
e Wo ]
~-<
[ I ~J
* is a unif'ormly most powerful. (uom.. p.) test of size ""-, if being an.
. other size cI.. test
'\}
2.
Defs" 33..38 in chapter 512
Ref 8
--
Probability Ratio
T~
Neyman-Pearson Theorem,
X is a continuous random variable with density f(x, 9)0
HOi
9 == 9
(simple
Hlll
0
~pothesis)
Assume!
f(x, 9 0 )
Assumes
f(x" 91 )
f{x, QoJ
>
OJ
f(x" Ql)
>0
9 == 9
1
(simple alternative)
for the same set So
is a continuous random variable 0
- 149 ..
Theorem 33 (Neyman-Pearson):
The most powerful test of HO against ~ is given by *(x) defined as follows~
*(x)
1
elsewhere
- 0
where k can be chosen so that it- is of size ..<.

If the test
* is in~ependent of Q1 for Q1 &GJ~\hen * is the u.mopo test of HO
against Hl :
&~ 0
Remark: This is what has been called the probability ratio test, since Neyman.-,
Pearson originally expressed the theorem that
I----+--~--
__~--~----
Proof:ltl
To show there is a required k that makes -:l- of size ..<

define:
..<{k)
-o) .,
since
r:
Pr
[1f'O{x)
(x)
:.>
X has density f
lJ
oI
'-'(00) == 0
is a continuous random variable, 1 -
~(k)$
which is the
d.~
of
this random variable, is a continuous function and is monotone non-decreasing,ll

hence for some k f we must have -k f )
==-<.
- 1,50 2.
To show that * is uomopo

Let
be
any other test of size ~.
We want to ShO'!l'l that g
j/(x) t(x,
Consider!
~ Y(x) t(x, !ll) dx

[t(x,~) - k t(x, 90 )]
!ll) dx
[(x) - lI(x)]
That this integral is

when f
when f
- k f
follows since
>0" then'!t
- k f ~ Os then
1.9 and
rf! ... ~O
=.0" and
ff .. ~O
=:
Expanding this integral we get8
)1 t
l dx
fllt
l!f
but
j*
Therefore;
and
l dx
f O dx
dx
::&.
- k [
r1
j f O dx
~5 f 1
I\~
..,
j2n' a
Reject Ho when
.L
1
O
~ (Xi
'"'
l.
known
~)2
f
1
O
I:"
- -
J2n'a
~
2
~
2
- 2u..
.:::JX.
""
nu..
..
~Xi
.' J.
J.
".L
,
2a'2
== '" ... the size condition
>k
~
or
to dx
IJ.=~> 0
2&2
~X.
is u"mQpO
a
... ~l_
to dx -
~0
dx
Example I i
dx
>
;?
-151 -
(In k) 20' +
or
nlJ:t. 2
To satisfy the size condition, K a Zl-..

If Ho is true,
Remark:
Pr[ aJ,pr
X > zl....<l.
J
~)
X is NCO,
Er,Pl) (which is independent
If the most powerful test of H against

O
H:J..
of
IJ:LJ
=0;"<
= Q1
is independent of
for some family~" then the probability ratio test, *~ is u$m.Pe for
He'
O against~.
Q =: Q
Therefore"
For Hoa
X > zl_..<
<0
and l.J.
>0
uQom~po
,
u~m9P.
It'-
<0
11.:
the uOm0p. test is
j,l.
'> 0",
X< zl-'.
J;
test for this problem since the u.m.p. tests for
differ.
here&
+\
'j'
.'\
Problem 68:
6~t
is u.m()p. for HO against
IJ.. 0 against ~.
There can be no
l.J.
.fn
X< K'
u.m.p, hereg
~4
X> K
--._---
l.J.1
Xl' X2" , Xn are independently distributed with density
f(x.)
= Q e-Qx
x '>
(1)
Find the u"m.p~ test explicitly (i"e., find the distribution of the
test statistic) c>
(2)
Write down the power function in terms of a familiar tabulation.
- 152 Remark:
--
flex)
fo(X)
is required to be a continuous random
variable~
If this ratio is not a continuous random variable, then .J.. (k)
may be discontinuous, so that .t.. (k)
lOt
..< has no solution o
nomal situations where one can't get ..<
lit
.05
Pr(~>~
(eogo small bi-
exactly.)
in this case is defined to be e
=1
flex)
when i'O(x)
>k
=0
flex)
when l' (x.)
<k
I:
when
rflex)
(x)
o
=:
where e is chosen so that the size

of the test comes out as .,(
~(k).
I
.t.(k-ltQ)
'----
(k)
(k .. 0)
Problem 69g
For 1 let 8 be the set where f ~ o.

l
1
For fa let So be the set where fa
(1)
> 0(1
If So and Sl are not coincident then there may be no test of size ..<.
(2). If there is no test of size ..< given by the Neyman-Pearson Lemma
(Theorem 33) there is a test of size
<.,( with
Hypotheses noted thus far' have been simple hypotheses(l

the hypothesis is called a composite hypothesis, eGg. 3
Xl' X2,
eGO,
Xn are.NID
(~, cr2)
If
power
Cy
=1
has more tha~ one point:
13(Q) = Pr[rejeot H]
for HO: I-.t. =: 0 against
I1.1
I-.t.
>0
Testg
..
,,(
- ---
Extension of the u.m(>pe test to oomposite h;y'potheses by means of a "most unfavorable

distribution of
@"'o
w.
Let A (Q) be a distribution of Gover

Then X has a density f (x,
~(x)
i).
~ (x, Q) d :\(iI)
-=
density of X under H plus the

O
additional information.
w
NV
Let Hobet
I1. $
X has density h4: (x)
i~e~1
Let
Th~~
34:
Proog
be the most powerful test of
= Q
the density of X is (x~ Ql)
%against ~.
If rp~ is of size d.. for the original test" it is mOp$ for this test.
Let be any other test of
(x) f(x,
Then
'"
=
Thus
andl
is
~) dx ~ -< for all Ii
J[5 ~
iii
Ha-
(x)
(x) (x, iI) dx
f J(x~
lit
d:\{iI}
iI) d:\(Q}}
~ (x) ~ (G) dx
AA-
of required size for H ~

o
~" (x) f(x,
Q} dx
(x) (x, Q)
dx
dx
~54
ci known
rf' test~
H~:
!f
(;J.
I:
I(. /, ]
Put
reject 1 X ';> zl-<...2:- which was found to be uomop. for
rn
0 against Hl :
(../,> O.
~"CO<
O.
for(../,
We know that Pr [rejecting H with
A(Q) = 0
'3 <0
-1
Q>O
,/
(ioe&$ concentrate the distribution at a which is the worst spot from

the standpoint of testing or distinguishing)
';fhis makes our composite distributioni
~ (x)"
( :rex.
~) d},(Q)
..
:rex,
0)
so that our problem is back to HO and we have for HO a u.mop" test which
is also a test of the original HO'
One way to get an optimum test in the absense of a u"mop" test is to restrict the
---------class of tests to be considered and look for probabilH,y ratio test for restricted
class.
Such restrictions area
1<)
..Defo
42~
2/j1
Def. 43g
Unbiasedness
rJ
is an unbiased test if l3 (Q) ~
co<
for all G e ~
Similarity
is a similar test if Eg (
rJ)
=..< for all! e
Wo
3e Invarianoe
X has density f{x, ~)
G :=, a family of transformations of the sample space of X onto itself\)
eo go a g(x)
=:
cx.
= x + d
change of scale
translation of axes
Let g(X) have density f [ x, g(Q) ]
g is
g(Q)
JL
a transformation of the parameter space induced by the transfor~

mation of the sample space.
- 155 if X is N(~" (l)
g(~.f (12)
if X is
lit
OJ,.L,,
N(~, 0'2)
g(~,\1 (12)
X ... d is
lit
~ e wand H.J.I ~
IT we have HOi
oX is N(0 j,J."
2
0 0'2
~+
2
0'2 )
N(~ of!- d, (12)
ci
d"
an. w
then a transformation in-
duced by g leaves the problem invariant i f

i(Q}
8: (,.J
i( i)
8:
8: UJ
n... _
~ when
0'2)
Example:
when Q
H08
Q oS
~.
n .. w
g(X) .. oX
>
g(j,.L1 (12) .. oj,J., 020'2
W is the (12 axis
-----+-----1-/.
under H g(X) is N(O, 0 20'2)

O
2
H g(X) is N(~, 0 0'2)
l
Def.44t
whioh doesn't change
eu
A test, ,s(x)" is invariant under a transformation g (for whieh the

oorrespo'nding
Example t
t.
g leaves
the problem invariant) if W[g(x)]
l~
n
-"i X.
J.
=:.
81m (] (Xi .. f.)2)~2

.
lit
~(x)
l~
(n-l) n
(under g
ii ~cXi
ex) ---"':::-_~r-J"I\'
(~~i_Iq.)~\1/2.
(n-l) n )
ii~:'X'
lit
~(Xj-f)2J 2
(n-l.) n
Def* 45:
If among all unbiased (or similar or invariant) tests there is a
which is uom.pol then ~ is u.m.p9u. (or u.m~poso or u.m.p.i.).
rjf
-156 Eniformly Most Powerful Unbiased (uom.pou,) Tests:

HOi Q
Single parameter Q
"f" is differentiable with respect to

Remarks
If
=:
which is inside an open interval

O of the parameter space.
~G
is unbiased, then
=:
~~, (iO) 6.,(
Proof:
~ .(~)
~_ (Q)
~
>-t
Hence
for
~ ~O
(Q) has a minimum at ~
~_ (x)
(x, Q)
d~
(Q) ..
'aQ
dx
Ill:
G .... Q
O
tor unbiase~~-j~La..l:t~.!'-1?:~~~yes._mu~!:?.~.~:w.Q~~!c.!ed (otherwise the power curve

has no minimum 0
"
Assuming that
He:
=:
(Q) is differentiable with respect to Q
and henoe
Notes
"Q., 9
.Theorem 35:
5f(x, Q)
11.$
dx is differentiable under the integral sign, and with
@l (two-sided)
If there exist
(
I:
I:
~,
we can get.
so that
1 when f(x, Ql) >kl f(x" i o) + k 2
~~
"I
Q
=: ~O
0 elsewhere
is of size ..,{for H and unbiased, then it is mope for alternatives

O
Q e If the test does not depend on G it is uom.po u. for \ U i 6 GJ.
l
1
ProofS
(refS
Cramer p.532)
Let be any other unbiased testo
Then
Sfu fa dx ='" =
J'(w: ~"' Q
'='
90
)_ fa
dx =
dx
a=
J'(- ~ I.
dx -
= Qo
Bi~ conditions
unbias~dness
conditJ.ons
- 157 -
)Cf -lI)
f 1 dx
Cf - ) [f1 -
This integrand must always be ~

since if f 1 -
kJ.
~~ f 1 dx ~
We want to show that.
f 0 - k2
.
~i
k:J.
1 dx
~ IQ ~ QJdx~O
f O - k2
i =
~0
~O
1.
lit
#}
>,
o=f~ti
Thus the desired relationship always, holds o

Comment:
It: you are trying to find abounded function" a ~
f dx subject to side conditionsJ
~ b which max~mizes
f i dx = c
i = 1,,2, "",9 n;
then this maximum will be given by choosing,

where k. are chosen to
J.
satisfy the n side conditil
=a
otherwise
Exampleg
Consider a particular alternative
,
!.
~,
known
lJ:Le
~ (Xi~)2
2 -
2a
- 1.58 -
k'a, the u.m.p.u. test
If we can find proper
is~
Reject H i f .
..
(derivative evaluated at
~x.
~.
J.
--r
cr
nlJ:L
. -
~ k +
r
n~
-:r-!
e
n~
.......
kJ.
: 0)
2a
,-
where,
+. k 2 x
k'
-""1
=:
2a
n ~2
k'
e
2 =T
a
2
2a
If we set Yl = the left hand side of this inequality" Y2 = the right hand
and restrict ourselves to the cases where

ing type of
'1.. ~ 0;
then we could get the
graph~
side~
follo~
~l
\
J
- . - --.+
I
I
I
I
I","
'"
___-==-_......:"'
..,-__-l-
'"
It
,It
(Y2" Y2, Y2 "Y2
are possible Y lines)

2
The test says to reject H i f < a or > b where a may be -
b may be +
.
41'
II
Iff
Y2 gives a two-tailed test (finite a" b) ..- Y2' Y2 "Y2

tests (only one interseotion with Yl) ]
CD
00
all give one-tailed
-159 Requiring that the test be of size

Pr[a<i<bJ
0<
specifies thatll
... l.-.,{
rn
i.e."
.--'.--
,r2n 0'
me
20'2
di =1. ..
..<
Also, the unbiasedness condition requires that:
d (
dPl
'-
orl
d~
rn
f2io;
1. ..
e"
or
,f2na
_rn-
-{2ncr
2ci
..
e
):
00
dX+
rn
$0'
-co
- rn
n(i - lJ.:L)2
n(x - lJ.:L)2
2ci
..
-0
dx
'2=0
n(x _ ~)2
.. - 2(i
lJ.:L
..2
nx
2r:i
=0
dX
..
nx
ill
di
[2]
==0
It
[2 ]
The function under the integral in

is an odd function" so that the integral
is zero only if a == - b (i.e", if a, bare symetrical).
[1]
..
thus become sg
so that 111e can determine a asg
1.60 Thus the test is:
1x
Reject Hii'
>
and since it is independent of
IJ.r
it is U1>m.p"u.
Problem 70.
Find the
u~m.p.uo
note~
Theorem 36e
~ (Xi
If
test
test for H
O
-
~)2
2
is sufficient for cr
is sufficient for ~ then given any test (~) there exists a
'Y (~O with the same power function.
optimum tests, only functions of
Proof:
define
Y(:) = E
by definition
Thus 8
[. (o!)
'. need
( TJ
Hence in looking for
be considered.
which is independent of Q
- 161.
-Invariant Tests
Refs:
We have8
Lehmann" "Notes on Testing Hypotheses"" ch o 4

Fraser" "Non Parametric Methods in Statistics", ch. 2
observations.
Transformations:
parameter.
...
g(x)
density&
has density f
._ X t
[!.,
f(~
Q)
g (Q)]
ioeo, there is an induced transformation on the parameter space:
the group of transformations that leave the problem invariant, i.e.
Gg
Examplee
-g (Q)
$,
Wo
if Q e
""g (f})
.~
if
~ 8
~l
2
Xl' X2, " ... "', In are NlD (.I., (J )
t
=OX
g(x)
:0
+ d
CX 0\+-
g (.I.,
11.~
If
we set d ... 0 and c

X
...
> 0,
(C(.l.
+ d, c 2(J2)
:(0
(.I.
is NID
Xl
(J2)
oft-
d, c 2(J2)
~ >0
then the problem is invariant, and we have.
g(x) ... ex
Sufficient statistics for
~,
..
'"
cr2
are
X and S ... ~(X.
X)2
J.
Under the transformation:
t a - is invariant (the inclusion of constants does
rs
not affect invariance).
(discussion to be continued after def.

Def. 46t
46 and theorem 370)
m(x) is a maximal invariant function under a group of transformations if

1) m[g(x}]
=m(x)
2) m(xl )'" m(x 2 )
>
there is a g such that g(x2 ) = Xl of vice verB
- 162 Theorem
37~
The necessary and sufficient condition that the test function (~)
be invariant under G is that it depend only on m<.~).
Proof~ 1)
(2)
Y(m(~) ]
I:
rJ [g(~)]
'ft ~(!)] 1 = "r[m(~.> J

m
Suppose that (!) is invariant,
2)
= (x)
,
We have to show that (?E.) is
a function of m<.~~).
Given [g(~)
show m(x ) ~ m(x )

1
2
m(x1 ) = m(x 2 )
(!)
>
> for some
(~) == [g t (x2 )
thusg
(~)
:: (x2 )
t
g, call it g , g (x 2 ) == xl
I:'
(x 2 ) which is what we set out
to prove.
Returning to the IIIStudent D problema
It remains to show that t
... i
JS
is maximal invariant
(l
""I
Setting t :. - "
rs
==.1.
:c.L.-
J'Si
we have to show that given t
=t II
we can find
_)
g such that X , S : g(X i S
-,
Consider..!.. and call this ratio
"a""
t
2
or S = a S
But this is just one of the members of the original family of transformations so
t is maximal invariant.
Hence the problem is reduced to finding the u.mGp. test based on t.
In summary.
Xl' X2.9
H
3
'0
eo.,
X are NID (~.?

n
/lk=0
Cf )
- 1631) Reduce by using sufficient statistics,

2)
X,
SW
Impose the invariance conditions under the transformation x'

This reduces the problem to tests based on t.
3) Find the distribution of t under

Ratio Test).
-
fb
at
CX, C
:> 0.
and H and apply Theorem 33 (Probabilit:.

1
-------------------------------------------.
Distribution of Non-Central t{t-distribution under H)a
Refs~
.'C 5
Neyman -Ii- To:tmrska, jASA 1936, pp. 318-326 (tables for the one-sided case)
Welcl?- + Johnson; Biometrika, 1940, PP. 362-389
Resnikoff + Lieberman, "Tables of the Non-Central t-distribution ll , Stanford
University Press, 1957
the- non-central t-variable, is defined bya
z is N(O, 1) .
where I
d.f~
w is X with f
8 is a constant
>
The usual t-variab1e is
='C
CO,
= rn
f)
i
CT
-rhE
2
(j
when H is true
o
If ~ is true, lJ.
lit
11.
and
Jil (X-/.L:1.)
(]
. is N(O" 1)
(n-1)
-164 The joint density of z and w is,
2'
-1
'2
Q)
<.Z
<,co
w ~O
To get the density of
't"
(the non-central t) we make the transformatiom
't"
= _e +8
rr
Thuss
1 /fU 't"
co
~
o
To get the distribution of the usual t, put
=0
r rr
8)2
!:! _ ~
u 2 e
O.
The integral in f ('t" ) become s
;1 - \
~( l +1)
du
which can be readily evaluated recalling that
e-ax xb-1 dx
o
Thus,
ref:
1 +t
Cramer, p.238
du
- 165-
EO
The u.m.p.i. test is to reject
Note.
if &
This ratio is a monotone increasing function of 't'. (The simplest

proof of the required monotonisity is given by Kruskal" Annals of Math
Stat" 1954, pp. 162-30)
R('t')
ratio
kl-------~
Since t >t o when the above ratio> k" the probability ratio test thus reduces to:
Reject Ho when t
>
to
Final result is that the m.p.io test for Hli
Reject Ho when t
This is independent of
Example.
'1.
fl: >
=~ >
~o
is.
t 1-< ("..1)
and hence is u.. m.p.i. for
(~"
Xl" X2,
000"
Xmare NID
Yl" Y2"
000,
Yn are NID (~2'
EO
against Hlo
el)
2
(J
i~ u
Sufficient statistics for the three parameters are:
sp
==
~ (Xi
- X)2
m+n...2
+ ~ (Yi - 'Y)2
V\
\;"-0
\l\
_ 166 ..
Consider.
N( a
X'=a.X+b
Xt i
yt : oY + d
yt is N(c
+ b, a 2a2)
~2 + d, c 2a 2 )
Invariance requires that,

a~~b=o~+d
when
=- j.L.2 for all j.L.
a~ +b>o ~2 +d
b dJ
The first line requires that.
requirementthat~
The seoond line adds the

Thereforec
a=-o
Xi' =- aX + b, yt == aY
oft
b, a ')P 0
leaves the problem invariant.
To be provent
is a maximal invariant statistic.
t =- t
t===} there exists an a, b such that
X'= aX + b" yt
=- aY
-11
x. - "
=-
,?
define:
o(x .. i)
now let XI
or~
==
j( t
..
!I
=d
eX
o(x ... Y) =
ef ...
.. ei
il
'if =- of
Xf
T~e
..
=:
l:<o
d ..
d -
+ d
eX ...
so that the same transformation has been applied to

'X and Yo
u.m.pe test invariant under the family of transformations,

Xt =- aX ... b,
iSI
Rejeet II if
Y': aY + b"
"> 0
-161 ~
Under the alternative
- 1J.2
I:
d" thus
(!..:-Xcr - d)
s
cr
"'"
JH
m. .
T
!!
cr
1
Ii
has a non-central t distribution with parameters [
crJl:+!
m n
.
Exercise.
.( == 0.05
-cr = 0.8
Find m" n required to have the power
'?
,,90 (put m = n)~
Try to find m, n also by normal approximation.

dof 0
Answer:
non-centrality parameter =p
2(n-l)
=0
-30n
28
27
25
For the power to exceed e90" from the Neyman-Tokarska tables we need:
= 30
d.,f o = 00
p == 2.99 at dGf o
P ., 2.93 at
Thus, by rough interpolation, the minimum size required is n == m 28.

From the normal approximation,
Problem 7lg
a)
crl
= 26.7
or 27
Xl' X2,
.
Y
,
Yl~ 2
HO~
0"00,
"Ill""
2
X are NJD (~, cr )

m
1
Yn are NJD (1J.2" cr 2 )

2
== cr2
1\$
crl
>- cr22
b)
c)
Find the group of transformations leaving H invariant.

O
Find sufficient statistics for the parameterso
Find the maximum invariant function.
d)
Find the u.m.p.i. test.
-168 ..
e) Find the u~m~p.u.i. test (ioe., set down conditions to get the rejection
2
2
region as in problem 70) forI HO' cr1 = r:J2 against H~U (J12 rJ cr 2
2
f) Show that the usual test which is to reject if F '> Fl _..<. (m-l, n-l) where:
s 2
1:0
~ (Xi"
x)2
m-l
~ (Y ...
z 2
y
.~
!)
n-l
1=
is a test of size 2..< with lIequaltail probabilities""

g)
= 0.05.
Plot the power of the test of (r) for m = n = 10 with 2..<.

2 2
(Include the points cr1 /r:J2 = 0.5, 0.8, 1.25, 1.5s 2~0)~
Tests for Variances:
~ (Xi ..
~) IJ. known
Reject
Ba
it
~ (Xi ~ 1J.)2
Reject Ho if ~ (Xi - f!.L) 2

ref: problem 70
2) IJ. unknown
is sufficient for
/.1-)2
-- u"mopc> for HO against
)t K
< IS.
~(X.
or
-
>K2
xl
J._
.,
cr
n-l
--
H:t.
uemopo u<) for HO against
are sufficient for IJ.,
H~.
cr
Problem is invariant under translation, i.e

XI = X
-It
To find a maximum invariant function

f(X', s,2)
= f(X
+ a, s2) = f(X, s2) must hold for all a.
This says that f(X, s2) is independent of

tions of i only.
X.
Thus invariant functions are func-
Hence, since (n-l~ s is ,,2 with (n-l) d~f. when H is true, the problem .is
O
(j
exactly that of (1) with the
d~fo
reduced by 1
(i~eo,
n-l vice n).
-169No~alTests:
Summary of
Xl~ X2'
oe", Xm are NID
XiS" Y's are independent
Yp Y m ... ~ Y are NID

n
2
Altel'native
Hypothesis
1.
Ha :
HI:
~ )0
&It
Test:
(~
Reject H if
Classification
Invariant under
(f or H against the transformations
given alternati va)
of the fom
known)
x )zl
O"x
-..<
rm
-
O"x
2 fl
"
I-I )Zl_~ 1m
-
Hi: ~
:f
HO: ~x
= ""0 ( O"~
HI: ""x
> ""0
t ) t 1_..<
H'I"" ""x
I: ""0
It
IX
= i0
3. Ha : 0"2x
HI: 0"2x
2
Hr: O"x
1
unknown )
Xf = cX
I > tl~2
e)
Xf = eX
e arbitrary
("" known )
2
~ (Xi .. ",,) 2
~ (Xi ;.. ",,) < Kl or
>' 0"0
F 0"0
2 (m)
> X1-..<
--:1
u.m.p.
>K2
(for equations for K1' K2

see problem 7a)
4e
HO: 0"
= 0"02
HI:
O"~ )
HI:
O"~ F O"g
og
("" unknown)
~
- 2
(Xi - X)
~ (Xi
2
> X1-..<
(m-l)
>
- j{)2( Kl or 1<2
(for K , K see problem
2
l
70 - use m-l d.f.)
xr = X + a
Xt = X
-I-
-170Alternative
Hypothesis
Test:
Re ject H if
01assification Invariant under

(for H against the transformations
given a1ternati va)
of the form
H : lJ.x :: lJ.y
O
x' = X + a
I'=I+a
x' = X + a
I' :: Y + a
x,
a.
'f, s~
m+n-2
are sufficient for j.Lx'~~O'
x-f
x' = aX
= aY
yl
sp/ ! .
m+ !
n
I X - y I > tl_~(m+n-2)
2
s p(lii-r"ii
J!":T
ix r iy
X, f, s~J
+b
+ b
Xl :: aX + b
It :: aY + b
s; are sufficient statistics far IJ.x '
~, ~,
0';.
This is the classical Fisher-Behrens Prob1em--no exact test

is known. The approximations thus far have tried to keep the
size of the test under control, and very little attention has
been paid to the power. The approximation used is:
X-I
s2
2:. + _l
t' is approximately distributed as t with

modified daf .. (i.e., modifications by:
Smith-Satterthwaite, Cochran-Oox, and
Dixon-Massey)
Ref:
Anderson and Bancroft, p. 80.
Tables far t I in the one-sided case (-< = 0.05, 0 ..01) have been
given by Aspen in Biometrika, 1949.
-171If m = n
X-I
= -:;:::::::::;;;:
+ s2
ix
I
V
X-I
Empirical results:
based on 1000 samples"
= n = 15
cri = 1"
cr~ = 4 :
significance level
actual rejections
5%
1%
4.9% 1.1%
rejections based pn 28 d.!.

significance level
m=n=5
actual rejections
5%
1%
6.4% 1.8%
rejections based on 8 d.f.
In re power in the Fisher-Behrens problem:

Ref:
Gronow; Biometrika, 1951" pp. 252-256
He gives a note on the power of the. U and the t tests for

the 2 sample problems with unequal variances e
cri
vfith n l f n "
f cr~, the U test stays fairly close to ..(
2
in size, whereas the t-test jumps wildly ani a comparison
of the power becomes very difficult.
-172Alternative
Hypothesis
Test:
H'cl=i
O' x
y
Hl :
2
x
(j
>
Reject H if
Classification
Invariant under
(for H against the transformations
g!!~m alternative)
of the form
known)
2
y
XI = aX + b
yl = (a)Y+c
(j
XI = aX .,.. b
yl = (:ta)Y+c
vJhere Fl' F are chosen to satisfy the unbiasedness

2
conditions as in problem 71.
Since Fl' F depend on complicated equations, we usually use
2
the test:
Reject HO if max
~2
2 )
1-, 1
sx
>F l~
(m,n or n,m)
d.f. depending on which term is in the numerator.
8"
(lJ.x '
~y
unknown )
u ..m.p.i.
XI = aX + b
Yt
u.m.p.u,.i.
= (ta)Y +c
XI = aX + b
yl = (ta)Y+c
Same comments on the determination of F and F apply as

l
2
in No.7" and the same alternative approach is usually
taken (with the appr'opriate modification of d.f.)
- 1733..
Maximum Likelihood Ratio Tests.
Ha:
~:
Q 8
n-
max f(!I Q)
Q 8 Wa
11.= max f(~ gj
8..0..
Maximum Likelihood Ratio Test is to reject H if A <: A where A is chosen to
the size conditions.
For small samples in general this procedure may not give reasonable results.
satisf~
For
large samples, as with the mel.e., results are fairly good\t

Remarkc
Under suitable regularity conditions -2 ln A is asymptotically distributed

as x2 The degrees of freedom depend upon the number of parameters speci-
fied by the hypothesis, ice", if Q has m components in nand k component

in W then the dof" in the asymptotic distribution of -2 lnA. (under H )
a
o
are (m - k).
See S. S. Wilksi
Proof:
Annals of Math Stat, 1938" or "Hathematical
Statistics" II Princeton University Press

Wald has proven that the ffieloro test is asymptotically most powerful. or asymptoticall
most powerful unbiased testG
A~
Refi
Example
of
...
m,l.r~
Walda
Annals of Math Stat, 1943Transactions of American Math Society" 1943

(both papers in his collected papers)
tests
2
Xl" X2 " ou" ~ are N(~" 0.)
HaS
=a
~Fa
R.t.8
~ (Xi
2
20
- ,,)2
unknown
-174 ~
~X.
When H is true"
J.
is the m.loe. of a "
X"
And in general"
n~X2
----~X
i
- ~ (Xi" X)2
An: ~X2
2
n~
(X._X)2
_ _...;;J.
_
2~ (X -X)2
i
(n-I) s2
(n-l ) S 2
-it
nx-.2
.if
.-n2
.
:::
h(~) :::
of Ae
+~n:r
n )'=2s
J n-l
(A. -2!n _ 1)1/2
can be used if it is a monotone function

/
- 175 -
(~)
hf(A) ==-[n-l
h(~)
Therefore
(X 2/ n _1r l / 2 i\..f/n)-l
(~) <
0
~o
is a monotone decreasing function of
h('iI.)
</\o
h(:\) > h
o
ltl >t o
Reject H whens
O
Problem 72& Find the m,l or e test for H :
against
lJ,=O
1\:
I-L
> O.
Does it reduce to the u.m.p.i. test?

Problem 73:
Xl" X2,
Hog
X are NID(ft./., (i)
0>01
(j
-::: C
lJ,
lJ,J
(specified)
Find a uem.po io test for HO against

Perform the test where X :: 5J
4.
General Linear
(j
HIg
S :::
== c 1
ci unknown
'>
Co
~.
12..
00
=:
1, n
II;';
20
00
~" N
Hypothesis~
assumptions8
X,1. are NID(lJ""

1.
P
lJ" ==
3.
j=l
a ..
3.J
i = 1.1/ 2"
~,
J
~1"
$.0,
~p
are unknown parameters
p ~N
rank A ::: (a
ij
) is p
k = 1,,: 2" eu" s

s ~ p
i.eo" s linearly independent
equations in the parameters.
aBs" bes known

Alternative formulationg
We may solve for
assumptions:
~l"
13 2"
ClU"
13 p in terms of p of the l,J>is and then get as the
i==l
".ei lJ.i
=t
.f, ::.
1" 2"
u.
N..,p
HO can be written as an additional set of s equations in the lJ.'SI

N
HOtJ
i=l
C1'...
'10.
~,3.
= 0
k -1" 2"
GU',
- 176 ..
Example;
i.e.~
the 2-factor, no interaction modeJ
=- 1" 2,
eo .t
ab observations
'"'1." ~.,
Parameters are iJI,
~a-l' ~l" ... , (3b-l
0<.
j=l
parameters in all
are the a..l equations in the parameters
It:
b -
a-l .
Lemma:
a. jY' (i = 1., 2" ..

1. J
fl,
m) are m linearly independent equations in
n unknowns (m .c: n) then there exists an equivalent set of equations with

matrix C which is orthogonal" i.e.
n
~ c~. ::
j=l
ProofS
for i
l.J
= 1,
2,
'Ref a 11ann, Analysis and Design of Experiments.

given m equations in n unknowns=
..
"
CI
o~.,
.. 1.7'7 ..
The first row in the equivalent set is determined by settingt
To get the second row:

t
c 2j ::: a 2j ..
~lj
~clc;.
e~cla2'
j
J J
j
J J
..
A~ci,
j
J
This = 0 (as required) if A. :
~clja2'
j
which will hold i f the denominator ,. 0 _. but by

virtue of the independence of the original equations the equality can not holdo
For the third row:
~Cl,c3'
, =~a3,cl'
j
J J
j
J J
This = 0 i f
Similarly
A:t
=.
..
A:L (1) - ~(O)
~a3 .c l
j
'
J
.~ =~a3jC2j
J
Finally we set:
The completion of the proof follows readily by induction.
--------
~~-
-1-78 hypothesis~
In the alternative formulation of the general linear

N-p+a (~ N) linearly independent equations,
we had the followin
i=:~
A. ti
~i
Ill:
From the lemma we may assume these form the first N-p+s rows of an orthogonal matrix,
These can be extended to form a complete orthogonal matrix (NxN) -- this is a well i
known matrix algebr~lemma,\l proof is in Cramer or Mann. If we call the complete
'
orthogonal matrix ./ '- , we have
.0.
~l
..
II
"N.p"
III
Pu
.0.
o
"
":w
~-p,
Pl N
'A,ts-"
P Vs orthogonalizE
't'ts added to complete

the matrix.
..
0
C)
fJ
PaN
Psl
III
'l;lN
"
<l>
'l;
p-s, N
E(Y,e)
= 1.=1
~ 4 Di ~
'CI
by the assumptions
=0
.0
if H is true
O
ci
yts are independent normal variables with means as shown and variances
(an orthogonal transformation changes NID variables into other NID variables with the same
varia.nce s) "
-179 This is the canonical form of the general linear hypothesis.

2
Yi are NID (1,.(.1" a)

jJ..
].
H:t
where lJ.i
at:
=- 0
I
one or more of these s IJ.i
"
S .;.
MeL.R" Test for the canonical form:
=- -
( {2i1 a)N
m.loe. of the IJ. 's in ware found by setting IJ..]. = Y.]. (i=N-p+s+1"
N-p+s
and~2=~
1=1
A =
u."
N)
y. 2 "N
].
8~1
(~
e - N/2
e - N/2
N-p
~y.2
i=l ].
N-p+s
~y2
i=~ i
Qr
N-p+s
y. 2
== Q + ]
a
i=N-p+~].
'A,N
Q
a
CJ;
=(~r
= absolute
minimum (nothing
specified about the hypothes
r = relative minimum
-180 -
Qr - ~a
- 'l:9
1'1
A. -1=
X (s)
N;f
=~
y.
i=1
has the F-distribution with parameters
Therefore
~d.2
N.;fItS
Recall that
is
J.
i~-~l
>
(S.ll
N-p)
J.
y.2 is the sum of squares of variables with non-zero means" and

J.
hence has a non-central
x2..distribution
with parameters (s,
i=1
d. 2 }e
J.
(Q - Q )/s
The ratio of a non-central X2 to a central X2 ia a non-central F ~ thus ~r_,...-a
__
Qa P
s
under H.. has a non-central F-distribution with parameters (a, N-p.t ~ d. 2 ).
--~
i=l J.
7N-
Thus the problem is solved in terms of the yts (the orthogonalized XiS).
The original problem was:
XiS are NID
..
1)
i=l
)..0'
",J.
(~i' a2 )
~i
= 0
2)
i=l
Pki jJ.i = 0
.e = 1" 2"
Q $
N-p
- 181-
HeLoR. test:
defineS
Q;
Q;
~
=:
min
~ (Xi
=:
min
(in .f).)
G (in
w)
~)2
under restrictions (1)
(Xi - lJ.i)2
=:
under restrictions (1) and (2)
(Q~)lJ2
~ (Q r )1/2
r
')/2e-N/2'
.
"
a
"" =n'
-
""r
-Nl ~
As in the other case, this is a monotone decreasing function of
Q - Q
r,
Qa
Recall that under an orthogonal transformation, sums of squares are preserved.
Thus:
Q is carried into Q
Q is carried into Q
Hence:
and has the same distributions under H (F-distributioj
and under
,Problem
74:
11.
(non-central F).
X. 'k is N(j.l.", cl)

~J
j.l.
~J
=~
=~,
2,
= 1.. 2,
i
j
~J
0.0,9
a
b
+ ~i + ~. + (~)i'
J
J
all ~.
Find the usual F-test for 1)
H
O
2)
= OJ
all (-<13)
~J
=:
- 182-
Power of the ANOVA test:

Power of the ANOVA test has been tabled by Tang (Statistical Research 11emoirsl Volt\> 2
for ~ = 0005, ~ = 0.01 for various values,of.
Tables ing
lit
2~JJi2
s+.
Mann
Kempthorne
s
2
aO' A
==
d.
k=l
i~
YN-p*k =
i==l
P ki Xi
~ ~
==
Q-Q
11,=
--,-
=~y'
k=l
(N
=~]
N-p+k
k=l
i=l
hence 20'2", is Q - Q with X. replaced by E(X )

r
a
1
i
I:
IrA.
Example:
or
X" k =
1J
f,J.
13J
+ -<i + ..<. ""

1
]13. ""
j
-I;
(0<[3) 1J
e"
-I;
1J k
~
(0<[3) ..
i
1J
..<. "" 0
"" 0
X eo. l
under
~,
-<i not all zero
Qr .. Qa ==
~(~13) ..
E(l.1 ., ) = ~ + -<.1
E(l
C'
) ""
E(l.1.. -
+ 0 + 0
X.... ) "" -<.1
1J
=0
- 183 a
hence
~ ~
20'2",:::
thus
Exercise:
i=l jl:~ k=l
..{.
nb
1.
.,(. 2
i=l 1.
2nb .::J~i 2]1/2
I:
[ 20"2
If a
3" b = 5 how large should n be so that we can detect
I:
with probability .75 (~ ::: 0 0 05).

Try n.. calculate , enter the tables and find the power. After successive
trials n will be obtained to give the required power. (daf for the numerato
:0 2,; for the denominator::: ab(n-l) ::. 15{n-l).
Verify that n ::: 13.
0
Randomized blocks;
i .. I" 2"
j = 1" 2"
~..
J.J
~ +
J..
oft
.. e ...
"0.''
b.
(this is the additional assumption of randomized

blocks)
or
X,.
J.J
I:'.~
where
+ "<i + b. + e ..
J
J.J
ij
j
is
N(O, 0'2)
is
N(O, O'b 2 )
and they are independent.

some
-1...1.
rj.
~b,
Given b."
J
If H is true"
X.1..
ib
are N(~ + ..(. +

1.
X.
1..
is
N(~ ' ..
i=l J
II>
2
0' )
""0
0'2)" and
~ ~ {f. - X
1=. 'J=
1.'11....
a
)2
~ (X. _ X )2
'11.
1.=
2
has a X -distribution with a-1 dGf. This conditional (on the b,) distribution does
not involve the b j " therefore the unconditional distribution of J
a
~ ~ (X, - X
i=l j=~
1.
)2 is
x2 (a_l).
..
...
-184 ...
In general" given b.,

J
~ ~ (XiJ.
i
- Xi....
X.
-J
2
is X (a-l)(b-l).
)2
Again we have the same result for the unconditional distribution.

Hence a test of
He
is based on the statisticB
ft
.>y'a
(ili. - il
~ (X'j~x-.-'"';"'--x-.-It-X-)2-~~(a---~)-(b-"-l)
[b x.1"
is N(IJ.'
X -distribuliion with parameter b
+$ ..<.,
1
~c(i2.9
.~~'
.J
1.
which has the F-distribution with

If H is false,
-1
(a-l)" (a-l)(b-l)
dof.
~ ~(X1.... XO'~ )2
ri)e
has a non-central
dot' ~
a-l~
The Power is also calculated as i f the b t S were fixed parameterso .

Random model:
(Heirarchical Classification or Nested Sampling)
00.,
i =~J1 2,
j = 1.~ 2" ..
where the a
u,
a
n
are N(O, ?"a ).
ANOVA Table
~
aSs
error
a-I
a(n.,l)
E(MS)
SaiS.
~ ~(X ....!
)2
~ (X i {'X i )2
1.
CI<ll
+na
has a x2.distribution with a(n-I) d.f.
-18$ Since this is conditional on the ai' but independent of

them" it is also the unconditional distribution"
....
e~g.,
unconditional distribution.
in terms of moment generating functions
j.tb
~ (X. - X
i=l
J. 1t
)2
is ,,2 with a-l d.f.
Qa
a2 itnaa 2
He;
eTa
a leads
...
to the usual F-test against
~g
--:2.
a
+na
a
has an F-distribution with [(a-I), a(n-l)l d 0 f.

Eroblem 75~
Xijk
1ll:LJ
a)
lJ.ij ... l..l. ... ..<..J.
b)
I.kij
=:0
,i)
is N(l';.j'
l..l.
-I;
it
b ..
J.J
a. + b ..
J.
ij
J.J
..,
= 1,
= 1.,
= 1.,
a.
2,
b
2, .,
2, 0 , n
k
where b
are N(O, CT 2 )
ij
b
2
where b
are N(O, a )
i
j
a.
J.
fj.~,
are N(O,
CT
2)
and the a's and b is are independent.

For n
= 21
= 2;
1.)
2)
a =
5;
: 0.05,
in (a) against
in (b) against
plot the power of the tests:
eT2 +
aa
2~ 2 2
-186
~
MUltiple Comparisonsa
Tests of all contrasts with the general linear hypothesis model following ANOVAI!>
Tukeyts Procedures
observations:
let R =- max
~
se
Xl' X2 ,
i Xi ~
~
'5 i
is an estimate of
eo.,
Xn are NID(O,
- min {Xi}
n
1. ~ i ~n
2 which is independent of Xl' X ,
2
qU,
Xn with
f d.f.
(iQe., f s 2;02 has a X2.distribution with f d.fQ)
e
The distribution of Ria can be found in a general form" and in particular when the
X9s are normal (since the X.la are N(O, 1) -- it is free of any parameters.
:L
The distribution of Ri
RI!a
= !..
s e "'" s e
::0
Studentized Range
has been found by
numerical integration and percentage points have been tabulated by Hartley and Pearsor
in Biometrika Tables, Table 29Q
Consider the one..way ANOVA
situation~
Xij are NID (~i' cr )
i
j
2
!i. are NID (I-l-i , ~-)
I'1SE =Consider HO~
~ ~ (Xij
2<0
Xi )2
a(n,,",l)~-
J.!i::O ~
= 1"
2,
~ ~,
=- 1 ~ 2~ 0". II n i = n
2
is an independent estimate of a with a(n-l)
or IJ.i .., ~ = 0
Given HO:
Pr
This t is the statistic Q (a" f) in Snedecor, section 10.6, po 251

where f =0 d.f () =' a(n-l) in this case.
This inequality will hold for any possible pairwise comparison (i.e lll " any i, k).
d~f.
~87
Frame of reference is the total experiment" not the individual pairwise tests -- is eli,
for the total experiment and making all possible pairwise tests, the probability of
the Type I error is ~..<" where co< is the level chosen for the Q(a"f) statistic.
i.1. - '1<:
Hence we reject H:
I:'
for any i" k when
rn (Xi. - X
{Tm
k)
Q..< (a"f)
Scheffe Test for all contrasts (most general of all).

Ref:
Scheffe, Biometrika, June 1952

2
X.. are NID (lJ.., (1 )

l.J
l.
1, 2,
1" 2,
se 2 is an estimate of (12 which is independent of
with f dof.
~ c.~l.'
i=1
...
i=1
l.
Consider the totality of all such

Theorem 38g
X.l..
0 ....
l.
HO(~).
If the lJ.i are all equal" then
Pr
(a-I). C
a-l
where 0'
()
hence, if' we reject any such H (') when

O
a
2
[ i=l 0i
(Xl.'
----
- X
>}
2
s 2~ 0i
e. i=l
n.l.
then the type I error
~-<.
a-l.1
F"l
(1
r(a-~+! )
r()r{f)
_ a-ltf
F)
dF
- 1.88 Proof:
Under --0
H_
For all c. with ~ c.
~c.X
1. ...
1.
1.
0,
=:0
>
~Pr
.......
~c i =0
Y.1. are NID(O, 1)
define:
i=l, 2, ".:1 a
d.
1.
tit
~o.1. =~.r;-:'J u i d.1. =
(]
d. Y)2
Expre8$ion we wish to maximize is
1(,2
1.
i_
with respect to d
=- 1" 2" ... " a
r~ rn:J y.J~di
Y~li=
J
~n.J
multiply
[lJ by dj
since
~ (ltd.
1. 1.
:=
and sum over j:
2[;EdlJ [;EdiYiJ - 0 - 2~(l) - 0

~ - [~diYi] 2
since: i)
~ v:'d.
J. J.
ii) ]d. 2
J.
=- 0
=:0
- J.89 put
":t"
~2 in [1] to obtain the jth equation:
2Yj
[~d1YiJ ~ Tn;~Jii',.~J[~diyJ -2dj~d1yJ ~O

2
nj
= ~
(Y . - i)
J
j=l
Reoall:
Pr
.,
J.=..
c.J. !.J.. ]
t
~C'!i
J.
,
:il:. Pr
>t
~ n. Y.
_ _...J.---.;J.;;;.._
~ni
:=Pr
6e
Y.J.
22
and are NID(O" 1)0
(]
>t
... 190 ..,

We can make an orthogonal transformation of the formi
Then
has a x2..distribution with (a-l) dof ..
Hence
>a.:rt
Thus
has an F-distribution with (a-l, f) d.f.
but
Therefore the type I error will be less than or equal to ..< if we reject any H (!!) ifa
O
L~Oi 1 '>
2
Xio
se
2~ci
~-
ni
(a-l) Fl _-< (a-l" f)
191 $"
~.Tests
Simple hypothesis
Single series of trialsi

events)~
classes (or
pr6babilities~
observations:
l'J.
P2
vI
"
Pk
~P.J
~v j
=1
=n
If H is true
X.
is asymptoticall1 normal subject to the restricti..
lOt
By an orthogonal transformation it can be shewn that
(
0,
"'Iv. -flP.'/
~. -1L.~ 6~
j=1
l1P
j
--71
The power of the test
ha.s an asymptotic X2.distribution with k-l

d"f. as n --7 (!) 0
as n--7oo"
~ower of a simple x2-test:
p. = p.
J
v. - np. o
V ...
np.
=_L.-...:2
X. =- -L-_:.L
.r::-Or:;::'J
p
, np
'I n
$ j
p.
now e
-:L
0
Pj
p.
1. + -J
Pj
to>
p. 0
J
as
Under HI the Yj are asymptotically normal!)
n~oo.
- 192 -
~ Xj 2
With the same type of orthogonal transformation, we have
under
H:L
is a sum
J~~_ and henea has a llOn.-oantral x200distribntion with parameters
of squares of Yj +
Pj
k
j=l
k
The non-centrality parameter can also be written as
n~
j=l
Example a
v." noo of families with

l.
boys in families of 2 children.
Assume the number of boys is binomially distributed with parameter Pit

H:
O
P .. Oc4
p: 005
p.
No .. boys
Pj ~under
under HO)
.16
~48
.36
A=n
=t
I1.)
,,0816 n
For n = 100
The power can be determined from the Tables of the Non..Central x 2 by E. Fix,
University of California Publications in Statistics, Volume 1, No.2, Pp. 15-19.
From the tables, for
= f =- 2" .,( = 0.05

(power) <: 008
A. ... 8/116" k-1
00 7
~ ~
x2_test may also be a one-sided test:

for j
~kw
for j
> k'
(kt exceed P.O)

J
(k - k' fall below Pjo)
- 193 -
x2.teat
o2
(v, - np, )
caloulate ~ a J - as usual
is modified as follows:
for j .f k':
nPj
for j
>kit
<.
if v
> np,O
J
if v j
x2
if v j
<
np j
np j
put the contribution == 0

put the contribution
=0
calculate x2 as usual
is rejected if the sample value ;> X21.2~(k-l)
Examples
8(10)
12(10)
20
12(10)
8(10)
20
Refs:
if v1I
X~
=0
--
0.5
=:
therefore we fail to
reject H
O
20
20
noteg
Hog
> 10 we would
then calculate x2
(on the X2...test) Z

Cochrane
1952 Annals of Math Stat

1954 Biometrics
x2.test of a composite hypothesis:

classes:
s series of observations:
1
vll
'"
0
probabilities:
2
v12
0
0
k
vlk
.eo
"
v sl
v s2
..
P1I
P12
oQ
"
v sk
Plk
"l)
PSI
P.s2
.,,.
Psk
= 1, 2,
2:1
= 1,
...0",
~
j=l.
Pij
Vi'
J
=:
=:
f(~)
s
k
- 194 -
~
[P;;
j
~J
HO~
Pij
X' j
~
1:0
for i
1" 2" ..." s
= PijO(~)
Theorem 39.
1\
IrQ is estimated by
then as n
~ 00,
m.1.e~ or any asymptotically equivalent method

under the regularity conditions as for estimation:
r: .
1\
""\2
LV i; - ni.Pij (~) I
-~
7t
n-lp, . (9)
~J -
-=-
.l.
has a X2~distribution w~Gh s(k-1)-t d$fo when H is true.

(t =: no~ of components in ~)
0
If H1 is true8
Pij == Pij
=:
Pij
c. .
-8-
and
ni
=:
--
~n
l/ n i
i
2
then X has a non-central X2..distribution with d.. f. as before"
l
and non-centrality parameter .S [ I B(BfBr
where Oi
- ], x ks
(..JC"JPi)
Cramer:
Mitra:
J~
:: _~J
r=-o- ~
Pij
Ref:
B~
i = 1" 2"
j # 1" 2"
.e == 1" 2"
0'." ks
.. /) 0"
" "t
..
x2 section
December 1958 Annals of 11ath Stat
Phq,D. Dissertation, UNC, Institute of Statistics I".fimeograph Series
No,,142.
- 19, -
Note on the general X2 tests of oomposite hypothesesi

v ij - n i Pij
X :
- are asymptotically normal subject to the linear
ij
i Pij
Jn
11 JPij X1j ~ 0
rel>i;rictionB that
~ 1,
for all 1
2, .... B
1J
For sufficiently large n the m.l.e. reduces to solving the equationa
dIn
"$Q
I=
dInQL
-a
==
+
Q
(~~~2In L j
oQ
~ =9
) (9 -
9 )
0
Recall:
1n L =- KI
01;
~~
i
v.. In Pij (9)

1J
In L is linear in the v ij" as are
...t-~
'0 ili
h =- 1" 2
Hence the estimation of Q (single component) means one additional linear restriction
on the v. . or on the X. j (since the two are linearly related) It
J.J
J.
If this restriction (2) is linearly independent of the s restrictions in [11" then

we can transform the s+l restrictions to s+l orthogonal linear restrictions and then
by the usual extension get an orthogonal transformation that takes
~~
ij
Xij2
into
~~
y ..
ij
with s(k-l) - 1
1:1
sk - (s+1) terms and henoe we get
J.J
the result of the general X2 theorem (theorem 39).

Example a
Power ofax2 test of homogeniety" 2 x 2 table
Probabilities:
1)
11.1
2)
Pu =- P21 =
~:
P12
=P22 1 - Q
P11
= Q + o/[ll
Pl2 = 1 - 9 -
.P21
= Q - olin
P22 ==
1. -
ct F
c/
rn
[I - B(B IB)'l B' 1,2
A =t,2'
take PI
= P2 '21
(sampling fraction)
::0
~ (-r;--Q-
BI
BtB
- ~96 ..
::0
-1
(BIBf
=:
-Q(l1-
=-
1 - 9
Q(l - 9)
~
.. Q
- 9(l-~
2
1 -
-Q(1-9)
... Q(l-@)
1
B(BIBr Bi =.
1 - Q
-r-
(~
5'
I 0
T
-c
J2(1-9)
2"
1 .. 9
- Q(l- Ql
2
-9(1-Q)
-2
--r
- 9(1-9)
2
~--
- 9(1-9)
2
-rQ
~-
.. Q(l-e)
-61
Q)
-2
-c-
-c
~2(1..9)
::0
(patterns of + and - signs in 5, Bare

opposite and sufficient to cancel all termse
Hence the non-centrality parameter,
;..
c2
= Q(i-Qj
Which can be expressed in terms of the pts as:

PII
of;
P2I
=---r2c
PII" P2I
= Fn
PI2 + P22
I-Q~--~-
n (
=t 4
Pll .. P2I )2
- 197 -
A=
In the Deoember 1958 Annals of Math Stat Mitra gives the following formula for A in
the 2 x k contingency table&
probabilitiesg
~Qj
J = Q + c1.J./
pJ." "
rn
=1
~C1j = 0
~C2j = 0
2
(c lj - C2j~
9j
For the above example:
Pl
= P2:
"2'
6. Other Approaches to Testing Hypotheses and other problems,

1.
Most Stringent Testse

Let the family of tests of size
Let ~(Q)
sup ~
e ':f'(.1..)
(Q)
.I..
be
<p (..<).
- 198 -
(x) is most stringent if:

i)
it is of size
ii)
20
for any other size ~ test ~ I
I'1inimax~:
L [D(X)"
Example:
Decision Theory Approach

== the loss when D(x) is the decision made and 9 is the true
value of the parameter.
X is N(Il" 1)
D(!) == 1
D(!) ... 0
means accept
L [D(!)" Il] ...
HO~
Il'" 0
JJ.2 when D(!) ... 0
L [D(!)" Il } : C/Il when D(!)
Having set up a loss function, we then have a risk function defined as:
rD(Q)
~ E(L):
.5
00
(:>(!:), Q]f(",. g)
It is frequently impossible to minimize this universally with respect to
Q"
thus:
D(x) is a minimax decision rule i f it minimizes (with respect to all possible

decision rUles).
SUPg reg)
or expressed another way we determme3
ref:
infD sUPQ r (Q)

D
Blackwell and Girshick, Theory of Games, Wiley
3& Admissible Decision Rules:

D(~)
is admissible if there is.E2. DI (x) such that

rD,(Q)
~ rD(Q)
for all t;
rD'(~)
<::
for some Q
rD(Q)
i.e o , you can not improve on D(!) uniformly0
---------
.. 199 CHAPTER. VII
MISCELLANEOUS
A.
Confidenq, Re gions:
B(w)
Let
Def.
ta.:
=:
A(~
the totality of subsets of
If Pr[A(!>
Theorem 40:
.rL
::> e J)
~
1 -.,(
...
then A(X)
is a confidence region with
~.
If a non-randomized test of size
exists for H : e
Let R(X.
-
e)
90
_9
_
be the set of X for which H : e

-
=:
for every e
0 0 0
then there exists a confidence region for

Proof:
be a function from X (sample space) to B(UJ)
confidence coefficient 1 -
is a random variable with d.f. f(2$., e)
=:
e of
size 1
J ~.
eo is rejected, e.g.,
R(X,e),
lIl1f111,
IllIflfn
--x
[lJ
This R(!, eo) is defi~ed and satisfies [1 J for every eo'

Define: A(!)
U X J R(2,
./
Pr[A(!> :J Go] = Pre!

'~='
e)
J R(e)]
x.
=
1 - Pre! e R(e)
J >
1 - ~
q.e.d.
48: A confidence region, A(!>, is uniformly most powerful if

Pr[AQ~)
::>
Ie'
is true J is minimized for all
e' r/ e.
note: Kendall uses this same definition with "uniformly most powerful"
replaced with lIuniformly most selective. 1I Neyman replaces "u.m.p."
with "shortes t II
Def. 42,:
A confidence interv'al is "shortest" if the length of the interval is

minimized uniformly in 9.
Confidence Interval for the ratio of two means:

:d
,~_==-.
a~, a~, p)
Xi' Yi are bivariate normal BIV N( fl., "<ll-,
= correlation
wherejO
Problem:
coefficient and may
find a confidence interval for ..<
i = 1, 2, ... , n
= O.
= ~fi~
Zi= ..<Xi -Y
i is N(O "~2a2
x - 2..<na
r x a+
y 0'2)
y
2. (Zi
- 2)2 and
Z are
independent and distributed as X and normal respectively.
L (Zi-Z) 2
n-l-
=
222
=..<
sX -
2~s
:xy
+ By
therefore:
4):~ =~~i1/2'
x
Pr
xy
....i~ 2~...;:.-Y~-s;>2~1"'r72~ <

(..< s
d.!.
-.... 1- -1
has a t-distribution with n-I
x - 2..<s xy +
l-..
2
We can solve this and get confidence limits for

quadratic equation in ..<:
,,~.
(n-I)
=1
- e
This yields the following
- 201
000
Among the various possible solutions of this quadratic equation we can have the
following situations:
~,l~/,,(
2
A.
If nX! .. t s~ > 0:
~"<2
It can be shown that 2 roots always exist and this the confidence interval is:
...
For the above inequality to hold, X must be significantly away from the origin;
i.e. , we must
rejectHo:~
=0
on the basis of Xl' X2, ~ Xn at the e level.
e.g. , the following condition must hold:
i\lx.L>t l
x
.. .!
2
One could, if desired, change t (and thus e) to insure that this inequality
always held and thus that a ureal" confidence interval exists.
Note:
B.
If
nX2
- t
s2
x
<0
~,
..(2 may be real or complex
If the roots are real, then the "confidence interval" is (- 00, ~), ("<2' + 00),
i.e., we accept "in the tails."
......
2)2
2\ -2
2 2
2
If 4(nXY t Sxy
< 4(nX....2 .. t 2sx'
(nY - t Sy), i.e., b
C.
< 4ac,
then
the roots are complex and we have a confidence interval (.. 00, + 00).
[Thus we can say that .. co
D.
If we get
< .< <
+ co with confidence 1 ..
Q(..<)
----+-----'-"..<)
Q(..(
' \
Refs:
eJ.
s. 0 <:
ill:..2
(..( s
1..<1 - Y I
2
x
.. 2..<8
2 l7~2
+ S )
xy
Y
thus we always accept H since the test statistic never exceeds t.

o
Fieller, Journal of the Royal Statistical Society, 1940
Paulson, Annals of Math Stat, 1942
Bennett, Sankhya, 1957
Symposium, Journal of the Royal Statistical Society, 1954.
<t
- 202 -
LIST OF THEOREMS
Theorem
10
Properties of distribution functions
G.
2. 1 - 1 Correspondence of distribution function and probability measure
Discontinuities of a distribution function .
4.
Multivariate distribution function properties
50 Addition law for expectations of random variables
..
22
22
Conditions for unique determination of a distribution function by moments

(Second Limit Theorem) .. 0 0 e 0 0
0
23
8. Tchebycheff's Inequality. .. .. ..
24
9.
25
7e
10.
...
...
6. Multiplication law for expectations of random variables
Bernoulli's Theorem (Law of large numbers for binomial random variables).
..
Characteristic function of a linear function of a random variable
28
11. Characteristic function of a sum of independent random variables
29
12.
30
Inversion theorem for characteristic functions
..
~.'
.....
13. Continuity theorem for distribution functions (First Limit Theorem)
33
14. Central Limit Theorem (independently and identically distributed random

variable s)
II
"
..
"
15. Liapounoff Theorem (Central limit theorem for non-identically distributed

random variables)
"
16. Ofeak) Law of Large Numbers (LLN) ..
0
17. Khintchine's Theorem (Weak LLN assuming only E(x)
< 00 )
18. A Convergence Theorem (Cramer) co co .

e
19.. Convergence in probability of a continuous function "
20. Strong Law of Large Numbers (SLLN) .. .
21. Gauss-Ms.rkov Theorem (bol. u. e. ) " .. .
~
<l\
22. Asymptotic properties of mol.e o (scalar case)
35
41
41
"
44
46
51
61
.. .
23. Asymptotic properties of m.l.e. (multiparameter case)
33
65
70
- 203 -
Theorem
Page
24~
Information Theorem (Cramer-Rao) (scalar case)
25.
Information Theorem (Cramer-Rao) (multiparameter case)
76
81
26c
Blackwell's Theorem for improving an unbiased estimate.
88
270
Complete sufficient statistics and m.v.u.e.
28.
Complete sufficient statistics in the non-parametric Qase
290
Bayes, constant risk, and minimax estimators
109
300
Density of a sample quantile
116
31. Convergence of a sample distribution function
o.
124
32~
Pitman's Theorem on Asymptotic Relative Efficiency
*' 140
33~
Neyman-Pearson Theorem (Probability ratio test)
G.
92
96
34. Extension of Probability Ratio Test for composite hypotheses "
35.. Sufficient conditions for uom.p.u. test "0.0
(>
149
153
156
36.. Reduction of test functions to those depending on sufficient statistics. 160

37.
Conditions for invariance of a test function "
Scheffe's Test of all linear contrasts

2
39. Asymptotic power of the x test (I'1itra)
38.
"
""
162
187
194
40. Correspondence of confidence regions and tests " "
".
199
... 204 -
LIST OF DEFINITIONS
Definition
10 Notation of sets
2. Families of sets
Oc
. . . . . .. . . .
. . .. ..
_'.
-0
.,
40
50
Additivity of sets " ..
Probability measure
6"
Point function "
30 Borel Sets
..,
.... .. ..
0
7. Distribution function
8. Density function
<3
9. Random variable
'.
..
..
..
II
_e
"'0.
..
..
-0
O'
.e
cD.OO
10
0.0.
10
II
O.~
16
-0
..
...
GIi
(0)
10.
Conditional probability
ll.
Joint distribution function
12" Discrete random variable "
13. Continuous random variable
000,
14~
Expectation of g(X)
1$"
Moments
160
Characteristic function
17~
Cumulant generating function
180
9-~e
()
(l
"
17
21
It!
..
O.OeJ
2$
..
28
Convergence in probability " .. "
" " " " ..
hI
.." " "
"
"
"
"
..
"
eo.
e.
II
"
..
22
19. Convergence almo at everywhere . . . . . .. 49

20"
Risk function
..
21 0
I~nimum
22.
Best linear unbiased estimates (b.louoe o )
230
Bayes estimates
24.
Minimax estimates
II
..
"
<)
..
ill
..
..
..
"
"
(m.v~u.eo)
variance unbiased estimates
.. .. ..
..
" "
..
"
"
"
..
I>
"
$5
56
57
58
60
- 205 Page
Definition
Constant risk estimates
26.
Consistent estimates
OIl
.Oll.,~
II
eo
60
o.
60
60
27. Best asymptotically normal estimates (b.a.nce.)
28. Least Squares estimates
32 0
Complete sufficient statistics
330
Test
0."0.,000
34. Test of size
ooo.o
350 Power of a test
"
"
36. Consistent sequenoe of tests

Index of tests
e"
...
40.
Sample quantile
"
o.
64
oa
..
86
92
113
"o.o
'
113
113
.'
o.o
eCJ:o.
CI~.o
. . .,
"
..
41" Uniformly most powerful tests
l).
113
113
."'.e.o.o.o o
114
tIIQ.e
115
.)
116
1). 154
154
155
64
"
oa.<t;>.o
Asymptotic relative efficienoy (AoReE o )

39. Quantile
"
. . ..
Maximum likelihood estimates (m.l.e.)
61
30.
Likelihood funotion
290
31. Sufficient statistics ..
~.
oo
148
42.
Unbiased tests
43.
Similar tests "
44.a
Invariant tests
45..
Uniformly most powerful unbiased tests (u.m"p"u.) "
.....
155
46.
Maximal invariant function
470
Confidence region ..
48 0
U.. M,P Q confidence region .. .. .. "
490
"Shortest" oonfidence interval
80
..
"
0'
oo
.,oeo
ee
161
..
..
..
..
199
.. .. .. .. .. .. 199
..
0$.0
..
..
...
200
- 206 -
LIST OF PROBLEMS
Problem
o.
lim
Pr,( 8n ) " " " ..
n-?,oo
3.
Cumulative probability at a point " " "
40
Joint and marginal distribution functions " "
5~
Transformations involving uniform distributions " " " " " " "
1.
Pr(Sum of sets)
Sum Presets)
=0
"
"
""
13
6. Distribution of the sum of probabilities (combination of probabilities).
19
7. Prove that the density function of the negative binomial sums to 1
21
8& Find the mean of a censored normal distribution

9.
"
Uniqueness of moments (uniform)
(l
. ., " "
"
<>
." .
. ." .. , .
"
21
24
10.
Prove the normal density function integrates to 1 " " " " " " " " "
28
ll.
Derive the characteristic function of N(O, 1)
"
(II
28
eo.
.,
"
32
"
..
32
33
".... " " "
Ii
10
35
36
38
It
"
38
41
41
coo
12. Density of the product of n geometric distributions

13.
Factorial mog.f. for the binomial distribution
Inversion of the characteristic function (Cauchy)

Limiting distribution of the Poisson
16.
Use of Mellin transforms
"
"
17. Transformations on NID(OJ a2 )
18. Limiting distribution of the multinomial distribution

19. Limiting distribution of the negative binomial
20.
Show E [g(X)
210
Distribution of the length of first run in a series of binomial trials
22.
Distribution of the minimum observation from a uniform CO, 1) population.
44
44
23.
Prove theorem 19
E,x [g(x)ly]
Ci
oo.o.e~o:o
oe
Asymptotic distribution of (X - A)2/X where X is Poisson A

Elcample of convergence almost everywhere
,,~
"
46
48
So
- 207 -
Problem
26. Inversion of (t)
= cos
ta
eo'
e.eOc
o.
\)
.,
Asymptotic distribution of the product of two independent sample means
28. Show that
x is a bol.u.e of ~
o \)
<>
29. Example of minimum risk estimation
..0
30. Determination of risk function for Binomial
~.
50
50
57
57
59
31" Find a constant risk estimate for Binomial p
60
32. Proof of consistency
..
60
ItI
co
Cla
63
69
0,
"
69
c:
70
74
E>
.,
of parameters in simple linear regression

2
= a 2 e_a x
Find the m.lee" of lla lQ' where f ()

,x
M.l.,e. of Poisson A and its asymptotic distribution
.,.
11.1..8e of parameter na" of R(O, a)
37., M.lee. of the parameter of the Laplace distribution

38g M.l08 of A,
3
74
\)
75
bound for unbiased estimates of the Poisson parameter A
78
Crarr~r-Rao
39. Application of the multiparameter

40., C-R
J.o~18r
~A
where X is Poisson AJ Y is Poisson
<>
lower bound
410 C-R lower bound for estimates of the geometric parameter
80
42. Application of the C-R lower bound to the negative binomial ... 84
43. Derivation of sufficient statistics
44.
4)
87
89
Sufficient statistics and unbiased estimators for the negative binomial
45. Variance of sufficient statistics for the Poisson parameter A 91

46. M.v. u.e. of a quantile
47.
480
<>
. ..
95

..
98
Minimum and constant risk estimates of cr2 (normal) "
98
49. M.l.e o of the multinomial parameters (p..

~J
<>
= p.~1:'.)
J
101
50. M.lee. and minimum modified x2 estimates of the multinomial parameters

(Pij = .,(i "" Pj ) C l . . 0 0 .,
107
- 208 Problem
2
M.l.e$ and minimum modified X estimates of Pi
=e
-~iA
fl
It
"
f}
108
,2. Comparison of variance of minimax and m.l. estimators for Binomial p 110
,3.
Example of t1Tolfowitz minimum distance estimate of
,4"
A.R.E" of median test to mean test (normal)
,,~
Power of the normal approximation for testing Binomial p ..
,6.
1.1.
..
A.R.E. of median test to mean test (general) " ..

Distribution of intervals between R(O,
1) variables
. ..
..
tI
"
..
Example of the U test (lVilkinson Combination Procedure) ...

l
.....
Distribution of the range
:)
Power of the
n-n
..
Expected value of W~ (Von Mises .. Smirnov statistic)

':>
fl
i>
..
. . . . . . . ..
test (Kolmogorov statistic)
62. Wilcoxen's U test . . . . .
"
CJ
..
e 0 0 ..
..
"
..
.......
"
..
..
64. Application of two sample tests
..
..
..
..
..
.'e"~..O"
"
..
..
'. 112
.. 114
. 115
117
. .. 120
"
II
II
123
124,6
. . . .. 128
Gil
..
, "
. 129
..
134
:)
63. Distribution and properties of the number of Yi exceeding max Xi (two

sample te st) ..
Q
..
134
138
65. Moments and properties of a particular non-parametric two-sample test

statistic
0""..".. 0
I)
..
"
..
"
'"
"
66. A.R.E e of Wilcoxen's U test to the normal test

67. Equivalence of the
68.
Kruskal~Wallis H test
0 " .. ... " "
138
14,
and Wilcoxen's U test for k=2
UQM,p. test for negative exponential parameter ..
69. Example of non-existence of tests of size "'"
70.
."
..
..
ct
Co
lJ
. .. . ..
.. "
to
73.
f.L (normal) "
~. 75e
Power of the two-way ANOVA F-test
f)
. .. 151
152
160
"
U.m.p.i. test for the coefficient of variation (normal)
74.. Derivation of F-test for the general linear
..
167
..
. .. . " . . . 175
" . 17,
. . . . . . . .. 185
71. Example of application of 1"1aximal Invariant Function ..

72. Maximum likelihood ratio test for
147
~pothesis
(two-way ANOVA) .... 181
"

Mathematics Statistics 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mathematics Statistics 1

Uploaded by

Copyright:

Available Formats

ADVANCED

Lectures delivered by D. G. Chapman

Random variable (def. 9)

PROPERTIES OF UNIVARIATE DISTRIBUTIONS -

Laplace Transform "

iii Unbiased Estimation

OF HYPOTHESES -- DISTRIBUTION FREE TESTS

{PARAMETRIC TESTS -- POWER

- iv x 2.test of oomposite hypotheses 0 0

Correspondence of confidence regions and tests (Tho 40) 199

CHAPl'ER I ..... kELIMINARIE8

(Euclidian k-dimensional space) -. S

+ 8 2 is the set of points in either or both sets.

Sl + 8 2 is the set of points in both

points in 5 2 bllt not

Exercise 1/ Show that 8

8 ) then 8 2 .. ~ is the set of

the set of points in at least one of the 8n

the set of points common to all the Sn

is defined as the complement of 8 and is the same as ~ -

(~ + 5 2 )* means that x is not a member of either ~ or 8 >

since x is common to both

To complete the proof

Show that 52 - 51 - S2~*0

i.e~ the set of points x,y subject to the restriction x 2 + 1'2 , 1

~ 'xf~o8, t1"~ 08}

(an exploding family)

Such sequences ot sets are called monotone sets ..

8how that the closed interval in R

Show that the open interval in R2 {x,y~ JxJ.( 1"

Probability is generally thought of in terms of sets, which is why we study sets.

the family of sets which can be obtained from t.he family of

~(S) is an additive set function i f

each Borel 8et A(S) is a real number, and

area is a set function

peS) is a probability measure on ~ i t

1/ P is an addi tive set function

fA will denote the empty set which contains no points,

i t ~ CS2 then P(S:!) ~ P(S2)

These sets Stare disjoint; &lso

the additive properly of P

Problem 2/ Prove lelllJl8 3 for case 2.

6/ Associated with any probabUity measure

F(x) is called a distribution function -

-sTheorem 1/ Any distribution function, F(x), has the following properties:

have to show that

Therefore F(~) ~ F(x2 )

(the empty set)

~~ .F~n) l~cl(Gn) P(n~~ Gn )

Pick a point a ~ for this point we want to show

Consider a nested sequence

lig14(CX)a + en) =- F(a) is the property to be shown.

lim P(~) P(lim \) P(H)

l~~ a + en) F(a)

Or in familiar terms F(a) F{x - 0) + Pr(x

is the set whose only point

Proof omitted Theorem

funotion F(x) has at most a oountable number of dis-

Proof~ Let vnbe the number of points of disoontinuity with a jump,.

n whroh is what we have to show(1

Suppo,se the oontrary holds, i.e 0

Then if we let Sn be the set of such disoontinuities, we have

Therefore s the total number of discontinuities. ~ V <:. ~ n

is the sum of the integers whioh is a oountable sum_