Professional Documents
Culture Documents
1. Probability Postulates: Let Ai for i ∈ I, a countable index set, be events in the sample
space S, i.e., subsets of S.
1. 0 ≤ P (Ai ) ≤ 1.
2. P (∅) = 0 and P (S) = 1.
P
3. If Ai ’s are mutually exclusive events, then P (∪i Ai ) = i P (Ai ).
6. Conditional Probability: For two events A, B with P (A) 6= 0, the conditional proba-
bility of B given A is P (B|A) = P (A ∩ B)/P (A).
11. Chebyshev’s Theorem: Let X be a randam variable with mean µ and standard devi-
ation σ. Then
1
P (|X − µ| < kσ) ≥ 1 − 2 , for any k > 0.
k
(Probability is at least 1 − 1/k 2 that X will take on a value within k standard deviations
of the mean.)
12. Markov’s Inequality: Let X be a randam variable with probability density f (x), where
µ
f (x) = 0 for x < 0. If µ is the mean, then for any a > 0, P (X ≥ a) ≤ .
a
Fact Sheet 1
MA204 Statistics 2008
Department of Mathematics, IIT Madras
1
7. Exponential Distribution: Parameter θ > 0.
Density is (
1 −x/θ
θ
e for x > 0
f (x) =
0 elsewhere.
µ = θ, σ 2 = θ2 .
α αβ
µ= , σ2 = 2 .
α+β (α + β) (α + β + 1)
10. Normal Distribution : Parameters µ, and σ > 0.
Density is
1 1 x−µ 2
n(x; µ, σ) = √ e− 2 ( σ ) , for − ∞ < x < ∞.
σ 2π
Mean = µ, Variance = σ 2 .
When µ = 0, σ 2 = 1, it is called a standard normal distribution.
11. Theorem-1: Let X have a binomial distribution with the parameters n and θ and let
Y = X/n. Then
θ(1 − θ)
E(Y ) = θ and σY2 = .
n
12. Theorem-2 : If X has a normal distribution with mean µ and standard deviation σ,
X −µ
then Z = has the standard normal distribution.
σ
13. Theorem-3 : If X is a random variable having binomial distribution with the parameters
X − nθ
n and θ, then the moment generating function of Z = p approaches the moment
nθ(1 − θ)
generating function of the standard normal distribution, as n → ∞.
2
Fact Sheet 2
MA204 Statistics 2008
Department of Mathematics, IIT Madras
Theorem 2 : Law of Large Numbers For any positive constant c, the probability that X
σ2
will take on a value between µ − c and µ + c is at least 1 − 2 .
nc
Theorem 3 : Central Limit Theorem If X1 , . . . , Xn constitute a random sample from
an infinite population with mean µ, variance σ 2 , then as n → ∞, the limiting distribution of
X −µ
Z= √ is the standard normal distribution.
σ/ n
1
Theorem 4 If X1 , . . . , Xn constitute a random sample from a normal population with mean
µ, variance σ 2 , then X has a normal distribution with mean µ and variance σ 2 /n.
Theorem 5 If X is the mean of a random sample of size n drawn from a population of size N
σ2 N − n
with mean µ and variance σ 2 , then E(X) = µ and V ar(X) = .
n N −1
Theorem 6 If X1 , . . . , P
Xn are independent random variables each having the standard normal
distribution, then Y = ni=1 Xi2 has the chi-square distribution with ν = n degrees of freedom.
Theorem 7 If X1 , . . . , Xn are independent random variables Pn having chi-square distributions
with degrees of freedom ν1 , . . . , νn , respectively, then Y = i=1 Xi has the chi-square distribu-
tion with ν1 + · · · νn degrees of freedom.
Theorem 8 If X1 and X2 are independent random variables, where X1 has chi-square distri-
bution with degrees of freedom ν1 , X1 + X2 has chi-square distribution with ν1 + ν2 (> ν1 )
degrees of freedom, then X2 has chi-square distribution with degrees of freedom ν2 .
Theorem 9 If X and s2 are the sample mean and the sample variance of a random sample
of size n drawn from a normal population with mean µ and variance σ 2 , then X and s2 are
(n − 1)s2
independent and the random variable Y = has a chi-square distribution with degrees
σ2
of freedom n − 1.
Theorem 10 If Y, Z are independent random variables where Y has chi-square distribution
Z
with degrees of freedom ν, Z has the standard normal distribution, the t = p has the
Y /ν
t-distribution with degrees of freedom ν.
Theorem 11 If X and s2 are the sample mean and the sample variance of a random sample
X −µ
of size n drawn from a normal population with mean µ and variance σ 2 , then t = √ has
s/ n
the t-distribution with degrees of freedom n − 1.
Theorem 12 If U, V are independent random variables having chi-square distributions with
U/ν1
degrees of freedom ν1 , ν2 , respectively, then F = has the F-distribution with degrees of
V /ν2
freedom ν1 and ν2 .
Theorem 13 If s21 and s22 are the sample variances of a random sample of size n1 and n2 ,
respectively, drawn from normal populations with variances σ12 and σ22 , respectively, then
s2 /σ 2 σ 2 s2
F = 12 12 = 22 12 has the F-distribution with degrees of freedom n1 − 1 and n2 − 1.
s2 /σ2 σ 1 s2
2
Fact Sheet 3
MA204 Statistics 2008
Department of Mathematics, IIT Madras
• zα/2 is such that the integral of the standard normal density from zα/2 to ∞ is equal to
α/2.
• tα/2, n−1 is such that if T is a random variable having a t distribution with n − 1 degrees
of freedom, then P (T ≥ tα/2, n−1 ) = α/2.
• χ2α/2, n−1 is such that if X is a random variable having a χ2 distribution with n − 1 degrees
of freedom, then P (X ≥ χ2α/2, n−1 ) = α/2.
• fα/2, n1 −1, n2 −1 is such that if X is a random variable having the F distribution with
n1 − 1 and n2 − 1 degrees of freedom, then P (X ≥ fα/2, n1 −1,n2 −1 ) = α/2. We also have
fα,m,n · f1−α,n,m = 1.
Theorem 1 Let X be the mean of a random sample of size n from a normal population with
the known variance σ 2 . If X is used as an estimator of the√mean of the population, then the
probability is 1 − α that the error will be less than zα/2 · σ/ n.
Theorem 1 is restated as:
Theorem 2 If x is the value of the mean of a random sample of size n from a normal
population with the known variance σ 2 , then a (1 − α)100% confidence interval for the mean
of the population is given by
σ σ
x − zα/2 · √ < µ < x + zα/2 · √ .
n n
Theorem 4 If x and s2 are the values of the mean and the sample variance of a random
sample of size n from a normal population, then a (1 − α)100% confidence interval for the mean
of the population is given by
s s
x − tα/2, n−1 · √ < µ < x + tα/2, n−1 · √ .
n n
1
Theorem 5 If x1 and x2 are the values of the means of independent random samples of sizes
n1 and n2 from normal populations with the known variances σ12 and σ22 , respectively, then a
(1 − α)100% confidence interval for the difference between the two population means is given
by s s
σ12 σ22 σ12 σ22
x1 − x2 − zα/2 · + < µ1 − µ2 < x1 − x2 + zα/2 · + .
n1 n2 n1 n2
Theorem 6 If x1 , x2 , s21 , s22 are the values of the means and sample variances of independent
random samples of sizes n1 and n2 from normal populations with equal (unknown) variances,
then a (1 − α)100% confidence interval for the difference between the two population means is
given by
r r
1 1 1 1
x1 − x2 − tα/2, n1 +n2 −2 · sp + < µ1 − µ2 < x1 − x2 + tα/2, n1 +n2 −2 · sp +
n1 n2 n1 n2
where s2p = {(n1 − 1)s21 + (n2 − 1)s22 }/(n1 + n2 − 2).
Theorem 7 Let X be a binomial random variable with the parameters n and p. For large n,
with p∗ = x/n, an approximate (1 − α)100% confidence interval for p is given by
r r
p ∗ (1 − p∗ ) p∗ (1 − p∗ )
p∗ − zα/2 · < p < p∗ + zα/2 · .
n n
That is, if p∗ is used as an p
estimate of p, then with (1 − α)100% confidence we can assert that
the error is less than zα/2 · p∗ (1 − p∗ )/n.
Theorem 8 Let X1 and X2 be binomial random variables with the parameters n1 , p1 and
n2 , p2 , respectively. Suppose n1 , n2 are large. With p∗1 = x1 /n1 and p∗2 = x2 /n2 , an approximate
(1 − α)100% confidence interval for p1 − p2 is given by
s s
∗ ∗ ∗ ∗
p (1 − p ) p (1 − p ) p∗1 (1 − p∗1 ) p∗2 (1 − p∗2 )
p∗1 − p∗2 − zα/2 · 1 1
+ 2 2
< p1 − p2 < p∗1 − p∗2 + zα/2 · + .
n1 n2 n1 n2
Theorem 9 Let s2 be the value of the sample variance of a random sample of size n from a
normal population. A (1 − α)100% confidence interval for σ 2 is given by
(n − 1)s2 2 (n − 1)s2
< σ < .
χ2α/2, n−1 χ21−α/2, n−1
Theorem 10 Let s21 and s22 be the values of the sample variances of independent random
samples of sizs n1 and n2 from normal populations. A (1 − α)100% confidence interval for
σ12 /σ22 is given by
s21 1 σ12 s21
· < < · fα/2, n1 −1, n2 −1 .
s22 fα/2, n1 −1, n2 −1 σ22 s22
2
Fact Sheet 4
MA204 Statistics 2008
Department of Mathematics, IIT Madras
The likelihood ratio technique yeilds the following critical regions C for a given level of signifi-
cance α:
1. A random sample of size n is drawn from a normal population with known variance σ 2 .
x̄ − µ0
H0 : µ = µ0 ; H11 : µ > µ0 ; H12 : µ < µ0 ; H13 : µ 6= µ0 . Take z = √ . Then
σ/ n
C1 : z ≥ zα ; C2 : z ≤ −zα ; C3 : z ≥ zα/2 .
Note: If n ≥ 30, this can be used for any population, and also s2 can be used in place of
σ 2 if σ 2 is unknown.
2. A random sample of size n < 30 is drawn from a population with unknown variance.
x̄ − µ0
H0 : µ = µ0 ; H11 : µ > µ0 ; H12 : µ < µ0 ; H13 : µ 6= µ0 . Take t = √ . Then
s/ n
C1 : t ≥ tα,n−1 ; C2 : t ≤ −tα,n−1 ; C3 : t ≥ tα/2,n−1 .
3. Two independent random samples of size n1 , n2 are drawn from normal populations with
known variances σ12 , σ22 , respectively.
H0 : µ1 − µ2 = δ; H11 : µ1 − µ2 > δ; H12 : µ1 − µ2 < δ; H13 : µ1 − µ2 6= δ.
x̄1 − x̄2 − δ
Take z = q 2 . Then C1 : z ≥ zα ; C2 : z ≤ zα ; C3 : z ≥ zα/2 .
σ1 σ22
n1
+ n2
Note: If n1 ≥ 30, n2 ≥ 30, this can be used for any population, and also s21 , s22 can be used
in place of σ12 , σ22 if the latter are unknown.
4. Two independent random samples of size n1 < 30, n2 < 30 are drawn from normal
populations with the same unknown variance σ 2 .
H0 : µ1 − µ2 = δ; H11 : µ1 − µ2 > δ; H12 : µ1 − µ2 < δ; H13 : µ1 − µ2 6= δ.
(n1 − 1)s21 + (n2 − 1)s22 x̄1 − x̄2 − δ
Take s2p = and t = q .
n1 + n2 − 2 sp · n11 + n12
Then C1 : z ≥ zα ; C2 : z ≤ zα ; C3 : z ≥ zα/2 .
Note: If n1 = n2 , then s2p = (s21 + s22 )/2.
5. Given two independent random samples of size n1 and n2 from two normal populations.
H0 : σ12 = σ22 ; H11 : σ12 > σ22 ; H12 : σ12 < σ22 ; H13 : σ12 6= σ22 . Then
s21 s22
C1 : ≥ f α,n1 −1,n2 −1 ; C 2 : ≥ fα,n2 −1,n1 −1 ;
s22 s21
s21 2 2 s22
C3 : ≥ f α/2,n1 −1,n2 −1 for s 1 ≥ s 2 , and ≥ fα/2,n2 −1,n1 −1 for s21 < s22 .
s22 s21
1
6. A random sample of size n is drawn from a normal population.
(n − 1)s2
H0 : σ 2 = σ02 ; H11 : σ 2 > σ02 ; H12 : σ 2 < σ02 ; H13 : σ 2 6= σ02 . Take χ2 = . Then
σ02
C1 : χ2 ≥ χ2α,n−1 ; C2 : χ2 ≤ χ21−α,n−1 ; C3 : χ2 ≥ χ2α/2,n−1 or χ2 ≤ χ21−α/2,n−1 .
9. Let xi be the observed value of a binomial random variable Xi with parameters ni and
θi , respectively, for i = 1, 2, . . . , k. H0 : θ1 = θ2 = · · · θk ; H1 : θi 6= θj for some i, j.
k k 2
x1 + x2 · · · + xk 2 X (xi − ni θ̂)2 X X (fij − eij )2
Take θ̂ = ;χ = = ,
n1 + n2 · · · nk i=1 n i θ̂(1 − θ̂) i=1 j=1
e ij
10. In (9), If an r × c table is taken instead of the k × 2 table, where we denote by θi· , the
probability that an item falls into the i-th row, θ·j , the probability that an item falls into
the j-th column, and θij , the probability that an item falls into the i-th row and j-th
column, and H0 as θij = θi· · θ·j , H1 as θij 6= θi· · θ·j , then C is χ2 ≥ χ2α,(r−1)(c−1) .
r Xc
2 2
X (fij − eij )2
Here, χ is computed by χ = , with
i=1 j=1
eij
fij is the observed frequency for the cell in the i-th row and j-th column,
fi· is the sum of all fij in the i-th row, the row total,
f·j is the sum of all fij in the j-th column, the column total,
P P
f is the sum of all entries in the table, i.e., i j fij , and
θ̂i· = fi· /f, θ̂·j = f·j /f, eij = θ̂i· · θ̂·j · f = fi· · f·j /f.
Remark: Also, (9) is used for testing ‘goodness of fit’ when we expect that the observed
data follows some particular distribution. There, we interpret fi ’s as the observed fre-
quencies, and the ei ’s as the expected frequencies obtained by the use of the particular
distribution.