Professional Documents
Culture Documents
PART 9
Testing of hypotheses
M S Sridhar
Head, Library & Documentation
ISRO Satellite Centre
Bangalore 560017
Importance
It is very difficult, laborious & time-consuming to make
adequate discriminations in the complex interplay of facts
without working hypothesis
1. Gives definite point to the inquiry
2. Helps establishing direction
Directs our search for order among facts & provide considerable
advantage in inquiry with suggested explanation or solution
3. Prevents blind search & indiscriminate gathering of data
While searching for significant & relevant facts to explain the
problem, shows the essential relationship that exists between
various elements within the complexity
4. Helps to delimit the field of inquiry
In his search, researcher may fall back on previous experience
of his own or that of others & single out those factors that are
known to have explained similar situations in the past as
observed in the descriptive literature or speculative philosophy
State Ho as well as Ha
Yes No
Reject Ho Accept Ho
Decision Decision
Accept H0 Reject H0
3. F – Test
Based on F – Distribution 4. χ2 - Test
Based on Chi-square
σ2 s
1 distribution
F = ------------- ( Oij - Eij )2
σ2 s2 χ2 = ∑ ----------------
Used in the context of Eij
ANOVA and for the Used for comparing a
testing the significance sample variance to a
of multiple correlation theoretical population
coefficients, comparing variance
the variance of two σ 2s
independent samples, χ2 = ---------- ( n – 1 )
σ 2p
etc.
X1 - X2 X1 - X2
Z = --------------------------- t = ------------------------------------------------------------
√σ2p1 / n1 + √ σ2p2 / n2 ∑(X1i – X1)2 + ∑(X2i – X2)2
-------------------------------------- (√ 1/n1 + √ 1/n2)
n1 + n2 + n3
3. Comparing two related means
¯D – D ∑ Di
t = ------------- ; df = n – 1; ¯ D = ----------; σdif =√ ∑Di2 - √(¯ D)2i n/n - 1
σdif /√n n
∑Di2
A = ------------
(∑Di)2
1 578 6 36
2 572 0 0
3 570 -2 4
4 568 -4 16
5 572 0 0
6 578 6 36
7 570 -2 4
8 572 0 0
9 596 24 576
10 544 -28 784
-----------------------------------------------------------------------------------------------------------
n = 10 ∑ Xi = 5720 ∑(Xi - X)2=1456
-----------------------------------------------------------------------------------------------------------
1 578 578 0 0
2 572 578 -6 36
3 570 578 -8 64
4 568 578 - 10 100
5 572 578 - 6 36
6 578 578 0 0
7 570 578 -8 64
8 572 578 -6 36
9 596 578 18 324
10 544 578 - 34 1156
-------------------------------------------------------------------------------------------------------------------
n = 10 ∑ Di = - 60 ∑ Di 2 = 1816
-------------------------------------------------------------------------------------------------------------------
Example: Given below are the time taken in minutes by 7 untrained users (X1i)
and 5 trained users (X2i) for executing a query on an online database. Is there
any evidence at 5% significance level that the training has reduced the time
taken for executing a query user 10 and 8 as assumed means for X1i and X2i
respectively.
H0 : µ1 = µ2 ; Ha: µ1< µ2
(X1i – A1) 28
¯X1 = A1 + ------------------- = 10 + ---------- = 14
n1 7
(X2i – A2) 15
¯X2 = A2 + ------------------ = 8 + --------- = 11
n2 5
∑ (X1i – A1) - [∑((X1i – A1)] / n1
2 2
σs1 = ---------------------------------------------------- = 3.667
2
n1-1
∑(X2-A2)2 –[∑(X2i-A2)]2/n2
σs22 = ---------------------------------------- = 6
n2 - 1
¯ X1 - ¯ X2
T = --------------------------------------------------------------
√(n2 – 1) σs12 +(n2 – 1) σs22 X √1/ n1 + √ 1/ n2
-------------------------------------
n1+n2-2
14 - 11
= --------------------------------------------------------- = 2.381
√(7 – 1)(3.667)+(5-1)(6) X √1/7 + √ 1/5
--------------------------------
7+5-2
Table value of t for 10 df at α =0.5 for one tail test is 1.812. Hence H0 is
rejected
∑Di -7
Mean Difference, ¯D = ---------- = --------- = - 0.778
n 9
√ ∑ Di 2 - (¯D)2(n) √ 121-(-3.5)2 X 6
σdiff = ---------------------------- = ---------------------------- = 3.08
n-1 6-1
4. Testing the Equality of Variances of Two Normal Populations
Example: Given below are two random samples (X1i & X2i ) drawn
from two normal populations. Test using variance ratio at 5% and 2%
level of significance whether the two populations have the same
variance
H0 = σp12 = σp22
∑X1i 220
¯X1 = -------- = ------- = 22
n1 10
∑X2i 420
¯X2 = -------- = ------- = 35
n2 12
M S Sridhar, ISRO Testing of Hypotheses 38
4. Testing the Equality of Variances of Two Normal Populations
…contd.
---------------------------------------------------------------------------------------------------
Sample 1 Sample 2
X1i (X1i - ¯X1) (X1i - ¯X1)2 X2i (S2i - ¯X2) (X2i - ¯X2)2
---------------------------------------------------------------------------------------------------
20 -2 4 27 -8 64
16 -6 36 33 -2 4
26 4 16 42 7 49
27 5 25 35 0 0
23 1 1 32 -3 9
22 0 0 34 -1 1
18 -4 16 38 3 9
24 2 4 28 -7 49
25 3 9 41 6 36
19 -3 9 43 8 64
30 -5 25
37 2 4
-----------------------------------------------------------------------------------------------------
∑ X1i =220 ∑ (X1i - ¯X1)2 = 120 ∑ X2i=420 ∑(X2i- ¯X2)2 = 314
-----------------------------------------------------------------------------------------------------
n1 = 10 n2 = 12
------------------------------------------------------------------------------------------------------ 39
M S Sridhar, ISRO Testing of Hypotheses
4. Testing the Equality of Variances of Two Normal Populations
…Contd.
∑ (Xii- ¯X1)2 120
σ2s1 = ------------------------ = --------------- = 13.33
n1 - 1 10 - 1
σs22 28.55
F = -------- = ----------- = 2.14 (Since σs22 > σs12 )
σs12 13.33
df: V1 = n2 - 1 = 12 - 1 = 11
V2 = n1 - 1 = 10 - 1 = 9 (Since V1 > V2)
Σ Xc 470
¯X = ------ = -------- = 47 kgs
n 10
(Xi - ¯X)2 280
σs2 = ∑ ------------ = ----------- = 31.11 [H0: σp2 = σs2]
n-1 100-1
σ s2 31.11
χ = -------- (n-1) = --------- (10-1) = 13.999
2
σ p2 20
17, 15, 20, 29, 19, 18, 22, 25, 27, 9, 24, 20, 17, 6, 24, 14, 15, 23, 24,
26, 19, 23, 28, 19, 16, 22, 24, 17, 20, 13, 19, 23, 24, 17, 20, 13, 19,
10, 23, 18, 31, 13, 20, 171, 24, 14
There are 12 +ve and 30 -ve signs
X = 12, i.e., Number of +ve or -ve signs whichever is lower
X - np 12 - (46)(1/2)
Z = ------------ = ----------------------- = - 3.2437 ⇒ |Z | = 3.2437
√npq √(46) (½)(½)
As tabulated critical value of Z at α=0.05 is 1.645 H0 is rejected
Signs are + + - + + + +
There are 6 + and 1 -
Probability of 6 or more successes in 7 trials with p= 1/2 is 0.063
(see binomial probability distribution table)
This is less than α = 0.10 hence Ho is rejected or the training is
effective
Note: For large sample (i.e., both n.p & n.q are > 5) normal
distribution can be used
M S Sridhar, ISRO Testing of Hypotheses 50
5. Fisher-Irwin Test
¾ To test that 2 different treatments are different in terms of the results they
produce, i.e., there is no difference among 2 sets of data
¾ Applicable for situations where observations of each item could be classified
to one of the 2 mutually exclusive categories
Example: No. Passed No. Failed Total
New Training (A) 5 1 6
Old Training (B) 3 3 6
-------------------------------------------------------
Total 8 4 12
H0: Two programmes are equally good
Probability of Group A doing as well or better = Probability (5 passing &
1 failing) + Probability (6 passing & 0 failing)
8C X 4C 8C X 4C 224 28
5 1 6 0
= --------------- + ------------------ = -------- + -------- = 0.24 + 0.03 = 0.27
12C 12C 924 924
6 6
Alternatively, probability of Group B doing as well or worse = Probability
(3 passing & 3 failing) + Probability (2 passing & 4 failing)
8C X 4C 8C X 4C
3 3 2 4
= --------------- + ------------------ = 0.27
12C 12C
6 6
Comparing this probability at α = 0.05 we find that H0 is valid
n(n-1)……..(n-r+1) n! nP
r
nC = --------------------------- = ------------- = --------
r
r(r-1) …… 3.2.1 r!(n-r)! r!
8X7X6 336
8C = ----------------- = --------- = 56
3
3X2X1 6
Note 2. Probability table can also be consulted for given n and y
H0: There is no change in people’s attitude before and after the treatment
H0: P(A) = P(D) , i.e., Probability (Favourable before + Unfavourable after)
= Probability (Unfavourable before + Favourable after)
(|A – D| - 1)2 (|200-100| - 1)2 (99)2
χ2 = ------------------ = -------------------- = --------- = 32.67 with df = 1
A+D 200 + 100 300
Table value of χ2 for df = 1 at α = 0.05 is 3.84. Hence null hypothesis is rejected
7. The Median Test
To test whether two independent samples belong to the same
population (or even different population) with same or different
sizes but same median
Combining both, sample median is determined and a 2x2 table is
formed by assorting items above the median and below the median
Example: PRECIS and POPSI were adopted for indexing 8 sets of
micro documents and given below are their effectiveness. Test
the hypothesis that there is no difference between these two
scores at α = 0.05
Set No. 1 2 3 4 5 6 7 8
A. PRECIS 49 32 44 48 51 34 30 42
B. POPSI 40 45 50 43 37 47 55 57
Combined series lead to median as 44.5
By grouping the elements above & below median
PRECIS POPSI Total
Above median 3 (a) 5 (b) 8 (n1)
Below median 5 (c) 3 (d) 8 (n2)
Expected frequency total for the row x total for the column
of any cell = of that cell of that cell
Grand total
df = (c-1) (r-1)
If the calculated value of χ2 is equal or more than that of tabulated for the
given df the association is significant
df = x-1 = 3 - 1 = 2
Tabulated value of χ2 at 2 df and α = 0.05 is 5.991. Hence the H0 is
accepted or there is no significant difference
U - µu U - n1n2 / 2
Z = ------------ = -------------------------------
σu √ n1n2 (n1+n2+1)/12
When neither of the samples (n1 & n2) is greater than 8, U is the number of
times that a score in the group with n2 elements precedes a score in the
group with n1 elements (if n1<n2)
8 10 11 12 14 15 16
(S2) (S2) (S1) (S2) (s1) (s2) (S2)
U=0+0+1+2=3
For U = 3, n1=3, n2=4 the probability p is 0.200, which is lower than α =
0.005 and hence H0 is accepted
Test the hypothesis that there is no difference in the ‘half-life’ of two sets
of books
Combined rank: 2.4 6.4 6.9 9.0 9.1 9.3 11.1 11.2 11.5 13.0 13.2 13.7
13.9 14.0 15.1 15.5 15.8 16.0 16.1 17.2 17.8 18.0 18.2 18.3
R1 =113, R2 = 187,
U1 = 12(12)+12(13) / 2 - 113 = 109, U2 = 35
196.5 - (15)(15) / 2 86
Z = ------------------------------- = ------------- = 3.84
√(15)(15)(31)/12 24.109
At 5% significance, the critical value of Z is 1.96
Hence H0 is rejected, I.e., two groups of students do not belong to the same
population
M S Sridhar, ISRO Testing of Hypotheses 63
10. Wilcoxon Matched Pair or Signed Rank Test
¾ Used in the context of two-related samples where we can
determine both direction and magnitude of difference. Examples:
wife & husband, subjects studied before & after experiment,
comparing output of two machines, etc.
¾ As it attaches greater weight to pair which shows a larger
difference it is more powerful test than sign test
¾ Null hypothesis (Ho ) is that there is no difference in the two
groups with respect to characteristics under study
Steps :
• Find the differences di between each pair of values
• Assign rank to the differences from smallest to largest without
regard to sign
• Find sum of the ranks of +ve &-ve separately
• T is the smaller of the two sums
• For small sample use table values of ‘T’ where ‘n’ is the number
of pairs (excluding those with di = 0)
• For large sample (n>25), Z test is used with
T - μT
μT = n(n + 1) / 4, σT = √ n(n+1)(2n+1) / 24 and Z = ------------
σT
Since N ≤ ? Comparing the value of S with the critical value in table for k=4 & N = 7 at
5% significance level, i.e., 217.0 H0 is rejected. Hence professors are applying
essentially the same standard in ranking the journals, I.e. W is significant.
S 332
W = ------------------------ = ------------------------- = 0.741
1/12 k2 (N3 - N) 1/12 (42)(73 - 7)
The lowest value observed amongst Rj is 7. As such the best estimate of true rankings
is in the case of journal J1 low. In other words, professors on the whole place the
journal J1 as first.