You are on page 1of 34

PARAMETRIC & NON-PARAMETRIC TESTS

By: Aarti, Research Scholar, Dept. of Hotel & Tourism Management, Kurukshetra University, Kurukshetra

HYPOTHESIS
An unproved theory; Something taken to be true for the purpose of investigation; An assumption; The antecedent of a conditional statement; A supposition

4/7/2013

Measurement
1. Nominal or Classificatory Scale
Gender, ethnic background

2. Ordinal or Ranking Scale


Hardness of rocks, beauty, military ranks

3. Interval Scale
Celsius or Fahrenheit

4. Ratio Scale
Kelvin temperature, speed, height, mass or weight

4/7/2013

DATA
MEAN MEDIAN MODE Average or arithmetic mean of the data The value which comes half way when the data are ranked in order Most common value observed

In a normal distribution, mean and median are the


same

If median and mean are different, indicates that


the data are not normally distributed

The mode is of little use.


4/7/2013

Parametric Data
The data that can be measured. For example, heights, weight, depth, amount of money, etc Interval and ratio measurements are considered parametric. Data is considered parametric if it has the following three assumptions: Normality: Distribution is normal. Equal variances: The populations from which the data is obtained should have equal variances. The Ftest can be used to test the hypothesis that the samples have been drawn from populations with the equal variances. Independence: The data should be measured on an interval scale . If any or all of these assumptions are untrue then the results of the test may be invalid. it is safest to use a non-parametric test.

4/7/2013

PARAMETRIC TESTS

4/7/2013

These techniques are termed parametric because they focus on specific parameters of the population, usually the mean and variance. Also known as standard tests of hypothesis. use mean values, standard deviation and variance to estimate differences between measurements that characterize particular populations. Applied on parametric data. Parametric tests require data from which means and variances can be calculated, i.e., interval and ratio data. As long as the actual data meet the parametric assumptions, regardless of the origin of the numbers, then parametric tests can be conducted.

4/7/2013

TYPES
Student's t-tests (Student is the name of the statistician who developed the test) Z-test : based on the normal probablity distribution; used for judging the significance of several statistical measures, particularly the mean; Chi-square test; F-test; Analyses of variance (ANOVA); Linear regression, and others.

4/7/2013

Example: t-test
Suppose you have two independent groups (corresponding to two drugs) on which some measurement has been made for example, the length of time until relief of pain. You want to determine if one drug has a better overall (shorter) time to relief than the other drug. However, when you examine the data its obvious that the distribution of the data is not normal (You can test for normality of data using a statistical test.) If the data had been normally distributed, you would have performed a standard independent group t-test on this data.

4/7/2013

NON-PARAMETRIC TESTS

4/7/2013

Although nonparametric techniques do not require the stringent assumptions associated with their parametric counterparts, this does not imply that they are assumption free. CHARACTERISTICS: Fewer assumptions regarding the population distribution Sample sizes are often less stringent and > 30. Measurement level may be nominal or ordinal Independence of randomly selected observations, except when paired Primary focus is on the rank ordering or frequencies of data Hypotheses are posed regarding ranks, medians, or frequencies of data

It is needed because: Sample distribution is unknown. When the population distribution is abnormal i.e. too many variables involved.

4/7/2013

Methods
Chi-square test (2): Used to compare between observed and expected data. Also known as Test of goodness of fit, Test of independence, Test of homogeneity.

Kruskal-Wallis test: For testing whether samples originate from the same distribution. used for comparing more than two Samples that are independent, or not related Alternative to ANOVA.
Wilcoxon signed-rankUsed when comparing two related samples or repeated measurements on a single sample to assess whether their population mean ranks differ. Median testUse to test the null hypothesis that the medians of the populations from which two samples are drawn are identical. The data in sample is assigned to two groups, one consisting of data whose values are higher than the median value in the two groups combined, and the other consisting of data whose values are at the median or below. Sign test: Can be used to test the hypothesis if there is "no difference in medians" between the continuous distribution of two random variables X and Y, Fishers exact test: Test used in the analysis of contingency where sample sizes are small.
4/7/2013

Statistical test for paired or matched observation


Variable Test Mc Nemars Test

Nominal

Ordinal

Wilcoxon

Quantitative (discrete or non normal)

Wilcoxon

Quantitative (normal*)

Paired t est

4/7/2013

Note
These tables should be considered as guides only, and each case should be considered on its merits. The Kruskal-Wallis test is used for comparing ordinal or non-Normal variables for more than two groups, and is a generalisation of the Mann-Whitney U test. Analysis of variance is a general technique, and one version (one way analysis of variance) is used to compare Normally distributed variables for more than two groups, and is the parametric equivalent of the KruskalWallistest. If the outcome variable is the dependent variable, then provided the residuals (see ) are plausibly Normal, then the distribution of the independent variable is not important. There are a number of more advanced techniques, such as Poisson regression, for dealing with these situations. However, they require certain assumptions and it is often easier to either dichotomise the outcome variable or treat it as continuous. When valid, use parametric. Useful for non-normal data If possible use some transformation Use if normalization not possible Commonly used

Wilcoxon signed-rank test Wilcoxon rank-sum test Spearman rank correlation Chi square etc.

4/7/2013

ADVANTAGES
Treat samples from several different populations.

If the sample size is small there is no alternative.


If the data is nominal or ordinal. Easier to learn and apply than parametric tes

4/7/2013

DISADVANTAGES
Discard information by converting to ranks

Less powerful
Tables of critical values may not be easily available. False sense of security

4/7/2013

Wilcoxon signed rank test

To test difference between paired data CALCULATION: Step 1: Exclude any differences which are zero Ignore their signs Put the rest of differences in ascending order Assign them ranks If any differences are equal, average their ranks
STEP 2: Count up the ranks of +ives as T+ Count up the ranks of ives as T-

4/7/2013

Step 3: If there is no difference between drug (T+) and placebo (T-), then T+ & Twould be similar If there is a difference one sum would be much smaller and the other much larger than expected The larger sum is denoted as T T = larger of T+ and T- .

Step 4: Compare the value obtained with the critical values (5%, 2% and 1% ) in table. N is the number of differences that were ranked (not the total number of differences). So the zero differences are excluded.
4/7/2013

EXAMPLE
Hours of sleep Patient Drug Placebo

1
2 3 4 5 6 7 8 9 10

6.1
7.0 8.2 7.6 6.5 8.4 6.9 6.7 7.4 5.8

5.2
7.9 3.9 4.7 5.3 5.4 4.2 6.1 3.8 6.3

Null4/7/2013 Hypothesis: Hours of sleep are the same using placebo & the drug

Hours of sleep

Patient

Drug

Placebo

Difference

Rank Ignoring sign

1 2 3 4 5 6 7

6.1 7.0 8.2 7.6 6.5 8.4 6.9

5.2 7.9 3.9 4.7 5.3 5.4 4.2

0.9 -0.9 4.3 2.9 1.2 3.0 2.7

3.5* 3.5* 10 7 5 8 6

8
9 10

6.7
7.4 5.8

6.1
3.8 6.3

0.6
3.6 -0.5

2
9 1

3rd & 4th ranks are tied hence averaged; T= larger of T+ (50.5) and T- (4.5) Here, calculated value of T= 50.5; tabulated value of T= 47 (at 5%) significant at 5% level indicating that the drug (hypnotic) is more effective 4/7/2013 than placebo.

Wilcoxon rank sum test


Compares two groups. Calculation: Step 1: Rank the data of both the groups in ascending order. If any values are equal, average their ranks. Step 2: Add up the ranks in the group with smaller sample size. If the two groups are of the same size either one may be picked. T= sum of ranks in the group with smaller sample size. Step 3: Compare this sum with the critical ranges given in table.

Look up the rows corresponding to the sample sizes of the two groups. A range will be shown for the 5% significance level.
4/7/2013

Non-smokers (n=15)
Birth wt (Kg) 3.99 3.79 3.60* 3.73 3.21 3.60* 4.08 3.61 3.83 3.31 4.13 3.26 3.54 3.51 2.71

Heavy smokers (n=14)


Birth wt (Kg) 3.18 2.84 2.90 3.27 3.85 3.52 3.23 2.76 3.60* 3.75 3.59 3.63 2.38 2.34

Null Hypothesis: Mean birth weight is same between non-smokers & smokers

4/7/2013

Non-smokers (n=15)
Birth wt (Kg) 3.99 3.79 3.60* 3.73 3.21 3.60* 4.08
3.61 3.83 3.31 4.13 3.26 3.54 3.51 2.71

Heavy smokers (n=14)


Birth wt (Kg) 3.18 2.84 2.90 3.27 3.85 3.52 3.23
2.76 3.60* 3.75 3.59 3.63 2.38 2.34

Rank 27 24 18 22 8 18 28
20 25 12 29 10 15 13 3
Sum=272

Rank 7 5 6 11 26 14 9
4 18 23 16 21 2 1
Sum=163

* 17, 18 & 19are tied hence the ranks are averaged Hence caculated value of T = 163; tabulated value of T (14,15) = 151 4/7/2013 Mean birth weights are not same for non-smokers & smokers they are significantly different

Spearmans Rank Correlation Coefficient

Based on the ranks of the items rather than actual values. Can be used even with the actual values

Example: To know the correlation between honesty and wisdom of the boys
of a class.
It can also be used to find the degree of agreement between the judgments of two examiners or two judges.

4/7/2013

R (Rank correlation coefficient) = D = Difference between the ranks of two items N = The number of observations. Note: -1 R 1. i) When R = +1 Perfect positive correlation or complete agreement in the same direction ii) When R = -1 Perfect negative correlation or complete agreement in the opposite direction. iii) When R = 0 No Correlation.

4/7/2013

COMPUTATION
Give ranks to the values of items. Generally the item with the highest value is ranked 1 and then the others are given ranks 2, 3, 4, .... according to their values in the decreasing order. Find the difference D = R1 - R2 where R1 = Rank of x and R2 = Rank of y Note that D = 0 (always) Calculate D2 and then find D2 Apply the formula.

4/7/2013

If there is a tie between two or more items.


Then give the average rank. If m be the number of items of equal rank, the factor 1(m3-m)/12 is added to D2. If there

is more than one such case then this factor is added as


many times as the number of such cases, then

Student No.

Rank in Maths

Rank in Stats

R1 - R 2 D -2 2 3 0 -2 -3

(R1 - R2 )2 D2 4 4 9 0 4 9

(R1) 1 2 3 4 5 6 1 3 7 5 4 6

(R2) 3 1 4 5 6 9

7 8 9
10
4/7/2013

2 10 9
8

7 8 10
2

-5 2 -1
6

25 4 1
36

N = 10

D=0

D2 = 96

4/7/2013

Chi-square
Commonly used to compare observed data with data we would expect. To obtain according to a specific hypothesis. Chi-square requires that you use numerical values, not percentages or ratios. The formula for calculating chi-square ( 2) is: 2= (o-e)2/e That is, chi-square is the sum of the squared difference between observed (o) and the expected (e) data (or the deviation, d), divided by the expected data in all possible categories.
4/7/2013

Procedure
1. 2. 3. 4. 5. State the hypothesis being tested and the predicted results. Gather the data by conducting the proper experiment (or, if working genetics problems, use the data provided in the problem). Determine the expected numbers for each observational class. Remember to use numbers, not percentages. Chi-square should not be calculated if the expected value in any category is less than 5. Calculate using the formula. Complete all calculations to three significant digits. Round off your answer to two significant digits. Use the chi-square distribution table to determine significance of the value. Determine degrees of freedom and locate the value in the appropriate column. Locate the value closest to your calculated 2 on that degrees of freedom df row. Move up the column to determine the p value. State your conclusion in terms of your hypothesis. If the p value for the calculated 2 is p > 0.05, accept your hypothesis. 'The deviation is small enough that chance alone accounts for it. A p value of 0.6, for example, means that there is a 60% probability that any deviation from expected is due to chance only. This is within the range of acceptable deviation. If the p value for the calculated 2 is p < 0.05, reject your hypothesis, and conclude that some factor other than chance is operating for the deviation to be so great. For example, a p value of 0.01 means that there is only a 1% chance that this deviation is due to chance alone. Therefore, other factors must be involved.

4/7/2013

Parametric

Non-parametric

Assumed distribution

Normal

Any

Assumed variance

Homogeneous

Any

Typical data

Ratio or Interval

Ordinal or Nominal

Data set relationships

Independent

Any

Usual central measure

Mean

Median

Benefits

Can draw more conclusions

Simplicity; Less affected by outliers

Tests

Choosing

Choosing parametric test

Choosing a non-parametric test

Correlation test

Pearson

Spearman

Independent measures, 2 groups

Independent-measures t-test

Mann-Whitney test

Independent measures, >2 groups

One-way, independent-measures ANOVA

Kruskal-Wallis test

Repeated measures, 2 conditions

Matched-pair t-test

Wilcoxon test

4/7/2013

Repeated measures, >2 conditions

One-way, repeated measures ANOVA

Friedman's test

Limitations of the tests of hypothesis


The tests should not be used in mechanical fashion i.e. testing is not decision-making itself; the tests are only useful aids for decision-making. The tests do not explain as to why does the difference exists, say between the mean of two samples; Results of significance tests are based on probabilities and as such cannot be expressed with full certainty; The inferences based on the tests cannot be said to be entirely correct evidences concerning the truth of the hypothesis. It happens in the case of small samples where erring inferences happens to be generally higher. For greater reliability, the size of the sample be sufficiently enlarged.

4/7/2013

4/7/2013

You might also like