Professional Documents
Culture Documents
Justification for the test : All (m+n)! arrangements of the data are equally
likely under the null hypothesis. Thus by defining the m smallest observations as
+
type H and rest of type T , all ( ) possible arrangements of H and T are
equally likely. So we are justified in using this test.
DISTRIBUTION OF R UNDER H 0
The exact distribution of R is:
(2 1 )
Let a=m+n, then (0,1) as with lim =
2(1)
So we can use this for a asymptotic test for relatively large a.
We will use tables of distribution of R available for sample sizes less than 24. For all other
cases we will use the asymptotic test.
CHECKING DISTRIBUTION FREE NATURE
We choose 4 different distributions: Uniform(0,1), Normal(0,1) , Exponential(1) and
Cauchy(0,1).
For each we draw a random sample of size m+n and compute R
We repeat the above 10000 times and estimate P[R=r] for different values of r.
We plot these on a graph
Distribution free property check
0.5
_ uniform _ normal _ exponential _ cauchy
0.4
Estimated P[R=r] m=7 n=7
0.3
0.2
0.1
0.0
2 4 6 8 10 12 14
r
Distribution free property check
0.5
_ uniform _ normal _ exponential _ cauchy
0.4
0.3
0.2
0.1
0.0
5 10 15 20
r
Distribution free property check
0.5
_ uniform _ normal _ exponential _ cauchy
0.4
0.3
0.2
0.1
0.0
10 15 20 25 30 35 40
r
The estimated probabilities for all the different
distributions are very close to each other. And
thus from a visual inspection,
( 2 1 )
Zn=
(0,1) as
2(1)
Where a=m+n and lim =
We shall empirically verify this fact.
Proportion of Observations
-2
0
z
m=50 n=50
2
4
Proportion of Observations
-4
-2
z
0
m=60 n=80
2
Proportion of Observations
-4
-2
z
0
m=100 n=150
2
4
Proportion of Observations
-2
0
z
m=200 n=200
2
4
Proportion of Observations
-4
-2
z
0
m=200 n=50
2
Proportion of Observations
-3
-2
z
-1
m=200 n=15
0
1
CONCLUSIONS OF LIMITING DISTRIBUTION STUDY
As a=m+n becomes large R, when properly standardized seems to go close to a
normal distribution as expected.
The approximation seems to be best if lambda is close to 0.5
If m<<n or n<<m then the convergence do not seem to happen fast enough with a.
STUDY OF POWER
We need to check the power of the test procedure against different alternatives.
As our alternative is nonparametric we will take 5 different possible alternative
distributions and check power empirically for each.
The 5 chosen alternatives are
1. Xis are independent normal, but means of successive Xis increase by 0.01
2. Xis come from a MA(1) model with parameter 0.2
3. Xis come from a AR(1) model with parameter 0.6
4. Xis come from a MA(2) model with parameters 0.2 and 0.1
5. Xis come from a ARMA(1,1) model with parameters 0.2 and 0.2
Power comparison for sample size 7+7
0.20
0.15
Power
0.10
0.05
0 1 2 3 4 5
Hypothesis
Power comparison for sample size 11+11
0.35
0.30
0.20 0.25
Power
0.15
0.10
0.05
0 1 2 3 4 5
Hypothesis
Power comparison for sample size 20+30
0.6
0.5
0.4
Power
0.3
0.2
0.1
0 1 2 3 4 5
Hypothesis
Power comparison for sample size 75+125
1.0
0.8
0.6
Power
0.4
0.2
0 1 2 3 4 5
Hypothesis
CONCLUSION OF POWER STUDY
Power for all chosen alternatives is seen to increase with sample size.
Estimated size of the test for some set of values of m and n are as below. We see all
of these are close to 0.05
m n Estimated size
7 7 0.0513
11 11 0.0493
20 30 0.061
40 60 0.0508
75 125 0.0519
STUDY OF CONSISTENCY OF THE TEST
We want that under all choices of alternative distributions, power increases with
sample size.
For this study we will again take the 5 alternatives in the power study.
For this study we will take m=n . We will choose m from 25 to 100 at a gap of 5.
ALTERNATIVE 1: INCREASING MEANS
0.6
0.5
0.4
Power
0.3
0.2
0.1
a
ALTERNATIVE 2: MA(1) PARAMETER=0.2
Consistency for Alternative 2
0.35
0.30
0.25
Power
0.20
0.15
0.10
a
ALTERNATIVE 3: AR(1) PARAMETER=0.6
Consistency for Alternative 3
0.95 1.00
0.85 0.90
Power
a
ALTERNATIVE 4: MA(2) PARAMETERS = (0.2,0.1)
Consistency for Alternative 4
a
ALTERNATIVE 5: ARMA(1,1) PARAMETERS=(0.2,0.2)
0.9
0.8
0.7
Power
0.6
0.5
0.4
0.3
a
CONCLUSION OF CONSISTENCY
From the plots, it is clear that the power of the test increases for each choice of
alternatives with increase in sample size, so the test is consistent under all our choices
of alternatives.
CONCLUDING REMARKS
1) The test is not sensitive to underlying distribution as long as it is continuous in terms
of the null distribution. So this is a distribution free test.
2) The test statistic is asymptotically normal.
3) The test is consistent under various alternatives.
4) There is no parametric counterpart of this test.
5) Other tests based on runs like the runs up and down test can be used for testing
the same set of hypotheses.