Professional Documents
Culture Documents
or
R
) was calculated from the bootstrap sample.
Fourth, the 600 ES estimates were ranked from low to high. The lower limit of the CI was
determined by finding the 15
th
estimate in the rank order [i.e. the .025 (600)
th
estimate]; the
upper limit was determined by finding the 585
th
estimate [i.e. the .975 (600)
th
estimate].
Results
Table 3 contains estimated coverage probabilities for percentile bootstrap intervals of
and
R
for all conditions in Study 2. The results indicate that the CI for performed much less
adequately than did the CI for
R
. In particular, coverage probability for the former CI could be
Cohens Effect Size 11
quite poor when the data were nonnormal. Coverage probabilities for the CI for
R
ranged form
.942 to .971 and were, therefore, near the nominal .95 value for all distributions. This interval
exhibited a tendency to increase as
R
increased, but appeared to be largely unaffected by the
distribution.
Additional Comparisons of CIs for
R
Comparison of Interval Widths for CIs for
R
. Study 1 and Study 2 indicated that, when
data are nonnormal, probability coverage for can be quite poor for both noncentral t
distribution based CIs and percentile bootstrap CIs. When the data were nonnormal, probability
coverage for the noncentral t distribution-based CI for
R
was adequate when .80
R
and was
adequate under some conditions when .80
R
. Probability coverage for the percentile bootstrap
CI for
R
was good for all conditions investigated. Since probability coverage for both types of
CIs for
R
is adequate in some conditions, it is important to compare the width of the two types
of intervals. Average widths for the two types are reported in Table 4 and show that, in general,
the width of the noncentral t distribution-based CI is shorter. The width advantage for the
noncentral t distribution-based was larger with smaller sample sizes and larger values for
R
.
Power. For each condition we determined the proportion of times that the intervals for
R
did not contain zero. These proportions estimate the power for tests of the hypothesis
0
: 0
R
H against the non-directional alternative and are reported in Table 5. Typically, but not
always, the CI based on the noncentral t distribution is estimated to have more power than the
percentile bootstrap CI. However, the estimated power differences were very small.
Cohens Effect Size 12
Discussion
Although the need to report ESs is becoming more widely acknowledged and interest in
confidence intervals for ESs is increasing, little appears to be known about the robustness of CIs
for ESs. Our research indicates that noncentral t distribution-based CIs for Cohens ES i.e.,
and for a robust version of Cohens ES i.e.,
R
may not have adequate coverage probability
when the data are sampled from nonnormal distributions. However, the difference between the
nominal confidence level and the empirical coverage probability tended to be much smaller for
the CI on
R
and, depending on ones tolerance for this difference, one might regard the coverage
probability as adequate, particularly when .80
R
.
As a result of the performance of the noncentral t distribution-based CIs, we investigated
whether CIs constructed by using the percentile bootstrap would have adequate coverage
probability. The results indicated that percentile bootstrap CIs for might not have adequate
coverage probability when data are nonnormal. By contrast, percentile bootstrap CIs for
R
had
adequate coverage probability for the three nonnormal distributions we investigated. Perhaps
most important, the coverage probabilities were adequate for the full range of ESs investigated,
rather than just for .80
R
.
Although percentile bootstrap CIs for
R
had better coverage probability than did the
noncentral t distribution-based CI, the latter confidence interval was shorter. Thus, some might
argue that additional simulations are needed to determine the conditions under which the two CIs
maintain probability coverage close to the nominal level in order provide a basis for selecting the
CI most appropriate for their data. Unfortunately, visual inspection of ones data can be
misleading with respect to the degree of nonnormality. In Figure 7.1, Wilcox (2001), for
Cohens Effect Size 13
example, provides a graph of a distribution of very long-tailed distribution that is almost
indistinguishable from a graph of the normal distribution. Estimates of measures of skew and
kurtosis can be misleading because these estimates tend to have large standard errors unless the
sample size is very large. In addition, our results suggest that the size of
R
, in part, determines
which of the two CIs has better coverage probability. Clearly, researchers will only have an
estimate of
R
and it is not clear how valid the estimate will be as a guide to selecting between
the two CI methods. Our point of view is that we should try to find a CI that has good
probability coverage over a wide range of distributions and values for
R
. In fact this point of
view motivated studying distributions that were strongly nonnormal. Since the percentile
bootstrap CI for
R
best met this criterion, we recommend this confidence interval from among
the four we investigated.
Cohens Effect Size 14
Footnotes
1. Other, less popular, measures of ES have been proposed by Hedges and Olkin (1985),
Kraemer and Andrews (1982), McGraw and Wong (1992), Vargha and Delaney (2000), Cliff
(1993, 1996) and Wilcox and Muska (1999) [see Hogarty & Kromrey (2001) for the definitions
of these procedures].
2. Hogarty and Kromrey (2001) suggested a robust statistic of ES similar to the one we present.
Cohens Effect Size 15
References
American Psychological Association. (2001). Publication manual of the American Psychological
Association (5
th
ed.). Washington, DC.
Bird, K. D. (2002). Confidence intervals for effect sizes in analysis of variance. Educational and
Psychological Measurement, 62, 197-226.
Cliff, N. (1993). Dominance statistics: Ordinal analyses to answer ordinal questions.
Psychological Bulletin, 114, 494-509.
Cliff, N. (1996). Answering ordinal questions with ordinal data using ordinal statistics.
Multivariate Behavioral Research, 31, 331-350.
Cohen, J. (1965). Some statistical issues in psychological research. In B.B. Wolman (Ed.),
Handbook of clinical psychology (pp. 95-121). New York: Academic Press.
Cumming G., & Finch S. A Primer on the understanding, use, and calculation of confidence
intervals that are based on central and noncentral distributions. Educational and
Psychological Measurement, 61, 532-574.
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. & Stahel, W. A. (1986). Robust statistics.
New York: Wiley .
Hays, W. L. (1963). Statistics. New York: Holt, Rinehart and Winston.
Hedges, L. V. (1981) Distribution theory for Glass's estimator of effect size and related
estimators. Journal of Educational Statistics, 6, 107-128.
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic
Press.
Cohens Effect Size 16
Hoaglin, D. C. (1983). Summarizing shape numerically: The g-and h distributions. In D. C.
Hoaglin, F. Mosteller, & Tukey, J. W. (Eds.), Data analysis for tables, trends, and
shapes: Robust and exploratory techniques. New York: Wiley.
Hogarty, K. Y., & Kromrey, J. D. (2001). Weve been reporting some effect sizes: Can you guess
what they mean? Paper presented at the annual meeting of the American Educational
Research Association (April), Seattle.
Huber, P. J. (1981). Robust statistics. New York: Wiley.
Kraemer, H. C., & Andrews, G. A. (1982). A nonparametric technique for meta-analysis effect
size calculation. Psychological Bulletin, 91, 404-412.
McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological
Bulletin, 111, 361-365.
Murphy, K. R. (1997). Editorial. Journal of Applied Psychology, 82, 3-5.
SAS Institute Inc. (1999). SAS/IML user's guide, version 8. Cary, NC: Author.
Staudte, R. G., & Sheather, S. J. (1990). Robust estimation and testing. New York: Wiley.
Steiger, J. H., & Fouladi, R. T. (1997). Noncentrality interval estimation and the evaluation of
statistical models. In L. Harlow, S. Mulaik, & J. H. Steiger (eds.), What if there were no
significance tests? Hillsdale, NJ: Erlbaum.
Thompson, B. (1994). Guidelines for authors. Educational and Psychological Measurement, 54,
837-847.
Vargha, A., & Delaney, H. D. (2000). A critique and improvement of the CL Common Language
effect size statistics of McGraw and Wong. Journal of Educational and Behavioral
Statistics, 25, 101-132.
Cohens Effect Size 17
Wilkinson, L. and the Task force on Statistical Inference (1999). Statistical methods in
psychology journals. American Psychologist, 54, 594-604.
Wilcox, R. R. (2001). Fundamentals of modern statistical methods. New York:Springer.
Wilcox, R. R. (2003). Applying contemporary statistical techniques. San Diego: Academic Press.
Wilcox, R. R., & Keselman, H. J. (2003). Modern robust data analysis methods: Measures of
central tendency. Psychological Methods, 8, 254-274.
Wilcox, R. R., & Muska, J. (1999). Measuring effect size: A non-parametric analogue of .
British Journal of Mathematical and Statistical Psychology, 52, 93-110.
Yuen, K. K., & Dixon, W. J. (1973). The approximate behaviour and performance of the two-
sample trimmed t. Biometrika, 60, 369-374.
Cohens Effect Size 18
Table 1
Example Means and Standard Deviations
Group
j
Y
j
S
1 28.5 4.6
2 33.8 6.2
Note.
1 2
21 n n
Cohens Effect Size 19
Table 2
Estimated Coverage Probabilities for Nominal 95% Noncentral t Distribution-Based Confidence
Intervals for and
R
0 g h 0 g , .225 h .76 g , .098 h .225 g , .225 h
R
1 2
n n
R
R
R
R
0.00 20 0.949 0.949 0.955 0.952 0.951 0.951 0.952 0.953
40 0.950 0.944 0.955 0.955 0.951 0.953 0.955 0.954
60 0.949 0.951 0.947 0.950 0.945 0.950 0.955 0.948
80 0.950 0.948 0.954 0.952 0.949 0.956 0.951 0.954
100 0.946 0.948 0.955 0.956 0.948 0.945 0.949 0.951
0.20 20 0.949 0.949 0.947 0.951 0.950 0.953 0.954 0.951
40 0.950 0.948 0.950 0.952 0.954 0.945 0.954 0.946
60 0.946 0.947 0.950 0.951 0.954 0.953 0.949 0.951
80 0.950 0.947 0.953 0.952 0.950 0.954 0.943 0.945
100 0.947 0.945 0.952 0.950 0.956 0.952 0.942 0.954
0.50 20 0.949 0.948 0.939 0.947 0.944 0.949 0.930 0.947
40 0.951 0.949 0.935 0.947 0.940 0.943 0.921 0.944
60 0.951 0.951 0.927 0.945 0.941 0.948 0.915 0.945
80 0.948 0.951 0.930 0.948 0.944 0.945 0.918 0.940
100 0.949 0.945 0.925 0.945 0.938 0.946 0.907 0.949
0.80 20 0.948 0.944 0.910 0.940 0.931 0.937 0.899 0.946
40 0.955 0.943 0.903 0.937 0.929 0.937 0.875 0.941
60 0.961 0.955 0.895 0.939 0.923 0.933 0.872 0.939
80 0.942 0.944 0.903 0.947 0.925 0.935 0.861 0.943
100 0.954 0.951 0.892 0.943 0.925 0.939 0.854 0.933
1.10 20 0.950 0.948 0.889 0.937 0.904 0.925 0.853 0.936
40 0.951 0.945 0.872 0.938 0.909 0.923 0.838 0.930
60 0.950 0.945 0.849 0.932 0.912 0.926 0.812 0.928
80 0.951 0.940 0.850 0.934 0.896 0.914 0.804 0.932
100 0.948 0.945 0.851 0.938 0.904 0.926 0.785 0.939
Cohens Effect Size 20
Table 2 (Continued)
0 g h 0 g , .225 h .76 g , .098 h .225 g , .225 h
R
1 2
n n
R
R
R
R
1.40 20 0.954 0.946 0.857 0.926 0.887 0.915 0.822 0.925
40 0.954 0.937 0.841 0.934 0.883 0.907 0.787 0.929
60 0.945 0.945 0.822 0.930 0.879 0.914 0.761 0.928
80 0.948 0.941 0.816 0.934 0.882 0.907 0.748 0.930
100 0.948 0.945 0.805 0.927 0.880 0.915 0.736 0.923
Cohens Effect Size 21
Table 3
Estimated Coverage Probabilities for Nominal 95% Confidence Bootstrap CIs for and
R
0 g h 0 g , .225 h .76 g , .098 h .225 g , .225 h
R
1 2
n n
R
R
R
R
0.00 20 0.938 0.951 0.924 0.954 0.924 0.946 0.922 0.954
40 0.940 0.945 0.938 0.953 0.936 0.950 0.938 0.954
60 0.945 0.950 0.934 0.951 0.935 0.948 0.939 0.947
80 0.945 0.948 0.943 0.950 0.938 0.955 0.937 0.951
100 0.942 0.947 0.945 0.953 0.942 0.945 0.937 0.949
0.20 20 0.935 0.954 0.919 0.953 0.921 0.950 0.919 0.950
40 0.940 0.947 0.936 0.952 0.939 0.950 0.936 0.950
60 0.939 0.946 0.936 0.951 0.946 0.950 0.937 0.952
80 0.945 0.946 0.944 0.952 0.944 0.949 0.933 0.946
100 0.944 0.947 0.943 0.952 0.947 0.950 0.934 0.952
0.50 20 0.934 0.955 0.902 0.953 0.908 0.954 0.882 0.955
40 0.941 0.951 0.922 0.952 0.925 0.950 0.902 0.952
60 0.946 0.954 0.920 0.952 0.933 0.954 0.907 0.951
80 0.944 0.951 0.922 0.951 0.937 0.953 0.919 0.948
100 0.946 0.946 0.928 0.947 0.935 0.952 0.916 0.953
0.80 20 0.928 0.954 0.870 0.958 0.885 0.957 0.843 0.962
40 0.944 0.953 0.888 0.952 0.913 0.955 0.863 0.957
60 0.953 0.958 0.899 0.950 0.917 0.949 0.877 0.952
80 0.938 0.9466 0.911 0.957 0.926 0.950 0.880 0.954
100 0.950 0.955 0.907 0.951 0.929 0.953 0.875 0.948
1.10 20 0.926 0.961 0.837 0.965 0.860 0.962 0.803 0.966
40 0.938 0.957 0.872 0.958 0.896 0.960 0.831 0.959
60 0.939 0.954 0.873 0.956 0.914 0.957 0.835 0.950
80 0.944 0.952 0.891 0.952 0.914 0.948 0.880 0.954
100 0.944 0.955 0.894 0.958 0.923 0.957 0.847 0.959
Cohens Effect Size 22
Table 3 (Continued)
0 g h 0 g , .225 h .76 g , .098 h .225 g , .225 h
R
1 2
n n
R
R
R
R
1.40 20 0.918 0.966 0.813 0.972 0.840 0.968 0.765 0.969
40 0.938 0.956 0.857 0.966 0.889 0.960 0.798 0.964
60 0.936 0.957 0.858 0.963 0.897 0.964 0.814 0.964
80 0.942 0.957 0.870 0.959 0.912 0.956 0.821 0.961
100 0.942 0.954 0.875 0.957 0.914 0.959 0.828 0.958
Cohens Effect Size 23
Table 4
Average Width of Noncentral t Distribution-Based (NCT) and percentile bootstrap (BOOT) CIs
for
R
0 g h 0 g , .225 h .76 g , .098 h .225 g , .225 h
R
1 2
n n
NCT
BOOT NCT
BOOT NCT
BOOT NCT
BOOT
0.00 20 1.368 1.513 1.368 1.468 1.368 1.470 1.367 1.463
40 0.952 0.995 0.952 0.979 0.952 0.973 0.952 0.976
60 0.774 0.795 0.774 0.787 0.774 0.782 0.774 0.786
80 0.668 0.682 0.668 0.676 0.668 0.673 0.668 0.675
100 0.597 0.606 0.597 0.603 0.597 0.601 0.597 0.601
0.20 20 1.373 1.516 1.374 1.482 1.373 1.484 1.373 1.481
40 0.956 0.999 0.956 0.986 0.956 0.984 0.956 0.984
60 0.777 0.799 0.776 0.793 0.776 0.789 0.776 0.791
80 0.671 0.685 0.671 0.681 0.671 0.680 0.671 0.681
100 0.599 0.609 0.599 0.606 0.599 0.606 0.599 0.606
0.50 20 1.404 1.571 1.401 1.549 1.405 1.578 1.403 1.548
40 0.975 1.028 0.974 1.023 0.975 1.041 0.975 1.027
60 0.792 0.820 0.792 0.821 0.792 0.831 0.792 0.824
80 0.684 0.703 0.684 0.705 0.684 0.712 0.684 0.707
100 0.611 0.625 0.611 0.628 0.611 0.635 0.611 0.630
0.80 20 1.458 1.659 1.453 1.663 1.462 1.749 1.456 1.682
40 1.010 1.083 1.009 1.096 1.010 1.126 1.010 1.102
60 0.819 0.860 0.819 0.874 0.820 0.901 0.818 0.877
80 0.708 0.736 0.707 0.748 0.708 0.773 0.707 0.754
100 0.632 0.653 0.632 0.665 0.631 0.684 0.632 0.669
1.10 20 1.532 1.784 1.527 1.827 1.541 1.958 1.527 1.848
40 1.058 1.153 1.057 1.186 1.060 1.250 1.057 1.196
60 0.857 0.913 0.857 0.945 0.859 0.991 0.858 0.957
80 0.740 0.782 0.740 0.808 0.741 0.852 0.740 0.816
100 0.661 0.694 0.661 0.717 0.661 0.754 0.661 0.722
Cohens Effect Size 24
Table 4 (Continued)
0 g h 0 g , .225 h .76 g , .098 h .225 g , .225 h
R
1 2
n n
NCT
BOOT NCT
BOOT NCT
BOOT NCT
BOOT
1.40 20 1.627 1.944 1.618 2.024 1.633 2.183 1.619 2.043
40 1.120 1.239 1.116 1.301 1.122 1.400 1.116 1.317
60 0.906 0.983 0.905 1.027 0.907 1.108 0.905 1.044
80 0.781 0.840 0.781 0.881 0.783 0.940 0.781 0.893
100 0.698 0.747 0.697 0.780 0.698 0.833 0.697 0.792
Cohens Effect Size 25
Table 5
Estimated Power for Non-directional Tests of
0
: 0
R
H
0 g h 0 g , .225 h .76 g , .098 h .225 g , .225 h
R
1 2
n n
NCT
BOOT NCT
BOOT NCT
BOOT NCT
BOOT
0.20 20 0.090 0.089 0.088 0.086 0.083 0.091 0.088 0.084
40 0.131 0.132 0.123 0.120 0.132 0.130 0.117 0.113
60 0.174 0.174 0.164 0.163 0.170 0.169 0.161 0.160
80 0.219 0.217 0.210 0.204 0.215 0.215 0.219 0.214
100 0.260 0.263 0.265 0.258 0.267 0.259 0.257 0.250
0.50 20 0.290 0.302 0.286 0.272 0.315 0.312 0.300 0.277
40 0.538 0.548 0.514 0.507 0.534 0.527 0.533 0.527
60 0.705 0.714 0.702 0.697 0.710 0.697 0.713 0.701
80 0.830 0.833 0.827 0.818 0.831 0.828 0.824 0.815
100 0.907 0.906 0.897 0.891 0.898 0.896 0.898 0.892
0.80 20 0.627 0.654 0.590 0.570 0.620 0.612 0.603 0.577
40 0.905 0.909 0.886 0.880 0.892 0.884 0.894 0.879
60 0.984 0.987 0.976 0.974 0.977 0.975 0.974 0.970
80 0.994 0.994 0.996 0.994 0.995 0.994 0.994 0.994
100 1.000 1.000 1.000 1.000 0.998 0.999 1.000 1.000
1.10 20 0.873 0.891 0.849 0.834 0.862 0.855 0.853 0.825
40 0.996 0.996 0.992 0.991 0.992 0.988 0.989 0.988
60 1.000 1.000 0.999 0.999 0.999 0.999 0.999 0.999
80 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
100 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
1.40 20 0.979 0.985 0.960 0.950 0.962 0.957 0.959 0.947
40 1.000 1.000 1.000 0.999 1.000 1.000 1.000 0.999
60 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
80 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
100 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Cohens Effect Size 26
- 2 2 4 6 8
t
0.1
0.2
0.3
0.4
Central t Noncentral t with l =4
Figure 1. A central and a noncentral t distribution.
Cohens Effect Size 27
- 2 2 4 6 8
t
0.1
0.2
0.3
0.4
t=3.14
Noncentral t with l =5.21
Noncentral t with l =1.05
.025 .025
Figure 2. Graphical representation of finding a confidence interval for the noncentrality
parameter