Variance Heterogeneity and Non-Normality

y Heterogeneity of variance can be classified into two
types:
1. 2.
Where the variance is functionally related to the mean Where there is no functional relationship between the variance and the mean
Variance is functionally related to the mean

Usually associated with data whose distribution is not
normal
Example I, count data, such as the number of infested
plants per plot or the number of gall per leaf
Usually follow the poisson distribution, wherein the variance is equal to the mean, s2 = x
Example II, binomial distribution, such as percent
survival of insects or percent plants infected with desease (e.g., alive or dead and infested or not)
The variance and the mean are related as s2 = x(1-x).
There is no functional relationship between the variance and the mean

y Usually occurs in experiments where, due to the nature
of the treatments tested, some treatments have errors that are substantially higher (or lower) than others. y Example, the variance of the F2 generation can be expected to be higher than that of the F1 generation because of genetic variability in F2 is much higher than that in F1.
Handling Variance Heterogeneity

There are two remedial measures for handling
variance heterogeneity:
1. 2.
The method of data transformation for variances that are functionally related to the mean The method of error partitioning for variances that are not functionally related to the mean
Handling Variance Heterogeneity

Procedure for detecting the presence of variance
heterogeneity and for diagnosing the type of variance heterogeneity:

1. 2. 3.
For each treatment, compute the variance and the mean across replication Plot a scatter diagram between the mean value and the variance Visually examine the scatter diagram to identify the pattern of relationship (Fig. 7.2)
Data Transformation
y The most appropriate for variance heterogeneity where
the variance and the mean are functionally related y The appropriate data transformation to be used depend on the specific type of relationship between the variance and the mean.
Data Transformation
y There are three of the most commonly used
transformation
1. 2. 3.
Logarithmic Transformation Square-Root Transformation Arc Sine Transformation
Logarithmic Transformation
y Most appropriate for data where the standard
deviation is proportional to the mean or where the effects are multiplicative. y Generally found in data that are whole numbers and cover a wide range of values. y Examples: number of insects per plot or number of egg masses per plant (per unit area)
Logarithmic Transformation
y If the data set involves small values (e.g., less than 10),
log(x + 1) should be used instead of log x, where x is the original data
The Procedure for Applying the Logarithmic Transformation
STEP 1. Verify the functional relationship between the mean and the variance using the scatter-diagram STEP 2. Because some of the values in Table 7.14 are less than 10, log(x + 1) is applied instead of log x. STEP 3. Verify the success of the logarithm transformation in achieving the desired homogeneity of variance, by applying step 1 to the transformed data in Table 7.15.
STEP 4. Construct the analysis of variance, in the usual manner, on the transformed data in Table 7.15
Square-Root Transformation
y Appropriate for data consisting of small whole
numbers, for example, data obtained in counting rare events, such as:
y the number of infested plants in a plot, y the number of insects caught in traps, or y the number of weeds per plot.
y Appropriate also for percentage data where the range is
between 0 and 30% or between 70 and 100%.
y If most of the values in the data set are small (e.g., less
than 10), especially with zeroes present, (x + 0.5)1/2 should be used instead of x1/2
The range of data is from 0 to 26.39% Many values are less than 10 data are transformed into (x + 0.5)1/2
Square Root Transformation
Arc Sine Transformation

y Appropriate for data on proportion, data obtained
from a count, and data expressed as decimal fractions or percentages ( y Note: percentages which are not derived from count data are not included, such as percentage protein or carbohydrates. y Using a table of the arc sine transformation. y The value of 0% should be substituted by (1/4n) and the value of 100% by (100-1/4n)
Arc all percentage data need to be transformed. The y Not Sine Transformation
rule in choosing the proper transformation as follow:
y RULE 1. For percentage data lying within the range of 30
to 70%, no transformation is needed. y RULE 2. For percentage data lying within the range of either 0 to 30% of 70 to 100%, but not both, the squareroot transformation should be used. y RULE 3. For percentage data that do not follow the ranges specified in either rule 1 or rule 2, the arc sine transformation should be used.
Illustration of Arc Sine Transformation

Original data
The arc sine transformation should be used because the percentage data ranged from 0 to 100%. All zero values are replaced by [1/4(75)] and all 100 values by {100- [1/4(75)]}
Illustration of Arc Sine Transformation

Transformed data
The result of arc sine transformation using original data in Table 7.23
Statistical Procedures for Agricultural Research

y Kwanchai A, Gomez y Arturo A. Gomez

Variance Heterogeneity and Non-Normality

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Variance Heterogeneity and Non-Normality

Uploaded by

Copyright:

Available Formats

y Heterogeneity of variance can be classified into two

Variance is functionally related to the mean

plants per plot or the number of gall per leaf

Example II, binomial distribution, such as percent

The variance and the mean are related as s2 = x(1-x).

There is no functional relationship between the variance and the mean

Handling Variance Heterogeneity

Handling Variance Heterogeneity

heterogeneity and for diagnosing the type of variance heterogeneity:

Logarithmic Transformation Square-Root Transformation Arc Sine Transformation

log(x + 1) should be used instead of log x, where x is the original data

The Procedure for Applying the Logarithmic Transformation

y Appropriate also for percentage data where the range is

between 0 and 30% or between 70 and 100%.

Square Root Transformation

Arc Sine Transformation

Illustration of Arc Sine Transformation

Illustration of Arc Sine Transformation

Statistical Procedures for Agricultural Research

You might also like