Professional Documents
Culture Documents
M otivational
research is a
type of marketing
research that
attempts to
explain why
consumers behave
as they do.
T he analysis
begins at the
cultural level.
Cultural values
and influences are
the ocean in which
we all swim and,
of which, most of
us are completely
unaware.
4
The global leader in analytical research systems
604 Avenue H East
Arlington, TX 76011-3100, USA
(1) 817.640.6166 or 1.800. ANALYSIS
http://www.decisionanalyst.com
© 1998 Decision Analyst, Inc
shares, the role of advertising in the
category, and trends in the marketplace.
Only part of this business
environment knowledge can come
from the respondent, of course, but
understanding the business context
is crucial to the interpretation of
consumer motives in a way that
will lead to useful results. Understanding
the consumer’s motives
is worthless unless somehow that
knowledge can be translated into
actionable marketing and advertising
recommendations.
Sometimes a motivational study is
followed by quantitative surveys to
confirm the motivational hypotheses
as well as to measure the relative
extent of those motives in the
general population. But many times
motivational studies cannot be
proved or disproved by survey research,
especially when completely
unconscious motives are involved.
In these cases, the final evaluation
Studies often collect data on categorical variables that can be summarized as a series of counts. These counts are
commonly arranged in a tabular format known as a contingency table. For example, a study designed to determine
whether or not there is an association between cigarette smoking and asthma might collect data that could be
assembled into a 2−2 table. In this case, the two columns could be defined by whether the subject smoked or not,
while the rows could represent whether or not the subject experienced symptoms of asthma. The cells of the table
would contain the number of observations or patients as defined by these two variables.
The chi-square test statistic can be used to evaluate whether there is an association between the rows and columns
in a contingency table. More specifically, this statistic can be used to determine whether there is any difference
between the study groups in the proportions of the risk factor of interest. Returning to our example, the chi-square
statistic could be used to test whether the proportion of individuals who smoke differs by asthmatic status.
The chi-square test statistic is designed to test the null hypothesis that there is no association between the rows
and columns of a contingency table. This statistic is calculated by first obtaining for each cell in the table, the
expected number of
Table 1
events that will occur if the null hypothesis is true. When the observed number of events deviates significantly from
the expected counts, then it is unlikely that the null hypothesis is true, and it is likely that there is a row-column
association. Conversely, a small chi-square value indicates that the observed values are similar to the expected
values leading us to conclude that the null hypothesis is plausible. The general formula used to calculate the chi-
square (X2) test statistic is as follows: where O = observed count in category; E = expected count in the category
under the null hypothesis; df = degrees of freedom; and c, r represent the number of columns and rows in the
contingency table.
The value of the chi-square statistic cannot be negative and can assume values from zero to infinity. The p-value
for this test statistic is based on the chi-square probability distribution and is generally extracted from published
tables or estimated using computer software programs. The p-value represents the probability that the chi-square
test statistic is as extreme as or more extreme than observed if the null hypothesis were true. As with
the t and Fdistributions, there is a different chi-square distribution for each possible value of degrees of freedom.
Chi-square distributions with a small number of degrees of freedom are highly skewed; however, this skewness is
attenuated as the number of degrees of freedom increases. In general, the degrees of freedom for tests of
hypothesis that involve an r×c contingency table is
Table 2
equal to (r7minus;1)×(c−1); thus for any 2×2 table, the degrees of freedom is equal to one. A chi-square
distribution with one degree of freedom is equal to the square root of the normal distribution, and, consequently,
either the chi-square or standard normal table can be used to determine the corresponding p-value.
The chi-square test is most widely used to conduct tests of hypothesis that involve data that can be presented in a
2×2 table. Indeed, this tabular format is a feature of the case-control study design that is commonly used in public
health research. Within this contingency table, we could denote the observed counts as shown in Table 1. Under
the null hypothesis of no association between the two variables, the expected number in each cell under the null
hypothesis is calculated from the observed values using the formula outlined in Table 2.
The use of the chi-square test can be illustrated by using hypothetical data from a study investigating the
association between smoking and asthma among adults observed in a community health clinic. The results
obtained from classifying 150 individuals are shown in Table 3. As Table 3 shows, among asthmatics the proportion
of smokers was 40 percent (20/50), while the corresponding proportion among asymptomatic individuals was 22
percent (22/100). By applying the formula presented in Table 2, for the observed cell counts of 20, 30, 22, and 78
(Table 3) the corresponding expected counts are 14, 36, 28, and 72. The observed and expected counts can then
be used to calculate the chi-square test statistic as outlined in Equation 1. The resulting value of the chi-square
Table 3
test statistic is approximately 5.36, and the associated p-value for this chi-square distribution that has one degree
of freedom is 0.02. Therefore, if there was truly no association between smoking and asthma, there is a 2 out of
100 probability of observing a difference in proportions that is at least as large as 18 percent (40%–22%) by
chance alone. We would therefore conclude that the observed difference in the proportions is unlikely to be
explained by chance alone, and consider this result statistically significant.
Because the construction of the chi-square test makes use of discrete data to estimate a continuous distribution,
some authors will apply a continuity correction when calculating this statistic. Specifically, where Oi−Ei is the
absolute value of the difference between Oi and Ei and the term 0.5 in the numerator is often referred to as Yates
correction factor. This correction factor serves to reduce the chi-square value, and, therefore, increases the
resulting p-value. It has been suggested that this correction yields an overly conservative test that may fail to
reject a false null hypothesis. However, as long as the sample size is large, the effect of the correction factor is
negligible.
When there is a small number of counts in the table, the use of the chi-square test statistic may not be
appropriate. Specifically, it has been recommended that this test not be used if any cell in the table has an
expected count of less than one, or if 20 percent of the cells have an expected count that is greater than five.
Under this scenario, theFisher'sexact test is recommended for conducting tests of hypothesis.
Bibliography
Grizzle, J. E. (1967). "Continuity Correction in the X2 Test for 2×2 Tables." The American Statistician 21:28–32.
Pagano, M., and Gauvreau, K. (2000). Principles of Biostatistics, 2nd edition. Pacific Grove, CA: Duxbury Press.
— PAUL J. VILLENEUVE
Wikipedia:
Chi-square test
Top
Home > Library > Miscellaneous > Wikipedia
distributionof the test statistic is a chi-square distribution when the null hypothesis is true, or any in which this
is asymptoticallytrue, meaning that the sampling distribution (if the null hypothesis is true) can be made to
approximate a chi-square distribution as closely as desired by making the sample size large enough.
Some examples of chi-squared tests where the chi-square distribution is only approximately valid:
Pearson's chi-square test , also known as the chi-square goodness-of-fit test or chi-square test for
independence. When mentioned without any modifiers or without other precluding context, this test is usually
understood (for an exact test used in place of χ2, see Fisher's exact test).
Likelihood-ratio tests in general statistical modelling, for testing whether there is evidence of the need to
move from a simple model to a more complicated one (where the simple model is nested within the
complicated one).
One case where the distribution of the test statistic is an exact chi-square distribution is the test that the variance
of a normally-distributed population has a given value based on a sample variance. Such a test is uncommon in
practice because values of variances to test against are seldom known exactly.
Contents [hide]
population
2 See also
3 External links
4 References
If a sample of size n is taken from a population having a normal distribution, then there is a well-known result
(seedistribution of the sample variance) which allows a test to be made of whether the variance of the population
has a pre-determined value. For example, a manufacturing process might have been in stable condition for a long
period, allowing a value for the variance to be determined essentially without error. Suppose that a variant of the
process is being tested, giving rise to a small sample of product items whose variation is to be tested. The test
statistic T in this instance could be set to be the sum of squares about the sample mean, divided by the nominal
value for the variance (i.e. the value to be tested as holding). Then T has a chi-square distribution with n–1
degrees of freedom. For example if the sample size is 21, the acceptance region for T for a significance level of 5%
See also
Statistics portal
G-test
T Test
External links
Chi-Square Calculator from GraphPad
References
Approach Wiley,ISBN 9780470454619
Greenwood, P.E., Nikulin, M.S. (1996) A guide to chi-squared testing. Wiley, New York. ISBN 047155779X
Nikulin, M.S. (1973) Chi-square test for normality. "International Vilnius Conference on Probability Theory
Nikulin, M.S. (1973) Chi-square test for continuous distributions with scale and shift parameters, "Theory
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by
professional editors (see full disclaimer)
Donate to Wikimedia
control (in marketing)
Related answers:
Chi-square test for 2x3 table? Read answer
A statistical hypothesis test is a method of making decisions using experimental data. In statistics, a result is
called statistically significant if it is unlikely to have occurred by chance. The phrase "test of significance" was coined
by Ronald Fisher: "Critical tests of this kind may be called tests of significance, and when such tests are available we may
discover whether a second sample is or is not significantly different from the first."[1]
Hypothesis testing is sometimes called confirmatory data analysis, in contrast to exploratory data analysis. In frequency
probability, these decisions are almost always made using null-hypothesis tests (i.e., tests that answer the
question Assuming that the null hypothesis is true, what is the probability of observing a value for the test statistic that is at
least as extreme as the value that was actually observed?)[2] One use of hypothesis testing is deciding whether experimental
Statistical hypothesis testing is a key technique of frequentist statistical inference, and is widely used, but also much
criticized.[citation needed] While controversial,[3] the Bayesian approach to hypothesis testing is to base rejection of the hypothesis
on theposterior probability.[4] Other approaches to reaching a decision based on data are available via decision
theory and optimal decisions.
The critical region of a hypothesis test is the set of all outcomes which, if they occur, will lead us to decide that there is a
difference. That is, cause the null hypothesis to be rejected in favor of the alternative hypothesis. The critical region is
usually denoted by C.
Contents
[hide]
1 Examples
Trial
o 1.2 Example 2 - Clairvoyant
Card Game
o 1.3 Example 3 - Radioactive
Suitcase
Tea
3 Definition of terms
4 Interpretation
6 Origins
7 Importance
8 Potential misuse
9 Criticism
importance
o 9.2 Meta-criticism
o 9.3 Philosophical criticism
o 9.4 Pedagogic criticism
o 9.5 Practical criticism
o 9.6 Straw man
o 9.7 Bayesian criticism
10 Publication bias
11 Improvements
12 See also
13 References
14 Further reading
15 External links
[edit]Examples
A statistical test procedure is comparable to a trial; a defendant is considered innocent as long as his guilt is not proven. The
prosecutor tries to prove the guilt of the defendant. Only when there is enough charging evidence the defendant is
condemned.
In the start of the procedure, there are two hypotheses H0: "the defendant is innocent", and H1: "the defendant is guilty". The
first one is called null hypothesis, and is for the time being accepted. The second one is called alternative (hypothesis). It is
The hypothesis of innocence is only rejected when an error is very unlikely, because one doesn't want to condemn an
innocent defendant. Such an error is called error of the first kind (i.e. the condemnation of an innocent person), and the
occurrence of this error is controlled to be seldom. As a consequence of this asymmetric behaviour, the error of the second
Wrong decision
Reject Null Hypothesis Right decision
Type I Error
A person (the subject) is tested for clairvoyance. He is shown the reverse of a randomly chosen play card 25 times and
asked which suit it belongs to. The number of hits, or correct answers, is called X.
As we try to find evidence of his clairvoyance, for the time being the null hypothesis is that the person is not clairvoyant. The
If the null hypothesis is valid, the only thing the test person can do is guess. For every card, the probability (relative
frequency) of guessing correctly is 1/4. If the alternative is valid, the test subject will predict the suit correctly with probability
greater than 1/4. We will call the probability of guessing correctly p. The hypotheses, then, are:
null hypothesis (just guessing)
and
When the test subject correctly predicts all 25 cards, we will consider him clairvoyant, and reject the null hypothesis. Thus
also with 24 or 23 hits. With only 5 or 6 hits, on the other hand, there is no cause to consider him so. But what about 12 hits,
or 17 hits? What is the critical number, c, of hits, at which point we consider the subject to be clairvoyant? How do we
determine the critical value c? It is obvious that with the choice c=25 (i.e. we only accept clairvoyance when all cards are
predicted correctly) we're more critical than with c=10. In the first case almost no test subjects will be recognized to be
clairvoyant, in the second case, some number more will pass the test. In practice, one decides how critical one will be. That
is, one decides how often one accepts an error of the first kind - a false positive, or Type I error. With c= 25 the probability of
and hence, very small. The probability of a false positive is the probability of randomly guessing correctly all 25 times.
Before the test is actually performed, the desired probability of a Type I error is determined. Typically, values in
the range of 1% to 5% are selected. Depending on this desired Type 1 error rate, the critical value c is
From all the numbers c, with this property, we choose the smallest, in order to minimize the probability of
a Type II error, a false negative. For the above example, we select: c = 12.
But what if the subject did not guess any cards at all? Having zero correct answers is clearly an oddity
too. The probability of guessing incorrectly once is equal to p'=(1-p)=3/4. Using the same approach we
can calculate that probability of randomly calling all 25 cards wrong is:
This is highly unlikely (less than 1 in a 1000 chance). While the subject can't guess the cards
correctly, dismissing H0 in favour of H1 would be an error. In fact, the result would suggest a trait on
the subject's part of avoiding calling the correct card. A test of this could be formulated: for a
selected 1% error rate the subject would have to answer correctly at least twice, for us to believe
Placed under a Geiger counter, it produces 10 counts per minute. The null hypothesis is that no
radioactive material is in the suitcase and that all measured counts are due to ambient radioactivity
typical of the surrounding air and harmless objects. We can then calculate how likely it is that we
would observe 10 counts per minute if the null hypothesis were true. If the null hypothesis predicts
(say) on average 9 counts per minute and a standard deviation of 1 count per minute, then we say
that the suitcase is compatible with the null hypothesis (this does not guarantee that there is no
radioactive material, just that we don't have enough evidence to suggest there is). On the other
hand, if the null hypothesis predicts 3 counts per minute and a standard deviation of 1 count per
minute, then the suitcase is not compatible with the null hypothesis, and there are likely other
The test described here is more fully the null-hypothesis statistical significance test. The null
hypothesis represents what we would believe by default, before seeing any evidence. Statistical
significance is a possible finding of the test, declared when the observed sample is unlikely to have
occurred by chance if the null hypothesis were true. The name of the test describes its formulation
and its possible outcome. One characteristic of the test is its crisp decision: to reject or not reject
the null hypothesis. A calculated value is compared to a threshold, which is determined from the
Again, the designer of a statistical test wants to maximize the good probabilities and minimize the
bad probabilities.
The following example is summarized from Fisher, and is known as the Lady tasting tea example.
[5]
Fisher thoroughly explained his method in a proposed experiment to test a Lady's claimed ability
to determine the means of tea preparation by taste. The article is less than 10 pages in length and
is notable for its simplicity and completeness regarding terminology, calculations and design of the
experiment. The example is loosely based on an event in Fisher's life. The Lady proved him wrong.
[6]
1. The null hypothesis was that the Lady had no such ability.
2. The test statistic was a simple count of the number of successes in 8 trials.
3. The distribution associated with the null hypothesis was the binomial distribution familiar
4. The critical region was the single case of 8 successes in 8 trials based on a conventional
If and only if the 8 trials produced 8 successes was Fisher willing to reject the null hypothesis –
effectively acknowledging the Lady's ability with > 98% confidence (but without quantifying her
ability). Fisher later discussed the benefits of more trials and repeated tests.
1. The first step in any hypothesis testing is to state the relevant null and alternative
2. The second step is to consider the statistical assumptions being made about the sample
in doing the test; for example, assumptions about the statistical independence or about
the form of the distributions of the observations. This is equally important as invalid
assumptions will mean that the results of the test are invalid.
4. Derive the distribution of the test statistic under the null hypothesis from the
assumptions. In standard cases this will be a well-known result. For example the test
5. The distribution of the test statistic partitions the possible values of T into those for which
the null-hypothesis is rejected, the so calledcritical region, and those for which it is not.
6. Compute from the observations the observed value tobs of the test statistic T.
The decision rule is to reject the null hypothesis H0 if the observed value tobs is in the
It is important to note the philosophical difference between accepting the null hypothesis and
simply failing to reject it. The "fail to reject" terminology highlights the fact that the null hypothesis is
assumed to be true from the start of the test; if there is a lack of evidence against it, it simply
continues to be assumed true. The phrase "accept the null hypothesis" may suggest it has been
proved simply because it has not been disproved, a logical fallacy known as the argument from
ignorance. Unless a test with particularly high power is used, the idea of "accepting" the null
hypothesis may be dangerous. Nonetheless the terminology is prevalent throughout statistics,
[edit]Definition of terms
The following definitions are mainly based on the exposition in the book by Lehmann and Romano:
[7]
Simple hypothesis
The set of values for which we fail to reject the null hypothesis.
Region of rejection / Critical region
The set of values of the test statistic for which the null hypothesis is rejected.
Power of a test (1 − β)
The test's probability of correctly rejecting the null hypothesis. The complement of the false negative rate, β.
Size / Significance level of a test (α)
For simple hypotheses, this is the test's probability of incorrectly rejecting the null hypothesis. The false
positive rate. For composite hypotheses this is the upper bound of the probability of rejecting the null hypothesis
A test with the greatest power for all values of the parameter being tested.
Consistent test
When considering the properties of a test as the sample size grows, a test is said to be consistent if, for a fixed
size of test, the power against any fixed alternative approaches 1 in the limit.[8]
Unbiased test
For a specific alternative hypothesis, a test is said to be unbiased when the probability of rejecting the null
hypothesis is not less than the significance level when the alternative is true and is less than or equal to the
A test is conservative if, when constructed for a given nominal significance level, the true probability
of incorrectly rejecting the null hypothesis is never greater than the nominal level.
Steps in Hypothesis Testing (1 of 5)
The basic logic of hypothesis testing has been presented somewhat informally in the
sections on "Ruling out chance as an explanation" and the "Null hypothesis." In this
section the logic will be presented in more detail and more formally.
2. The next step is to select a significance level. Typically the 0.05 or the 0.01
level is used.
Subjects in the drug group scored significantly higher (M = 23) than did
subjects in the control group (M = 17), t(18) = 2.4, p = 0.027.
The statement that "t(18) =2.4" has to do with how the probability value (p)
was calculated. A small minority of researchers might object to two aspects of
this wording. First, some believe that the significance level rather than the
probability level should be reported. The argument for reporting the probability
value is presented in another section. Second, since the alternative hypothesis
was stated as µ1 ≠ µ2, some might argue that it can only be concluded that the
population means differ and not that the population mean for the drug group is
higher than the population mean for the control group.
This argument is misguided. Intuitively, there are strong reasons for inferring that
the direction of the difference in the population is the same as the difference in the
sample. There is also a more formal argument. A non significant effect might be
described as follows:
Although subjects in the drug group scored higher (M = 23) than did subjects in the
control group, (M = 20), the difference between means was not significant, t(18) =
1.4, p = 0.179.
It would not have been correct to say that there was no difference between
the performance of the two groups. There was a difference. It is just that
the difference was not large enough to rule out chance as an explanation of
the difference. It would also have been incorrect to imply that there is no
difference in the population. Be sure not to accept the null hypothesis. teps
in Hypothesis Testing (5 of 5)
At this point you may wish to see a concrete example of using these seven
steps in hypothesis testing. If so, jump to the section on "Tests of μ, σ
known."