You are on page 1of 22

Welcome to Powerpoint slides for

Chapter 9

Anova and the Design of Experiments

Marketing Research Text and Cases by Rajendra Nargundkar

Slide 1

Introduction and Applications

1. Surveys are the most popular research method used in Marketing Research. 2. The other widely used class of study is known as experimentation. Just like in a laboratory, we manipulate certain variables (usually marketing related ones in Marketing Research), and observe changes in other variables (like sales, or consumer preferences, behaviour or attitude for example). 3. The application areas for experiments are wide. Whenever a marketing mix variable (independent variable) such as price, a specific promotion or type of distribution, even specific elements like shelf space, or colour of packaging etc is changed, we would want to know its effect. 4. Under proper conditions, an experiment can tell us the effects of specific variations in one or more elements, of the marketing mix.

5. An experiment can be done with only one independent variable (factor) or with multiple independent variables.

Slide 2

Methods
1. A oneindependent variable experiment is called oneway ANOVA. ANOVA stands for Analysis of Variance, the generic name given to a set of techniques for studying cause-and-effect of one or more factors on a single dependent variable. 2. If we hypothesise that there is also a Blocking Variable (to be explained later in the Randomised Block Design) in addition to one independent variable, we can use a randomized block design.

3. When more than one factors (independent variables) are studied, it is known as a factorial experiment. This design can also facilitate the study of possible interaction effects among the independent variables. We will explore this further when we discuss factorial experiments. 4. When more than one dependent variable is studied, the technique called MANOVA or Multivariate Analysis of Variance is used. However, we will limit ourselves to the discussion of three major types of ANOVA .

Slide 3 Variables The Analysis of Variance technique is used when the independent variables are of nominal scale (categorical) and the dependent variable is metric (continuous). Design The design of the experiment is the most critical in performing any experiment to be analysed through the technique of ANOVA. There are four major types of designs, of which three frequently used types will be illustrated with a worked out example each. These four major types are Completely Randomised Design in a One-Way ANOVA (Single Factor) Randomised Block Design (Single Blocking Factor) Latin Square Design (Two Blocking Factors) Factorial Design with 2 or more Factors. We will discuss in detail the first two, and the fourth.

Slide 4 One-Way ANOVA This particular design is used when there is only one categorical independent variable, and one dependent (metric) variable. Each category of an independent variable is called a level. The independent variable may be different levels of prices, or different pack sizes, or different product colours, and the effect (dependent variable) could be sales, preferences or attitudes towards the brand. In the example that follows, we will look at advertising copy alternatives as the independent variable, and preference rating for the advertising copy as the dependent variable. Worked Example Problem: In this example, we assume that three different versions of advertising copy have been created by an advertising agency for a campaign. Let us call these versions of copy ADCOPY 1, 2 and 3. Now, the ad agency wants to test which of these three versions of the advertising copy is preferred by its target population, before they launch the campaign. A sample of 18 respondents is selected from the target population in the nearby areas of the city. At random, these 18 respondents are assigned to the 3 versions of ad copy. Each version of ad copy is thus shown to six of the respondents. The respondents are asked to rate their liking for the ad copy shown to them on a scale of 1 to 10. (1 = Not liked at all, 10 = Liked a lot, and other values in between these two). The ratings given by the 18 respondents are tabulated.

Slide 5 Input Data Fig 1. shows the input data for the 18 respondents. Fig. 1. Sr. No. 1 2 3 4 5 6 7 8 9 10 Ad copy 1 1 1 1 1 1 2 2 2 2 rating 6.00 7.00 5.00 8.00 8.00 8.00 4.00 4.00 5.00 7.00

Slide 5 contd...
Fig. 1. Contd Sr. No. 11 12 13 14 15 16 17 18 Ad copy 2 2 3 3 3 3 3 3 rating 7.00 6.00 5.00 5.00 4.00 7.00 8.00 7.00

The codes in the ad copy, column (1,2,3) indicate the different versions of the ad. The last column, rating, is the rating given by a respondent to the adcopy seen by him/her. Thus, six respondents have rated each ad. Please note, that these eighteen respondents were randomly assigned to each of the three ad versions. This random assignment is called a completely randomised assignment or design.

Slide 6 The input data in fig 1 is input into a statistical package for performing a One-Way ANOVA, because we have only 1 categorical factor (Ad copy) at 3 levels 1, 2, 3 and 1 dependent variable Rating. Output The output of the computerised One-Way ANOVA is shown in fig. 2. Fig. 2 Source of Variation Main Effects ADCOPY Explained Residual Total Sum of Squares 7.000 7.000 7.000 29.500 36.500 DF Mean Square 3.500 3.500 3.500 1.967 2.147 F Sig. of F .203 .203 .203

2 2 2 15 17

1.780 1.780 1.780

Slide 6 contd.

The first column is titled Source of Variation. Under this, labeled Main Effects, is the single independent variable called ADCOPY. We then go to the last column, where the significance of the F test is given. It is .203 in this case, for the factor ADCOPY. This indicates that at the confidence level of 95 percent, (corresponding to significance level of 0.05), the F-test proves the model is not significant. In other words, the Ratings given to the three ad copy versions are not significantly different from each other.

Slide 7

The ANOVA has thus told us what we may not have been able to gauge if we had simply looked at the mean ratings for each ad copy by computing these.
For example, the ratings for the ad copy version 1 are 6,7,5,8,8,8 and the mean rating is (6+7+5+8+8+8) / 6, or 42/6 = 7. Similarly, the mean rating of ad copy version 2 is (4+4+5+7+7+6) / 6, or 33/6 = 5.5. The mean rating for ad copy version 3 is (5+5+4+7+8+7) / 6, or 36/6 = 6. At a glance, the three mean ratings appear to be different 7, 5.5 and 6. But the ANOVA tells us that this difference is not statistically significant at the 95 percent confidence level. It does this by performing an F-test. The null hypothesis for this F-test is that there is no significant difference in the mean ratings for the three ad copy versions. (H0: M1 = M2 = M3 where M1, M2 and M3 are the mean ratings for the three versions of ad copy). Thus, in this case, we have accepted the null hypothesis (or failed to reject the null hypothesis), at the 95 percent confidence level. If the significance of F in the last column of fig. 2 had been less than 0.05, we would have rejected the null hypothesis. In that case, we would have concluded that significant differences exist between mean ratings given to the three ad copy versions.

Slide 8 1. Randomised Block Design: Let us continue with the same input data as in fig. 1, with one more column added to it. This dataset is shown in fig. 3.
Fig. 3 sr. adcopy no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 rating 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 6.00 7.00 5.00 8.00 8.00 8.00 4.00 4.00 5.00 7.00 7.00 6.00 5.00 5.00 4.00 7.00 8.00 7.00 magazine 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6

Slide 8 contd..
We have made a slightly different assumption in this case. We assume that the three versions of the adcopy were each used in 6 different magazines. These six magazines are coded 1, 2, 3, 4, 5, 6 and appear in the column titled magazine. Out of the people who saw these ads, 18 randomly chosen respondents are picked, one from each magazine who saw a particular version of ad. Thus, we finally have one respondent who has seen a given version of the ad in a given magazine. In other words, we have one respondent for every combination of magazine and adcopy.

Slide 9 Hypothesis 1. The assignment of our sample of 18 in the above manner assumes that the magazine in which the version of adcopy appears may have an impact on the ratings. We can test this hypothesis - in fact, two hypotheses - by doing an ANOVA with a randomised block design. 2. For this purpose, we use the variable Rating as the dependent variable, and Adcopy as the factor, and Magazine as the block. 3. A block is defined as some variable which could affect the relationship between the independent factor and the dependent variable under study in an ANOVA. In our example, the magazine in which the advertisement appears could influence the Rating given to Adcopy by the respondents. We are trying to remove the effect of the magazine used, by "blocking" its effect, or treating the block separately. 4. If we do not block on a variable, its effect gets included with the error (residual) term. This may lead to wrong conclusions about the relationship between the independent and dependent variables. In that sense, a randomised block design is more "powerful" than a simple one-way ANOVA, if the block effect is significantly influencing the relationship.

Output The computer output for this problem using a randomised block design is shown in fig. 4. Fig. 4 Tests of significance for RATING using UNIQUE sums of squares.
Source of Variation Residual Adcopy Magazine (Model) (Total) SS DF MS F Sig of F

Slide 10

3.67 10 .37 7.00 2 3.50 9.55 .005 25.83 5 5.17 14.09 .000 32.83 7 4.69 12.79 .000 36.50 17 2.15

This table is similar to the output table of the one-way ANOVA we got earlier (fig. 2), except that there is an additional source of variation called Magazine in the first column of fig. 4. This is the block we have used, to test the null hypotheses .The first null hypothesis is that mean rating of the ADCOPY is the same for all 3 versions. This is the same as the null hypothesis we had used earlier for the one-way ANOVA. .The second null hypothesis is that the block used (Magazine in this case) has no effect on mean ratings given to ADCOPY versions by respondents.

Slide 11 1. To test if the null hypotheses are rejected or not, we turn to the last column of fig. 4, which gives the result of an F-test for any assumed confidence level. We will assume we wanted to test these hypotheses at the 95 percent confidence level. 2. We know that the significance level of F in the last column should be less than 0.05 for the null hypothesis to be rejected. We see that for both the rows labelled ADCOPY and MAGAZINE, the significance of F is less than .05. It is .005 for ADCOPY and .000 for MAGAZINE. This means that both the null hypotheses are rejected.

3. We conclude that the mean ratings given to the 3 versions of ADCOPY are significantly different, and also that the MAGAZINE in which the ADCOPY appears has an impact on its rating.
4. Please note that the Blocking Factor being considered separately has now led us to a different conclusion from that in a completely randomized test of the same basic data. This makes the randomized block test a better test when we suspect that a blocking factor affects the relationship between the independent variable and the dependent variable.

Slide 12

Latin Square Design


The Latin Square Design is an extension of the Randomised Block Design. It consists of one independent variable (FACTOR) and two Blocks, instead of one which we saw in the Randomised Block Design. It has no special significance in marketing research, so we will move on to the more general case of a factorial design where any number of factors can be tested simultaneously for their effects on the dependent variable. Factorial Designs

This type of design is employed when we have 2 or more independent variables or factors. The major advantage of this design is that multiple factors can be simultaneously tested. There are two kinds of effects that we can test. One is called the Main Effect. The second is called the Interaction Effect. To illustrate, we will take up an example.

Slide 13

Worked Example
In this example, we assume that we are testing for a toilet soap brand, the effect of two Factors (independent variables) Pack Design and Price - on Sales (dependent variable). We would like to know (1) if each of the Factors independently affects Sales (called the Main Effects), and (2) if there is a combined effect of Pack Design and Price (called the 2 way Interaction Effect) on Sales. Incidentally, if there are 3 factors in a study, then we could test for all 2-way interaction effects and the 3-way interaction effect, in addition to the Main Effects of the individual factors. To continue with our example, the experiment is conducted in a simulated environment on 18 randomly selected respondents. There are 3 levels of price Rs. 8, Rs. 11 and Rs. 14, and 3 levels of Pack Design designated by the main colours used Blue, Red and Green. The coding of these variables is 1, 2, 3 respectively for Rs. 8, 11 and 14 and 1, 2, 3 for Blue, Red and Green in the case of Pack Design.

Slide 14

Input Data

The input dataset is shown in fig. 5. Fig. 5. sr. no. sales packdesn price 1 500 1 1 2 440 2 1 3 360 3 1 4 300 1 2 5 280 2 2 6 250 3 2 7 200 1 3 8 150 2 3 9 250 3 3 10 600 1 1 11 450 2 1 12 510 3 1 13 400 1 2 14 350 2 2 15 300 3 2 16 250 1 3 17 275 2 3 18 220 3 3 Column 1 is Sales, column 2 is Pack Design and Column 3 is Price. Please note that even though Price is a continuous metric variable, for the purpose of ANOVA, being an independent variable, it has to be treated as a categorical variable. Hence the coding (1, 2, 3) for Price.

Slide 15 Also note from fig.5 that each combination of Price and Pack Design appears twice in the dataset. For example, Packdesign = 1 and Price = 1 appears in Row 1 and also Row 10. This is known as a replication in design of experiments. This is similar to having a higher sample size in a survey. Depending on the number of Factors and the number of levels of each Factor, the minimum sample size required for ANOVA may go up. In such cases, multiple observations or replications become necessary. In general, replications reduce chances of random error affecting the results of ANOVA experiments, similar to the effects of increasing sample size in surveys. Output: The output data for our factorial experiment are presented in fig. 6. Fig 6
Source of Variation Main Effects Packdesn Price 2-Way Interactions Packdesn Price Explained Residual Total Sum of Squares DF Mean Square F Sig of F

209305.556 12536.111 196769.444 9838.889 9838.889

4 2 2 4 4

52326.389 13.645 .001 6268.056 1.635 .248

98384.722 25.656 .000 2459.722 .641 .646 2459.722 .641 .646 7.143 .004

219144.444 8 27393.056 34512.500 9 3834.722 253656.944 17 14920.997

Slide 16

Let us first look at Sources of Variation listed in the first column. The last source of variation listed is the Residual or error term. But we are interested in the two Main Effects and one Interaction Effect.
In this case, we are testing three hypotheses The mean level of Sales remains the same for all 3 levels of Pack Design (Main Effect 1). The mean level of Sales remains the same for all 3 levels of Price (Main Effect 2). The mean level of Sales remains the same for all combinations of Pack Design and Price (Interaction Effect). Assuming 0.05 level of significance, we check whether for each of the rows corresponding to the above hypotheses, the significance of F is below 0.05 in the last column of fig. 6.

Slide 17 We find that the significance of F values are Pack Design - .248 (Main Effect 1) Price - .000 (Main Effect 2) Pack Design by Price - .646 (Interaction Effect) Therefore, only the Price effect, one of the two main effects, is significant statistically, at 95 percent confidence level. This means that hypothesis no. 2 is rejected. Hypothesis 1 and 3 cannot be rejected, as the significance of F values are greater than .05 in both cases - .248 and .646 respectively). Thus, we conclude that Price alone has an impact on Sales. Neither Pack Design alone nor the combination of Pack Design with Price have any significant impact on Sales of the toilet soap.

Slide 18

Additional Comments

Experiments are today widely used in many ways in Marketing Research. For example, test marketing of new concepts, products or prototypes is usually done through procedures explained above, or similar to these. STM or simulated Test Marketing procedures are extensions of the basic ANOVA type experiments, with the added tools of forecasting based on the results of experiments conducts. Separate software packages are now available for many specialised applications such as STM. Pairwise Tests If any main effect/interaction effect turns out significant, and has more than two levels, there is one additional test required to check for pairwise differences in the means. For instance, in our example of one-way ANOVA, if the mean Ratings had turned out to be significantly different at the 95 percent confidence level, we still would not know whether only one of the pairs (say, ADCOPY 1 and ADCOPY 2) are significantly different from each other, or if the remaining pairs (ADCOPY 1 and 3, and ADCOPY 2 and 3) are also significantly different. To find out, we can use tests such as Tukey's Test, Duncan's Test or Scheffe's Test. These can be requested while doing the ANOVA on most computer packages. These tests give us a pairwise test result of significant difference among means. These are meaningful only if the F test value for a main effect/interaction effect with more than two levels turns out to be significant.

You might also like