You are on page 1of 10

Workshop 1

Data: 124 weekly sales data of Heinz tomato ketchup, measured in US dollars,
between 1985 and 1988 in one supermarket in Sioux Falls, South Dakota (by A.
C. Nielsen). It includes sales, price, coupon promotion, major display promotion
information.

The characteristics of the sales (volume and revenue)


and prices

Sales
Price
volume
Display_only
Coupon_only
Display_Coupon
Valid N (listwise)

N
Statistic
124
124
124
124
124
123
123

Minimum
Statistic
18.89
.91
17.99
0
0
0

Descriptive Statistics
Maximum
Mean
Statistic
Statistic
773.94
114.4666
1.28
1.1572
626.87
100.5410
1
.27
1
.10
1
.06

Std. Deviation
Statistic
129.85165
.10324
112.76683
.444
.297
.233

Skewness
Statistic
Std. Error
2.599
.217
-.499
.217
2.321
.217
1.071
.217
2.761
.217
3.873
.218

Pursuant to the descriptive statistics above was sales of Heinz tomato


ketchup 18,89 at the lowest, and 773,94 at the highest during the period
of time. The average sale was 114,47. Std. Deviation tells us that the sale
was between 0 and 244,32 (144,47+129,85) in most of the weeks. Since
the sale in the majority of the weeks is between 0 and 244,32, and the
maximum sale is 773,94, can this be an indication on outliers. When I take
a look at the histogram below, it confirm my suspicion. In situations like
this, when we have outliers, is it better to use median to show the typical
sale in one week. If we take a look at the histogram below, is the median
60,02. The normal distribution has no skew (its perfectly symmetrical)
and the mean and median are both exactly at the peak. Therefore; the
sales variable is not normal distributed. There is a positive skew, because
the tail in the histogram below is on the positive side of the peak and
because the skewness number (2.599) in the descriptive statistics is a
positive number. Pursuant to the time-series plots below, there are some
fluctuations in the sales but its still relative stable. It doesnt look like
season have an influence in the sales.

According to the descriptive statistics above, the histogram below and the
time-series plot below, do the sales variable and the volume variable
behave relative equal. The minimum, maximum, mean, Std. Deviation and
skewness are almost the same. As with the sales variable, the maximum
volume is much higher than the typical volume. Like mentioned above,
this may be an indication on outliers and the histogram below conforms
the suspicion. Similar to the sales variable, the volume variable is not
normal distributed. There is a positive skew. And again, as the sales
variable, there are some fluctuations in the volume during the period of
time but it is still relatively stable.

Pursuant to the descriptive statistics above was the price 0.91 at the
lowest and 1.28 at the highest. The mean was 1.16. Pursuant to the
histogram below is the median 1.15. As opposed to the sales variable and
the volume variable is the mean and the median almost the same for the
price variable, and its almost at the peak. In addition, there is less skew.

The price variable is thus closer to normal distribution. When we take a


look at the time-series plots below, there are some fluctuation in the price
in the whole period. The price is relative stable until week 60 and there is
a decline in the price after this week. What happened? Its difficult to tell,
but harder competition, lower costs or deflation may explain the decline.
As we could see above, the demand was relative stable all the time and
thus we cannot accuse the demand for this decline.

We have three dummy variables, which is dichotomous variables with only


two values (0 or 1). This dummy variables represent different types of
promotion. The supermarket use display promotion only, coupon
promotion only and display and coupon promotion at the same time. The
mean is the only measure that makes sense when we are looking at
dummy variables because they only have two values. The supermarket
use only display promotion 27 % of the time, only coupon promotion 10 %
of the time and both display and coupon promotion 6 % of the time. They
use no promotion 57 % of the time. I will investigate the effects of
different types of promotion on the sales below.

The relationships between volume and price

Model Summary
Adjusted R
Std. Error of the
Model
R
R Square
Square
Estimate
1
.163a
.026
.018
111.71922
a. Predictors: (Constant), Price
Coefficientsa
Unstandardized Coefficients
Model
B
Std. Error
1
(Constant)
306.201
113.356
Price
-177.723
97.574
a. Dependent Variable: volume

Sig.
.008
.071

Y=306,201-177,723x

According to the scatter plot is there a weak negative correlation between


volume and price. When there is a negative correlation will an increase in
price leads to a decrease in volume. In the left column in the second table
is the constant and in the column Unstandardized Coefficients under B, do
we have the number 306,20 (a in the equation above). This is the
predicted volume if the price is zero. This situation will never occur. Below
is the coefficient of the price. The number is -177,72 (b in the equation
above). It tells us that when the price increase will the volume decrease
(greater number means steeper line and stronger correlation). Example: If
the price increase with 1 US dollar, will the volume decrease with 177,723
units. As we can see, is Sig. 0,071, which mean that we can assume this
correlation with 90 % certainty.
An important question in regression analysis: How good is this model?
How much of the variation in the dependent variable (volume in this case)
can the independent variable (price in this case) explain? Adjusted R
Square can give us the answer. Adjusted R Square in this model is 0.018.
This mean that the price explain only 1.8 % of the total variation in
volume. 98.2 % of the variation in volume cannot ascribe differences in
price. The higher Adjusted R Square, the better model.

This regression analysis indicates a weak negative correlation between


price and volume. But there are some requirement of regression analysis
and they are not fulfilled because the variables are not normal distributed
(especially not the volume variable) among other factors. Taking this into
consideration, we cannot trust this results. I will later use log
transformations to get more correct results.

The effects of different types of promotions on the sales

In the grouped scattered plot above are the plots quiet mixed together.
Therefore, its difficult to see the effects of different types of promotions
on the sales. On the other hand, the volume when they use no promotion
is almost the same irrespective of the price. It seems like there is a
relatively inelastic demand when there is no promotion. To see the effects
of different types of promotions, is it better to use a box plot.

Its easier to use a box plot like this to investigate the effects. We can see
the minimum volume, maximum volume, the median, first quartile and
third quartile for the different types of promotion and no promotion.
Pursuant to the median, the typical sales volume, no promotion results in
lowest sale. Display only is better than no promotion, coupon only is even
better and display plus coupon results in highest volume and is thus the
best option. If we take a look at maximum volume is coupon only the best
option. But, the maximum volume for coupon only is much higher than the
median and this may be an indication on outliers. The grouped scattered
plot above confirms my suspicion.
However, its difficult to recommend one type of promotion or no
promotion. Display plus coupon results in higher volume but is also more
costly. When we take this into account, it may be more profitable to use
other types of promotion or no promotion. In addition, if the supermarket
always use promotion, it will no longer be a promotion and thus the effect
may disappear. To investigate this further, I will take a look at the
customers price elasticity below.

Customers price elasticity


We can construct a demand model from the regression model above:
D(p)=306,201-177,723p

( p )= p

D ' ( p ) 1.16 ( 177.72 )


=
=2.05
100,5
D ( p)

When the price elasticity is 1<<, do we have an elastic demand. But


we know from above, that the price variable and volume variable is not
normal distributed. As a result, we cannot trust this price elasticity. Thus
we try log transformation and makes a new regression analysis.

Descriptive Statistics

log_price
log_volume
Valid N (listwise)

Skewness
Std. Error

Statistic
-.631
1.013

.217
.217

The pattern when we use log_volume and log_price is more symmetric.


The normal distribution has no skew, and the mean and median is exactly
at the peak. We can see from the histogram and the descriptive statistics
that they have less positive and negative skew in this model, and we can
see from the histogram that the mean and the median are closer to the
peak. Thus, the new log variables can be approximated by normal
distribution.
Model Summary
Model
R
R Square
1
.226a
.051
a. Predictors: (Constant), log_price

Adjusted R
Std. Error of the
Square
Estimate
.043
.79073

Coefficientsa
Unstandardized Coefficients
B
Std. Error
(Constant)
4.497
.131
log_price
-1.994
.777
a. Dependent Variable: log_volume
Model
1

Sig.
.000
.011

Using the regression analysis above can we calculate the price elasticity in
the log-log model:
Log(d)= 4.497-1.994*log(p)
d
log ()

p
log ( )

d
d
'
D (p)
( p )= p
=
D ( p)
The price elasticity before the log transformation was 2.05 and after its
1.994. As we can see in the model summary for the log-log model,
Adjusted R Square is a bit higher in this model but it is still low. In this
model, the negative correlation between volume and price is statistically
significant at 5% level. Despite of the low Adjusted R Square, we can
assume that the price elasticity is 2 since the first price elasticity we got
were almost 2 and the latter were almost 2. This mean, if the price
increase with 1 %, the volume will decrease with 2 %.
We can investigate customers price elasticity further by looking at the
price elasticity for each type of promotion.

Coefficientsa

Model
1

Unstandardized Coefficients
B
Std. Error
(Constant)

Log_NoPromotionsPrice
a. Dependent Variable: log_volume

3.716

.072

.297

.396

Standardized
Coefficients
Beta

.090

Sig.

51.446

.000

.752

.455

Coefficientsa

Model
1

Unstandardized Coefficients
B
Std. Error
(Constant)

Log_DisplayPrice
a. Dependent Variable: log_volume

5.229

.183

-5.869

1.304

Standardized
Coefficients
Beta
-.629

Sig.

28.565

.000

-4.502

.000

Coefficientsa

Model
1

Unstandardized Coefficients
B
Std. Error
(Constant)

Log_CouponPrice
a. Dependent Variable: log_volume

5.314

.501

-1.004

2.968

Standardized
Coefficients
Beta

Sig.

10.602

.000

-.338

.742

-.106

Coefficientsa

Model
1

Unstandardized Coefficients
B
Std. Error
(Constant)

Log_DisplayAndCou
ponPrice
a. Dependent Variable: log_volume

5.645

.514

.090

3.640

Standardized
Coefficients
Beta

.011

Sig.

10.977

.000

.025

.981

When we use a log-log model, the coefficient to the independent variable


is the price elasticity. According to the tables above, the demand is
inelastic when the supermarket use coupon promotion only, display plus
coupon promotion and when they use no promotion because the results in
the tables above are not statistically significant. On the other hand, the
demand is very elastic when the supermarket use display only, and this
negative correlation is statistically significant at 1% level. Its desirable for
the supermarket with an inelastic demand so they can increase the price
without losing customers. So pursuant to the price elasticity for each type
of promotion is coupon promotion only, display plus coupon promotion
and no promotion the best option for the supermarket.

Implications of these observations on the pricing and


promotion decisions
The variables we used in the models above are not normal distributed and
we have a limited amount of data. Thus the models above are not reliable,
and some of the observations are not statistically significant. In spite of
this, we can see some pattern in this data. And this pattern can be very
helpful when pricing and promotion decisions are going to be made.
Its difficult to recommend one type of promotion or no promotion.
Pursuant to this report, display and coupon promotion at the same time
seems to be the best option. It results in the highest volume, and the
demand is inelastic when they use this combination. They use display plus
coupon promotion infrequently (only 6 % of the time). This may explain
some of the positive effect. Its also important to mention, that this
combination is more costly than the other options. When we take this into

account, it may be more profitable to use other types of promotion or no


promotion.
We can say one thing with a relatively high degree of certainty, the
demand is very elastic when they use display promotion only. So in this
case, the supermarket cannot increase the price without losing lots of
customers. In addition, it results in the second lowest volume. This
indicate that display only is a bad solution. On the other hand, this is a
cheaper type of promotion. So it may be more profitable than other types
of promotions after all.
Another interesting result from this report, is the price elasticity. The
aggregate demand is elastic, but the demand is inelastic when the
supermarket use coupon promotion only, display plus coupon promotion
and when they use no promotion. In addition, the price can only explain
1.8 % (4.3% in the log-log model) of the total variation in volume. Which
indicates that price is not the most important factor concerning volume.
Thus, this report suggests that the supermarket should increase the price
under some circumstances.

You might also like