You are on page 1of 17

24-Grover.

qxd 5/8/2006 8:35 PM Page 506

24
MODELING MARKETING MIX
GERARD J. TELLIS
University of Southern California

CONCEPT OF THE MARKETING MIX PHILOSOPHY OF MODELING

The marketing mix refers to variables that a Over the past 45 years, researchers have focused
marketing manager can control to influence intently on trying to find answers to this ques-
a brand’s sales or market share. Traditionally, tion (e.g., see Tellis, 1988b). To do so, they have
these variables are summarized as the four Ps of developed a variety of econometric models of
marketing: product, price, promotion, and place market response to the marketing mix. Most of
(i.e., distribution; McCarthy, 1996). Product these models have focused on market response
refers to aspects such as the firm’s portfolio of to advertising and pricing (Sethuraman & Tellis,
products, the newness of those products, their 1991). The reason may be that expenditures
differentiation from competitors, or their super- on these variables seem the most discretionary,
iority to rivals’ products in terms of quality. so marketing managers are most concerned
Promotion refers to advertising, detailing, or about how they manage these variables. This
informative sales promotions such as features chapter reviews this body of literature. It
and displays. Price refers to the product’s list focuses on modeling response to these vari-
price or any incentive sales promotion such as ables, though most of the principles apply as
quantity discounts, temporary price cuts, or well to other variables in the marketing mix. It
deals. Place refers to delivery of the product relies on elementary models that Chapters 12
measured by variables such as distribution, and 13 introduce. To tackle complex problems,
availability, and shelf space. this chapter refers to advanced models, which
The perennial question that managers face is, Chapters 14, 19, and 20 introduce.
what level or combination of these variables The basic philosophy underlying the approach
maximizes sales, market share, or profit? The of response modeling is that past data on con-
answer to this question, in turn, depends on the sumer and market response to the marketing
following question: How do sales or market mix contain valuable information that can
share respond to past levels of or expenditures enlighten our understanding of response. Those
on these variables? data also enable us to predict how consumers

506
24-Grover.qxd 5/8/2006 8:35 PM Page 507

Modeling Marketing Mix– • –507

might respond in the future and therefore how respond to advertising and pricing. These patterns
best to plan marketing variables (e.g., Tellis & of response are also called the effects of adver-
Zufryden, 1995). While no one can assert the tising or pricing. We then present the most
future for sure, no one should ignore the past important econometric models and discuss how
entirely. Thus, we want to capture as much infor- these classic models capture or fail to capture
mation as we can from the past to make valid each of these effects.
inferences and develop good strategies for the
future.
Assume that we fit a regression model in PATTERNS OF ADVERTISING RESPONSE
which the dependent variable is a brand’s sales
and the independent variable is advertising or We can identify seven important patterns of
price. Thus, response to advertising. These are the current,
shape, competitive, carryover, dynamic, content,
Yt = α + βAt + εt. (1) and media effects. The first four of these effects
are common across price and other marketing
Here, Y represents the dependent variable
variables. The last three are unique to advertising.
(e.g., sales), A represents advertising, the para-
The next seven subsections describe these effects.
meters α and β are coefficients or parameters
that the researcher wants to estimate, and the
subscript t represents various time periods. Current Effect
A section below discusses the problem of The current effect of advertising is the
the appropriate time interval, but for now, the change in sales caused by an exposure (or pulse
researcher may think of time as measured in or burst) of advertising occurring at the same
weeks or days. The εt are errors in the estima- time period as the exposure. Consider Figure 24.1.
tion of Yi that we assume to independently and It plots time on the x-axis, sales on the y-axis,
identically follow a normal distribution (IID and the normal or baseline sales as the dashed
normal). Equation (1) can be estimated by line. Then the current effect of advertising is the
regression (see Chapter 13). Then the coef- spike in sales from the baseline given an expo-
ficient β of the model captures the effect of sure of advertising (see Figure 24.1A). Decades
advertising on sales. In effect, this coefficient of research indicate that this effect of advertis-
nicely summarizes much that we can learn from ing is small relative to that of other marketing
the past. It provides a foundation to design variables and quite fragile. For example, the
strategies for the future. Clearly, the validity, current effect of price is 20 times larger than
relevance, and usefulness of the parameters the effect of advertising (Sethuraman & Tellis,
depend on how well the models capture past 1991; Tellis, 1989). Also, the effect of advertis-
reality. Chapters 13, 14, and 19 describe how ing is so small as to be easily drowned out by the
to correctly specify those models. This chapter noise in the data. Thus, one of the most impor-
explains how we can implement them in tant tasks of the researcher is to specify the
the context of the marketing mix. We focus on model very carefully to avoid exaggerating or
advertising and price for three reasons. First, failing to observe an effect that is known to be
these are the variables most often under the fragile (e.g., Tellis & Weiss, 1995).
control of managers. Second, the literature has
a rich history of models that capture response
Carryover Effect
to these variables. Third, response to these
variables has a wealth of interesting patterns or The carryover effect of advertising is that
effects. Understanding how to model these portion of its effect that occurs in time periods
response patterns can enlighten the modeling of following the pulse of advertising. Figure 24.1
other marketing variables. shows long (1B) and short (1C) carryover effects.
The first step is to understand the variety The carryover effect may occur for several rea-
of patterns by which contemporary markets sons, such as delayed exposure to the ad, delayed
24-Grover.qxd 5/8/2006 8:35 PM Page 508

508– • –CONCEPTUAL APPLICATIONS

Sales Sales

A: Current Effect B: Carryover Effects of


Long-Duration

Time Time

Sales Sales
C: Carryover Effects D: Persistent Effect
of Short-Duration

Time Time
34

Legend: = ad exposure = baseline sales = sales due to ad exposure

Figure 24.1 Temporal Effects of Advertising

consumer response, delayed purchase due to (Tellis, 2004). The long duration that researchers
consumers’ backup inventory, delayed purchase often find is due to the use of data with long
due to shortage of retail inventory, and purchases intervals that are temporally aggregate (Clarke,
from consumers who have heard from those who 1976). For this reason, researchers should use
first saw the ad (word of mouth). The carryover data that are as temporally disaggregate as they
effect may be as large as or larger than the cur- can find (Tellis & Franses, in press). The total
rent effect. Typically, the carryover effect is of effect of advertising from an exposure of adver-
short duration, as shown in Figure 24.1C, rather tising is the sum of the current effect and all of
than of long duration, as shown in Figure 24.1B the carryover effect due to it.
24-Grover.qxd 5/8/2006 8:35 PM Page 509

Modeling Marketing Mix– • –509

Sales

Concave Response

Linear Response

S-Shaped Response

Advertising

Figure 24.2 Linear and Nonlinear Response to Advertising

Shape Effect level, it might not increase sales because the


market is saturated or consumers suffer from
The shape of the effect refers to the change tedium with repetitive advertising.
in sales in response to increasing intensity of The responsiveness of sales to advertising
advertising in the same time period. The inten- is the rate of change in sales as we change
sity of advertising could be in the form of expo- advertising. It is captured by the slope of the
sures per unit time and is also called frequency curve in Figure 24.2 or the coefficient of the
or weight. Figure 24.2 describes varying shapes model used to estimate the curve. This coeffi-
of advertising response. Note, first, that the cient is generally represented as β in Equation
x-axis now is the intensity of advertising (in a (1). Just as we expect the advertising sales curve
period), while the y-axis is the response of sales to follow a certain shape, we also expect this
(during the same period). With reference to responsiveness of sales to advertising to show
Figure 24.1, Figure 24.2 charts the height of the certain characteristics. First, the estimated
bar in Figure 24.1A, as we increase the expo- response should preferably be in the form of
sures of advertising. an elasticity. The elasticity of sales to advertis-
Figure 24.2 shows three typical shapes: lin- ing (also called advertising elasticity, in short)
ear, concave (increasing at a decreasing rate), is the percentage change in sales for a 1%
and S-shape. Of these three shapes, the S-shape change in advertising. So defined, an elasticity
seems the most plausible. The linear shape is is units-free and does not depend on the mea-
implausible because it implies that sales will sures of advertising or of sales. Thus, it is a pure
increase indefinitely up to infinity as advertising measure of advertising responsiveness whose
increases. The concave shape addresses the value can be compared across products, firms,
implausibility of the linear shape. However, the markets, and time. Second, the elasticity should
S-shape seems the most plausible because it neither always increase with the level of adver-
suggests that at some very low level, advertising tising nor be always constant but should show
might not be effective at all because it gets an inverted bell-shaped pattern in the level of
drowned out in the noise. At some very high advertising. The reason is the following.
24-Grover.qxd 5/8/2006 8:35 PM Page 510

510– • –CONCEPTUAL APPLICATIONS

We would expect responsiveness to be low term are carryover effects discussed earlier and
at low levels of advertising because it would be wearin, wearout, and hysteresis discussed here.
drowned out by the noise in the market. We To understand wearin and wearout, we need to
would expect responsiveness to be low also at return to Figure 24.2. Note that for the concave
very high levels of advertising because of satu- and the S-shaped advertising response, sales
ration. Thus, we would expect the maximum increase until they reach some peak as advertising
responsiveness of sales at moderate levels of intensity increases. This advertising response
advertising. It turns out that when advertising can be captured in a static context—say, the first
has an S-shaped response with sales, the week or the average week of a campaign.
advertising elasticity would have this inverted However, in reality, this response pattern changes
bell-shaped response with respect to advertis- as the campaign progresses.
ing. So the model that can capture the S-shaped Wearin is the increase in the response of sales
response would also capture advertising elastic- to advertising, from one week to the next of
ity in its theoretically most appealing form. a campaign, even though advertising occurs at
the same level each week (see Figure 24.3).
Figure 24.3 shows time on the x-axis (say in
Competitive Effects
weeks) and sales on the y-axis. It assumes an
Advertising normally takes place in free advertising campaign of 7 weeks, with one expo-
markets. Whenever one brand advertises a suc- sure per week at approximately the same time
cessful innovation or successfully uses a new each week. Notice a small spike in sales with
advertising form, other brands quickly imitate each exposure. However, these spikes keep
it. Competitive advertising tends to increase the increasing during the first 3 weeks of the cam-
noise in the market and thus reduce the effec- paign, even though the advertising level is the
tiveness of any one brand’s advertising. The same. That is the phenomenon of wearin. Indeed,
competitive effect of a target brand’s advertising if it at all occurs, wearin typically occurs at the
is its effectiveness relative to that of the other start of a campaign. It could occur because repe-
brands in the market. Because most advertising tition of a campaign in subsequent periods
takes place in the presence of competition, try- enables more people to see the ad, talk about it,
ing to understand advertising of a target brand in think about it, and respond to it than would have
isolation may be erroneous and lead to biased done so on the very first period of the campaign.
estimates of the elasticity. The simplest method Wearout is the decline in sales response of
of capturing advertising response in competition sales to advertising from week to week of a
is to measure and model sales and advertising of campaign, even though advertising occurs at the
the target brand relative to all other brands in the same level each week. Wearout typically occurs
market. at the end of a campaign because of consumer
In addition to just the noise effect of com- tedium. Figure 24.3 shows wearout in the last 3
petitive advertising, a target brand’s advertising weeks of the campaign.
might differ due to its position in the market or Hysteresis is the permanent effect of an adver-
its familiarity with consumers. For example, tising exposure that persists even after the pulse
established or larger brands may generally get is withdrawn or the campaign is stopped (see
more mileage than new or smaller brands from Figure 24.1D). Typically, this effect does not
the same level of advertising because of the occur more than once. It occurs because an ad
better name recognition and loyalty of the for- established a dramatic and previously unknown
mer. This effect is called differential advertising fact, linkage, or relationship. Hysteresis is an
responsiveness due to brand position or brand unusual effect of advertising that is quite rare.
familiarity.
Content Effects
Dynamic Effects
Content effects are the variation in response
Dynamic effects are those effects of advertis- to advertising due to variation in the content
ing that change with time. Included under this or creative cues of the ad. This is the most
24-Grover.qxd 5/8/2006 8:35 PM Page 511

Modeling Marketing Mix– • –511

Sales

Advertising Wearin

Advertising Wearout

Base Sales

Time in Weeks

Ad Exposures (one per week)

Figure 24.3 Wearin and Wearout in Advertising Effectiveness

important source of variation in advertising or newspaper, and the programs within them,
responsiveness and the focus of the creative such as channel for TV or section or story for
talent in every agency. This topic is essentially newspaper.
studied in the field of consumer behavior using
laboratory or theater experiments. However,
experimental findings cannot be easily and MODELING ADVERTISING RESPONSE
immediately translated into management prac-
tice because they have not been replicated in the This section discusses five different models of
field or in real markets. Typically, modelers advertising response, which address one or more
have captured the response of consumers or of the above effects. Some of these models are
markets to advertising measured in the aggre- applications of generic forms presented in
gate (in dollars, gross ratings points, or expo- Chapters 12, 13, and 14. The models are pre-
sures) without regard to advertising content. So sented in the order of increasing complexity. By
the challenge for modelers is to include mea- discussing the strengths and weaknesses of each
sures of the content of advertising when model- model, the reader will appreciate its value and
ing advertising response in real markets. the progression to more complex models. By
combining one or more models below, a
researcher may be able to develop a model that
Media Effects
can capture many of the effects listed above.
Media effects are the differences in advertis- However, that task is achieved at the cost of great
ing response due to various media, such as TV complexity. Ideally, an advertising model should
24-Grover.qxd 5/8/2006 8:35 PM Page 512

512– • –CONCEPTUAL APPLICATIONS

be rich enough to capture all the seven effects log (Yt) = α + β1 log(At) + β2 log(Pt) +
discussed above. No one has proposed a model β3 log(Rt) + β4 log(Qt) + εt. (4)
that has done so, though a few have come close.
The main difference between Equation (2) and
Equation (4) is that the latter has all variables as
Basic Linear Model the logarithmic transformation of their original
The basic linear model can capture the first state in the former. After this transformation, the
of the effects described above, the current effect. error terms in Equation (4) are assumed to be
The model takes the following form: IID normal.
The multiplicative model has many benefits.
Yt = α + β1 At + β2Pt + β3Rt + β4Qt + εt. (2) First, this model implies that the dependent
variable is affected by an interaction of the vari-
Here, Y represents the dependent variable (e.g., ables of the marketing mix. In other words, the
sales), while the other capital letters represent vari- independent variables have a synergistic effect
ables of the marketing mix, such as advertising on the dependent variable. In many advertising
(A), price (P), sales promotion (R), or quality (Q). situations, the variables could indeed interact to
The parameters α and βk are coefficients that the have such an impact. For example, higher adver-
researcher wants to estimate. βk represents the tising combined with a price drop may enhance
effect of the independent variables on the depen- sales more than the sum of higher advertising or
dent variable, where the subscript k is an index for the price drop occurring alone.
the independent variables. The subscript t repre- Second, Equations (3) and (4) imply that
sents various time periods. A section below dis- response of sales to any of the independent vari-
cusses the problem of the appropriate time interval, ables can take on a variety of shapes depending
but for now, the researcher may think of time as on the value of the coefficient. In other words,
measured in weeks or days. The εt are errors in the the model is flexible enough that it can capture
estimation of Yt that we assume to independently relationships that take a variety of shapes by
and identically follow a normal distribution (IID estimating appropriate values of the response
normal). This assumption means that there is no coefficient.
pattern to the errors so that they constitute just ran- Third, the β coefficients not only estimate
dom noise (also called white noise). Our simple the effects of the independent variables on the
model assumes we have multiple observations dependent variables, but they are also elasticities.
(over time) for sales, advertising, and the other Estimating response in the form of elasticities
marketing variables. This model can best be esti- has a number of advantages listed above.
mated by regression, a simple but powerful statisti- However, the multiplicative model has
cal tool discussed in Chapter 13. While simple, this three major limitations. First, it cannot estimate
model can only capture the first of the seven effects the latter five of the seven effects described
discussed above. above. For this purpose, we have to go to other
models. Second, the multiplicative model is
unable to capture an S-shaped response of adver-
Multiplicative Model tising to sales. Third, the multiplicative model
implies that the elasticity of sales to advertising
The multiplicative model derives its name is constant. In other words, the percentage rate at
from the fact that the independent variables of which sales increase in response to a percentage
the marketing mix are multiplied together. Thus, increase in advertising is the same whatever the
level of sales or advertising. This result is quite
Yt = Exp(α) × At β1 × Pt β2 × Rt β3 × Qt β4 × εt. (3) implausible. We would expect that percentage
increase in sales in response to a percentage
While this model seems complex, a simple increase in advertising would be lower as the
transformation can render it quite simple. In particu- firm’s sales or advertising become very large.
lar, the logarithmic transformation linearizes Equa- Equation (4) does not allow such variation in the
tion (3) and renders it similar to Equation (2); thus, elasticity of sales to advertising.
24-Grover.qxd 5/8/2006 8:35 PM Page 513

Modeling Marketing Mix– • –513

Exponential Attraction and Exp stands for exponent, and Vi is the marketing
Multinomial Logit Model effort of the ith brand, expressed as the right-
hand side of Equation (2). Thus,
Attraction models are based on the premise
that market response is the result of the attractive Vi = α + β1 Ai + β2Pi + β3Ri + β4Qi + ei, (7)
power of a brand relative to that of other brands
with which they compete. The attraction model where ei are error terms. By substituting the
implies that a brand’s share of market sales is a value of Equation (7) in Equation (6), we get
function of its share of total marketing effort; thus,
 
  Mi = Exp (Vi)/ j Exp Vj = Exp( kβk Xik + ei)/
Mi = Si / j Sj = Fi / jFj, (5)  
j Exp( kβk Xik + ej), (8)
Here, Mi is the market share of the ith brand
(measured from 0 to 1), Si is the sales of brand where Xk (0 to m) are the m independent
 variables or elements of the marketing mix,
i, j implies a summation of the values of the
corresponding variable over all the j brands in and α = β0 and Xi0 = 1. The use of the ratio of
the market, and Fi is brand i’s marketing effort exponents in Equations (6) and (8) ensures
and is the effort expended on the marketing that market share is an S-shaped function of
mix (advertising, price, promotion, quality, etc.). share of a brand’s marketing effort. As such, it
Equation (5) has been called Kotler’s funda- has a number of nice features discussed earlier.
mental theorem of marketing. Also, the right- However, Equation (8) also has two limita-
hand-side term of Equation (5) has been called tions. First, it is not easy to interpret because the
the attraction of brand i. Attraction models right-hand side of Equation (8) is in the form
intrinsically capture the effects of competition. of exponents. Second, it is intrinsically nonlin-
A simple but inaccurate form of the attrac- ear and difficult to estimate because the denom-
tion model is the use of the relative form of inator of the right-hand side is a sum of the
all variables in Equation (2). So for sales, the exponent of the marketing effort of each brand
researcher would use market share. For adver- summed over each element of the marketing
tising, he or she would use share of advertising mix. Fortunately, both of these problems can be
expenditures or share of gross rating points solved by applying the log-centering transfor-
(share of voice) and so on. While such a model mation to Equation (8) (Cooper & Nakanishi,
would capture the effects of competition, it 1988). After applying this transformation,
would suffer from other problems of the linear Equation (8) reduces to
model, such as linearity in response. Also, it is 
inaccurate because the right-hand side would Log(Mi M −) = α*i + k β k (X ik* ) + e*i , (9)
not be exactly the share of marketing effort but
the sum of the individual shares of effort on where the terms with * are the log-centered

each element of the marketing mix. version of the normal terms; thus, α*i = αi − α ,

A modification of the linear attraction model X ik = X ik − Xi , e i = ei − e, for k = 1 to m, and the
* * −

can resolve the problem of linearity in response terms with are the geometric means of the nor-
and the inaccuracy in specifying the right-hand mal variables over the m brand in the market.
side of the model plus provide a number of other The log-centering transformation of
benefits. This modification expresses the market Equation (8) reduces it to a type of multinomial
share of the brand as an exponential attraction of logit model in Equation (9). The nice feature of
the marketing mix; thus, this model is that it is relatively simpler, more
easily interpreted, and more easily estimated

Mi = Exp (Vi ) / j Exp Vj, (6) than Equation (8). The right-hand side of
Equation (9) is a linear sum of the transformed
where Mi is the market share of the ith brand independent variables. The left-hand side of
(measured from 0 to 1), Vj is the marketing Equation (9) is a type of logistic transformation

effort of the jth brand in the market, j stands of market share and can be interpreted as the log
for summation over the j brands in the market, odds of consumers as a whole preferring the
24-Grover.qxd 5/8/2006 8:35 PM Page 514

514– • –CONCEPTUAL APPLICATIONS

target brand relative to the average brand in the Third, because of the S-shaped curve of
market. the multinomial logit model, the elasticity of
The particular form of the multinominal logit market share to any of the independent variables
in Equation (9) is aggregate. That is, this form is shows a characteristic bell-shaped relationship
estimated at the level of market data obtained with respect to marketing effort. This relation-
in the form of market shares of the brand and its ship implies that at very high levels of marketing
share of the marketing effort relative to the other effort, a 1% increase in marketing effort trans-
brands in the market. An analogous form of the lates into ever smaller percentage increases in
model can be estimated at the level of an individ- market share. Conversely, at very low levels
ual consumer’s choices (e.g., Tellis, 1988a). This of marketing effort, a 1% decrease in market-
other form of the model estimates how individual ing effort translates into ever smaller percentage
consumers choose among rival brands and is decreases in market share. Thus, market share
called the multinomial logit model of brand choice is most responsive to marketing effort at some
(Guadagni & Little, 1983). Chapter 14 covers this intermediate level of market share. This pattern is
choice model in more detail than done here. what we would expect intuitively of the relation-
The multinomial logit model (Equation (9)) ships between market share and marketing effort.
has a number of attractive features that render it Despite its many attractions, the exponential
superior to any of the models discussed above. attraction or multinomial model as defined
First, the model takes into account the competi- above does not capture the latter four of the
tive context, so that predictions of the model are seven effects identified above.
sum and range constrained, just as are the origi-
nal data. That is, the predictions of the market
Koyck and Distributed Lag Models
share of any brand range between 0 and 1, and
the sum of the predictions of all the brands in The Koyck model may be considered a
the market equals 1. simple augmentation of the basic linear model
Second, and more important, the functional (Equation (2)), which includes the lagged
form of Equation (6) (from which Equation (9) dependent variable as an independent variable.
is derived) suggests a characteristic S-shaped What this specification means is that sales
curve between market share and any of the inde- depend on sales of the prior period and all the
pendent variables (see Figure 24.2). In the case independent variables that caused prior sales,
of advertising, for example, this shape implies plus the current values of the same independent
that response to advertising is low at levels of variables.
advertising that are very low or very high. This
characteristic is particularly appealing based Yt = α + λYt − 1 + β1At + β2Pt + β3Rt + β4Qt + εt.
on advertising theory. The reason is that very (10)
low levels of advertising may not be effective
because they get lost in the noise of competing In this model, the current effect of advertising
messages. Very high levels of advertising may is β1, and the carryover effect of advertising
not be effective because of saturation or dimin- is β1λ/(1 − λ). The higher the value of λ, the
ishing returns to scale. If the estimated lower longer the effect of advertising. The smaller the
threshold of the S-shaped relationship does value of λ, the shorter the effects of advertis-
not coincide with 0, this indicates that market ing, so that sales depend more on only current
share maintains some minimal floor level even advertising. The total effect of advertising is
when marketing effort declines to a zero. We β1 / (1 − λ).
can interpret this minimal floor to be the base While this model looks relatively simple and
loyalty of the brand. Alternatively, we can inter- has some very nice features, its mathematics
pret the level of marketing effort that coincides can be quite complex (Clarke, 1976). Moreover,
with the threshold (or first turning point) of the readers should keep in mind the following limi-
S-shaped curve as the minimum point necessary tations of the model. First, this model can cap-
for consumers or the market to even notice a ture carryover effects that only decay smoothly
change in marketing effort. and do not have a hump or a nonmonotonic
24-Grover.qxd 5/8/2006 8:35 PM Page 515

Modeling Marketing Mix– • –515

decay. Second, estimating the carryover of any of two ways: dummy variable regression or a
one variable is quite difficult when there are hierarchical model.
multiple independent variables, each with its Dummy variable regression is the use of
own carryover effect. Third, the level of data various interaction terms to capture how adver-
aggregation is critical. The estimated duration tising responsiveness varies by content, media,
of the carryover increases or is biased upwards wearin, or wearout. We illustrate it in the con-
as the level of aggregation increases. A recent text of a campaign with a few ads. First, suppose
paper has proved that the optimal data interval the advertising campaign uses only a few differ-
that does not lead to any bias is not the inter- ent types of ads (say, two). Also, assume we start
purchase time of the category, as commonly with the simple regression model of Equation
believed, but the largest period with at most one (3). Then we can capture the effects of these
exposure and, if it occurs, does so at the same different ads by including suitable dummy vari-
time each period (Tellis & Franses, in press). ables. One simple form is to include a dummy
The distributed lag model is a model with variable for the second ad, plus an interaction
multiple lagged values of both the dependent effect of advertising times this dummy variable.
variable and the independent variable. Thus, Thus,

Yt = α + λ1Yt – 1 + λ2Yt – 2 + λ3Yt – 3 + . . . Yt = α + β1At + δAt A2t +


+ β10At + β11At − 1 + β12At − 2 + . . . β2Pt + β3Rt + β4Qt + εt, (12)
+ β2Pt + β3Rt + β4Qt + εt. (11)
where A2t is a dummy variable that takes on
This model is very general and can capture the value of 0 if the first ad is used at time t and
a whole range of carryover effects. Indeed, the the value of 1 if the second ad is used at time t.
Koyck model can be considered a special case δ is the effect of the interaction term (At A2t).
of distributed lag model with only one lagged In this case, the main coefficient of advertis-
value of the dependent variable. The distributed ing, β1, captures the effect of the first ad, while
lag model overcomes two of the problems with the coefficients of β1 plus that of the interaction
the Koyck model. First, it allows for decay func- term (δ) capture the effect of the second ad.
tions, which are nonmonotonic or humped While simple, these models quickly become
shaped (see Figure 24.4). Second, it can partly quite complex when we have multiple ads,
separate out the carryover effects of different media, and time periods, especially if these are
independent variables. However, it also suffers occurring simultaneously. This is the situation
from two limitations. First, there is considerable in real markets. The problem can be solved by
multicollinearity between lagged and current the use of hierarchical models.
values of the same variables. Second, because of Hierarchical models are multistage models
this problem, estimating how many lagged vari- in which coefficients (of advertising) estimated
ables are necessary is difficult and unreliable. in one stage become the dependent variable in
Thus, if the researcher has sufficient extensive the other stage. The second stage contains the
data that minimize the latter two problems, then characteristics by which advertising is likely
he or she should use the distributed lagged to vary in the first stage, such as ad content,
model. Otherwise, the Koyck model would be a medium, or campaign duration. Consider the
reasonable approximation. following example.

Hierarchical Models Example


The remaining effects of advertising that we A researcher gathers data about the effect of
need to capture (content, media, wearin, and advertising on sales for a brand of one firm over
wearout) involve changes in the responsiveness a 2-year period. The firm advertises the brand
itself of advertising (i.e., the β coefficient) due using a large number of different ads (or copy
to advertising content, media used, or time of a content), in campaigns of varying duration (say, 2
campaign. These effects can be captured in one to 8 weeks), in a number of different cities or
24-Grover.qxd 5/8/2006 8:35 PM Page 516

516– • –CONCEPTUAL APPLICATIONS

Sales

Sales = function of
Koyck Model: Sales = function of twice-lagged sales,
Sales = function twice-lagged sales advertising, and
of lagged sales and advertising lagged advertising

Time

Advertising Exposures

Figure 24.4 Alternate Shapes of Advertising Carryover

markets. Assume that the researcher has data at gave rise to that coefficient. Such a
a highly disaggregate level, say the hour of the multistage model is called a hierarchical model
day. Such data are possible because of electronic (e.g., Chandy, Tellis, MacInnis, & Thaivanich,
databases such as that recorded by Internet retail- 2001; Tellis, Chandy, & Thaivanich, 2000).
ers, telemarketers, or retail firms with scanners. Two features are essential for hierarchical
The researcher analyzes the effect of advertising models. First, we should be able to obtain
on sales, separately for each city, campaign (ad), multiple estimates of the effects (or coefficient
and week of the campaign. These effects vary values) of advertising on some dependent vari-
substantially across the various estimations of the able such as sales or market share for the same
model. Why do they vary? brand across different contexts such as at least
The researcher suspects that the variation one of the following: the ad campaign, week of
could be due to varying responsiveness in the campaign, market, or medium. Then we can
markets, or by campaign, or by week of the use the estimates of the effects of advertising
campaign. The researcher has information on all from the first stage as dependent variables in the
these three factors (market, campaign, and week second stage. Second, as far as possible, we
of campaign). Then in a second-stage model, the need to minimize excessive covariation among
researcher can analyze how the coefficients of factors. Thus, a particular ad should not always
advertising estimated in the first stage vary due occur with a particular channel, or an ad of a
to these three factors. The dependent variable is particular duration should not always be run in
the coefficients of advertising from the first stage, a particular channel. Such co-occurrence leads
and the independent variables are the factors that to the problem of multicollinearity among the
24-Grover.qxd 5/8/2006 8:35 PM Page 517

Modeling Marketing Mix– • –517

created variables in the second-stage model price is typically strong and immediate, with
(see Chapter 13). As long as the three factors most of the effect lasting in the current period
have sufficient cross-variation, estimates of the (Sethuraman & Tellis, 1991).
second-stage model should be reliable. However, price changes can also have carry-
Depending on the richness of the data, hierar- over effects. These effects could occur because
chical models can estimate the last three effects consumers take time to learn of the price
of advertising that we identified above. That is, change, wait to respond until their next shop-
with such models and given suitable data, the ping trip, or wait to respond because of their
researcher can estimate what ad content is the current inventory. Typically, carryover effects
most effective, what duration of the campaign is are less pronounced for price than for advertis-
the most effective, and which media are the most ing. One type of carryover is the negative sales
effective. The duration of the campaign could be following a price cut, because consumers buy
estimated in terms of weeks. For example, if the excess stock during the discount and then hold
effectiveness of the ad first increases slowly back regular purchases until they deplete their
and then decreases suddenly, one could conclude stocks.
that wearin is slow but wearout is rapid. On the The exponential attraction or multinomial
other hand, if the effectiveness of the ad steadily logit model specified for advertising response
declines over time, then there is no wearin, and also serves very well to capture S-shaped
wearout sets out from the start. Furthermore, if response and competitive effects, if any, in
the data are sufficiently rich and detailed, the response to pricing. In addition, the integration
researcher can also obtain interaction effects of these models with a Koyck or distributed lag
such as which media are most suitable for par- specification can capture any carryover effects
ticular ads or which ad content needs to be run that may exist in response to pricing.
over campaigns of long versus short duration. In addition, response to price has three more
Note that to address all of the seven effects effects that are unique to price: promotional
of advertising identified above, the researcher price effect, reference price effect, and price
would have to use a hierarchical model, which interaction effect. To capture the three effects,
itself contains an exponential attraction or the researcher has merely to modify the linear,
multinomial logit model with a Koyck-type or multinomial logit, or distributed lag model by
distributed lag enhancement. In other words, including relevant independent variables. The
suitably integrating models described above basic structure of the model need not change.
would enable a researcher to address the most Thus, in the interests of parsimony, here we dis-
important phenomena associated with advertis- cuss only the unique effects of pricing and how
ing. In reality, such fully integrated models that modifications of the classic models discussed
can capture all the effects of advertising are very above can capture these effects.
complex and require substantial data (e.g., see
Chandy et al., 2001). If researchers want to focus
Modeling Promotion Price Effect
on only a few effects or their data are not rich,
they might want to simplify the model they use A pervasive feature of pricing in contempo-
to focus on only the most important effects. rary markets is that prices are constantly in flux.
Retailers have a certain list price, and frequently
for a variety of reasons, they offer discounts or
PATTERNS AND “sales” from these prices (Tellis, 1986). Thus,
MODELS OF PRICE RESPONSE pricing strategies have two components: (1) a list
price component that is basically how a brand is
The first four effects of advertising response listed on price relative to other brands and (2) a
also apply to price: current, shape, competition, promotion price component, which basically
and carryover effects. The current effect of involves a temporary discount off this list price.
price is the changes in sales that occur in the So, models of response to pricing should contain
same period as that in which prices change. In both of these components to correctly specify
contrast to response to advertising, response to and fully capture all the effects of price.
24-Grover.qxd 5/8/2006 8:35 PM Page 518

518– • –CONCEPTUAL APPLICATIONS

Price

List Price

Discounts

Time

Figure 24.5 Price Path of One Brand in One Store Over Time

Assume one has chosen the multinominal Modeling Reference Price Effect
logit model discussed above. Then, to fully cap-
ture the promotional price effect, one would use Reference prices are latent internal norms
two independent variables for price, instead of that consumers use as a basis against which to
only one: One variable would represent the list compare current prices (Tellis, 1998; Winer,
price of the brand; the other variable would rep- 1986). Reference prices are not observed and
resent the promotion price of the brand. The key cannot be ascertained by survey because of the
question would be how to measure the list and problem of demand bias. Even if they did not
promotion price. exist, consumers would be tempted to answer in
In markets today, firms generally keep the list the affirmative about them just to please the
price of the brand high for an extended period of researcher. The best way to test for reference
time but occasionally drop its price by offering a prices is by the prediction of behavior with
sales or discount (Tellis, 1998). Thus, one can and without reference prices. For example, a
define and capture the list price as the high researcher can ascertain a model’s improvement
modal price of the brand over a given time hori- in fit with the data, if any, from the inclusion of
zon (see Figure 24.5). One can define the pro- terms that capture reference price.
motional price or discount of the brand as the list Current research suggests at least two com-
price minus the actual price charged or paid in a ponents of reference price (Rajendran & Tellis,
particular time period within that horizon. One 1994): first, a temporal or internal reference
would use the same rules to compute the list and price based on memory that probably develops
promotional prices for competing brands. in response to past prices a consumer has paid
The estimated coefficients (elasticities) of and, second, an external or contextual reference
these variables would then reflect the response price based on visible prices that probably
of markets to these respective variables. By their relates to the prices of other competing brands
definition, the effect of the list price would gen- available to the consumer at the time of pur-
erally be negative. That is, the higher the list chase. A complete model of response to pricing
price of the brand, the lower its sales or market should capture these effects of reference price.
share. The effect of the promotional price would Any of the models discussed above can account
generally be positive. That is, the steeper the for reference price effects by including indepen-
promotional discount of the brand, the higher its dent variables for these effects. In effect, instead
sales or market share. of a single variable for price, the researcher
24-Grover.qxd 5/8/2006 8:35 PM Page 519

Modeling Marketing Mix– • –519

would include a variable for the temporal consumers probably form them from the list
reference price minus price paid, plus another prices of competing brands at the point of pur-
variable for the contextual reference price minus chase. On the other hand, promotional prices are
price paid. The next problem is how exactly to more likely to be compared to a temporal or
measure these reference prices. internal reference price because they vary over
To measure the contextual reference price of a time and depend on consumer memory and
target brand, the researcher could use either one experience of these prices.
of the average of the other brands’ prices or the Thus, despite many pricing effects, a resear-
lowest among the other brands’ prices or the price cher might capture most of these effects parsi-
of the leading rival brand (Rajendran & Tellis, moniously with just two independent variables
1994). The other or rival brands being considered for price. The first variable would be the refer-
in this case are those with which the target brand ence list price minus the actual list price. This
is available. Which of these three prices a term would capture the effect of list prices rela-
researcher uses depends on which price is most tive to contextual reference prices. The second
salient to consumers when they make decisions term would be the temporal reference discount
based on price. In the absence of a strong theory minus the discount actually obtained. This term
about this issue, a researcher would try out each would capture the effect of discounts with regard
of these three reference prices and use the one to temporal reference prices. The discount itself
that gives the best fit with the data. is the list price minus the actual price paid at any
To capture the temporal reference price, the one period.
researcher would use some moving average of
past prices that the consumer has used for the
Modeling Interaction Effects
target brand. Instead of a simple average, some
researchers advocate a weighted moving aver- Often, marketing variables affect consumers
age of past prices. The key issue here is, how synergistically. That is, the effect of two of them
does one estimate the weights and the numbers together is greater than the sum of the effect of
of prior periods that should be included in the each of them separately. We refer to this syner-
definition? The current thinking is that one gistic effect as an interaction effect. One might
should fit a time-series model that best captures argue that the whole concept of the marketing
the string of past prices for a brand (Winer, mix is that these variables do not act alone but
1986). The logic for this thinking is that the have some joint effect that is much greater than
prices that can best be predicted are those that a the sum of the parts. The general way in which
consumer is mostly likely to be able to recollect response models capture interaction effects is by
and respond to. However, there is no absolute including an additional term that is formed by
rule that any one measure of past prices is the the product of the two variables that interact.
best for the temporal reference price compo- For example, if the researcher believes that
nent. In effect, a researcher would use that com- advertising would be more effective during the
ponent that he or she finds to fit the data best. time of a discount, the researcher would include
a new independent variable formed from the
multiplication of advertising and discounts.
Modeling Promotional and
When one already has a large number of inde-
Reference Price Effects Jointly
pendent variables, some of which have multiple
A model can get quite unwieldy if one components (such as lagged values of advertis-
attempts to capture both promotional and refer- ing or temporal and contextual reference prices),
ence price effects and, for each of these, capture then testing out all sorts of interactions can
both temporal and contextual components. get quite complex. What is needed is a model
Fortunately, reference price effects are probably that can do so parsimoniously. Some of the past
related to promotional price effects. In particu- models may do so under certain assumptions.
lar, list prices are more likely to need a contex- Consider the multiplicative model in
tual or external reference price. The reason is Equation (3). This model in its original form
that list prices do not change much over time, so (with all the variables measured naturally)
24-Grover.qxd 5/8/2006 8:35 PM Page 520

520– • –CONCEPTUAL APPLICATIONS

implies that sales result from the multiplicative advertising data for a referral service over several
mix of the independent variables. In other words, years across more than 30 cities. In each city, the
it assumes that sales result from the interaction service provider can draw from a bank of about
of the marketing mix. However, in its logarith- 70 creatives developed over the years. Fortu-
mic form (after taking logs of all the variables) nately, the firm uses different creatives in differ-
in Equation (4), which is used to linearize and ent cities, in each of which the firm has operated
estimate the model, it no longer contains inter- for a varying length of time. The researchers were
action effects. So if theory suggests that the able to describe the differences in those creatives
interaction effects hold in the natural state of by a set of key characteristics, such as the use of
the variables but not in their logarithmic state, emotion, argument, endorser, certain types of
then the multiplicative model serves as a parsi- copy, and so on. They were also able to calibrate
monious means of capturing those interaction differences in the various cities by the age of the
effect. Alternatively, if the researcher believes market at which time the ad was aired.
that the interaction effects persist even after Given this scenario, a first-stage model could
taking logs of the natural variables or if the explain what effects each creative has in each
researcher is not sure, he or she could just run a city. Then, a second-stage model can explain how
model that includes additional interaction terms those effects vary by type of creatives and type
of the log of the marketing variables suspected to of city. This is a hierarchical model. We now
have interaction effects. proceed to describe the equations in each stage.
If the researcher has reason to believe that
a strong interaction effect exists between some
variables and the researcher is using a model other Stage 1: Estimating
than the multiplicative model, then he or Response to Ads (Creatives)
she is best advised to model the interaction The authors began with a distributed lag
effect explicitly. This modeling can be achieved model such as that in Equation (11). The authors
by including an additional independent variable then included a dummy variable in the model
formed by multiplication of those variables that for the presence or absence of each creative. The
the researcher assumes do interact with each other. coefficient of this variable determines how the
effect of advertising varies from the common
effect captured in Equation (11) due to the use
A PARTIALLY INTEGRATED HIERARCHICAL of a particular creative and the age of the market
MODEL FOR AD RESPONSE at that time. The authors also included many
control variables to account for other differ-
No researcher has published a model that ences, such as hour of the day and day of the
captures all of the seven characteristics of week when the sales occurred, the station and
marketing-mix models. However, a recent day-part (morning or evening) in which the ad
example published in two studies by a team aired, and whether the service was open.
of four authors shows how one could capture The first-stage model is
all of these effects except competition. Now,
many readers will argue that competition is R = α + (R−lλ + AβA + Cβc + SβS + SHβSH
pervasive in markets today and is the most + HDβHD + AMβM) O + εt, (13)
important dimension to capture. However, in
this particular example, competitors were not where
present. Also, advertising was the only element R = a vector of referrals by hour,
of the marketing mix that the firm used. Given
R−l = a matrix of lagged referrals by hour,
these two caveats, the authors were able to
integrate the other six desirable characteristics A = a matrix of current and lagged ads by hour,
of marketing models quite nicely. C = a matrix of dummy variables indicating
This example is due to a study done by whether a creative is used in each hour,1
Tellis et al. (2000) and Chandy et al. (2001). S = a matrix of current and lagged ads in each TV
The researchers have referrals (sales) and TV station by hour,
24-Grover.qxd 5/8/2006 8:35 PM Page 521

Modeling Marketing Mix– • –521

AM = a matrix of current and lagged morning ads effects vary and decay by time of the day and day
by hour, of the week.
H = a matrix of dummy variables for time of day
by hour, Stage 2: Explaining Effectiveness
D = a matrix of dummy variables for day of week of Ad Response by Type of Creative
by hour, In the second stage, the authors collected the
O = a vector of dummies recording whether the coefficients (βc) for each creative for each market
service is open by hour, (m) in which it is used and explained their varia-
α = constant term to be estimated, tion as a function of creative characteristics and
the age of the market in which it ran as follows:
λ = a vector of coefficients to be estimated for
lagged referrals, βc,m = ϕ1 Argumentc + ϕ2 (Argumentc × Agem)
βi = vectors of coefficients to be estimated, and + ϕ3Emotionc + ϕ4 (Emotionc × Agem)
εt = a vector of error terms initially assumed to be + ϕ5800 Visiblec + ϕ6 (800 Visiblec
IID normal. × Agem) + ϕ7Negativec + ϕ8 (Negativec
× Agem) + ϕ9Positivec + ϕ10
Note that in this study, the authors are able to (Positivec × Agem) + ϕ11Expertc + ϕ12
capture many of the key effects of advertising. (Expertc × Agem) + ϕ13Nonexpertc + ϕ14
For example, βA captures the main effect of (Nonexpertc × Agem) + ϕ15Agem + ϕ16
advertising by hour of the day. A combination of (Agem)2 + Γ Market + v,
(14)
λ and βA captures the carryover effect of adver- where
tising. βc captures the effects of various creatives
that were used, plus the main effects of advertis- βc,m = coefficients of creative c in market m from
ing by hour of the day. βS captures the effect of Equation (13),
the various media (TV stations) that were used. Age = market age (number of weeks since the
Note that the authors included the creatives inception of service in the market),
as dummy variables in Equation (13), indicating Market = matrix of market dummies,
whether a creative is used in a particular market. Γ = vector of market coefficients,
They chose to drop the creatives that had an v = vector of errors,
average effectiveness and to include only those ϕ = second-stage coefficients to be estimated, and
that were significantly above or below the other variables are as defined in Equation (4).
average. Thus, the coefficient of a creative in
Equation (13) represents the increase or decrease The characteristics of creatives that were
in expected referrals due to that creative, relative particularly important were the use of argument,
to the average of creatives in that particular emotion, expert endorsers, visibility of the brand
market. This specification had the most practical name, negative versus positive arguments, and
relevance. Managers are not interested much in expert versus nonexpert endorsers. The authors’
a global optimization of the best mix of cre- most important finding was that emotional
atives. Rather, they are interested in making appeals were effective in mature markets while
improvements over their strategy in the previous argument appeals were effective in new markets.
year. For this reason, they seek analyses that Furthermore, a nonlinear regression of the effec-
highlight the best creatives (to use more often) tiveness of ads on the age of the creatives enabled
or the worst creatives (to drop). the researchers to assess the effects of wearin and
The results showed that although advertising wearout. They found that ads have no wearin
has small effects, these effects varied dramati- period, and wearout starts from the very first
cally by type of ad and TV channel. Thus, man- week of the campaign and is steepest in the first
agers could drop the least effective ads and TV few weeks. Thus, frequently changing campaigns
channels and spend more on the most effective and developing new campaigns would be very
ads and TV channels. The detailed data and spec- useful. When developing new campaigns, using
ification of the model revealed a number of other appeals that were the most effective for the age of
interesting phenomena about how advertising’s the market would be highly advisable.
24-Grover.qxd 5/8/2006 8:35 PM Page 522

522– • –CONCEPTUAL APPLICATIONS

CONCLUSION Cooper, L. G., & Nakanishi, M. (1988). Market share


analysis. Norwell, MA: Kluwer.
Planning the marketing mix is a central task Guadagni, P., & Little, J. D. C. (1983). A logit model
in marketing management. Prudent planning of brand choice calibrated on scanner data.
requires that marketing managers take into Marketing Science, 2, 203–238.
account how markets have responded to the McCarthy, J. (1996). Basic marketing: A managerial
marketing mix in the past. The underlying approach (12th ed.). Homewood, IL: Irwin.
assumption is not that the past predicts the Rajendran, K. N., & Tellis, G. J. (1994). Is reference
future with certainty but that it contains valuable price based on context or experience? An analy-
lessons that might enlighten the future. sis of consumers’ brand choices. Journal of
The econometrics of response modeling Marketing, 58, 10–22.
describes how a researcher should model Sethuraman, R., & Tellis, G. J. (1991). An analysis of
response to the marketing mix so as to capture the tradeoff between advertising and pricing.
the most important effects validly. This chapter Journal of Marketing Research, 31, 160–174.
provides an overview of the essential issues and Tellis, G. J. (1986). Beyond the many faces of price:
principles in this area. It first describes the An integration of pricing strategies. Journal of
important effects that occur in markets today. It Marketing, 50, 146–160.
then discusses the strengths and limitations of Tellis, G. J. (1988a). Advertising exposure, loyalty and
various models that capture those effects. brand purchase: A two-stage model of choice.
The chapter focuses on two elements of Journal of Marketing Research, 15, 134–144.
the marketing mix: advertising and pricing. This Tellis, G. J. (1988b). The price sensitivity of compet-
focus is because the variables are the most com- itive demand: A meta-analysis of sales response
monly managed and analyzed and encompass a models. Journal of Marketing Research, 15,
wide range of response patterns. Understanding 331–341.
how to model response to these two variables Tellis, G. J. (1989). Interpreting advertising and
should provide researchers with the essential price elasticities. Journal of Advertising
tools to model response to other elements of the Research, 29(4), 40–43.
marketing mix. The chapter provides references Tellis, G. J. (1998). Advertising and sales promotion
to articles and chapters of this book that provide strategy. Reading, MA: Addison-Wesley.
further details on these issues. Tellis, G. J. (2004). Effective advertising: How, when,
and why advertising works. Thousand Oaks,
CA: Sage.
Tellis, G. J., Chandy, R., & Thaivanich, P. (2000).
NOTE Decomposing the effects of direct advertising:
Which brand works, when, where, and how long?
1. We use C to refer to the matrix of creatives
Journal of Marketing Research, 37, 32–46.
here and c to refer to individual creatives later in the
Tellis, G. J., & Franses, P. H. (in press). The optimal
chapter.
data interval for econometric models of advertis-
ing. Marketing Science.
Tellis, G. J., & Weiss, D. (1995). Does TV advertis-
REFERENCES ing really affect sales? Journal of Advertising,
24(3), 1–12.
Chandy, R., Tellis, G. J., MacInnis, D., & Thaivanich, Tellis, G. J., & Zufryden, F. (1995). Cracking the
P. (2001). What to say when: Advertising appeals retailer’s decision problem: Which brand to dis-
in evolving markets. Journal of Marketing count, how much, when and why? Marketing
Research, 38, 399–414. Science, 14(3), 271–299.
Clarke, D. G. (1976). Econometric measurement of Winer, R. (1986). A reference price model for
the duration of advertising effect on sales. demand of frequently purchased goods. Journal
Journal of Marketing Research, 13, 345–357. of Consumer Research, 13, 250–256.

You might also like