You are on page 1of 31

Summary Advanced Marketing Research

Session 1:
Decompositional view (how one sees a product): A product is characterized by many attributes
(=features/characteristics of the product), one needs to anticipate on which attributes consumers
want. An attribute is characterized by several levels (= different possibilities per attribute). To be
able to know what consumers want, it is sufficient to know what value consumers put on each level.
Then you could produce the best product.

Utility: if the utility of product A is higher than the utility of product B, a consumer will buy product.
In order to determine the utility of a product, just add the utility of all levels of that product.

Conjoint analysis (conjoint = to join together)


A product is composed by joining different attributes together. Conjoint analysis is a quantitative
market research technique that asks respondents to rank, rate or choose among multiple
products/services. Compared to choice-based conjoint analysis, ranking/rating-based conjoint
analysis does not show whether the consumer will actually buy a product. However, methods to
analyze choice-based conjoint analysis are more complex. The managerial goal of conjoint analysis
is to unveil the relative importance of different attributes of products/services, i.e. to determine
how individuals make trade-offs between attributes of a product.
Examples of managerial goals of conjoint analysis:
Optimize product lines and estimate demand under different scenarios
Estimate price elasticity: see how changes in X (price) affect Y (choice)
Measuring brand strength: see how changes in X (brand) affect Y (choice), by changing the
attribute brand, one can determine the sensitivity towards this attribute
Segment the market according to customer needs: look at the extent to which consumers
make the same choices, then segment the market based on preferences
Understand the importance of product attributes
Understand product preferences
Main outcome of conjoint analysis:
Importance, utilities: which product attributes levels influence the purchase decisions
Preference shares: How do customers choose between different products in specific market
situations? Do we observe different customer segments?
Market simulation: How should product attributes be changed to compete with the
products available on the market

1
Independent variables Dependent variable Method
Ranking-based Attributes (numerical Ranking Linear
Choose the most-preferred or categorical) regression
product, then the second
most-preferred, until the
least-preferred product.
Rating-based Attributes (numerical Rating Linear
Give a score to each product or categorical) regression
in turn
Choice-based Attributes (numerical Choice: (Multinomial)
Choose the most-preferred or categorical) 2 choices: binary logit model
product only variable dummy
More: categorical

Why conjoint analysis? (advantages)


People will always choose the best. In direct surveys, respondents might say they consider all
attributes important, this is not informative. In order to get rid of this, let consumers choose
between different products with different attributes.
Conjoint analysis enforces tradeoffs between attributes. All attributes are evaluated at once,
respondents evaluate complete products with both strong and weak attributes.
Conjoint reduces the problem of socially desirable answers: because you dont ask them
specific attributes they want, but offer them products to choose from.
Conjoint is more realistic, because in real-life consumers evaluate products and not isolated
attributes.
Conjoint analysis is straightforward, because suitable software is available.

Disadvantages of conjoint analysis:


Typically, only small number of attributes can be included because respondents need to
process multiple attributes simultaneously and they cant assess all attributes at one time.
Therefore, a relevant subset needs to be selected (limitation)
Assumes high-involvement search. knowing the attributes levels is usually only possible
after search effort. Answers might be biased because consumers are no experts.

2
Rating-based conjoint Example: evaluate hypothetical cameras on a 10-point scale:

8 stimuli
3 attributes
Each has 2 levels

Possible stimuli: 23 = 8

Estimate the preferences with a linear regression model (Ordinary Least Squares)
Interested in: , ,
= the part-worths in
the regression, they
show the importance of
the attributes

In this case we have


only 8 observations
(per individual) to
estimate 4 parameters
(, , and ), which
will not give a reliable
result.

How to interpret
coefficients? The
attribute zoom (0 if
zoom is 3x, 1 if zoom
is 5x) has coefficient
1.5. So the expected
rating for a camera
with zoom 5x, will be
1.5 higher for a
camera with zoom 3x
(ceteris paribus).
In other words: if zoom is 5x then the rating will increase compared to if zoom were 3x, since the
coefficient is positive. However, p-value is 0.086, so the result is not significant at 5% level, but it is
significant at 10% level, which is quite good for such a small sample.
If a result is insignificant, consumers dont care about the corresponding attribute.
Problems of rating-based conjoint:
Unreliable part-worths if model is estimated at individual level.
Not clear whether spread in ratings is due to real preference or due to response style
Implications for sales levels and market shares are not clear

3
Choice-based conjoint
Repeatedly ask consumers to choose between options and record choices. Sometimes add no-
choice option (relevant for market-share). The choices made are recorded for every customer
during n tasks (n can be any number). One task represents one choice set. As, in every choice set, a
different combination of attributes levels is used, we can derive the relative importance and effect
of each attribute on choice. (preferences = attributes part-worths_.
As the conjoint exercise is repeated across many customers, it can be detected whether different
customers have different preferences (customer-specific preferences).

Estimate the preferences with a multinomial logit regression model (Maximum Likelihood)

Interested in: , ,
= the part-worths in the
regression, they show
the importance of the
attributes

The number of
observations we have to
estimate the part-worths
of one customer, depend
on the number of choice
sets

Advantages of Choice-Based conjoint:


Tradeoffs are enforced even more: you HAVE to choose, while in rating-based you can give
same raking if you like products equally much
Realistic: the choice-setting mimics real-life
Accommodates no-choice option: sales proxy
Avoids the need of ad-hoc rules to predict market shares.
No subjective scaling: dont have the result of comparing results of someone who only gives
high rating to someone who gives only low rating. In other words, dont have the problem,
that someone perceives rating 7 as low while others perceive 7 as high.
Choice is cognitively less demanding than ratings.

Disadvantages of Choice-Based conjoint:


Observations are less more informative
o More observations needed for same accuracy
o Part-worths are not estimated at individual level
Implementation more complicated than traditional conjoint analysis

4
Introduction to Latent-Class analysis:
A latent class is a group of customers who share similar preferences for product attributes.
Latent class = customer segment
We do not observe which customers belong to the same class, hence we call it latent. However,
we can estimate customer preferences and similarities in preferences across customers using
conjoint data. Customers with similar part-worths will be assigned to the same class/segment.

By grouping consumers
based on part-worth, you
have to estimate less
parameters (per segement
instead of per individual).
Individual analysis and
individual strategy
would be optimal,
however this is too
expensive and mostly
infeasible. Therefore,
segmentation is used.
Using latent-class analysis, we can
Determine how many customer segments the market contains
Determine the size of each customer segment
Determine the specific preferences per segment
With this information, we can
Segment the market of customers S
Choose which segment a company wants to target T
Position product towards this customer segment P

There are three levels of analysis


Individual level (different part- + Realistic, as respondent have different preferences
worths for each respondent) Unreliable due to small number of observations per respondent
Aggregate level (same part- Assuming same preferences may give misleading results
worths for all respondents) + High precision, as all respondents are combined
Segment level (different part- + Realistic, as segments take into account different preferences
worths for different segments) + High precision if all respondents are used in one big analysis

Why segmenting the market?


To deal with the structure of heterogeneity in consumer needs and wants
Better matching of products to customers
To find some homogenous clusters of consumers who are likely to exhibit similar responses to
marketing efforts
Reduced costs of marketing (economies of scale)
To focus on best segments

5
How does it work? All people that answer the questions form the population. Compute distance
between every object (euclidean distance). 2 people who are very close have the same preferences. Put
people with the same preferences in a segment. Minimize distince within segment while maximizing
distance between segement.

Why not cluster analysis?


Requires individual-level part-worths, which need to be estimated for each respondent
Unreliable part-worths = unreliable segments
Uncertainty with respect to segment membership of respondents cannot be assessed
Cluster analysis has little statistical theory
(Many variants, the cluster analysis does not exist)

New technique: Latent-Class segmentation


Estimates segments and part-worths simultaneously, using all respondents together
High precision
Assumes that each respondent belongs to exactly one segment, but membership is uncertain
Able to make probability statements
Strong statistical foundation (only one latent classes technique, even though determining
number of segments is still subjective)
Latent Class segmentation was the clear winner in our extensive testing of clustering
approaches. We dont use traditional clustering techniques anymore.

Session 2: conjoint analysis


Goal of conjoint analysis: understanding and predicting customer trade-offs and preferences. It is
originated from psychology, economics and marketing and based on Random Utility Theory. Utility
depends on the value customers attach to the attributes. Consumers make trade-offs which
determines their preferences. This is also about the perception of the attributes (e.g. perception of
the brand Apple). As a marketer you can influence this perception.
In Random Utility Theory: no interaction effect, the utility of the product is just the sum of the
utility of the attributes.
Utilities are customer specific, while product (objective attributes) are the same for everyone.

6
Random Utility Theory

Consumers try to choose those alternatives that they like best subject to constraints such as income
and time. Sometimes, they dont choose what they like best because some randomness is involved
(there is some noise around customer choices). Therefore, true utility observed utility.

Ui = Vi + ei ((unobservable true utility) = observable, systematic utility + random component)

Probability of choosing product i among the product set C:


(j = all products in choice set)
(Holistic) Utility function
Therefore

Example in class: iPad Air

Product: iPad Air Perceived product Utilities Holistic utility = Vi Product choice
- memory (xi1) - ui1 probability
- speed (xi2) - ui2 ui1 + ui2 + ui3 + ui4 +
- weight (xi3) Same set, but - ui3 ui5 + ui6 P(buying)
- size (xi4) slightly different - ui4
-price (xi5) - ui5
- brand name (xi6) - ui6

Holistic utility: Vi = ui1 + ui2 + ui3 + ui4 + ui5 + ui6, where ui1 = 1 * xi1 for which
xi1= memory (16gb, 32gb or 64gb) di11 = 1 if 16gb, 0 else; di12= 1 if 32gb, 0 else; 64gb is base level.
If other two are not fulfilled (both 0), then the intercept includes the effect of base level. Thus
measures the effect of deviating from the benchmark. Mostly, pick lowest as base level because of
easier interpretation. Then: ui1 = 0 + 11 * di11 + 12 * di12
So: Vi = * xi + where = (0. 1, 2. 3. 4. 5. 6) and xi= (1, xi1, xi2, xi3, xi4, xi5, xi6)

Conjoint vs Direct questionnaire: direct questionnaire might be useful before you do conjoint:
Direct questionnaire Conjoint analysis
+ simple: #attributes is large - complex, #attributes is small
+ no high involvement assumed - high involvement assumed
- everything is important + forced to make tradeoffs
- not realistic: isolated attributes + more realistic: hypothetical products
- explicit: social answers + implicit: less social answers
- subjective ranges of levels + predefined ranges of levels

7
Steps in conjoint analysis (same for all types)

Step 1: determine the type of study


Ranking-based conjoint (seventies, early 80s)
o Dependent variable: ordinal responses
o Method of estimation: monotonic or linear regression
Rating-based conjoint (eighties, early 90s)
o Dependent variable: rating responses
o Method of estimation: linear regression (OLS)
Choice-based conjoint (nineties, new millennium)
o Dependent variable: qualitative responses (choices)
o Method of estimation: maximum likelihood (condition logit)

Rating or choices?
Rating-Based Conjoint Choice-Based Conjoint
Some tradeoffs are made Tradeoffs are enforced even more
Rating products is not common in daily life Realistic: the choice-setting mimics real-life
All products are considered Accommodates no-choice option sales proxy
Implication for sales/market share not clear, Avoids the need of ad-hoc rules to predict
(you get utilities, but difficult to translate into market shares (since it is about choosing
choices) instead of attitude)
No clear whether spread in ratings is due to real No subjective scaling
preferences or due to response style
Ratings are informative of the intensities of Choice is cognitively less demanding than
preference ratings
Easy to implement and estimate (also less Observations are less informative
respondents needed to evaluate more profiles)
(in case you want to measure the intensity of Implementation more complicated
preference, youd better use rating-based)
Step 2:Choosing attributes (which & how many)
Attributes in conjoint analysis should:
be relevant and feasible for the management (discuss with them!)
have varying levels in real-life (4 wheels for a car); but dont start varying attributes that doesnt
have varying levels in real-life if variable is not changing, it cant measure
be expected to influence preferences (theory, qualitative research)
be clearly defined and communicable (respondent should understand correctly, e.g., verbal
descriptions, pictures, intro movie); if not easy to understand, add pictures
preferably not exhibit strong correlations (but price, brand name); because of resulting
multicollinearity, try to avoid this.

8
The number of attributes should be less or equal to 6 (rule of thumb); if you have more than 6,
consumers might not be able to process all information. It is important to not only value attributes
which are valued high by own customers, but also value the attributes that are valued high by
customers that go to the competition. Techniques for large number of attributes do not outperform
conjoint, e.g.:
Direct survey
Partial-profile conjoint (only subsets of attributes)
Hybrid conjoint (direct survey, small full-profile conjoint)
Adaptive conjoint (direct survey, dynamic paired comparisons)

Step 3:Choosing levels (which & how many)


Levels of attributes should be
interesting for the management (discuss with them!)
unambiguous (low versus high is too imprecise); be as concrete as possible such that the
interpretation is the same for all respondents
separated enough (otherwise too little weight)
realistic (but allowed to be little bit outside current range); find out whether you should invest
in levels outside range
such that no attribute can a priori be expected to be clear winner

Number of levels: Two levels is minimum since you want to observe variation: in case of linearity,
two levels is both sufficient and efficient. In case of nonlinearity, more than two levels are needed.
However, more levels than necessary is inefficient because more parameters need to be estimated,
and complexity for respondent increases. In addition, (if possible) make sure there are equal
number of level per attributes, else attention would be drawn to the one with more attributes.

Step 4: Questionnaire design (for rating-based conjoint)


Choose the right number of stimuli/levels/attributes in the questionnaire. It is of great importance
to choose an optimal subset of stimuli among all possibilities. Number of stimuli:
Advantages of large questionnaires: more observation per respondent and increase in quality, as
respondents learn how to answer. Disadvantage of large questionnaire: decrease in quality, as
respondents get fatigued or bored. Respondents can complete up to 20 tasks without decrease in
quality. However, also other factors like involvement, mood, time constraints and product
complexity play a role, such that in some cases 12 tasks may be maximum.

Stimulus: hypothetical product, which to include?


In a full factorial design it is feasible to include all possible combinations. In a fractional factorial
design it is only feasible to include a subset of all possible combinations. An optimal design provides
as much information as possible about respondents preferences for given number of tasks. It
minimized standard errors of part-worth estimates and is both balanced and orthogonal.
Level balance: levels of an attribute occur with equal frequency. E.g.: number of stimuli with
level black for a specific attributes equals the number of stimuli with level with for that
attributes.

9
Orthogonality: levels of any two attributes occur independently, then
P(32gb & black) = P(32gb) * P(black).
Quick check: rows (and columns) are proportional to each other.

First questionnaire:
9 = 9 & 9=9 balanced
P (32-b AND white) P(32-b) * P(white)
6/18 9/18 * 9/18
Or rows and columns are not proportional not orthogonal

Second questionnaire:
12 6 not balanced
Rows (8;4) and (4;2) are proportional, since (8;4) = 2* (4;2)
Columns (8;4) and (4;2) are proportional, since (8;4) = 2* (4;2)
Thus orthogonal

Third questionnaire:
10 = 10 and 10 = 10 balanced
Rows (5;5) and (5;5) are proportional
Columns (5;5) and (5;5) are proportional
Thus orthogonal

Sometimes it is not possible to achieve both level balance and orthogonality for all numbers of
stimuli. Necessary condition:
1. Number of stimuli should be divisible by the number of levels for each attribute
2. Number of stimuli should be divisible by the product of the number of level for any pair of
attributes
Examples: 2 attributes with 2 levels and 2 attributes with 3 levels.
1. According to the first condition it should be divisible by 2 and 3
2. According to the second condition it should be divisible by: 2*2=4; 2*3=6 and 3*3=9

What to do if balanced and orthogonal design cannot be found?


Ultimate objective is to obtain small standard errors minimize D-error measure (average
variance), which is equivalent to: maximize D-efficiency.
Level balance and orthogonality are tools to achieve this. D-efficiency increases when more stimuli
are added, but too much stimuli lead to too much tasks. Thus, sometimes one needs to make a
tradeoff. Furthermore, sometimes a design with neither balance nor orthogonality may have higher
D-efficiency than a design with exactly one of these two. One could try with different seeds in SAS,
then the search algorithm tries different routes.

10
Session 3: Rating- and Choice-Based Conjoint
Step 3 and 4: Determine the number of attributes and the number of levels:
%mktruns(2 2 2 3 3 3 3) 3 attributes with 2 levels, 4 attribtues with 3 levels.
* saturated = 12 = the number of parameters
you need to estimate = the minimum number
of stimuli, we dont want to go beyond 20.
If there are 12 parameters to estimate, and
there are 12 stimuli 0 degrees of freedom.
* full factorial = 648 = 23*34 number of
possible stimuli
* 36*/72* = the number of stimuli that is
optimal, leading to level balance and
orthogonality. (divisible by: 2;3;4;6;9)
But they are both > 20, too high.
* nr violations: 3 = see next page

Create linear design: %mktex( 2 **3 3 **4 , n = 18, seed = 17 )


(design for 18 stimuli)
This command will automatically
save design as randomized.
D-efficiency = 99,7096%, reflects
how well level balance and
orthogonality criteria are satisfied.
In determining the optimal number of stimuli, one has to look at how large the drop in D-
efficiency is for different numbers of stimuli.
Canonical Correlations Between the Factors
There are 0 Canonical Correlations Greater Than 0.316
Evaluate linear desing: %mkteval(data=randomized)
X1 X2 X3 X4 X5 X6 X7
This table shows the correlation between attributes.
X1 1 0.11 0.11 0 0 0 0
X2 0.11 1 0.11 0 0 0 0 There is no problem, since correlation is low no
X3 0.11 0.11 1 0 0 0 0
X4 0 0 0 1 0 0 0 multicollinearity.
X5 0 0 0 0 1 0 0
X6 0 0 0 0 0 1 0 If correlation is too high, choose different number
X7 0 0 0 0 0 0 1
of stimuli.
What to do if two attributes are highly correlated?
Combine it into a new super-attribute, which combines two attributes. However, it is hard to
disentangle the effect of each underlying attribute and the super-attribute might have too many
levels. If there are too many levels, restrictions can solve this. However, this mostly reduces D-
efficiency.
Command for restriction:
%macro restrict;
bad = (x1 = 1 & x2 = 2) + (x1 = 2 & x2 = 1);
%mend;
%mktex( 2 **3 3 **4 , n = 18, seed = 17, restrictions = restrict ) ;
proc print data=design ; run ;

11
Keep all attributes but exclude totally unrealistic stimuli; however, it may be needed to
sacrifice level balance and orthogonality. Furthermore, D-efficiency is likely to decrease.
Frequencies
X2 X4 3 3 3 3 3 3 These numbers indicate the
X2 X5 3 3 3 3 3 3
X1 9 9
X2 9 9
X2 X6 3 3 3 3 3 3 occurrence of levels per
X2 X7 3 3 3 3 3 3
X3 9 9
X3 X4 3 3 3 3 3 3
attribute, which can be used to
X4 6 6 6
X5 6 6 6
X3 X5 3 3 3 3 3 3 check level balance criterium.
X3 X6 3 3 3 3 3 3
X6 6 6 6
X3 X7 3 3 3 3 3 3
X7 6 6 6
X4 X5 2 2 2 2 2 2 2 2 2
* X1 X2 4 5 5 4
X4 X6 2 2 2 2 2 2 2 2 2
* X1 X3 5 4 4 5
X1 X4 3 3 3 3 3 3
X4 X7 2 2 2 2 2 2 2 2 2 The * indicates where the
X5 X6 2 2 2 2 2 2 2 2 2
X1 X5 3 3 3 3 3 3
X5 X7 2 2 2 2 2 2 2 2 2
problem areas are: P(A and B)
X1 X6 3 3 3 3 3 3
X1 X7 3 3 3 3 3 3
X6 X7 2 2 2 2 2 2 2 2 2 P(A) * P(B). 3 violations in
N-Way 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
* X2 X3 4 5 5 4 this case.
e.g. X1X2 = 4 5 5 4
4 times, level one X 1 and level one X 2
Obs X1 X2 X3 X4 X5 X6 X7
5 times, level one X 1 and level two X 2 1 2 2 2 2 3 2 1
2 1 1 1 3 3 3 2
5 times, level two X 1 and level one X 2 3 2 2 2 1 2 3 2
4 times, level two X1 and level two X1 4 2 2 1 3 3 1 1
5 1 1 2 1 3 1 2
6 2 1 1 1 3 2 3
7 1 2 1 3 2 2 3
Print linear design: proc print data = randomized; run 8 2 2 1 1 1 3 3
9 1 1 2 2 2 1 3
10 1 1 1 2 1 3 1
Step 5: collect data from respondents (for rating-based conjoint) 11 2 1 2 3 2 3 1
Number of respondents: Rating-based conjoint can be done 12 1 2 2 2 3 3 3
13 1 2 1 1 2 1 1
per respondent, but the larger the set of respondents, the 14 2 1 2 3 1 1 3
more accurate the estimation of the part-worths. The 15 1 2 2 3 1 2 2
16 2 2 1 2 1 1 2
literature suggests at least 10 to 50 observations per 17 2 1 1 2 2 2 2
18 1 1 2 1 1 2 1
parameter to estimate

Step 6: Estimation of the part-worths(for rating-based conjoint)

In the slides it is shown how to do regression in Excel.

12
The coefficients give the part worths for respondent 1. E.g. zoom (x2) has coefficient 1.5. Because
the coefficient is positive, we can say that level 2 is preferred over level 1.

Step 7: Design Market Simulators(for rating-based conjoint)


Choice simulator: Suppose you observe ratings for 5 respondents
o First, you can estimate part-worth for each of the 5 respondents
o Second, derive these 5 respondents choices for 3 new hypothetical products choice
simulator. Example:
3 new digital cameras, each with different strength:
Camera 1 has high resolution (1, 0, 0)
Camera 2 has more zoom (0, 1, 0)
Camera 3 has large screen (0, 0, 1)

Utility of camera 1 for respondent 1:


U1,1 = 4,5 +(1* 1,0)+(0*1,5)+0*(1,0)
= 5,5
Part-worth for respondent 1

Once you have the utilities, these


have to be translated into market
share. There are three possibilities
to achieve this:

Max utility rule: chosen = 1, non-chosen =0


Bradley-Terry-Luce Rule: proportional to utility & sum to 1. Take utility of a product and divide
by the utility of all products. Market share for camera 1 respondent 1: 5.5/(5.5+6.0+5.5) = 0.32
Logit rule: proportional to exp-utility & sum to 1 exp (5.5) / (exp (5.5)+exp(6.0)+exp(5.5)) =
0.27 Problem: which method should you use to determine market share?

13
Session 4: Choice-Based Conjoint
Step 1 up to step 3 are the same for choice-based conjoint compared to rating-based conjoint.

Step 4: Design questionnaire


Step 4 becomes more complex for choice-based conjoint, because it requires choice sets. There are
four conditions for good choice design:
1. Level balance: levels of an attribute occur with equal frequency (as before)
2. Orthogonality: Levels of any two attributes occur independently (as before)
3. Minimal level overlap: whether choices within set do not overlap in terms of levels
(alternatives within choice set do not share same attribute levels)
4. Utility balance: (almost) equally attractive alternatives, give respondent a difficult choice.
First make sure there is level balance/orthogonality, then minimal overlap. It is OK to have some
overlap, but not too much, therefore we want minimal overlap. The less overlap, the more
information that can be extracted. Last criterion: utility balance, we want two products with
similar utility in order to provide the responded with difficult choices. In general, utility balance is a
very difficult criterion to satisfy, because in order to define difficult tradeoffs, we would need to
know in advance how respondents will evaluate the different attribute levels. But this is actually
what we want to learn. (Classical chicken-egg problem) solution:
Specify assumed part-worths based on managerial expectations and small-sample pre-tests. In
most cases assume expected signs, but do not impose strength differences (magnitude). If you are
not sure which level will be preferred, set all assumed part worths equal to 0 for that attribute.
These assumed betas are used for construction of choice sets, but estimation results can still be
very different. However, even quite bad assumptions are better than zero assumptions. But still,
wrong sign is not preferred over zero part-worths.
Example: suppose an attribute with 3 levels with actual (unknown) part-worths 0 1 1
o Safe middle option is zero part-worths 0 0 0
o Too small scale 0 0.5 0.5 is clearly preferred
o Too large scale 0 2 2 is clearly preferred
o Skew scale 0 0.5 1.5 is clearly preferred
o Wrong direction 0 -1 -1 is not preferred to zero part-worths

How many alternatives per choice set?


Usually 2 to 4 alternatives per choice set. The more alternatives per set, the more information the
respondent has to process. As a result, the larger the number of attributes, the less number of
alternatives a respondent can handle try to limit it to 18-20 bits of information (max)

CBC design in SAS (example, 7 attributes with 2 levels, 2 alternatives per set)
Step A: Determine # of choice sets %mktruns(2**7)
This specifies the number of stimuli, which has to be translated into the number of choice sets.
Make sure that the number of choice sets saturated (20 choice sets is a good upperbound)
Necessary condition for optimality: number of stimuli should be dividable by 2 and 2*2 = 4
number of choice sets should be devisable by 2/#alternatives and 4/#alternatives (2 and 1)

14
Step B: Create candidate set using %mktex
Only stimuli in candidate set are used for constructing the choice sets, 2 options:
1. Construct the final set of stimuli: %mktex(2**7, n=40, seed=17)
Level balance and orthogonality issues
2. (Best alternative) Larger candidate set, more flexible in choice set stage
e.g. take full-factorial design: %mktex(2**7, n=128, seed=17)

Step C: define structure of choise sets using %mktlab


First tell # options per choice set: %mktlab(data=design, int=f1-f2)
(f1-f2 indicates 2 options, f1-f3 indicates 3 options, etc.)
Then, decide for each stimulus whether it is allowed.

Step D: Create choice design using %Choiceff (SAS will pick the stimuli considering all 4 criteria)
%Choiceff(data=final, model=clas(x1-x7), nsets =20, flags=f1-f2, beta=1 -1 -1 -1 -1 1 -1,
seed=17, maxiter=20)
clas(x1-x7): tells you the number of attributes
nsets: is number of choice sets in the design
flags: indicates how many options per choice set
beta: the assumed part-worths for the attributes, which ensure that the design satisfies the utility
balance criterion (SAS treats the last attribute level as the benchmark level with value 0, then beta
of the first level is the effect of being in level 1 compared to the base level) The number of
components in beta: is equal to the number of parameters to be estimated. Since it is multinomial
logit, the intercept is not estimated.
seed: the starting value for the algorithm
maxiter: the maximum number of iterations that the algorithm will run to obtain the design

Step E: Print and investigate the choice design: proc print; id set; by set; run

Efficiency: not important


Index: you can see whether a product is used a multiple times
Prob: probability that a customer will choose that product (based on assumed beta). This shows
whether we have utility balance

15
Step 5: Collect data
Choice-based conjoint requires multiple respondents: the more, the better
In practice, range from 150 to 1200 respondents, with a minimum of 200 respondents per
group/segment (Orme, 1998). Own experience: good results with 100 respondents, or multiple if
you want to consider segments

Step 6: Estimate preferences (Estimate Part-worths)


Resonse variable: choice; explanatory variables: attributes of hypothetical products.
Multinomial Logit Model: Assumes that the probability that an individual will choose one of the m
alternatives i from the choice set C is:
Since we are estimating a probability with this formula, the outcome has to be between 0 and 1 and
the sum of all probabilities should equal 1.

Where xi is a vector of alternative attributes and a vector of unknown parameters (the part-
worths). How to code the x variables (attributes)? Easiest way is 0/1 dummy. You can choose any
level as base level in this case (not especially the last one), but you have to interpret the estimated
part-worths based on the choide you make.

See slides
Assumptions (for 37-40
modelfor thetwo
with estimation of choice-based part-worths
alternatives):

Calculating
- Utility dependsutilities from estimates:
on attribute levels and Utility depends on factors
other unobserved attribute levels and other unobserved factors
represented by an error term
represented by an error term:
Ui,A = ResolutionA + ZoomA + ScreenA + i,A
Ui,B = ResolutionB + ZoomB + ScreenB + i,B

Each respondent will choose the alternative providing the highest utility.
- Uncertainty in i,A and i,B is described by probability distribution
Thus, choice
- Each individual chooses the alternative providing the highest utility probabilities are
computed as:
Respondent i chooses alternative A Ui,A > Ui,B
Ui,B > URespondent
i,A i chooses alternative B 1. compute utility of
each alternative
(without error term)
2. Take exponents of
these utilities
3. divide exponents of
considered alternative
by sum of all exponents

Conditional logit model:

exp(0.9) = 0.43
Pr (camera A) =
exp(0.9) + exp(1.2)
exp(1.2)
Pr (camera B ) = = 0.57
exp(0.9) + exp(1.2)
16
The part-worths are the preferences of the customers with respect to each attributes levels. Choice
probabilties are determined by the levels (x variables) and the part-worths (parameter estimates).
The levels are given and the part-worths are estimated such that the probabilities match with the
actual choices (Maximum Likelihood: Part-worths are estimated such that they maximize the
product of the probabilities concerning all actual choices.

Step 7: Design market simulators


Create your own market scenario and learn what happens. (see excel simulation file)

Session 5: latent-class analysis


Market segmentation: viewing a heterogeneous market as multiple homogeneous markets, in
response to differing consumer preferences, for more satisfaction of their varying wants. Split the
population in homogeneous groups regarding preferences; maximize homogeneity within segment
and maximize heterogeneity between segments. Goals of market segmentation:
To deal with the structure of heterogeneity in consumer needs and wants across borders
better matching products to customers
To find some homogeneous clusters of consumers who are likely to exhibit similar responses to
marketing efforts reduced costs of marketing
To focus on best segment.
In this way we are moving from supply/cost focus (the product) to demand focus (the market).

Use of market segmentation:


Segmentation: How to divide the market?
Targeting: Which segment(s) to focus on?
Positioning: How to communicate to the target? Unique and consistent positioning in each
segement.

17
Three core elements of market segmentation: (level/basis/method)
Which refinement level to consider? Segmentation level: Determines the unit of analysis
Macro: Country segments (results in different strategies for different countries)
Micro: consumer segments (mostly within countries)
For conjoint: unit of analysis = micro level because you consider individual consumers
Which distance(s) to consider? Segmentation basis: variable that you use to determine the
difference between consumers, it tells you in which segment a consumer fits.
General basis: independent of the domain
o Observable: geographic regions, socio-demographic variables (population size, age,
education, language)
o Unobservable: cultural dimensions, life styles
Domain-specific basis: (used in the course) Type of usages, financial product ownership,
brand loyalty; those only hold for the company you focus on.

Examples:
Survey on food habits for 500 consumers across 3 regions: micro level/domain-specific
National sales over 6 years of 5 consumer durables in 5 countries: macro level/ domain-specific
Brand loyalty ratings on bath soap in 15 Dutch supermarkets: micro level/domain-specific

6 factors determining the effectiveness of market segmentation (criteria for good segmenation,
level+basis):
Identifiabiliy: Easily measured segmentation variables needed, you need to see which
customer has which customer characteristics.
Substantiality: Segments should be large enough to be profitable nr of segments should not
be too large, otherwise too few customers within each segment.
Accessibility: effective promotional/distributional tools needed (need to make sure you can
reach the customers you focus on)
Stability: composition of segments should not change rapidly; consumers should stay long
enough in segment. (good stability criterion price sensitivity)
Responsiveness: homogeneous, unique response within segment (we want customers within
segment to react similarly to strategy)
Actionability: segments and firms goals/competencies should match. Segmentation choice
needs to be in line with what the company wants.

18
Differences between latent class and cross-tabulaiton/discriminant analysis/cluster analysis:

Latent class analysis is post-hoc: First collect consumer behavior data (observe consumer
preferences and choices), based on that you determine the segments.

Latent class can make predictions: e.g. market share predictions. If you have done CBC analysis,
you can predict in which segment new consumers will be. This is interesting from a marketing view.

Identifiability/responsiveness issue in post-hoc predictive segmentation solution: Profiling.


Preference (attribute importance) relevant for segmentation strategy but low visible.
Socio-demographic variables low relevance for segmentation but highly visible.
Profiling is a way to connect relevant preferences to observed socio-demographics. You relate the
membership of a segment to variables that you observe. It allows making predictions regarding
new customers.

Which iterative method to use to find the segment? Segmentation method

19
Latent class analysis compared to traditional methods:
Traditional way to get preferences and segments:
1. Estimate preferences, e.g. obtain part-worths in conjoint context
[Estimation at individual level (very limited degrees of freedom) unreliable estimates]
2. Assign respondents to segments based on estimates from step 1 [Cluster analysis]
3. Regress segment membership from step 2 on socio-demo variables
[Discriminant analysis or logit framework]

A better way to do this: Latent Class segmentation (integrate the 3 steps above):
LC determines the number of segments and who belongs to which segment at the same time.
LC estimates preferences and segments simultaneously for all respondents avoids estimation at
individual level higher accuracy and validity
LC provides estimates of uncertainty regarding segment membership of respondents
LC is broad and has many applications (traditional conjoint, CBC, scanner data from supermarkets..)
LC (mixture) methods currently provide the most powerful algorithms for market segmentation

Latent classes approach:


For each customer you will determine to which segment it belongs, the segment membership is
uncertain you get probability statements (we allow for uncertainty in the memberships):

Basic principles:
Prior probability: (assumptions)
probability segment membership before observing any individual data
prior probability = segment size (if 2 segments, take for example 50/50)
Posterior probability:
probability segment membership after observing individual data
Data collected from respondent contain useful segment info (e.g. choices in choice sets)
segments sizes are combined with individual data learning:
Prior likelihood posterior
likelihood update posterior etc.

Prior probability is for every individual


the same; posterior depends on
observation, thus is customer specific.

20
Example: calculating posterior probabilities

The 50/50 is the prior =P(segment)


the other numbers are the likelihood
Likelihood = P(product | segment)

50 * 0.8 = prior * likelihood

In order to get posterior probabilities, you have


to standardize those numbers to get numbers
between 0 and 1 and that add up to 1.
Posterior = P(segment | product)

0.73 = probability of being in price-sensitive segment, given you chose product A.

21
Session 6: latent-class analysis (part 2)

Estimation of part-worths and segment size: Maximum likelihood


The beta represents the preferences of a segment (it becomes segment specific)
The assumption we are making: all the customer that belong to the same segments have the same
betas and therefore the same utilities.

If you have only one segment:


Take part-worths such that
choices made by all respodents
over all choice sets are as likely
as possible.

If there are multiple segments:


the optimization becomes more
complex, because more
parameters need to be estimated.
We now want the optimal
parameters, given the segments.

22
Determination of the number of segments
first estimate model with 1 segment, then estimate model with 2 segments, etc. Then determine
which is the best? If you would pick the model with the highest likelihood, the model with most
segments will always give the highest likelihood. In the end, the number of segments is equal to
then number of customers. But this is not feasible. make tradeoff: likelihood as high as possible
and complexity as low as possible

P (= penalty for complexity) depends in which criterion you use

Evaluation of the segmentation quality


We want people within segment to be homogeneous and people across segments heterogeneous.

23
Profiling segments:
Relate unobserved response segments to observed socio-demographics.
Gain: new subjects can be assigned to segments after observing general characteristics like age,
gender, income,
Example: low-income subjects may have large probability to belong to price sensitive segment
marketing strategy.
cluster analysis: profiling after clustering; latent classes: profiling either during or after
segmentation

Latent Gold (purpose: conjoint analysis + segmentation)


3 files are used:
1. File containing choices and backgrounds: cbcRESP
2. File containing alternatives/stimuli: cbcALT
3. File containing choice sets: cbcSET

We dont know apriori how many


segments, let latent gold determine
which number is the best. We want
to have the information criterion as
low as possible.

24
We want the diagonal
numbers as high as
possible (then observed
= estimated). We want
number outside the
diagonal as small as
possible. The hit rate
tells you how many
times you have correct
prediction compared to
incorrect prediction.

In 37.7% of the time, we made accurate prediction.

These are the part-worths.

Wald test tells us whether the variable is jointly significant (across all segments). Whether a
given attribute matters for all segments.
H0: 1,s1= 1,s2= 1,s3= 0
You see that the p-value for price is 0.82 there is no quadratic effect for price. remove this
variable from the model.
Wald (=) test tells us whether the variables are significantly different accros segments.
H0: 1,s1= 1,s2= 1,s3 (Whether the effect of attributes is the same across segments.)
This tells you whether there are same preferences across segments, hence whether you actually
need segmentation.
Z-test tells us whether, within a given segment, the preferences are significant or not.
H0: 1,s1=0

25
Impose restrictions based on the previously mentioned tests (re-run model with new specification):

By making more restrictions, you reduce the number of parameters (makes the model les complex),
the information criteria should decrease.

Looking at relative tells you which attribute is


important in which class. For example, in the first
class, fashion is the most important attribute, while
in the second class quality is most important.

Profile tells you how each segment is looking,


it can help you profiling each segment.

Class size tells you the size of the segment. (50%


of the people belong to the first segment)

You can also see how they vary in the


preferences per attribute. E.g. segment 1 is
indifferent about quality.

You can also see what is the composition of segments


per demographic variable. For example, 75% of the
first segment is female.

26
Using probmeans, you can compare segments.
In profiles, the columns sum up to one while in
probmeans, the rows sum up to one.

Looking at for example all the people that


choose modern shoes, 61% comes from
segment 1.

This tab is very important when you have new customers, you can predict in which segment they
belong. For example if a new customer is a male customer of age 40+, he will probably belong to
segment 3.

27
Session 7: Market Simulation
Based on the unveiled preferences, we can design market shares simulators to guide managerial
decisions, such as which assortment to compose, which product to introduce on the market and
which price to charge

First example: Which assortment to compose in order to maximize sales (primary demand)?

Why add no-choice option in market simulation? We


want to measure sales (primary demand). Hence, we
need to observe the customers that come in the store
and buy shoes compared to the people dont buy
anything. Then primary demand = all the people do
buy shoes.

Create new SET and ALT files: these additional files


include inactive sets; they are not used for estimation
but predictions are computed. Hence, the response file
does not change. The consumers do not see the inactive
choice sets. New stimuli are included, thus the ALT set
needs to be adapted as well (add all missing stimuli).

The datasets have changed, but the model does not


change. Only update the new files to the existing models. Make sure that the coding of the variables
and the parameter specification are correct. Only the total alternatives and the total choice sets
change:

28
You should get exactly the same parameter estimates because we didnt collect new answers;
nothing has changed except now you are also going to make predictions for the last choice sets (that
people didnt see). Then also the likelihood is the same.
The nice feature of the random
utility theory is that we can use
the part-worth utility estimates
to simulate choice shares for the
inactive sets. We can now see
the predicted choices for the
new choice sets (look at
setprofiles, the final choice set).
Only 18% decided not to buy (no
choice option). Furthermore,
you can see how many percent
in each segment would go for
which shoe.

Set Profile also shows


us how the model
performs in predicting
the actual choices posed
to the respondent

Next, we can see what happens if we reduce our assortment: update the SET file, remove the less
chosen product in the inactive set, the ALT file does not change.

What are the alternatives of only having 2 shoes instead of 3: demand for one of the shoes has
increased, while the total primary demand has decreased. Which is better? The decision depends on
the store storage space available:
Can we have a broader assortment
(which requires a wider inventory) or is
it sufficient to work with a limited offer
(and loosing a few customers)? Does
the additional inventory cost
compensate for 3% share loss
The decision depends not only on the
market share but only on the revenues
and profits we can get from each
product.

Advantages of less products in your assortment could be: better in terms of positioning, cost savings
in terms of production/inventory/ordering, better in terms of logistics.

29
Disadvantages of having less choice: losing market share.
Second example: Which assortment to compose in order to generate higher revenues and
higher profits?
Suppose now that we are a shoe producer and that our competition is:
Modern, standard quality shoe, priced at $75 (MS3)
Modern and higher-quality shoe, priced at $75(MH3)
Which product should we offer to maximize market share (relative to the competition and no-
choice): Modern or traditional? Higher or standard quality? Higher, same or lower price?

In order to find the best new combination of attributes levels, add each to the current competitive
set: MS3 and MH3. For the new product, we try each possible combination of attribute levels.
First, update the SET file including the new choice sets. Then re-estimate and compare shares,
revenues and profits for each scenario.

Choice 1 and 2 are each time the same, and choice 4 is the no-choice option. Looking at only market
share, the first product (set 9) is preferred. However, do we still have a positive margin?

How can we see if we are affecting the competition or whether we increase primary demand? Look
at percentage that did not buy before introducing the new product and compare with percentage
that will not buy after introducing new product. Also compare market shares of competitors. Keep
in mind whether you want to decrease the market share of the other product.

30
Comparing products based on market shares or revenues give different winners!
Market share winner: modern,
higher-quality shoe at $ 25
We give the customer
everything at a lower price than
the competition but wont make
money out of it

Revenues winner: modern,


higher-quality shoe at $75
We offer the same price as the
competition

Third option: maximizing profits

Focus on a smaller market (lower


market share) but on a high-
margin product. Margin depends
on the price (+) and the
attributes marginal costs (-)
Typically, we have price as one
of the attributes If we determine
the (marginal) costs per
attribute level, we can
determine the profitability of
each product

31

You might also like