Professional Documents
Culture Documents
Session 1:
Decompositional view (how one sees a product): A product is characterized by many attributes
(=features/characteristics of the product), one needs to anticipate on which attributes consumers
want. An attribute is characterized by several levels (= different possibilities per attribute). To be
able to know what consumers want, it is sufficient to know what value consumers put on each level.
Then you could produce the best product.
Utility: if the utility of product A is higher than the utility of product B, a consumer will buy product.
In order to determine the utility of a product, just add the utility of all levels of that product.
1
Independent variables Dependent variable Method
Ranking-based Attributes (numerical Ranking Linear
Choose the most-preferred or categorical) regression
product, then the second
most-preferred, until the
least-preferred product.
Rating-based Attributes (numerical Rating Linear
Give a score to each product or categorical) regression
in turn
Choice-based Attributes (numerical Choice: (Multinomial)
Choose the most-preferred or categorical) 2 choices: binary logit model
product only variable dummy
More: categorical
2
Rating-based conjoint Example: evaluate hypothetical cameras on a 10-point scale:
8 stimuli
3 attributes
Each has 2 levels
Possible stimuli: 23 = 8
Estimate the preferences with a linear regression model (Ordinary Least Squares)
Interested in: , ,
= the part-worths in
the regression, they
show the importance of
the attributes
How to interpret
coefficients? The
attribute zoom (0 if
zoom is 3x, 1 if zoom
is 5x) has coefficient
1.5. So the expected
rating for a camera
with zoom 5x, will be
1.5 higher for a
camera with zoom 3x
(ceteris paribus).
In other words: if zoom is 5x then the rating will increase compared to if zoom were 3x, since the
coefficient is positive. However, p-value is 0.086, so the result is not significant at 5% level, but it is
significant at 10% level, which is quite good for such a small sample.
If a result is insignificant, consumers dont care about the corresponding attribute.
Problems of rating-based conjoint:
Unreliable part-worths if model is estimated at individual level.
Not clear whether spread in ratings is due to real preference or due to response style
Implications for sales levels and market shares are not clear
3
Choice-based conjoint
Repeatedly ask consumers to choose between options and record choices. Sometimes add no-
choice option (relevant for market-share). The choices made are recorded for every customer
during n tasks (n can be any number). One task represents one choice set. As, in every choice set, a
different combination of attributes levels is used, we can derive the relative importance and effect
of each attribute on choice. (preferences = attributes part-worths_.
As the conjoint exercise is repeated across many customers, it can be detected whether different
customers have different preferences (customer-specific preferences).
Estimate the preferences with a multinomial logit regression model (Maximum Likelihood)
Interested in: , ,
= the part-worths in the
regression, they show
the importance of the
attributes
The number of
observations we have to
estimate the part-worths
of one customer, depend
on the number of choice
sets
4
Introduction to Latent-Class analysis:
A latent class is a group of customers who share similar preferences for product attributes.
Latent class = customer segment
We do not observe which customers belong to the same class, hence we call it latent. However,
we can estimate customer preferences and similarities in preferences across customers using
conjoint data. Customers with similar part-worths will be assigned to the same class/segment.
By grouping consumers
based on part-worth, you
have to estimate less
parameters (per segement
instead of per individual).
Individual analysis and
individual strategy
would be optimal,
however this is too
expensive and mostly
infeasible. Therefore,
segmentation is used.
Using latent-class analysis, we can
Determine how many customer segments the market contains
Determine the size of each customer segment
Determine the specific preferences per segment
With this information, we can
Segment the market of customers S
Choose which segment a company wants to target T
Position product towards this customer segment P
5
How does it work? All people that answer the questions form the population. Compute distance
between every object (euclidean distance). 2 people who are very close have the same preferences. Put
people with the same preferences in a segment. Minimize distince within segment while maximizing
distance between segement.
6
Random Utility Theory
Consumers try to choose those alternatives that they like best subject to constraints such as income
and time. Sometimes, they dont choose what they like best because some randomness is involved
(there is some noise around customer choices). Therefore, true utility observed utility.
Product: iPad Air Perceived product Utilities Holistic utility = Vi Product choice
- memory (xi1) - ui1 probability
- speed (xi2) - ui2 ui1 + ui2 + ui3 + ui4 +
- weight (xi3) Same set, but - ui3 ui5 + ui6 P(buying)
- size (xi4) slightly different - ui4
-price (xi5) - ui5
- brand name (xi6) - ui6
Holistic utility: Vi = ui1 + ui2 + ui3 + ui4 + ui5 + ui6, where ui1 = 1 * xi1 for which
xi1= memory (16gb, 32gb or 64gb) di11 = 1 if 16gb, 0 else; di12= 1 if 32gb, 0 else; 64gb is base level.
If other two are not fulfilled (both 0), then the intercept includes the effect of base level. Thus
measures the effect of deviating from the benchmark. Mostly, pick lowest as base level because of
easier interpretation. Then: ui1 = 0 + 11 * di11 + 12 * di12
So: Vi = * xi + where = (0. 1, 2. 3. 4. 5. 6) and xi= (1, xi1, xi2, xi3, xi4, xi5, xi6)
Conjoint vs Direct questionnaire: direct questionnaire might be useful before you do conjoint:
Direct questionnaire Conjoint analysis
+ simple: #attributes is large - complex, #attributes is small
+ no high involvement assumed - high involvement assumed
- everything is important + forced to make tradeoffs
- not realistic: isolated attributes + more realistic: hypothetical products
- explicit: social answers + implicit: less social answers
- subjective ranges of levels + predefined ranges of levels
7
Steps in conjoint analysis (same for all types)
Rating or choices?
Rating-Based Conjoint Choice-Based Conjoint
Some tradeoffs are made Tradeoffs are enforced even more
Rating products is not common in daily life Realistic: the choice-setting mimics real-life
All products are considered Accommodates no-choice option sales proxy
Implication for sales/market share not clear, Avoids the need of ad-hoc rules to predict
(you get utilities, but difficult to translate into market shares (since it is about choosing
choices) instead of attitude)
No clear whether spread in ratings is due to real No subjective scaling
preferences or due to response style
Ratings are informative of the intensities of Choice is cognitively less demanding than
preference ratings
Easy to implement and estimate (also less Observations are less informative
respondents needed to evaluate more profiles)
(in case you want to measure the intensity of Implementation more complicated
preference, youd better use rating-based)
Step 2:Choosing attributes (which & how many)
Attributes in conjoint analysis should:
be relevant and feasible for the management (discuss with them!)
have varying levels in real-life (4 wheels for a car); but dont start varying attributes that doesnt
have varying levels in real-life if variable is not changing, it cant measure
be expected to influence preferences (theory, qualitative research)
be clearly defined and communicable (respondent should understand correctly, e.g., verbal
descriptions, pictures, intro movie); if not easy to understand, add pictures
preferably not exhibit strong correlations (but price, brand name); because of resulting
multicollinearity, try to avoid this.
8
The number of attributes should be less or equal to 6 (rule of thumb); if you have more than 6,
consumers might not be able to process all information. It is important to not only value attributes
which are valued high by own customers, but also value the attributes that are valued high by
customers that go to the competition. Techniques for large number of attributes do not outperform
conjoint, e.g.:
Direct survey
Partial-profile conjoint (only subsets of attributes)
Hybrid conjoint (direct survey, small full-profile conjoint)
Adaptive conjoint (direct survey, dynamic paired comparisons)
Number of levels: Two levels is minimum since you want to observe variation: in case of linearity,
two levels is both sufficient and efficient. In case of nonlinearity, more than two levels are needed.
However, more levels than necessary is inefficient because more parameters need to be estimated,
and complexity for respondent increases. In addition, (if possible) make sure there are equal
number of level per attributes, else attention would be drawn to the one with more attributes.
9
Orthogonality: levels of any two attributes occur independently, then
P(32gb & black) = P(32gb) * P(black).
Quick check: rows (and columns) are proportional to each other.
First questionnaire:
9 = 9 & 9=9 balanced
P (32-b AND white) P(32-b) * P(white)
6/18 9/18 * 9/18
Or rows and columns are not proportional not orthogonal
Second questionnaire:
12 6 not balanced
Rows (8;4) and (4;2) are proportional, since (8;4) = 2* (4;2)
Columns (8;4) and (4;2) are proportional, since (8;4) = 2* (4;2)
Thus orthogonal
Third questionnaire:
10 = 10 and 10 = 10 balanced
Rows (5;5) and (5;5) are proportional
Columns (5;5) and (5;5) are proportional
Thus orthogonal
Sometimes it is not possible to achieve both level balance and orthogonality for all numbers of
stimuli. Necessary condition:
1. Number of stimuli should be divisible by the number of levels for each attribute
2. Number of stimuli should be divisible by the product of the number of level for any pair of
attributes
Examples: 2 attributes with 2 levels and 2 attributes with 3 levels.
1. According to the first condition it should be divisible by 2 and 3
2. According to the second condition it should be divisible by: 2*2=4; 2*3=6 and 3*3=9
10
Session 3: Rating- and Choice-Based Conjoint
Step 3 and 4: Determine the number of attributes and the number of levels:
%mktruns(2 2 2 3 3 3 3) 3 attributes with 2 levels, 4 attribtues with 3 levels.
* saturated = 12 = the number of parameters
you need to estimate = the minimum number
of stimuli, we dont want to go beyond 20.
If there are 12 parameters to estimate, and
there are 12 stimuli 0 degrees of freedom.
* full factorial = 648 = 23*34 number of
possible stimuli
* 36*/72* = the number of stimuli that is
optimal, leading to level balance and
orthogonality. (divisible by: 2;3;4;6;9)
But they are both > 20, too high.
* nr violations: 3 = see next page
11
Keep all attributes but exclude totally unrealistic stimuli; however, it may be needed to
sacrifice level balance and orthogonality. Furthermore, D-efficiency is likely to decrease.
Frequencies
X2 X4 3 3 3 3 3 3 These numbers indicate the
X2 X5 3 3 3 3 3 3
X1 9 9
X2 9 9
X2 X6 3 3 3 3 3 3 occurrence of levels per
X2 X7 3 3 3 3 3 3
X3 9 9
X3 X4 3 3 3 3 3 3
attribute, which can be used to
X4 6 6 6
X5 6 6 6
X3 X5 3 3 3 3 3 3 check level balance criterium.
X3 X6 3 3 3 3 3 3
X6 6 6 6
X3 X7 3 3 3 3 3 3
X7 6 6 6
X4 X5 2 2 2 2 2 2 2 2 2
* X1 X2 4 5 5 4
X4 X6 2 2 2 2 2 2 2 2 2
* X1 X3 5 4 4 5
X1 X4 3 3 3 3 3 3
X4 X7 2 2 2 2 2 2 2 2 2 The * indicates where the
X5 X6 2 2 2 2 2 2 2 2 2
X1 X5 3 3 3 3 3 3
X5 X7 2 2 2 2 2 2 2 2 2
problem areas are: P(A and B)
X1 X6 3 3 3 3 3 3
X1 X7 3 3 3 3 3 3
X6 X7 2 2 2 2 2 2 2 2 2 P(A) * P(B). 3 violations in
N-Way 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
* X2 X3 4 5 5 4 this case.
e.g. X1X2 = 4 5 5 4
4 times, level one X 1 and level one X 2
Obs X1 X2 X3 X4 X5 X6 X7
5 times, level one X 1 and level two X 2 1 2 2 2 2 3 2 1
2 1 1 1 3 3 3 2
5 times, level two X 1 and level one X 2 3 2 2 2 1 2 3 2
4 times, level two X1 and level two X1 4 2 2 1 3 3 1 1
5 1 1 2 1 3 1 2
6 2 1 1 1 3 2 3
7 1 2 1 3 2 2 3
Print linear design: proc print data = randomized; run 8 2 2 1 1 1 3 3
9 1 1 2 2 2 1 3
10 1 1 1 2 1 3 1
Step 5: collect data from respondents (for rating-based conjoint) 11 2 1 2 3 2 3 1
Number of respondents: Rating-based conjoint can be done 12 1 2 2 2 3 3 3
13 1 2 1 1 2 1 1
per respondent, but the larger the set of respondents, the 14 2 1 2 3 1 1 3
more accurate the estimation of the part-worths. The 15 1 2 2 3 1 2 2
16 2 2 1 2 1 1 2
literature suggests at least 10 to 50 observations per 17 2 1 1 2 2 2 2
18 1 1 2 1 1 2 1
parameter to estimate
12
The coefficients give the part worths for respondent 1. E.g. zoom (x2) has coefficient 1.5. Because
the coefficient is positive, we can say that level 2 is preferred over level 1.
13
Session 4: Choice-Based Conjoint
Step 1 up to step 3 are the same for choice-based conjoint compared to rating-based conjoint.
CBC design in SAS (example, 7 attributes with 2 levels, 2 alternatives per set)
Step A: Determine # of choice sets %mktruns(2**7)
This specifies the number of stimuli, which has to be translated into the number of choice sets.
Make sure that the number of choice sets saturated (20 choice sets is a good upperbound)
Necessary condition for optimality: number of stimuli should be dividable by 2 and 2*2 = 4
number of choice sets should be devisable by 2/#alternatives and 4/#alternatives (2 and 1)
14
Step B: Create candidate set using %mktex
Only stimuli in candidate set are used for constructing the choice sets, 2 options:
1. Construct the final set of stimuli: %mktex(2**7, n=40, seed=17)
Level balance and orthogonality issues
2. (Best alternative) Larger candidate set, more flexible in choice set stage
e.g. take full-factorial design: %mktex(2**7, n=128, seed=17)
Step D: Create choice design using %Choiceff (SAS will pick the stimuli considering all 4 criteria)
%Choiceff(data=final, model=clas(x1-x7), nsets =20, flags=f1-f2, beta=1 -1 -1 -1 -1 1 -1,
seed=17, maxiter=20)
clas(x1-x7): tells you the number of attributes
nsets: is number of choice sets in the design
flags: indicates how many options per choice set
beta: the assumed part-worths for the attributes, which ensure that the design satisfies the utility
balance criterion (SAS treats the last attribute level as the benchmark level with value 0, then beta
of the first level is the effect of being in level 1 compared to the base level) The number of
components in beta: is equal to the number of parameters to be estimated. Since it is multinomial
logit, the intercept is not estimated.
seed: the starting value for the algorithm
maxiter: the maximum number of iterations that the algorithm will run to obtain the design
Step E: Print and investigate the choice design: proc print; id set; by set; run
15
Step 5: Collect data
Choice-based conjoint requires multiple respondents: the more, the better
In practice, range from 150 to 1200 respondents, with a minimum of 200 respondents per
group/segment (Orme, 1998). Own experience: good results with 100 respondents, or multiple if
you want to consider segments
Where xi is a vector of alternative attributes and a vector of unknown parameters (the part-
worths). How to code the x variables (attributes)? Easiest way is 0/1 dummy. You can choose any
level as base level in this case (not especially the last one), but you have to interpret the estimated
part-worths based on the choide you make.
See slides
Assumptions (for 37-40
modelfor thetwo
with estimation of choice-based part-worths
alternatives):
Calculating
- Utility dependsutilities from estimates:
on attribute levels and Utility depends on factors
other unobserved attribute levels and other unobserved factors
represented by an error term
represented by an error term:
Ui,A = ResolutionA + ZoomA + ScreenA + i,A
Ui,B = ResolutionB + ZoomB + ScreenB + i,B
Each respondent will choose the alternative providing the highest utility.
- Uncertainty in i,A and i,B is described by probability distribution
Thus, choice
- Each individual chooses the alternative providing the highest utility probabilities are
computed as:
Respondent i chooses alternative A Ui,A > Ui,B
Ui,B > URespondent
i,A i chooses alternative B 1. compute utility of
each alternative
(without error term)
2. Take exponents of
these utilities
3. divide exponents of
considered alternative
by sum of all exponents
exp(0.9) = 0.43
Pr (camera A) =
exp(0.9) + exp(1.2)
exp(1.2)
Pr (camera B ) = = 0.57
exp(0.9) + exp(1.2)
16
The part-worths are the preferences of the customers with respect to each attributes levels. Choice
probabilties are determined by the levels (x variables) and the part-worths (parameter estimates).
The levels are given and the part-worths are estimated such that the probabilities match with the
actual choices (Maximum Likelihood: Part-worths are estimated such that they maximize the
product of the probabilities concerning all actual choices.
17
Three core elements of market segmentation: (level/basis/method)
Which refinement level to consider? Segmentation level: Determines the unit of analysis
Macro: Country segments (results in different strategies for different countries)
Micro: consumer segments (mostly within countries)
For conjoint: unit of analysis = micro level because you consider individual consumers
Which distance(s) to consider? Segmentation basis: variable that you use to determine the
difference between consumers, it tells you in which segment a consumer fits.
General basis: independent of the domain
o Observable: geographic regions, socio-demographic variables (population size, age,
education, language)
o Unobservable: cultural dimensions, life styles
Domain-specific basis: (used in the course) Type of usages, financial product ownership,
brand loyalty; those only hold for the company you focus on.
Examples:
Survey on food habits for 500 consumers across 3 regions: micro level/domain-specific
National sales over 6 years of 5 consumer durables in 5 countries: macro level/ domain-specific
Brand loyalty ratings on bath soap in 15 Dutch supermarkets: micro level/domain-specific
6 factors determining the effectiveness of market segmentation (criteria for good segmenation,
level+basis):
Identifiabiliy: Easily measured segmentation variables needed, you need to see which
customer has which customer characteristics.
Substantiality: Segments should be large enough to be profitable nr of segments should not
be too large, otherwise too few customers within each segment.
Accessibility: effective promotional/distributional tools needed (need to make sure you can
reach the customers you focus on)
Stability: composition of segments should not change rapidly; consumers should stay long
enough in segment. (good stability criterion price sensitivity)
Responsiveness: homogeneous, unique response within segment (we want customers within
segment to react similarly to strategy)
Actionability: segments and firms goals/competencies should match. Segmentation choice
needs to be in line with what the company wants.
18
Differences between latent class and cross-tabulaiton/discriminant analysis/cluster analysis:
Latent class analysis is post-hoc: First collect consumer behavior data (observe consumer
preferences and choices), based on that you determine the segments.
Latent class can make predictions: e.g. market share predictions. If you have done CBC analysis,
you can predict in which segment new consumers will be. This is interesting from a marketing view.
19
Latent class analysis compared to traditional methods:
Traditional way to get preferences and segments:
1. Estimate preferences, e.g. obtain part-worths in conjoint context
[Estimation at individual level (very limited degrees of freedom) unreliable estimates]
2. Assign respondents to segments based on estimates from step 1 [Cluster analysis]
3. Regress segment membership from step 2 on socio-demo variables
[Discriminant analysis or logit framework]
A better way to do this: Latent Class segmentation (integrate the 3 steps above):
LC determines the number of segments and who belongs to which segment at the same time.
LC estimates preferences and segments simultaneously for all respondents avoids estimation at
individual level higher accuracy and validity
LC provides estimates of uncertainty regarding segment membership of respondents
LC is broad and has many applications (traditional conjoint, CBC, scanner data from supermarkets..)
LC (mixture) methods currently provide the most powerful algorithms for market segmentation
Basic principles:
Prior probability: (assumptions)
probability segment membership before observing any individual data
prior probability = segment size (if 2 segments, take for example 50/50)
Posterior probability:
probability segment membership after observing individual data
Data collected from respondent contain useful segment info (e.g. choices in choice sets)
segments sizes are combined with individual data learning:
Prior likelihood posterior
likelihood update posterior etc.
20
Example: calculating posterior probabilities
21
Session 6: latent-class analysis (part 2)
22
Determination of the number of segments
first estimate model with 1 segment, then estimate model with 2 segments, etc. Then determine
which is the best? If you would pick the model with the highest likelihood, the model with most
segments will always give the highest likelihood. In the end, the number of segments is equal to
then number of customers. But this is not feasible. make tradeoff: likelihood as high as possible
and complexity as low as possible
23
Profiling segments:
Relate unobserved response segments to observed socio-demographics.
Gain: new subjects can be assigned to segments after observing general characteristics like age,
gender, income,
Example: low-income subjects may have large probability to belong to price sensitive segment
marketing strategy.
cluster analysis: profiling after clustering; latent classes: profiling either during or after
segmentation
24
We want the diagonal
numbers as high as
possible (then observed
= estimated). We want
number outside the
diagonal as small as
possible. The hit rate
tells you how many
times you have correct
prediction compared to
incorrect prediction.
Wald test tells us whether the variable is jointly significant (across all segments). Whether a
given attribute matters for all segments.
H0: 1,s1= 1,s2= 1,s3= 0
You see that the p-value for price is 0.82 there is no quadratic effect for price. remove this
variable from the model.
Wald (=) test tells us whether the variables are significantly different accros segments.
H0: 1,s1= 1,s2= 1,s3 (Whether the effect of attributes is the same across segments.)
This tells you whether there are same preferences across segments, hence whether you actually
need segmentation.
Z-test tells us whether, within a given segment, the preferences are significant or not.
H0: 1,s1=0
25
Impose restrictions based on the previously mentioned tests (re-run model with new specification):
By making more restrictions, you reduce the number of parameters (makes the model les complex),
the information criteria should decrease.
26
Using probmeans, you can compare segments.
In profiles, the columns sum up to one while in
probmeans, the rows sum up to one.
This tab is very important when you have new customers, you can predict in which segment they
belong. For example if a new customer is a male customer of age 40+, he will probably belong to
segment 3.
27
Session 7: Market Simulation
Based on the unveiled preferences, we can design market shares simulators to guide managerial
decisions, such as which assortment to compose, which product to introduce on the market and
which price to charge
First example: Which assortment to compose in order to maximize sales (primary demand)?
28
You should get exactly the same parameter estimates because we didnt collect new answers;
nothing has changed except now you are also going to make predictions for the last choice sets (that
people didnt see). Then also the likelihood is the same.
The nice feature of the random
utility theory is that we can use
the part-worth utility estimates
to simulate choice shares for the
inactive sets. We can now see
the predicted choices for the
new choice sets (look at
setprofiles, the final choice set).
Only 18% decided not to buy (no
choice option). Furthermore,
you can see how many percent
in each segment would go for
which shoe.
Next, we can see what happens if we reduce our assortment: update the SET file, remove the less
chosen product in the inactive set, the ALT file does not change.
What are the alternatives of only having 2 shoes instead of 3: demand for one of the shoes has
increased, while the total primary demand has decreased. Which is better? The decision depends on
the store storage space available:
Can we have a broader assortment
(which requires a wider inventory) or is
it sufficient to work with a limited offer
(and loosing a few customers)? Does
the additional inventory cost
compensate for 3% share loss
The decision depends not only on the
market share but only on the revenues
and profits we can get from each
product.
Advantages of less products in your assortment could be: better in terms of positioning, cost savings
in terms of production/inventory/ordering, better in terms of logistics.
29
Disadvantages of having less choice: losing market share.
Second example: Which assortment to compose in order to generate higher revenues and
higher profits?
Suppose now that we are a shoe producer and that our competition is:
Modern, standard quality shoe, priced at $75 (MS3)
Modern and higher-quality shoe, priced at $75(MH3)
Which product should we offer to maximize market share (relative to the competition and no-
choice): Modern or traditional? Higher or standard quality? Higher, same or lower price?
In order to find the best new combination of attributes levels, add each to the current competitive
set: MS3 and MH3. For the new product, we try each possible combination of attribute levels.
First, update the SET file including the new choice sets. Then re-estimate and compare shares,
revenues and profits for each scenario.
Choice 1 and 2 are each time the same, and choice 4 is the no-choice option. Looking at only market
share, the first product (set 9) is preferred. However, do we still have a positive margin?
How can we see if we are affecting the competition or whether we increase primary demand? Look
at percentage that did not buy before introducing the new product and compare with percentage
that will not buy after introducing new product. Also compare market shares of competitors. Keep
in mind whether you want to decrease the market share of the other product.
30
Comparing products based on market shares or revenues give different winners!
Market share winner: modern,
higher-quality shoe at $ 25
We give the customer
everything at a lower price than
the competition but wont make
money out of it
31