You are on page 1of 70

Marketing Research:

Preparing Survey Data for Analysis and Data Analysis


Dwi Krisnanto Adji (122120039)
Feri Lasman (122120051)
Obby J.W. Purba (122120100)
Zenal Mutakin (122120144)
Chapter 14
Preparing Survey Data for Analysis
Learning Objectives

1. Describe the process of data preparation for analysis.


2. Discuss validation, editing, and coding of survey data.
3. Explain data entry procedures as well as how to detect
errors.
4. Describe data tabulation approaches.

14-3
Overview of Data Preparation and Analysis

14-4
Data Preparation
. . . process of converting information from a questionnaire so it can be transferred to
a data warehouse. Four Steps . . .

Data Validation

Editing and Coding

Data Entry

Data Tabulation

14-5
. . . determine if the
surveys interviews or
Data observations were
Validation conducted correctly and
free of interviewer fraud or
bias.

14-6
Fraud

Screening

Data
Validation Procedure
Process

Completeness

Courtesy

14-7
Editing
Process that checks the data for mistakes made by either the interviewer or the
respondent.

Areas of concern to check . . .


Asking the proper questions
Accurate recording of answers
Correct screening questions
Responses to open-ended questions

14-8
Responses to an Open-Ended Question

14-9
Coding
Grouping and assigning values to various responses from the survey instrument . .

Codes are numerical a number from 0 - 9.


Well planned and constructed questionnaires reduce time spent on
coding.
Numeric codes should be designed into the questionnaire from the
beginning.
If questionnaires do not use coded responses a master coding system
must be established.

14-10
An Illustration of a Master Code Sheet

14-11
Illustration of Response Consolidation for Open-Ended Questions

14-12
Coding open-ended questions is a four-step process

1. Generate a master list of potential responses assign values to the


responses
2. Consolidate responses
3. Specify a numerical value as a code
4. Assign a coded value to each response

14-13
Data Entry

PC most popular

Scanner
4 Major
Ways
Touch screen

Light pen

14-14
Example of Optical Character Recognition
Questionnaire

14-15
Error Detection

First step determine if the software used for data


entry and tabulation includes error editing routines:
To identify the wrong type of data
To prepare a printed representation of the data entered
To produce a data/column list of the data
To find individual questionnaires and verify the proper
response (code)

14-16
SPSS Data View of Coded Values for Santa Fe
Grill Observations

14-17
Example of Data/Column List Procedure

14-18
Data . . . process of counting the number
Tabulation of observations classified into
certain categories; e.g.,
one-way tabulation or
cross-tabulation.

14-19
One-Way Tabulation Purposes:

Determine the frequency of non-response to individual


questions.

Locate errors or blunders in data entry.

Calculate summary statistics such as means, standard


deviations, range, etc.

Communicate results of research project.

14-20
One-way frequency table the number of respondents that
responded to each possible answer to a questions available
alternatives.

One-way frequency tables . . .


Identify missing data
Determine valid percentages
Provide summary statistics

14-21
Example of One-Way Frequency Distribution

14-22
One-Way Frequency Table Illustrating Missing
Data

14-23
Cross . . . determine whether variables
Tabulation differ when compared across
sample subgroups. Results show
frequencies and percentages for
both rows and columns.

14-24
Overview of Descriptive Statistics

14-25
Data Tabulation Issues to be Considered

Judgment of the analyst selection of variables (questions)


to use in examining relationships.
Demographic variables or lifestyle/psychographic
characteristics are the starting point in developing cross-
tabulations.
Technique is simple but findings may be difficult to interpret .
..
Keep research objectives in mind when constructing and using tables
Spreadsheets help

14-26
Descriptive Statistics
. . . summarize and describe data
obtained from a sample of
respondents.

Central
Dispersion
Tendency

14-27
. . . translation of one-way
frequency and cross-
tabulation tables into
graphs.
Graphical
Illustrations

. . . technique for
communicating research
results from preliminary
data analysis to the client.

14-28
Deli Depot Questionnaire

14-29
Deli Depot Questionnaire
(continued)

14-30
Chapter 15
Data Analysis
Learning Objectives
1. Explain measures of central tendency and dispersion.
2. Describe how to test hypotheses using univariate and bivariate statistics.
3. Apply and interpret analysis of variance (ANOVA).
4. Utilize perceptual mapping to present research findings.

15-32
Value of Testing for Differences in
Data

Common to all
marketing research projects

Central tendency and dispersion

t-distribution and associated confidence interval


estimation

Basic Statistics Hypothesis testing


and Descriptive Analysis

Analysis of variance (ANOVA)

15-33
Measures of Central
Tendency

Mean Median Mode

15-34
Measures of Central Tendency defined . . .

Mean arithmetic average of the sample, all values of


a distribution of responses are summed and
divided by the number of valid responses.
Mode most common value is the set of responses to
a question; i.e., the response most often given to a
question.
Median middle value of a rank ordered distribution;
half of the responses are above and half below the
median value.

15-35
Measures of Central Tendency

Nominal = Ordinal = Interval & Ratio


Mode Median = Mean

Types of Data

15-36
Dialog Boxes for Calculating the Mean, Median and Mode

15-37
Output for Mean, Median and Mode for X25Frequency of
Eating at . . .

15-38
Measures of Dispersion

Range

. . . describe how close to the


mean or other measure of Standard
central tendency the other Deviation
values in the distribution fall.

Variance

15-39
Measures of Dispersion

Range the distance between the


smallest and largest values of the variable.

Standard deviation the average distance of


the dispersion of the values from the mean.

Variance the average squared deviation


about the mean of a distribution of values.

15-40
Output Measures of Dispersion

15-41
Analyzing Relationships of
Sample Data
Purpose of inferential statistics to make a
determination about a population on the basis of a
sample.
Sample a subset of the population.
Sample statistics measures obtained directly from sample data.
Population parameter a measured characteristic of the population.
Actual population parameters are unknown since the cost to perform a census of the
population is prohibitive.

Frequency Distribution used to display data


calculated from the sample.

15-42
Hypothesis Testing

Univariate statistical Bivariate statistical


test = hypothesis test = hypothesis
tests one variable tests
at a time. two variables.

15-43
Hypothesis Testing

. . . a preconceived notion
that is empirically testable
but unproven, and
Hypothesis developed in order to
explain phenomena.

15-44
Hypothesis Testing
Null Hypothesis (H0) a statement that asserts the status
quo.
Alternative Hypothesis (H1)
a statement that is the opposite of the null hypothesis that the
difference in reality is not simply due to random error.
Represents the condition desired.
Null hypothesis is accepted there is no change in the
status quo.
Null hypothesis is rejected the alternative hypothesis is
accepted and the conclusion is that there has been a
change in opinions or actions.
Null hypothesis refers to a population parameter not a
sample statistic.

15-45
Hypothesis Testing
Independent samples two or more groups of
respondents that are tested as though they may come
from different populations (independent samples t-
test).
Related samples two or more groups of
respondents that originated from the sample
population (paired samples t-test).
Paired samples questions are independent but
respondents are the same.

15-46
Hypothesis Testing
First Step to develop the hypotheses that are to be
tested . . .
Developed prior to the collection of data.
Developed as part of a research plan.
Make comparisons between two groups of respondents to
determine if there are important differences between the
groups.
Important considerations in hypothesis testing are:
Magnitude of the difference between the means.
Size of the sample used to calculate the means.

15-47
Hypothesis Testing
Statistical Significance
Inference regarding the population
Type I Error made by rejecting the null hypothesis when it is true the
probability of alpha ()
Level of Significance .10, .05, or .01

15-48
Hypothesis Testing

Type II Error failing to reject the null


hypothesis when the alternative hypothesis is
true the probability of beta ().
Unlike alpha (), which is specified by the
researcher, beta () depends on the actual
population parameter.
Type I and Type II errors sample size can help
control these errors.
Can select an alpha () and the sample size in order to increase the power
of the test and beta ().

15-49
Analyzing Relationships of
Sample Data

Univariate Tests of . . . involve hypothesis testing using one variable at


Significance a time.

. . . if sample size <30 and the standard deviation


t-test is unknown, assumption of a normal distribution
is not valid, use t-test.

. . . if sample size >30 and the standard


z-test deviation is unknown,
use z-test.
15-50
Analyzing Relationships
of Sample Data

Univariate
and
. . . require interval or ratio data.
Bivariate
t-tests

. . . assumption is the samples are drawn from


Bivariate
populations with normal distributions and the
t-test
variances of the populations are equal.

15-51
Univariate Hypothesis Test
Using X16Reasonable Prices

15-52
Analyzing Relationships of
Sample Data

Bivariate
. . . more than one group is involved.
Hypothesis

. . . there is no difference between


Null
Hypothesis the group means.
1 = 2 or that 1 - 2 = 0

15-53
Analyzing Relationships
of Sample Data

The formula for calculating the t


value is . . .
_ _
Z = x1 x2
Sx1 x2

15-54
Bivariate Statistical Tests

Cross-tabulation is useful for examining


relationships and reporting the findings for two
variables. The purpose of cross-tabulation is to
determine if differences exist between subgroups
of the total sample.

15-55
Dialog Boxes for Crosstab

15-56
Example of a Cross-Tabulation: Gender by Ad Recall

15-57
Chi-Square (X2) Analysis

. . . test for significance between the


frequency distributions of two or more
nominally scaled variables in a cross-
tabulation table to determine if there is any
association.

15-58
Chi-Square (X2) Analysis

Assesses how closely the observed frequencies fit the pattern of the
expected frequencies and is referred to as a goodness-of-fit test.

Used to analyze nominal data which cannot be analyzed with other


types of statistical analysis, such as ANOVA or t-tests.

Results will be distorted if more than 20 percent of the cells have an


expected count of less than 5.

15-59
Chi-Square Analysis
Is usage of the Internet (low, moderate and
high) related to gender?

Does frequency of eating out (infrequent,


moderately frequent, and very frequent) differ
Examples of between males and females?

Research
Questions Do part-time and full-time workers differ in
terms of how often they are absent from work
(seldom, occasionally, frequently)?

Do college students and high school students


differ in their preference for Coke versus Pepsi?

15-60
SPSS Chi-Square Crosstab Example

15-61
Analyzing Relationships of
Sample Data

Independent Example . . . interviews with male and female


Samples coffee drinkers.

Related Example . . . interviews of only female students


Samples and comparing number of Cokes consumed
versus number of cups of coffee.

15-62
Analyzing Data Relationships
Requirements for ANOVA
dependent variable can be either interval or ratio
scaled.
independent variable is categorical.

Null hypothesis for ANOVA states there is no


difference between the groups the null hypothesis is
...
1 = 2 = 3

15-63
ANOVA

F-test used to statistically


evaluate the differences between
the group means.
Determining
Statistical
Significance

Total variance separated into


between-group and within-
group variance.

15-64
ANOVA Testing Statistical Significance

. . . Examines the ratio of two components of


total variance and is calculated as shown
below . . .
Based on the F-distribution
...
F ratio = Variance between groups
Variance within groups

. . . the larger the difference in the variance


between groups.
. . . implies significant differences between the
The larger the F ratio . . . groups.
. . . the more likely the null hypothesis will be
rejected.

15-65
Analyzing Relationships of Sample Data
ANOVA cannot identify which pairs of means are
significantly different from each other.
Must perform follow-up tests to identify the means that
are statistically different from each other. Including:
Sheff
Tukey, Duncan and Dunn

15-66
Analyzing Data Relationships

ANOVA MANOVA
(analysis of variance) (multivariate analysis of
variance)

. . . determines if three or
. . . same as ANOVA but
more means are statistically
multiple dependent variables
different from each other
can be analyzed together.
(single dependent variable)
15-67
15-68
Perceptual Maps

. . . have a vertical and a horizontal axis that are


labeled with descriptive adjectives.

To develop perceptual maps can use rankings,


mean ratings, and multivariate methods.

15-69
Perceptual Mapping

New product development

Image development
Applications
in Marketing
Research
Advertising

Distribution

15-70

You might also like