Professional Documents
Culture Documents
22-12-2014
Key Concepts
Construct (or Concept or Variable)
Measurement
Scaling
22-12-2014
Scale Characteristics
Description
By description, we mean the unique labels or
descriptors that are used to designate each
value of the scale. All scales possess
description.
Order
By order, we mean the relative sizes or
positions of the descriptors. Order is denoted
by descriptors such as greater than, less than,
and equal to.
AMR @ Dr. Vikas Goyal
Scale Characteristics
Distance
The characteristic of distance means that
absolute differences between the scale
descriptors are known and may be expressed
in units.
Origin
The origin characteristic means that the scale
has a unique or fixed beginning or true zero
point.
AMR @ Dr. Vikas Goyal
22-12-2014
Scaling
Type of scale depends on type of data!
Type of Scale
Nominal
Ordinal
Interval
Information content
increases
Ratio
Comparative
Scales
Paired
Comparison
Constant
Sum
Continuous
Rating Scales
Rank
Order
Itemized
Rating Scales
Likert
Stapel
AMR @ Dr. Vikas Goyal
Semantic
Differential
22-12-2014
SPSS-Variable View
Column
What it Means
SPSS-Variable View
Column
What it Means
Decimals This column allows you to control the number of characters after the
Label
decimal place.
This column allows you to provide a more extensive description of the
variable.
This column allows you to provide a key for what the numbers of a
numeric variable may represent (e.g., 1=Female, 2=Male).
This column allows you to indicate whether there are any missing
Missing values in a variable. Values marked as missing are excluded from
analyses in SPSS.
Values
Columns This column indicates the total number of columns a variable's values
may have.
Align
This column indicates the alignment of the variable in the Data View.
This last column indicates the level of measurement of the variable.
Measure There are three from which you can choose: Nominal, Ordinal, and
Scale.
AMR @ Dr. Vikas Goyal
22-12-2014
Data Preparation
Missing Value Treatment
User-defined missing values
System-missing values
Coding
Pre-coded
Coding open-ended questions
Re-coding
Compute Variable
Sub-setting data
Select-if
Split file
AMR @ Dr. Vikas Goyal
22-12-2014
Frequency Distribution
In a frequency distribution, one variable is
considered at a time.
A frequency distribution for a variable produces
a table of frequency counts, percentages, and
cumulative percentages for all the values
associated with that variable.
Frequency Distribution (single variable all levels)
for descriptive stats of data
AMR @ Dr. Vikas Goyal
Measures of Variability
Range (Largest-Smallest)
Deviation from the mean
Variance Mean Squared Deviation
Std. Deviation (s) root of variance
Coefficient of Variation (s/mean)
Unitless & expressed as %
Measure of relative variability (can be used in segmentation)
Measures of Shape
Skewness
Kurtosis (zero for normal)
AMR @ Dr. Vikas Goyal
22-12-2014
Population
Sample
Mean
Proportion
Variance
s2
Standard deviation
Size
Sx
(X-)/
Sp
(X-X)/S
S/X
Cross-Tabulation
While a frequency distribution describes one variable at a
time, a cross-tabulation describes two or more variables
simultaneously.
Cross-tabulation results in tables that reflect the joint
distribution of two or more variables with a limited number
of categories or distinct values.
Cross Tab (multiple variables all levels) for exploring interdependence of variables, for example:
How many brand-loyal customers are males?
Is product ownership related to income levels?
Is familiarity with the new product related to age and
education levels?
AMR @ Dr. Vikas Goyal
22-12-2014
Male
Female
Row
Total
10
15
10
15
Column Total
15
15
Pet Adoption
Gender
Male
Row
Total
10
50
60
30
10
40
Pet
Female
Dog
Cat
Column Total
40
60
0
100
22-12-2014
Chi-Square
Chi-Squared Test
comprehensive analysis rather than random chance
testing statistical significance of observed association.
Chi-Square stats
n n
fe = nr c
where
nr
nc
n
Contingency coefficient:
C = ( 2 / 2 + n)1/2
Measure of the strength of association between the
variables
Ranges from 0 - 1
AMR @ Dr. Vikas Goyal
22-12-2014
Chi-Square stats
Cramers V = [(2/n)/(Min (r-1),(c-1))]1/2
Measures strength of association, for any sized
table
Range from 0 - 1
Phi-Coefficient = (2 / n)1/2
Degree of Freedom (df) = (c-1)*(r-1)
Chi-Square stats
Chi-Square = 34.0278
Contingency coefficient = 0.5038
Cramers V = 0.5833
Phi-Coefficient = 0.5833
DOF = (2-1)*(2-1) = 1
10
22-12-2014
Degrees
of
Freedom
(df)
Probability (p)
0.95 0.90 0.80 0.70 0.50 0.30 0.20 0.10 0.05
1
2
3
4
5
6
7
8
9
10
0.01
0.001
0.004
0.02
0.06
0.15
0.46
1.07
1.64
2.71
3.84
6.64
10.83
0.10
0.21
0.45
0.71
1.39
2.41
3.22
4.60
5.99
9.21
13.82
0.35
0.58
1.01
1.42
2.37
3.66
4.64
6.25
7.82
11.34
16.27
0.71
1.06
1.65
2.20
3.36
4.88
5.99
7.78
9.49
13.28
18.47
1.14
1.61
2.34
3.00
4.35
6.06
7.29
9.24
11.07
15.09
20.52
1.63
2.20
3.07
3.83
5.35
7.23
8.56
10.64
12.59
16.81
22.46
2.17
2.83
3.82
4.67
6.35
8.38
9.80
12.02
14.07
18.48
24.32
2.73
3.49
4.59
5.53
7.34
9.52
11.03
13.36
15.51
20.09
26.12
3.32
4.17
5.38
6.39
8.34
10.66
12.24
14.68
16.92
21.67
27.88
3.94
4.86
6.18
7.27
9.34
11.78
13.44
15.99
18.31
23.21
29.59
Nonsignificant
Significant
Conclusion
As Chi-Square (34.02) > threshold Value (3.84,
at p=0.05)
There is a significant relationship between the
gender and pet adoption behaviour.
The strength of this relationship is about
58.33%
11
22-12-2014
Session 4:
Statistical Hypothesis Testing
12
22-12-2014
13
22-12-2014
Determine Probability
Associated with Test
Statistic
Compare with Level of
Significance,
14
22-12-2014
Example
A firm is testing the effect of a new kind of Sales
Promotions (SP) on the sales. The firm offered
the SP in 100 stores and recorded they levels of
sales.
It is known that the mean level of sales without
the SP = 61 Cr./month
The obtained results:
The mean sales with SP across 100 stores = 66.7
Sample Std. Deviation = 18.69 Cr./month
Did SP had an effect on the sales?
AMR @ Dr. Vikas Goyal
15
22-12-2014
The Error
When we draw inference about population based
on the sample, there is risk of making two types
of errors.
Type I Error
Type I error occurs when the sample results lead
to the rejection of the null hypothesis when it is
in fact true.
The probability of type I error ( ) is also called
the level of significance.
AMR @ Dr. Vikas Goyal
16
22-12-2014
The Error
Type II Error
Type II error occurs when, based on the sample
results, the null hypothesis is not rejected when it
is in fact false.
The probability of type II error is denoted by .
Unlike , which is specified by the researcher, the
magnitude of depends on the actual value of
the population parameter (proportion).
The risk of both and can be controlled by
increasing the sample size.
AMR @ Dr. Vikas Goyal
Hypothesis Testing
Parametric tests assume that the variables of interest are
measured on at least an interval scale.
Nonparametric tests assume that the variables are
measured on a nominal or ordinal scale.
These tests can be further classified based on whether one
or two or more samples are involved.
AMR @ Dr. Vikas Goyal
17
22-12-2014
Parametric Tests
(Metric Tests)
One Sample
* t test
* Z test
Non-parametric Tests
(Nonmetric Tests)
One Sample
Two or More
Samples
Independent
Samples
* Two-Group t
test (Mean)
* Z test
* Chi-Square
* K-S
* Runs
* Binomial
Paired
Samples
* Paired
t test
(Proportion)
AMR @ Dr. Vikas Goyal
Two or More
Samples
Independent
Samples
Paired
Samples
* Chi-Square
* Mann-Whitney
* Median
* K-S
* Sign
* Wilcoxon
* McNemar
* Chi-Square
One Sample
H0: SP had no effect on sales, i.e.
mean sales = 61 (even with the SP are offered)
61.0
61.0
t = (X - )/sX
s X = s/ n
AMR @ Dr. Vikas Goyal
18
22-12-2014
F-test
Two sample test:
F-test: this is performed if it is not known whether the two
groups have equal variance, i.e. Equal variance not
assumed. For Ex. Variance of male and female
respondents on a particular variable in question (ex;
internet usage) is same.
H0: the two variables have Equal Variance
H1: The two variables DO NOT have Equal Variance
If F-test comes out to be significant.. This implies that H0
can be rejected.. i.e. the two variables do not have equal
variance. And thus take the results under this head.
AMR @ Dr. Vikas Goyal
19
22-12-2014
SPSS Example
Please Run the following Parametric Tests:
Determine if the mean level of familiarity with internet
is more than 4.0
Determine if the internet usage is significantly different
for male and female
Determine if the mean level of familiarity with internet
is significantly different for male and female
Determine if the respondents significantly differed in
their attitude toward the Internet and attitude toward
technology.
Determine if the male and female respondents
significantly differed in their attitude toward the
Internet and attitude toward technology.
AMR @ Dr. Vikas Goyal
20