AMR Concept Note-1 (Freq Dist, Cross Tab, T-Test and ANOVA)

ADVANCED MARKETING RESEARCH
Concept Note-I
Dr. VIKAS GOYAL

For class circulation only

1
A frequency distribution is a convenient way of looking at the consolidated values of a variable.
In a frequency distribution, one variable is considered at a time.
A frequency distribution for a variable produces a table of frequency counts, percentages, and cumulative
percentages for all the values associated with that variable.
A frequency distribution is a tool for organizing data. We use it to group data into categories and show the
number of observations in each category.
Frequency distribution also indicates the extent of valid responses.
Frequency distribution is a mathematical distribution whose objective is to obtain a count of the number
of responses associated with different values of one variable and to express these counts in % terms.
Supplementary extension to the frequency distribution is to look at the percentage of values that point up
in every category. This is called a relative frequency distribution or percent frequency distribution.
The frequency data may be used to construct a histogram, or a vertical bar chart, in which the values of
the variable are portrayed along Xaxis and the absolute or relative frequencies of the values are placed
along Y-axis.
Frequency distribution can help in assessing the following three characteristics of any variable:
o Measures of Location: Central tendency (Mean, Median, Mode)
o Measures of Variability: Range, Variance & Std. Deviation (s), Coefficient of Variation (s/mean)
o Measures of Shape: Skewness, Kurtosis (zero for normal)
Variance is the mean squared deviation of all the values from the mean.
Standard Deviation is the square root of the variance.
Coefficient of variation is a useful expression in sampling theory for the standard deviation as a % of the
mean.
Skewness and Kurtosis provide an idea of the shape of the distribution.
Kurtosis is a measure of the relative peakedness of flatness of the curve defined by the frequency
distribution. The Kurtosis of a normal distribution is zero.
Skewness is a characteristic of a distribution that assesses its symmetry about the mean.
A null hypothesis is a statement of the status quo, one of no difference or no effect. The null hypothesis
can either be rejected or fail to be rejected, but cannot be accepted. The null hypothesis is always about
the population variables rather than the sample variables.
An alternative hypothesis is one in which some difference or effect is expected. Accepting the alternative
hypothesis will lead to changes in opinions or actions.
Concept Note-I
Dr. VIKAS GOYAL


2
The test statistic measures how close the sample has come to the null hypothesis.
The test statistic often follows a well-known distribution, such as the normal, t, or chi-square distribution.
Type I error occurs when the sample results lead to the rejection of the null hypothesis when it is in fact
true. The probability of type I error () is also called the level of significance.
Type II error occurs when, based on the sample results, the null hypothesis is not rejected when it is in fact
false. The probability of type II error is denoted by .
Cross-tabulation is a statistical technique that describes two or more variables simultaneously and results
in tables that reflect the joint distribution of two or more variables that have a limited number of
categories or distinct values.
A cross-tabulation is the merging of the frequency distribution of two or more variables in a single table.
Cross-tabulation with two variables is also known as Bivariate Cross-tabulation.
Cross-tabulation tables are also called contingency tables.
The chi-square statistic (
2
) is used to test the statistical significance of the observed association in a cross-
tabulation.
The chi-square distribution is a skewed distribution whose shape depends solely on the number of
degrees of freedom. As the number of degrees of freedom increases, the chi-square distribution becomes
more symmetrical.
Chi-square requires that you use numerical values, not percentages or ratios.
Chi-square should not be calculated if the expected value in any category is less than 5.
The phi coefficient () is used as a measure of the strength of association between the two variables, in
the special case of a table with two rows and two columns (a 2 x 2 table).
Cramers V can be used to check the strength of relationship in cross tab for any size of table.
t tests are conducted for examining hypothesis about means.
t test could be conducted on the mean of one or two samples of observations.
t test are used to provide inferences for making statements about the means of parent populations.
Parametric tests are hypothesis-testing procedures that assume that the variables of interest are
measured on at least an interval scale.
Nonparametric tests are hypothesis-testing procedures that assume that the variables are measured on a
nominal or ordinal scale.
Single sample t-test is performed when we wish to test the hypothesis about the mean of a variable
against an absolute number (say, mean height is greater than 4)
Concept Note-I
Dr. VIKAS GOYAL


3
Two-sample t-tests for testing the difference in the mean of either the independent samples, paired
samples and overlapping samples.
Two independent sample t-tests allow researchers to evaluate the mean difference between two
populations using the data from these two separate samples.
Two independent samples t-test is used when two separate sets of independent and identically
distributed samples are obtained, one from each of the two populations being compared.
Paired samples t-tests is used when the hypothesis needs to compare the mean of two different variables
for a single population, without identifying separate groups. Thus, it is called paired or related sample t-
test. It characteristically comprise of a sample of matched pairs, or one group of units that has been tested
twice.
Kolmogorov-Smirnov (K-S) one-sample test is a one-sample nonparametric goodness-of-fit test that
compares the cumulative distribution function for a variable with a specified distribution.
Mann-Whitney U Test is a statistical test for a variable measured on an ordinal scale, comparing the
difference in the location of two populations based on observations from two independent samples.
Wilcoxon test can be used as the non-parametric equivalent for paired sample t-test.
Analysis of variance (ANOVA) & Analysis of Covariance (ANCOVA) are used for examining the differences
in the mean values of the dependent variable associated with the effect of the independent variables with
more than 2 categories/levels or treatment. Dependent variable needs to be metric and the independent
variable needs to be categorical for ANOVA.
Analysis of variance (ANOVA) is a statistical technique for examining the differences among means for two
or more populations.
Treatment in ANOVA is a particular combination of factor levels or categories.
One-way ANOVA is a technique in which there is only one factor or independent variable.
SS
y
is the total variation in Y, i.e. the sum of squares. This is the total of SS between groups and SS within
groups.
SS
between
is also denoted as SS
x
, is the variation in Y related to the variation in the mean level of different
categories of X. This represents variation between the categories of X, or the portion of the sum of
squares in Y related to X.
SS
within
is also referred to as SS
error
, is the variation in Y due to the variation within each of the categories of
X.
Concept Note-I
Dr. VIKAS GOYAL


4
The strength of the effects of individual X (independent variable or factor) on Y (dependent variable) is
measured by eta
2
(
2
). The value of
2
varies between 0 and 1.
N-way ANOVA is a model where two or more factors are involved.
Strength of relationship between individual factors and the dependent variable can be measured by using
omega-squared.
Analysis of covariance (ANCOVA) is an advanced analysis of variance procedure in which the effects of one
or more metric-scaled independent variables are removed from the dependent variable before conducting
the ANOVA. Metric independent variables in ANOVA are treated as covariates.
The covariate is generally used as the control variables in ANCOVA.
Multivariate analysis of variance (MANOVA) is similar to analysis of variance (ANOVA), except that
instead of one metric dependent variable, we have two or more.
A Classification of Hypothesis Testing Procedures for Examining Differences:

A Concept map for frequency distribution:
Concept Note-I
Dr. VIKAS GOYAL


5

A Concept map for cross-tabulation:

A Concept map for conducting t-test
Concept Note-I
Dr. VIKAS GOYAL


6

Relationship Amongst Test, Analysis of Variance, Analysis of Covariance, & Regression:

A Concept map for One-Way ANOVA:
Concept Note-I
Dr. VIKAS GOYAL


7

AMR Concept Note-1 (Freq Dist, Cross Tab, T-Test and ANOVA)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AMR Concept Note-1 (Freq Dist, Cross Tab, T-Test and ANOVA)

Uploaded by

Copyright:

Available Formats

ADVANCED MARKETING RESEARCH

You might also like