Professional Documents
Culture Documents
Samples
Who = Population:
all individuals of interest
What = Parameter
Characteristic of population
Types of Sampling
Representative Sample
Sample should be representative
of the target population
Random sampling
Types of Statistics
Descriptive statistics:
Inferential statistics:
Describe characteristics
Test hypothesis, Make conclusions,
organize, summarize, condense data
interpret data, understand
relations
DESCRIPTIVE STATISTICS
INFERENTIAL STATISTICS
Some Definitions:
Construct:
abstract, theoretical, hypothetical
cant observe/measure directly
Variable: reflects
IQ
Intelligence
Vocabulary
Achievement
construct, but is directly measurable and can differ from subject to subject
(not a constant). Variables can be Discrete or Continuous.
Operational
Definition:
WISC
SAT Vocab
Test
Grades
concrete, measurable
Defines variable by specific operations used to measure it
Types of Variables
Quantitative
Measured in
amounts
Ht, Wt, Test score
Discrete:
separate categories
Letter grade
Qualitative
Measured in categories
Gender, race, diagnosis
Continuous:
infinite values in
between
GPA
Scales of Measurement
Nominal Scale: Categories, labels, data carry no numerical
value
Sigma Notation
= summation
X = Variable
What do each of these Mean?
X
X + 2 versus (X + 2)
X2 versus (X)2
(X + 2)2 versus (X2 + 2)
Types of Research
Correlational Method
No manipulation: just observe 2+
Also called:
Descriptive
experimental
Naturalistic
Survey design
NonObservational
Experimental Method
True experiments:
IV
Levels, conditions,
treatments
DV
outcome
(observe level)
Control
Random assignment equivalent groups
Experimental Validity
Internal Validity:
Is the experiment free from
confounding?
External Validity:
How well can the results be generalized
to other situations?
Quasi-experimental
designs
IV not manipulated
no random assignment
Research Settings
Laboratory Studies:
Advantages:
Control over situation (over IV, random assignment,
etc)
Provides sensitive measurement of DV
Disadvantages:
Subjects know they are being studied, leading to
demand characteristics
Limitations on manipulations
Questions of reality & generalizability
Research Settings
Field Studies:
Advantages:
Minimizes suspicion
Access to different subject populations
Can study more powerful variables
Disadvantages:
Lack of control over situation
Random Assignment may not be possible
May be difficult to get pure measures of the DV
Can be more costly and awkward
Frequency Distributions
Frequency Distributions
After collecting data, the first task for a
Regular Frequency
Distribution
Grouped Frequency
Distribution
Relative
Class Interval Frequency Frequency
100 to <115
2
0.025
115 to <130
10
0.127
130 to <145
21
0.266
145 to <160
15
0.190
160 to <175
15
0.190
175 to <190
8
0.101
190 to <205
3
0.038
205 to <220
1
0.013
220 to <235
2
0.025
235 to <250
2
0.025
79
1.000
Frequency Distribution
Graphs
Histograms
In a histogram, a bar is centered above
each score (or class interval) so that the
height of the bar corresponds to the
frequency and the width extends to the
real limits, so that adjacent bars touch.
Polygons
In a polygon, a dot is centered above
each score so that the height of the dot
corresponds to the frequency. The dots
are then connected by straight lines. An
additional line is drawn at each end to
bring the graph back to a zero frequency.
Bar graphs
When the score categories (X values) are
measurements from a nominal or an
ordinal scale, the graph should be a bar
graph.
A bar graph is just like a histogram except
that gaps or spaces are left between
adjacent bars.
Smooth curve
If the scores in the population are
measured on an interval or ratio scale, it is
customary to present the distribution as a
smooth curve rather than a jagged
histogram or polygon.
The smooth curve emphasizes the fact
that the distribution is not showing the
exact frequency for each category.
Frequency distribution
graphs
Shape
A graph shows the shape of the distribution.
A distribution is symmetrical if the left side of
Characteristics of z Scores
Z scores tell you the number of standard deviation
size n
The larger the n, the less variability there will be
The sample means will cluster around the population
mean
The distribution will be normal if the distribution of the
population is normal
Even if the population is not normally distributed, the
distribution of sample means will be normal when n > 30
What is a Sampling
Distribution?
Formulating Hypotheses
The Null Hypothesis (H0)
Differences between means are due only to chance
fluctuation
Type II Errors
Statistical Power
How sensitive is a test to detecting real
effects?
A powerful test decreases the chances of
making a Type II Error
Ways of Increasing Power:
Assumptions of Parametric
Hypothesis Tests (z, t, anova)
Disadvantages of Independent
Sample Designs
Usually requires more subjects (larger n)
The effect of a variable cannot be assessed for each
individual, but only for groups as a whole
There will be more individual differences between
groups, resulting in more variability
Advantages of Paired-Sample
Designs
Planned Comparisons
Can be done with t tests (must be few in number)
Doyouknow
populationSD?
Yes
No
UseZ
Test
Arethereonly
2groupsto
Compare?
Yes
Only2
No
Morethan
2
No
Morethan
2Groups
Yes
Only2
Groups
Use
ANOVA
Doyouhave
Independentdata?
Yes
No
IfFnot
Significant,
RetainNull
IfFis
Significant,
RejectNull
IfFnot
Significant,
RetainNull
Doyouhave
Independentdata?
Yes
Use
Independent
Sample
Ttest
IfFis
Significant,
RejectNull
No
UsePaired
Sample
Ttest
CompareMeans
WithMultiple
ComparisonTests
Use
Independent
Sample
Ttest
UsePaired
Sample
Ttest
Isttest
Significant?
Yes
Compare
means
Reject
Null
Hypothesis
No
Retain
Null
Hypothesis
Correlational Method
No manipulation: just observe 2+
Also called:
Descriptive
experimental
Naturalistic
Survey design
NonObservational
Assessing Reliability
Consistency over time, across raters, etc
Hypothesis Testing
Correlation Coefficients
Can range from -1.0 to +1.0
The DIRECTION of a relationship is indicated by the sign
of the coefficient (i.e., positive vs. negative)
The STRENGTH of the relationship is indicated by how
closely the number approaches -1.0 or +1.0
The size of the correlation coefficient indicates the
degree to which the points on a scatterplot approximate
a straight line
As correlations increase, standard error of estimate gets smaller
& prediction becomes more accurate
Introduction to Regression
In any scatterplot, there is a line that provides the best
fit for the data
This line identifies the central tendency of the data and it can be
used to make predictions in the following form:
Y = bx + a
b is the slope of the line, and a is the Y intercept (the value of Y when X = 0)
Simple Regression
Discovers the regression line that provides
the best possible prediction (line of best
fit)
Tells you if the predictor variable is a
significant predictor
Tells you exactly how much of the
variance the predictor variable accounts
for
Multiple Regression
Gives you an equation that tells you how
well multiple variables predict a target
variable in combination with each other.
Nonparametric Statistics
Used when the assumptions for a
parametric test have not been met:
Data not on an interval or ratio scale
Observations not drawn from a normally
distributed population
Variance in groups being compared is not
homogeneous
Chi-Square test is the most commonly used
when nominal level data is collected