You are on page 1of 7

Psych 115

CH. 9: Basic Between-Subjects Design

Experimental Design general structure of


the experiment (not its specific content)
Determining design:
o Number of IVs
o Number of treatment cond.
o Same or different subjects
Between-Subjects Design different
subjects in each condition

Selecting and Recruiting Subjects


How accurately we can generalize
representativeness
Random sampling external validity
Practical Limits
Convenience sample talaga very low in
representativeness
Ind. may share same char. and attributes
Asking people you know
o Personal conflicts
o Not completely ignorant of hypo
o Sensitive to subtle cues to behave in
particular way
o Obliged to participate (ethical qs)
No way to guarantee truly random sample
Follow ethical guidelines and keep subjects
informed (people iffy about psych exp.)
To obtain subjects:
o Make appeal interesting, nonthreat,
meaningful
o Exp will help others
o Pay with token
o Assess if publicly or privately
How Many Subjects?
In b/n design more than two subjects
dapat
Subjects are not all the same
Individuals different larger samples
Larger sample represent full range of
behaviors on DV
Bh. under diff treatments to be noticeably
different din dapat
Did our IV have an effect?
Use stat procedures to decide whether
differences are significant: Dahil lang ba sa
differences ng mga subjects to begin with?
Or sa IV?
Effect size estimate of the size or
magnitude of treatment effect
Larger ES, fewer subjects needed to detect
ES of IV strong: 10-20
Moderate: 20-30
Larger samp no guarantee exp will turn out
as expected
Have at least 20 subjects

One IV: Two-Group Designs


One IV: simplest experiment
At least two treatment conditions: two-group
design
Two Independent Groups
Subjects placed through random assignment
Random Assignment
Every subj has equal chance of being placed
in a condition
Groups independent make up of one
group, no effect on the other
Not randomly placed confounding
Random assignment form groups roughly
the same on EVs
Ind differences spread out
RA guards against possibility that subj var.
will systematically change with IV
Nonrandom selection affects ex. val.
Random assignment critical to internal
val.
Selection threat may occur
Not sure if IV caused differences could be
difference lang from how subj were placed
Experimental Group-Control Group Design
Exp condition value of IV applied
(experimental group)
Control condition without an experimental
manipulation
^Point of comparison
*Maturation internal cogn. Changes
Control cond helps internal validity by ruling
out alt. explanations
Two-Experimental-Groups Design
Control groups useful for first stages of
experimentation
When established that IV has sig. effect: twoexperimental-groups design for more precise
info
Both treatments have manipulation na
Forming Two Independent groups
Assigning subjects at random roughly the
same groups sana
We might not always be aware of every
variable that should be controlled
Difference (subj variable) should not be
significant
More subj randomization equivalent
grps
When to Use a Two-Independent-Groups Design
Look at hypothesis (one IV, could be tested
with two treatments)
We assume randomization is successful
About the same on evs but if not..
Two Matched Groups
Randomization does not always guarantee
comparable treatment groups
Matching Before Experiment

When

Subjects must be measured on EVs to be


used for matching
Not possible to form identical pairs decide
how much discrepancy allowed
If no suitable match eliminate
Put members of each pair at random
If not before experiment: Ex. intelligence
scores (do random assignment as usual then
simultaneously give intelligence test then
match subjects on scores after)
Precision matching insist members of
matched pairs have identical scores
Range matching specified range of scores
Discard if no match is available
Many subjects minimize chances of no
match
Rank ordered adjacent scores, matched
pairs
to Use Two Matched Groups
Matching IV easier to detect
Matching useful with very small nos. of subj
o Greater chance randomization will
produce dissimilar groups
o Big sample ^ you wont have to
worry
EV to match should be highly related to DV
o Increase ability to detect changes
Matching might result in larger differences in
other EVs (dampen IV effects)
Two matched vs two independent
research literature

Multiple Groups
More than two groups
Multiple-independent more common
Matching also possible
More treatment cond, more difficult to match
Assigning Subjects
Block randomization?
Closing our eyes??? (number table)
More than two conditions better picture
how IV operates
Choosing Treatments
Think in terms of hypothesis
Select simplest design na kaya itest
hypothesis
Practical Limits
Assume multiple groups formed by random
assignments
Several levels more info than just two
groups
Might be difficult to find subject if maraming
levels of IV
Additional levels take more time
Prior research help
Pilot study to pretest IV levels
o Allows to make changes before
investing time and resources
CH. 10 Between-Subjects Factorial Design
To explore more than one IV

More

than One IV
Efficient, provide more info
Two or more IVs at the same time
IVs now called factors
Simplest two factor experiment
Effects of each IV (main effects)
Interaction influence of one IV affect
influence of the other

Looking for Main Effects


Action of a single independent variable
Looking for Interactions
Interaction effect of one IV changes across
the levels of the other IV
When present, we cannot get complete
picture
Effects of one factor will change depending
on the levels of the other
o Main effects of one factor will be
altered by other factor
Limits or exceptions to main effects could be
there
Interaction qualifies main effects
More than two IVs higher order
interactions
Laying Out a Factorial Design
Make simple diagram design matrix
(visual image of design)
Describing the Design
Describing design in report cannot use
design matrix
2 x 2 factorial design = shorthand notation
(two IVs, two levels each etc.)
Factor Labeling Methods
Names of each factor are placed in
parentheses
Example 1: 2 x 2 (Type of Name x Length of
Name) between-subjects factorial design
Example 2: 2 (type of name) x 2 (length of
name) between-subjects factorial design
Example 3: 2 x 2 (Type of Name: given,
nickname x Length of Name: short, long)
between-subjects factorial design
Example 4: 2 (given name or nickname) x 2
(short or long name) between-subjects
factorial design
*Factor is a subject variable manipulation is
selection process
Understanding Effects from Factorial Designs
Graphs that reflect interaction: lines are not
parallel
o Diverge
o Converge
o Intersect
o Cross-over interaction each factor
completely reverse at each level of
the other factor
Choosing a Between-Subjects Design

As many groups of subjects as you have


treatment conditions
More treatment conditions, more time to run
exp, more time to do stat analysis
Very difficult to explain interaxn of 3 factors
Two factors b/n subj design: 3 effects
o Main effect for each factor
o Interaction
Design to be determined largely by no. of
IVs + treatment conditions
All designs discussed: participants in only
one treatment condition, measured on DV
after exp. manip
Within-subjects factorial design: subj receive
all conditions

CH. 11: Within-Subjects Design


Each subj serves in more than one cond.
Same subj improve chances of detecting diff.
Power of experiment is increased
Higher power, greater chance of detecting
genuine effect
AKA repeated-measures design
A Within-Subjects Experiment: Homophone
Priming of Proper Names
A comparison within each subj is more
precise
o Eliminated error from diff bet.
Subjects
Controls for individual differences, greater
power
o Use of fewer subjects allowed
Within-Subjects Factorial Designs
Requires fewer subjects than between-subj
design (kahit marami na yung conditions)
that is testing the same hypo
Mixed Designs
One factor manip within subjects
Between-subjects factor (often a subject
variable; cannot be manipulated by
experimenter)
Within-subjects var. + between subj. var.
Advantages of Within-Subjects Design
Use same subjects big help when we
cannot get many subjects
4 treatment conditions 15 in each
condition
o 60 in b/n
o 15 lang sa within
Best chance of detecting effect of IV
Controls for extraneous subj. variable
Most perfect form of matching
o Influence of subj. var is controlled
Ongoing record of subjects behavior over
time
Disadvantages of Within-Subjects Design

Practical Limitations
Several hours of testing (when pweds one
hour lang for between subj)
Resetting equipment for each condition
(sensitive instruments)
Can become tedious for subjects
o Hasty judgments inaccurate data
^Inconveniences
Interference Between Conditions
If one treatment cond precludes another
(between subj design required)
Effects of DV might be influenced by order in
which treatments are given order effect
Use of special counterbalancing procedures
to offset interference, control for potential
order fx
Controlling Within-Subjects Design
Controlling for Order Effects: Counterbalancing
Ex. New Cola vs. Old Cola
o After 2 hours of not drinking, new
cola pa lagi binibigay, malamang
mas masasarapan sila sa new cola
etc.
Progressive Error as experiment
progresses, results are distorted
o Fatigue effects subjects get tired,
performance decline
o Practice effect lead to
improvement, more familiar with
experiment
Control EV by making sure it affects all
treatment conditions
Eliminate/hold constant/balance across
conds.
Within-subjects: cannot eliminate order
effects, nor can we hold them constant
(same order for all subjects systematic
effect)
^Can balance out, distribute to all conds, to
affect all conds equally
Counterbalancing to distribute progressive
error
o Order effects on one cond will be
offset or counterbalanced by order
effect on other conditions
Subject-by-Subject Counterbalancing
Reverse Counterbalancing
For cola, two glasses of each
o Cola A, Cola B, Cola B, Cola A
o Add up units of PE for each condition
o (A = 1, B = 2, B = 3, A = 4) A = 5,
B=5
o ^Assuming PE is linear
o True PE curvilinear, nonmonotonic
(changing direction)
Block Randomization
PE nonlinear
Each set of treatments (ABCD) one block
Treatments within each block given in
random order

BR to be successful in controlling nonlinear


PE, present each treatment several times
Ex. (ABCD) 5 times
o BDCA, DBAC, ACDB, CABD, BADC
o 20 trials for each subject
o Used in cogn, perc, psychophysics
etc. when treatment conditions are
short

Across-Subjects Counterbalancing
Drawback within each subject CB: present
each condition more than once
Distribute effects of PE
Complete Counterbalancing
Using all possible sequences of cond and
using every sequence same number of times
(AB, BA, half will get 1, half will get 2)
Give each subject only one sequence (as
compared to w/in na both sequences)
Find number of possible sequences: N!
Partial Counterbalancing
Keep number of treatments to a
minimum
Omit any condition that is not necessary
for a good test of hypothesis
When we cannot do complete
counterbalancing, but still want some
control over PE across subjects
Using some subset of available order
sequences
Randomized partial counterbalancing
(Ex. 120 possible seq, 30 available
subjects, 30 sequences lang)
Still, use complete CB to be safer
But still better than same order lang
Does not make sense to use just 3 seq
Use at least as many randomly selected
sequences as there are experimental
conditions
Latin Square Balancing - protexn against
order effects, but cannot control other
kinds of systematic difference (cond. A
comes before B two times etc.) carry
over effect
Carryover Effects
Effects of some treatments will persist, even
after treatments are removed
We do not want effects of early conditions to
contaminate later conditions
Order effect position of treatment vs.
Carryover function of treatment itself
Control for CO subject-by-subject
counterbalancing, complete
counterbalancing (balance out)
Control less certain with randomized CB
Might not be controlled if Latin square used
Balanced Latin Squares appears only once,
precedes and follows every other cond. eq #
Formula: A, B, N, C, N-1, D, N-2, E, N-3, F etc.
Ex.
ABDC

BCAD
CDBA
DACB
Choosing Among CB Procedures
Within-subj cond will need some form of CB
Counterbalance for each subject when we
expect large difference in pattern of PE
In mixed design, only w/n need to be CB
Effects will be about the same for everyone
no need to worry about PE CB across
subj
Avoid randomized and latin square if may
carryover effects expected
CB can be useful in b/n exp too order of
items should be CB to avoid confounding
Order as a Design Factor
Treatment order always a between-subj
factor
If order produced no sig effects, CB worked
One cond has more impact CO fx
asymmetrical
How Can You Choose a Design?
Think about hypothesis of exp
Possible to have each subj more than one
cond? Yes within-subj
Treatments interfere with one another? Yes
between-subj
Can get only few subj within might be
better
Tradeoff: the longer the exp, the harder it
might be to find willing subj (fatigue more
likely)
Subject var best controlled in within-subj
Review related literature
CH. 13 Why We Need Statistics
Applying Statistical Inference
Directional hypothesis (we know which way
the difference goes)
H0 = the scores will look like they came from
the same population
HA = The scores will look like they came from
different populations
Normal curve normally distributed DV
(parametric tests)
Groups differ variations that would differ if
there had been no exp intervention
No way to verify alternative to null
Results could always be due to sampling
error
Choosing a Significant Level
Sig Level criterion to reject null hypo or not
0.05 Pattern of data so unlikely that it
could have occurred by chance less than 5%
Unlikely differences stat. significant
Valid test of hypo decide sig. lvl BEFORE
HAND

Variability can be caused by exp errors


produced by uncontrolled EVs
Type 1 and Type 2 Errors
Type 1 reject null when it is really true
(Alpha)
o 0.05 5 times of 100 we reject null
hypo when we shouldnt
Type 2 fail to reject null when false really
false (Beta)
o More extreme sig lvl Type 2 error
Prob of making type 2 amount of overlap
o The more overlap, harder to detect fx
(1 B) power of stat test
o B we can reduce by inc. sample size
o Reduce by reducing variability
o By using more powerful stat tests
(parametric tests)
Assumptions normally
dist., comparable variability,
int/ratio
o Accept less extreme sig lvl
a (GREEK letter dapat) chance to reject null
hypo when we should have retained it

Null hypo true


(Data from
same pop)
Null hypo false
(data from dif
pop)

To not reject
null
You are correct:
p=1-a
You made type
2
p=B

Reject null
You made type
1
p=a
You are correct
p=1-B

Going Beyond Testing H0


Cohen provide more evidence that trtment
worked than did simply reporting p values
for rejecting null
Effect size & Confidence Intervals
The Odds of Finding Significance
Amount of variability
Directional or nondirectional hypo
Importance of Variability
Normally distributed mean differences
close to zero (normal curve)
Critical regions most ext. 5% of diff bet
means
o Occur less than 5% chance
Amount of variability inc CR fall farther
from center of distribution
More var larger diff bet samp req to reject
null
We want treatments to be only source of var
Highly var data highly var dependent
measure
One-Tailed and Two-Tailed
Two tailed CR divided bet 2 tails
(nondirect)

One tailed directional


o Advantage: CR larger, closer to
center of dist. (easier for means to
fall there)

Test Statistics
Inferential stat indicators of what is going
on in population
AKA Test Stat
Rel between treatment diff and var
quantitative
The larger the value of test stat, the more
likely IV produced change
Differences across treatments large relative
to amount of var
We are more likely to be able to reject null
hypo higher value of ts
Organizing and Summarizing Data
Organize
Summarize
Stat to interpret
Organize Data
Layout in sheet
Summarizing Data
Data we record as we run exp raw data
When we report summary data dapat
(wala tayo pake sa indiv scores)
o Summarize with desc. Stat
Measures of Central Tendency
Typical of distribution
Mean most commonly used
Median
Mode
If dist is symm and has only one mode (no
ties) mean median mode will coincide
Mean sensitive to skew pulled in direction
of extreme scores
Mode useful to describe dist that contain
many identical scores
Positively skewed (tail on positive side)
Bimodal (two scores tied for most freq)
Skewed MMM different values
Measures of Variability
Range diff bet. L and S scores in data
o Computed quickly, straightforward
o Does not reflect precise amt of var
Variance
o Transform var in standard form
o Good but simple desc of how indiv
scores differ
o Average sq. SD from mean
How scores are spread out
around mean of data
Standard Deviation
o Sq root of variance
o Ave deviation of scores about the
mean
o Total of deviations from mean always
0
When we report summary stat, mean &
SD

CH 14: Analyzing Results


Which Test Do I Use?
No of IV, no of levels, within/between,
matching?
Selecting a Stat Test
Selection should be made beforehand
Create meas scale for DV
Statistics for Two-Group Exp
Use of nominal data
Chi Square Test
Critical value comparison
Nonparametric no assump of normal dist,
variances equal
Freq of responses in our sample represent
frequencies exp in pop
o Diff bet exp and obtained become
greater X2 increases
o When X2 larger than CR, we reject
Use parametric tests more powerful
Degrees of Freedom
How many mems of a set of data could
chance without changing value of stat we
already know
2x2 contingency (nrows-1)x(ncolumn-1)
Critical values organized by df
o When using diff or same stat applied
in diff ways, diff df even though same
samp
Interpreting Chi Square
Crammers coefficient
o .10 small deg of assoc
o .30 medium
o .50 large
o Squared estimate of effect size
The T Test
T test of ind groups compute t differences
bet treatment means to amount of variability
(expected of any two sets of data drawn
from same pop)
Likelihood of obtaining a part. Value of t
ttest
Effects of Sample size
Small samples vary more from mean of pop
Normal curve as sample increases
Requirements: normally distributed
Use large samples
T test robust assumptions may be violated
without changing chances of Type 1, 2
Appropriate t dist from df
T dist changes shape, CV of t need to reject
changes din
Value of t computed more or less likely than
CV
Fewer df more variability
More variability more cases far from mean
Large differences by chance lang
Larger sample CV gets smaller
CV t smaller df smaller
Parametric test treatment fx and var

Confidence intervals range of val above


and below sample mean likely to contain pop
mean with probability level that mean of pop
falls there
The T Test for Matched Groups
If we did same way as independent
overestimate amount of var
Var computation changes for matched grps
Df based on number of pairs of scores
IV within each match of pairs
Within subj affects t value
Fewer df more difficult to reject null
More powerful than t test for ind grps
(decreases chance of type 2 error)
Analysis of Variance
Evaluate diff among three or more treatment
means
Within groups var: scores of subjects in same
treatment group differ from one another
Between groups: scores of diff treatment
groups
ANOVA likelihood that proportions we
observe is by chance lang
Sources of Variability
Individual differences
Procedures not handled well
^^ Error (+ mistakes in recording data,
variations in testing conditions, other evs)
Experimental manip din
If IV had effect b/n var should be larger
F = Variability from treatment + error (B/N) /
Variability from error
Larger effect IV, larger F ratio
One-Way ANOVA
Independent groups, samples selected at
random, normally distributed, var
homogenous
Within and Between Groups Variability
Mean square
Df for MSB in column, df for MSQ row
Intrepreting Results
When computing F test only overall
pattern
Post hoc test pair by pair comparison
o Pinpoint source of differences
o Guard against type 1
o Less power (Type 2)
o Conservative
A priori comparisons part of original
ANOVA, planned comparisons less than no of
trtmnt grps
o More powerful
Graphing Results
Three fourths x axis ang y axis
IV on X axis, DV on y axis
Statistical Control for Differences Between Groups
Use of ANCOVA to control stat for potential
moderating variables
o Holding mv constant or stat equating
subjects before experiment

Increase sensitivity of exp to IV


effects

One Way Repeated Measures ANOVA


Within subjects design
Denominator calculated somewhat
differently
Analyzing Data from a Between-Subjects
Factorial Exp
Each IV may produce own unique treatment
effects; each can produce portion of b/n var
or main effect
Two-Way ANOVA
Steps:
o Within groups var computation

Compute for total sum of squares b/n


(var we have among treatment
groups)
SS1, SS2, SS1xSS2
o Treat data as if variable only one in
experiment
o Var of interaction what remains after
main effects
Evaluating F Ratios
Table values of F comparison
Correct df, correct CR
Only two levels of each factor, no worry
about post hoc (look at means instead)
Graphing Factorials
Useful for summarizing results of experiment
Interpreting Significant Effects
o

Cannot interpret significant interaction


post hoc test to pinpoint differences

You might also like