You are on page 1of 5

ANOVA: Analysis of Variance

1.One Way ANOVA

Credit Card Debt: Ret, Own or Mortgage Vs Loan


Amount

Is a one IV ANOVA
IV: Nominal (called as factor with levels)
DV: Ratio

2. Hypothesis:
H0: No difference between Loan Sanctioned for the Type of Owners
Ha: There is a difference between Loan Sanctioned for at least one Type of Owner

3. Assumptions:
1. Normality: ANOVA Applicable mainly for Normal data
QQ Plot
> s.mort = subset(s,home_ownership==MORTGAGE")
> qqnorm(s.mort$funded_amnt)
> qqline(s.mort$funded_amnt)
> hist(s.mort$funded_amnt)

2. Homogeneity of Variance: Variance should be equally spread around


> library(car)
# Levene Test similar to SPSS
> leveneTest(funded_amnt~home_ownership,s)
# Bartlett Test
> bartlett.test(funded_amnt ~ home_ownership, data=s)

3. Independence of Observations:
Other variable should not have effect on another variable
We dont want same person to have in mortgage list and own list.
It will make H0 true as same population will have data twice.

Summary Table

Source
Variatio
n

df

Between

SS

MS

dfbetw SSbetw
een
een

MSbet
ween

Within

dfwithi SSwithi
n
n

MSwith
in

Total

dftotal

SStotal

ANOVA Commands & sample result


>a = aov(funded_amnt~home_ownership,data=s)
>summary(a)
Df

Sum Sq

Mean Sq

F value Pr(>F)

home_ownership 2 1.140e+09 570034237 11.02 2.19e-05 ***


Residuals

397 2.053e+10 51707342

MSE & MSB


In short, MSE estimates 2 whether or not the population means are equal, whereas MSB
estimates 2 only when the population means are equal and estimates a larger quantity
when they are not equal.

Effect Size:

R2 = Percent of variance explained by IV

R2 = (F*dfbetween)/(F*dfbetween+dfwithin)

R2 = 0.053

~ 5% Variance is explained by home ownership

Alternatively;
R2 = SSa+SSb+SSa*b/SStotal

R2 is percentage variance explained by IV in DV

Power
> library(pwr)
> f2 = R2/(1 - R2)
> pwr.f2.test(df.between,df.within,f2)
Multiple regression power calculation
u=2
v = 397
f2 = 0.05551637
sig.level = 0.05
power = 0.9916698

Pairwise Comparison using t tests


Familywise Error
Familywise error = 1 (1-)#tests

ANOVA Correction
Bonferroni Correction
Familywise error/#tests
0.05/3 = 0.017

Pairwise t-tests
>pairwise.t.test(s$funded_amnt,s$home_ownership,p.adj
="bonferroni")

Pairwise comparisons using t tests with pooled SD


data: s$funded_amnt and s$home_ownership
MORTGAGE OWN
OWN 0.00301 RENT 0.00017 0.65248

Conclusion:
Pairwise comparisons showed that

Those with mortgages were given greater loans than those who
owned outright ($15,470 vs. $10,596; p = 0.003).

Those with mortgages were given greater loans than those who
rented ($15,470 vs. $12,396; p < 0.001)

There was no statistical difference between those who owned


outright and those who rented (p = 0.65)