You are on page 1of 3

Stat 151 Formula Sheet

Sampling Distribution of the sample proportion:

Numerical Summaries :
i Sample Mean : y =

yi
n

( yi y ) 2
n 1
2
i Sample Variance : s
i Range : max min
i Interquartile Range : IQR = Q3 Q1
i Sample Standard Deviation : s =

i 5 # summary: min, Q1 , median , Q3 , max


i Standardized Value : z score =

i Least-Squares regression: y = b0 + b1 x ,
sy
where b1 = r , b0 = y b1 x ,
sx
s
1 n
i Correlation: r =
z x z y = b1 x

n 1 i =1
sy

A 100(1 )% C.I for : Y z* /2


P( A B)
P( B)

General additional law:


P( A B ) = P( A) + P ( B) P( A B )
If A and B are disjoint, then
P ( A B) = P( A) + P( B)
General Multiplication law:
P ( A B) = P( A) P ( B | A) = P( B) P( A | B)
If A and B are independent, then
P( A B) = P ( A) P( B)
Sampling Distribution of the sample mean:

For a random sample of size n from a


population with mean , and standard
deviation , the sampling distribution of the
sample mean, Y , has a mean of Y = and a
standard deviation of SD(Y ) = Y =

p (1 p )
pq
=
, where q = 1 p
n
n
For large n ( np 10 and n(1 p) 10 ), the
sampling distribution of p is approximately
normal.
Inference for one population mean:
Sample size: For a 100(1 )% confidence with
SD( p ) = p =

z* / 2
margin of error E , n =
E
Case 1: is known

Probability Formulas:
Complement: P( AC ) = 1 P ( A)
Conditional probability: P ( A | B) =

For a random sample of size n from a


population with proportion p , the sampling
distribution of the sample proportion has a mean
of p = p and a standard deviation of

n
If population distribution is normal, then Y has
a normal distribution.
Central Limit Theorem: For large n (n > 30) ,
the sampling distribution of Y is approximately
normal.

n
Test-Statistic for testing H 0 : = 0 :
TS =

Y 0

~ Normal distribution if H 0 is true.

/ n
Case 2: is unknown

A 100(1 )% C.I for : Y tn*1, /2

s
n

Test-Statistic for testing H 0 : = 0 :


TS =

Y 0

~ t distribution with df = n 1 ,
S/ n
if H 0 is true.
Large Sample Inference for one proportion:
Sample size: For a 100(1 )% confidence with
2

z*
margin of error E , n = / 2 p (1 p )
E
An approximate 100(1 )% C.I. for p :
p (1 p )
n
Test-Statistic for testing H 0 : p = p0 :
p p 0
~ Normal distribution if H 0 is
TS =
p 0 (1 p 0 )
n
true.
p z* / 2

Stat 151 Formula Sheet


Large sample inference for two proportions:
Inference for two population means:

Independent samples:
Case 1: Assuming 1 = 2

Sp =

(n1 1) S12 + (n2 1) S 22


n1 + n2 2

A 100(1 )% C.I. for 1 2 :

1 1
+
Y1 Y2 tn*1 + n2 2, /2 S p
n1 n2

Test-Statistic for testing H 0 : 1 2 = 0 :


TS =

Y1 Y2 0
~ t distribution with
1 1
Sp
+
n1 n2

df = n1 + n 2 2 if H 0 is true.

A 100(1 )% C.I. for p1 p 2 :


p 1 p 2 z* / 2

Test-Statistic for testing H 0 : p1 p 2 = 0 :


p 1 p 2 0
~ Normal distribution
TS =
1
1
p (1 p ) +
n1 n2
if H 0 is true, where p =

S2 S2
*
Y1 Y2 tMin ( n1 1,n2 1), /2 1 + 2
n1 n2

Test-Statistic for testing H 0 : 1 2 = 0 :

TS =

Y1 Y2 0
2
1

2
2

S
S
+
n1 n2

~ t distribution with

Total number of observations: N = n1 +


Grand mean:
Y =

A 100(1 )% C.I. for D = 1 2 :


S
D t n*1, / 2 D
n
Test-Statistic for testing
H 0 : D = 1 2 = 0 :

TS =

D 0

SD / n
if H 0 is true.

~ t distribution with df = n 1

n1Y1 + n2Y2 +
N

+ nI

+ nk Yk

Between- group variability:


n1 (Y1 Y ) 2 + + nk (Yk Y ) 2
MS ( B) =
k 1
Within-group variability:
(n1 1) S12 + + (nk 1) S k2
MS (W ) =
N k

df = Min(n1 1, n 2 1) if H 0 is true.

Paired samples: D = Y1 Y 2

y1 + y2 n1 p1 + n2 p 2
=
n1 + n2
n1 + n2

One Factor ANOVA F-test:

Case 2: Not Assuming 1 = 2


A 100(1 )% C.I. for 1 2 :

p 1 (1 p 1 ) p 2 (1 p 2 )
+
n1
n2

Total sum of squares:


SS (Total) = SS (Between) + SS (Within)
Test statistic for testing H 0 : 1 = 2 =

= k

MS ( B)
~ F distribution, if H 0 is true,
MS (W )
with df1 = k 1 and df 2 = N k
TS =

Stat 151 Formula Sheet


Simple Linear Regression:

Model: {Y | X } = 0 + 1 X , 0 = Intercept ,
1 = Slope
Estimated model: {Y | X } = b0 + b1 X , where
sy
b1 = r and b0 = y b1 x
sx

SS ( Error )
n2
- the standard error of the model

= se = MS Error =

SS (Total ) = SS (Re gression) + SS ( Error )


SS (Re gression) = SS Regression = b12 (n 1) sx2
SS (Total ) = SS Total = (n 1) s y2

Inference for slope:


se

(n 1) s

2
x

Estimated value for single future observation, Y


when X = x : y = b0 + b1 x
SE ( y ) = se 1 +

the standard deviation of the model

SE (b1 ) =

Inference for single observation:

se

(n 1) sx

A 100(1 )% C.I. for B : b1 tn* 2, / 2 SE (b1 )

1 ( x x ) 2
+
n (n 1) sx2

A 100(1 )% C.I. for Y : y tn* 2, / 2 SE ( y )


ANOVA F-test:
Test-Statistic for testing H 0 : 1 = 0 :
MS (Regression) SS (Regression) /1
~F
TS =
=
MS (Error)
SS (Error) / (n 2)
distribution, if H 0 is true, with df 1 = 1 and
df 2 = n 2 .

Coefficient of Determination:
SS (Total) SS (Error)
R2 = r 2 =
SS (Total)
SS (Regression)
=
SS (Total)
Chi-Square Test:

Test-Statistic for testing H 0 : 1 = 0 :


b1 0
~ t distribution with df = n 2 if
SE (b1 )
H 0 is true.

TS =

Inference for population mean:


Estimated value for mean when X = x :
{Y | X = x } = = b0 + b1 x
SE ( ) = se

1 ( x x ) 2
+
n (n 1) sx2

A 100(1 )% C.I. for : tn* 2, / 2 SE ( )

Obs Exp
2
TS =
~ distribution with
Exp
i =1

df = k 1 if H 0 is true.
k

Expected Values :
= E[ X ] = x p( x)
i 2 = Var[ X ] = ( x ) 2 p( x)
i E[aX bY c] = aE[ X ] bE[Y ] c
i If X are Y independent, then
Var[aX bY c] = a 2Var[ X ] + b 2Var[Y ]
Standardized values :
Observation Mean
z values =
Standard Deviation

You might also like