You are on page 1of 6

Expectation, Covariance, and Regression

Examining the concept of expectation will provide us with a better feel for the concepts
of means, variances and covariances.
Consider a discrete random variable Y with a probability distribution p(y). The random
variable Y is said to be discrete if there is associated with Y a finite set of points
having positive probabilities which sum to unity. The probability distribution is
simply defined as: the set of all pairs of (y, p(y)) for which p(y) > 0. You might think
of a chlorophyll mutant which segregates 1:2:1 in the F2 for the phenotypes green,
yellow, and albino.
y

P(y)

.25

1
2

.5
.25

The probability
that y take on
the value 0 is
0.25, etc.

Then we define expectation as


E (Y ) = yP ( y )
y

or in this example
E(Y) = 0(.25) + 1(0.5) + 2(.25) = 1.0
We call this the mean of the distribution and usually symbolize it with
E (Y ) = You are probably most familiar with unweighted means where P(y) = (1/n)

We define the variance of the random variable Y to be:


Var (Y) = E((Y- )2)
In this case, Var (Y) = .25 (0-1)2 + .5 (1-1)2 + .25 (2-1)2 = 1.5
We generally symbolize Var (Y) as 2 and use the following computing formula:
2=(Y2) - 2
Rules for evaluating expected values
1. If C is a constant, E( C ) = C
2. If C is a constant and X is a random variable, then E (CX) = C E(X)
3. If X1, X2, X3 Xk are k random variables, then E (X1 + X2 + X3 +Xk) = E
(X1) + E (X2) + E (X3) + + E (Xk)

Consider two random variables X and Y that are jointly distributed. The joint (or
bivariate) probability distribution for X and Y is given by:
P(x,y) = P (X=x, Y=y)
It is useful to find a measure of association between X and Y. The covariance is such a
measure and is defined as follows:
COV (X,Y) = E( (X-E(X)) (Y-E(Y))
This reduces to :
E(X,Y) E(X) E(Y) = E(X,Y) -xy
This quantity may not always be easy to evaluate as stated. However, you will often be
given information about the two random variables. For example you may be told that X
and Y are independent, which means that their joint expectation is the product of their
individual expectations, or:
E(X,Y) = E(X) E(Y), and thus the covariance of X and Y is zero. Covariance is usually
denoted xy.
Note the similarity between this expression for covariance (E(X,Y) -xy )and the
numerator in the expression for a correlation coefficient. In fact the extimator for xy is
Cov ( X ,Y ) =

and

xy =

n ( xy x y )
n 1

1
xy
n

We know that if 2 variables are uncorrelated they are independent and thus the numerator
in the correlation coefficient would be zero. This agrees with the foregoing on
covariance.
Variance of a Linear Function of Random Variables
1. Assume that X and Y are normally distributed random variables with respective
means 1 and 2 and variance . Assume that X and Y are independent, then
Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X,Y)
Since X and Y are independent, Cov(X,Y) = 0 and Var(X) + Var(Y) = 22
2.

Var (CX) where C is a constant and X has variance 2 = Var (CX) = C22

Other useful equations and identities:


Cov (x, x) = Var (x) , i.e., the covariance of a random variable with itself is equal to the
variance of the random variable.
Where c is a constant, Cov (c, x) = 0
And Cov(c, x, y) = c Cov (x, y)
Where b is also a constant
Cov(cx, by) = cb Cov(x,y)
And
Cov [(c + x), y)] = Cov (x, y)
And
Cov [(x+y), (w+z) ] = Cov(x,w) + Cov(x,z) + Cov (y,w) + Cov (y,z)
Where x,y,w, and z are all random variables.
Regression
Consider this set of data
Observation
1
2
3
4
5
6
7
8
9
10
Sum

Nitrogen ( x )
30
20
60
80
40
50
60
30
70
60
500

Yield (y)
73
50
128
170
87
108
135
69
148
132
1100

xy
2190
1000
7680
13600
3480
5400
8100
2070
10360
7920
61800

x2
900
400
3600
6400
1600
2500
3600
900
4900
3600
28400

y2
5329
2500
16384
28900
7569
11664
18225
4761
21904
17424
134660

The simple linear model for regression of yield on nitrogen level


Yi = 0 + 1 X i + i

Since E ( i )=0, E( Yi )= 0 + 1 X i , and thus the regression function is E(Y) = 0 + 1 X ,


(you may be more familiar with Y = 0 + 1 X ) in which the y intercept is 0 and the slope
is 1 .
We estimate these parameters using the method of least squares:
Given that:

= Y Y = Y 0 1 X , then we wish to minimize Q, where


n

Q = ( Y 0 1 X ) 2 , that is the sum of the squared deviations from regression.


i= 1

E(Yi)=104

E(Y)=9.5
+2.1X

25

45

The least squares estimators of the parameters 0 and 1, i.e., those that minimize Q are:

b1 =

( X i X )(Yi Y )
( X i X ) 2

b0 = Y b1 X
Using the data from the table above:

( X i X )(Yi Y )
b1 =
( X i X ) 2

b0 = Y b1 X =

(500)(1100
, )
10
= 2.0
(500) 2
28, 400
10

61, 800
=

1
[1100
,
(2.0)(500)] = 10.0
10

In terms of deviations:

yi y = ( y y ) + ( yi yi ) , or in words
Total
deviations

Deviation Of
fitted regression.
value from mean

Deviation around
regression line

The sums of squares follow this same partitioning:

( yi y )2 = ( y y )2 + ( yi yi )2
Computing formulas:

SSTO = yi2 ny 2
SSR = 12 ( xi x )2
SSE = SSTO SSR

In the data set presented above:


SSTO = 134660-121000=13660
SSR =2.02(3400) = 13600
SSE = 13660-13600=60
R2= 13600/13660 = 0.99
The Correlation coefficient, r, should confirm this:

r=

cov x, y
var( x)var( y)

x y
x y i i
i i
n
=
(x )2
(y )2
2
2
i )(y
i )
(x
i
i
n
n
=

cov xy
cov xy

Cov(X,Y)/

var y

6800

3400

13660

SQRT (VAR)

58.3095189

116.876003

SQRT(VAR(X)
R=

var x

VAR(Y)

SQRT(VAR(X)

=6800/6814.92
=0.99781069

VAR(Y)

=6814.92268

You might also like