You are on page 1of 86

Random Variables and

Probability Distributions
Ravindra S. Gokhale
IIM Indore
1
Random Variables
A variable that associates a number with the outcome of a
random experiment is a random variable
Denoted by an uppercase letter such as X, Y, etc.


A random variable can take only numeric values.

Toss of a coin is NOT a random variable.
[It is an experiment that yields random results]
However, number of heads from toss of a coin is a random
variable




2
Random Variables
Examples:
If two dice are thrown, the sum of the faces is a random variable, as in:
X = 3, X = 11, etc.
If two coins are tossed, then the number of heads is a random variable,
as in: X = 0, X = 2, etc.
In a speed (rpm) measurement: X = 457, X = 1209, etc.
In a dimension measurement with the help of a caliper: X = 23.46, X =
48.97, etc.





3
Random Variables (cont)
A random variable with a finite or countable infinite range
Examples:
Number of scratches on a car surface
Proportion of defective parts among 1000 tested
Number of people arriving at a bank in a given time interval





4
Discrete Random Variables
Random Variables (cont)
A random variable with an interval (either finite or infinite) of real
numbers for its range
Examples:
Length dimension (like surface area of a table)
Time dimension (like time between failure for a machine)
Temperature dimension (like temperature inside a room)





5
Continuous Random Variables
Random Variables (cont)
Population in a particular state of India.
Total weight of consignments handled by a courier company in a
day.
Time to complete an exam.
Number of participants in an exit poll.
Total number of goals scored in a football game.
Life of a particular medicine.
Height of the Ocean's tide at a given location.
Amount of rain on a particular day.
Number of train derailments in a year.




6
Discrete or Continuous?
Random Variables (cont)
The manner in which random variables are expressed sometimes
depends on the problem at hand
Sometimes a random variable is discrete in nature, but it is
treated continuous
This is because the range of values it can take is too large
Example: Marks of a student in a 100 marks paper

7
Expression of Random Variables
Random Variables (cont)
Sometimes a random variable is continuous in nature, but it is
treated discrete
This is because the exact value (to the smallest level) is not required
Example: Age of a person may be expressed as a discrete random
variable forming different categories: 0-21, 21-35, 35-50, 50-65, 65+





8
Expression of Random Variables
Probability Distributions
Probability distribution of a random variable X is a formula, table,
or graph that gives all possible values of X and corresponding
probabilities P(X = x) for all x's in the domain of X.
Example: Probability distribution of roll of a dice:







9
x 1 2 3 4 5 6
P(X = x) 1/6 1/6 1/6 1/6 1/6 1/6
Probability Distributions
Standard probability models (probability distributions) are
available in the literature and have been studied in detail.

These models can mimic many real life scenarios very well and
have mathematically tractable representation.


10
Discrete Random Variables
Examples:
If two coins are tossed and we are interested in the event number of
heads obtained, then:
P(X = 0) = 0.25 P(X = 1) = 0.50
P(X = 2) = 0.25 P(X = 3) = 0
P(X > 1) = 1 [P(X = 0) + P(X = 1)] = 1 (0.25 + 0.50) = 0.25

In a lot that contains 10% defective pieces, if we are interested in the
number of defective pieces in a sample of 5 then:
P(X = 0) = 0.590 P(X = 1) = 0.328
P(X = 2) = 0.073 P(X = 3) = 0.008
P(X = 4) 0.001 P(X = 5) 0.000
P(X <= 2) = P(X = 0) + P(X = 1) + P(X = 2) = 0.590 + 0.328 + 0.073
= 0.991






11
Discrete Random Variables (cont)
Terminologies associated with discrete random variables
Probability mass function (pmf) denoted by f(x)
Cumulative distribution function (cdf) denoted by F(x)






12
x
f(x)
Probability mass function
of a fair dice
f(x
i
) = P(X = x
i
)
1 2 4 3 6 5 0
1/6
3/6
2/6
1
4/6
5/6
x
F(x)
Cumulative distribution function of
a fair dice
F(x) = P(X <= x) = f(x
i
)
x
i
<= x
Discrete Random Variables (cont)
Mean of a discrete random variables
Mean is the expected value of the random variable denoted by
or E(X)
It is the measure of the center of the probability distribution
Formula:

If we make infinite number of draws from the distribution of a
random variable and calculate the average of the data then the
average is the expected value (or mean) of the random variable.

Note: The expected value should not be confused with most
likely value.

13
= E(X) = x f(x)
x
Mean and Variance of a Discrete Random Variables
Discrete Random Variables (cont)
A simple example:

You can insure a Rs.500,000 jewellery against theft for its total
value by annual premium of Rs. R. If the probability of theft in a
given year is estimated to be 0.01, what premium should the
insurance company charge if it wants an annual expected gain
equal to Rs. 10,000?

14
Mean and Variance of a Discrete Random Variables
Discrete Random Variables (cont)

Variance of a discrete random variables
Denoted by
2
or V(X)
It is the measure of the dispersion or variability in the probability
distribution
Formula:

The standard deviation () of X is the (positive) square root of the
variance




15
Mean and Variance of a Discrete Random Variables

2
= V(X) = (x )
2
f(x) = [ x
2
f(x)]
2
x
x
Bernoulli Trial
A basic building block for all the discrete probability distributions

A trial has only two possible outcomes
Usually termed as: success and failure

Examples:
Did tossing of a coin lead to a head (success) or not?
Did the student pass the exam (success) or not?
Did India lose the match (success) or not?
Was the part defective (success) or not?

Probability of success is denoted by p

16
Bernoulli Trial
Mean of a Bernoulli Trial = p

Variance of a Bernoulli Trial = p (1 p)




Derive

17
Binomial Random Variable
A random experiment consists of n Bernoulli trials such that:
The trials are independent
Each trial results in only two possible outcomes success and
failure
The probability of success in each trial (denoted as p) remains
constant

The random variable X that equals the number of trials that result
in a success follows a Binomial Distribution


18
Binomial Random Variable
Exercise:

A jar contains five rings: three red and two white. Two
rings are randomly selected without replacement from the
jar, and the number X of red rings obtained are recorded.
Explain why X is or is not a binomial random variable?




19
Binomial Random Variable
A random variable X following a Binomial Distribution is denoted
by:
X ~ Binomial(n, p)

n and p are the parameters of the binomial distribution
n = 1, 2, 3, and 0 < p < 1


What is the meaning of Parameters of a Distribution?

20
Binomial Random Variable

The pmf of the binomial random variable X is given by:
f(x) =
n
C
x
p
x
(1 p)
n-x
x = 0, 1, 2, , n
This is the probability of x successes in n trials

Note:
n
C
x
= n! / [x! (n-x)!]

Understanding the pmf of a Binomial Distribution:
Probability of x successes is p
x
.
Probability of (n x) failures is (1 - p)
n-x
.
x successes in n trials can happen in
n
C
x
ways

21
Binomial Random Variable

For a binomial random variable with parameters n and p:
mean = = n p and variance =
2
= n p (1-p)

Derive


22
Binomial Random Variable

Effect of parameters on the shape of the distribution.


23
Problems
A batch of 500 machined parts contains 10 that are defective.
Parts are selected successively without replacement, until a non-
conforming part is obtained. The random variable is the number of
parts selected.
What is the range of random variables?

24
Discrete Random Variables
Problems (cont)
In a semiconductor manufacturing process, three wafers from a
lot are tested. Each wafer is classified as pass or fail. Assume
that the probability that the wafer passes the test is 0.8, and that
the wafers are independent.
Determine the probability mass function of the number of wafers from a
lot that passes the test.
Determine the cumulative distribution function for the random variable.
Determine the mean and the variance of the random variable.

25
pmf, cdf, mean and variance of Discrete Random Variables
Problems (cont)
Because not all airline passengers show up for their reserved seat,
an airline adopts a policy of overbooking 5 seats for a flight that
has a capacity of 120 seats. The probability that a passenger does
not show up is 0.1 and all the passengers are assumed to behave
independently.
What is the probability that every passenger who shows up can take the
flight?
What is the probability that the flight departs with empty seat(s)?

26
Binomial Distribution
Problems (cont)
A manufacturer has 100 customer orders to be satisfied. Each
order requires on component part that is purchased from a
supplier. However, 2% of the components are identified as
defective and the components are assumed to be independent.
If the manufacturer stocks 100 components what is the probability that
the 100 orders can be filled up?
If the manufacturer stocks 105 components what is the probability that
the 100 orders can be filled up?

27
Binomial Distribution
Problems (cont)
A multiple choice test contains 25 questions, each with 4 answers.
Assume that a student just guesses each question.
What is the probability that a student answers more than 20 questions
correctly?
What is the probability that the student answers less than 5 questions
correctly?
What will be the mean marks scored by a student?
What is the variance of the marks scored?

28
Binomial Distribution
Poisson Distribution
It is derived from a binomial distribution
Limiting case of the binomial distribution
Applied to systems with large number of possible events, each
of which are rare

29
Poisson Distribution
With reference to a binomial distribution, if n becomes very large
and p becomes considerably small, such that the product of n
and p (denoted by ) remains some manageable constant, then
lim
n
P(X = x) = (e


x
) / (x!)


The assumption of independence is still required

30
Poisson Distribution
Example (to distinguish Poisson distribution from Binomial
distribution):
A page in a book can have two outcomes: with error and without error
The probability of with error in a page of a book published by a good
publishing house is very small (this is the p)
But the number of pages in the book will be large, say 500 (this is n)
In this case, the distribution of pages with error in a book may be
modeled as a Poisson distribution

Another Example:
Number of bike accidents in a city in a week

31
Poisson Distribution
Typical application:
Poisson distribution is appropriate for a random variable that
counts the number of occurrences of an event of interest in a
given time interval.

Other application:
Number of surface defects
Number of errors

32
Poisson Distribution (cont)
A random variable X following a Poisson distribution is denoted by
X ~ Poisson()
is the parameter of the Poisson distribution
can be considered as the product of n and p
Note however that Poisson Distribution does not require the
knowledge about n and p

33
Poisson Distribution (cont)
The pmf of the Poisson random variable X is given by:
f(x) = (e


x
) / (x!)

x = 0, 1, 2, > 0
Note:

The range of X is integers from 0 to infinity (and not
bounded by n unlike Binomial Distribution)

For a Poisson random variable with parameter :
mean = = and variance =
2
=


Derive

34
Poisson Distribution (cont)
Important Notes:

In practical applications, will correspond to the rate of
something per some unit (example: number of people arriving
at a bank per hour, number of defects per square meter of a
surface)
It is important to use consistent units in calculating
probabilities, means, and variances involving Poisson random
variables


35
Poisson Distribution (cont)
Effect of the value of parameter on the shape of the
Distribution


36
Problems (cont)
Suppose that the number of customers that enter a bank in an
hour is a Poisson random variable, and suppose that P(X = 0) =
5%.
Determine the mean of X?
Determine the variance of X?

37
Poisson Distribution
Problems (cont)
The number of surface flaws in plastic panels used in the interior
of automobiles has a Poisson distribution with a mean of 0.05
flaws per square foot of plastic panel. Assume that an automobile
interior contains 10 square feet of plastic panel.
What is the probability that there are no surface flaws in an autos
interior?
If 10 cars are sold to a rental company, what is the probability that none
of the 10 cars has any surface flaws?
If 10 cars are sold to a rental company, what is the probability that at
most one car has any surface flaws?

38
Poisson Distribution
Problems (cont)
The number of failures of a testing instrument from contamination
particles on the product is a Poisson random variable with a mean
of 0.02 failures per hour.
What is the probability that the instrument does not fail in an 8-hour
shift?
What is the probability that there is at least one failure in a 24-hour
day?

39
Poisson Distribution
Problems (cont)
The increased number of small commuter planes in major airports
has heightened concern over air safety. An has recorded a
monthly average of five near-misses on landings and take offs in
the past 5 years.
Find the probability that during a given month there are no near-
misses on landings and take offs at the airport.
Find the probability that during a given month there are five near-
misses.
Find the probability that there are at least five near-misses during a
particular month.

40
Poisson Distribution
Problems (cont)
In a food processing and packaging plant, there are, on an
average, two packaging machine breakdowns per week.
What is the probability that there are no machine breakdowns
in a given week?
Calculate the probability that there are no more than two
machine breakdowns in two weeks?

41
Poisson Distribution
Case
US Public Healthcare Service

42
Continuous Random Variables
Examples:
When a machine breaks down, it is serviced. It runs for some time until
it again breaks down. We are interested in the event time (in hours)
between successive breakdowns, then:
P(X < 10) = ? P(X > 250) = ?
P(50 < X < 150) = ?

A finance executive wants to predict the various financial ratios (say X,
Y, etc.) of different organizations, based on past data. For a particular
organization, he may be interested in:
P(X > 0.75) = ? P(Y < 0.6) = ?
P(0.35 < X < 0.50) = ?





43
Terminologies associated with continuous random variables
Probability density function (pdf) denoted by f(x)
Cumulative distribution function (cdf) denoted by F(x)





Continuous Random Variables (cont)

44
pdf
Resembles a histogram
Used to calculate an area that
represents the probability that X
takes the values between [a, b]

P(a <= X <= b) =

b
a
dx f(x)
cdf
An alternative way to represent
the distribution
Extends the definition of f(x) to
the entire real line

F(x) = P(X <= x) =
for
Example:

x
-
du f(u)
x
Important: Probability that a continuous random variable takes a
particular value is zero.
Continuous Random Variables (cont)
Mean of a continuous random variables
Defined similarly to that of a discrete random variable
Denoted by or E(X)

Formula:

Variance of a continuous random variables
Defined similarly to that of a discrete random variable
Denoted by
2
or V(X)

Formula:

The standard deviation () of X is the square root of the variance




45
Mean and Variance of a Continuous Random Variables


dx f(x)] [x E(X)




2 2 2 2
- dx} f(x)] [x { dx} f(x) ] - [x { V(X)
Uniform Distribution
The simplest type of continuous distribution to understand
A random variable X following a Uniform Distribution is denoted
by: X ~ Uniform(a, b)
The pdf of a uniform distribution is:
f(x) = 1 / (b a), a <= x <= b



a and b are the parameters of the uniform distribution
For a uniform random variable over a <= x <= b:
mean = = [(a + b)/2] and variance =
2
= [(b a)
2
/ 12]

46
Uniform Distribution

For a uniform random variable over a <= x <= b:
mean = = [(a + b)/2] and variance =
2
= [(b a)
2
/ 12]




Derive

47
Problems
The probability density function of the length of a metal rod is f(x)
= 2 for 2.3 < x < 2.8 meters
If the specifications of this process are 2.25 to 2.75 meters, what
proportion of the rods fail to meet the specifications?
Determine the mean and the variance of the length of the metal rod

48
pdf, mean and variance of Continuous Random Variables
Problems (cont)
The net weight (in kilogram) of a packaged chemical powder
follows a uniform distribution for 49.75 < x < 50.25 kilogram
Determine the mean and the variance of the weight of the packages.
Determine the probability of a randomly selected package being less
than 50.1 kilogram.

49
Uniform Distribution
Problems (cont)
The manager of a local soft-drink company believes that when a
new beverage-dispensing machine is set to dispense 7 ounces, it
in fact dispenses an amount X at random anywhere between 6.5
and 7.5 ounces inclusive. Suppose X has a uniform probability
distribution.

Draw the graph of distribution function.
Find the mean and the standard deviation

50
Uniform Distribution
Problems (cont)
An officer of the highway patrol is assigned to assist motorists
should they become involved in an accident or have a mechanical
breakdown. He can use either of the two strategies:
Locating his patrol point at the midpoint of the highway.
Patrolling the entire stretch of the highway.

Which strategy is better in terms of faster response?

51
Uniform Distribution
Exponential Distribution
Preamble:
For a Poisson distribution, the number of people arriving at a
bank in one hour (i.e. arrival rate) is of interest
In some cases the time between arrivals may be of interest
This is exactly what is described by an exponential distribution
If the arrival rate is a Poisson random variable then the
corresponding time between arrivals is an exponential random
variable


52
Exponential Distribution
A random variable X following an Exponential Distribution is
denoted by:
X ~ Exponential()

is the parameter of the exponential distribution
Important: is the mean of the corresponding Poisson process
(example: arrival rate, and not the mean time between arrivals)

The pdf of an exponential distribution is:
f(x) = e
-x
for 0 <= x <=

53
Exponential Distribution (cont)
The cdf of an exponential distribution is:
F(x) = P(X <= x) = 1 e
-x
for x >= 0

For an exponential random variable with parameter :
mean = = [1 / ] and variance =
2
= [1 / (
2
)]



Derive

54
Exponential Distribution (cont)
Lack of memory property of the exponential distribution
Mathematically: P(X < (t
1
+ t
2
) | X > t
1
) = P(X < t
2
)
Implication (Example): Suppose the time between arrival of a
city bus is exponentially distributed with a mean of 15 minutes.
If you have already waited at the bus stop for 1 hour, then the
probability that a bus will arrive in the next 10 minutes is equal
to the probability that a bus would have arrived in the next 10
minutes as soon as you come to the bus stop (that is, without
the fact that you waited for one hour)


Derive

55
Problems
A catalog company that receives majority of its orders by
telephone conducted a study to determine how long customers
are willing to wait on hold before ordering a product. The length of
time was found to be a random variable best approximated by an
exponential distribution with mean equal to 2.8 minutes. What
proportions of customers have to hold more than 3 minutes before
placing an order?

56
Exponential Distribution
Problems (cont)
The time between arrivals of taxis at a busy intersection is
exponentially distributed with a mean of 10 minutes.
What is the probability that you wait longer than one hour for a taxi?
Suppose you have already been waiting for one hour for a taxi, what is
the probability that one arrives within the next 10 minutes?

57
Exponential Distribution
Problems (cont)
The lifetime of a mechanical assembly in a vibration test is
exponentially distributed with a mean of 400 hours.
What is the probability that an assembly on test fails in less than 100
hours?
What is the probability that an assembly operates for more than 500
hours before failure?
If an assembly has been on test for 400 hours without a failure, what is
the probability of a failure in the next 100 hours?

58
Exponential Distribution
Normal Distribution
Preamble:
The most widely used model for describing a random variable
Outcomes of a large number of real life situations has a bell
shaped frequency distribution that can be modelled by a Normal
Distribution.
Central Limit Theorem is associated with the Normal Distribution
The mean, median, and mode of a Normal Distribution are
theoretically same.
The range of variable extends from - to +


59
Normal Distribution

A random variable X following a Normal Distribution is denoted
by:
X ~ Normal(,
2
)
and
2
are the parameters of the normal distribution


60
Normal Distribution (cont)

61
Normal Distribution (cont)
A normal random variable described by Normal (0, 1),
that is = 0 and
2
= 1, is called a standard normal random
variable
Denoted by Z, that is, Z ~ Normal (0, 1)

Any normal random variable X ~ Normal (,
2
) can be mapped to
the standard normal random variable:
Z = [(X ) / ]

How to read and use the standard normal distribution table?


62
Problems
A Normal random variable is denoted by X ~ Normal (1.2, 0.15
2
).
Find the following probabilities:
P(X < 1.10)
P(X > 1.38)
P(1.35 < X < 1.5)

63
Normal Distribution
Problems (cont)
Find z
0
such that:
P(Z > z
0
) = 0.025
P(Z < z
0
) = 0.925
P(- z
0
< Z < z
0
) = 0.8262

64
Normal Distribution
Problems (cont)
The compressive strength of samples of cement can be modeled
by a normal distribution with a mean of 6000 kilograms per
square centimeter and a standard deviation of 100 kilograms per
square centimeter.
What is the probability that a samples strength is less than
6250 Kg/cm
2
?
What is the probability that a samples strength is between 5800
and 5900 Kg/cm
2
?
What strength is exceeded by 95% of the samples?

65
Normal Distribution
Problems (cont)
The reaction time of a driver to visual stimulus is normally
distributed with a mean of 0.4 seconds and a standard deviation
of 0.05 seconds.
What is the probability that a reaction requires more than 0.5
seconds?
What is the probability that a reaction requires between 0.4 and
0.5 seconds?
What is the reaction time that is exceeded 90% of the time?

66
Normal Distribution
In a certain city, the daily supply of electric power (in mega watt) can be
treated as a random variable having a normal distribution with mean 300 MW
and s.d. 50 MW.

Since supply is not a constant, the local authorities have imposed a system
of rationing to deal with the problem. It is known that to ensure proper
rationing a minimum of 250 MW supply is required; otherwise load shedding
is to be imposed.

There is no need of rationing whenever supply exceeds 350 MW. On the other
hand, maximum consumption of the city can never exceed 425 MW.
Find the percentage of the days
1. in which the city experiences load-shedding.
2. in which proper power rationing is implemented.
3. when there is an excess of power supply.
Problems (cont)
Normal Distribution
The average mileage before a major breakdown of a particular
bike is 60000 kms. with a s.d. of 10000 kms. The
manufacturer wishes to warranty these bikes, offering to
make necessary service free of charge, if the the new bike
has a break-down before covering certain number of kms.

Assuming that the mileage, before a major breakdown, is
distributed normally. Determine for how many kms , should
the manufacturer warranty, so that not more than 3 % of the
new bikes come for free service?
Problems (cont)
Normal Distribution
A wholesale distributor of fertilizer products finds that the
annual demand for one type of fertilizer is normally distributed
with a mean of 120 tonnes and s.d. of 16 tonnes.

If he orders only once a year, what quantity should be ordered
to ensure that there is only a 5% chance of running short of
stock?
Problems (cont)
Normal Distribution
Case
Specialty Toys

70
Normal Distribution
Joint Distribution
Let X
1
denote the random variable describing the outcome from
the roll of dice 1 and let X
2
denote the random variable describing
the outcome from the roll of dice 2 then the joint probability
distribution of X
1
and X
2
is the following:


71
X1 X2
1 2 3 4 5 6
1 1/36 1/36 1/36 1/36 1/36 1/36
2 1/36 1/36 1/36 1/36 1/36 1/36
3 1/36 1/36 1/36 1/36 1/36 1/36
4 1/36 1/36 1/36 1/36 1/36 1/36
5 1/36 1/36 1/36 1/36 1/36 1/36
6 1/36 1/36 1/36 1/36 1/36 1/36
Joint Distribution
P (X
1
= 2, X
2
= 3) = ?
P (X
1
= 3, X
2
= 1) = ?

72
X
1
X
2
1 2 3 4 5 6
1 1/36 1/36 1/36 1/36 1/36 1/36
2 1/36 1/36 1/36 1/36 1/36 1/36
3 1/36 1/36 1/36 1/36 1/36 1/36
4 1/36 1/36 1/36 1/36 1/36 1/36
5 1/36 1/36 1/36 1/36 1/36 1/36
6 1/36 1/36 1/36 1/36 1/36 1/36
Joint Distribution
P (X
1
+ X
2
= 3) = ?

(X
1
+ X
2
) is another random variable having its own distribution.

73
X1 X2
1 2 3 4 5 6
1 1/36 1/36 1/36 1/36 1/36 1/36
2 1/36 1/36 1/36 1/36 1/36 1/36
3 1/36 1/36 1/36 1/36 1/36 1/36
4 1/36 1/36 1/36 1/36 1/36 1/36
5 1/36 1/36 1/36 1/36 1/36 1/36
6 1/36 1/36 1/36 1/36 1/36 1/36
Linear Combinations of Random Variables
The linear combination of random variables leads to another
random variable.
Example: If length (say X
1
) and width (say X
2
) are random
variables, then the perimeter (say Y) is another random variable
and Y = 2(X
1
+ X
2
)
Y is a linear combination of X
1
and X
2

In general, given random variables X
1
, X
2
, , X
n
and constants c
1
,
c
2
, , c
n
, then:
Y = c
1
X
1
+ c
2
X
2
+ + c
n
X
n
is a linear combination of X
1
, X
2
,
X
n


74
Linear Combination of Random Variables (cont)
If Y is a linear combination of X
1
, X
2
, X
n
AND if X
1
, X
2
, , X
n
are
independent, then
Mean of Y is:
E(Y) = c
1
E(X
1
) + c
2
E(X
2
) + + c
n
E(X
n
)

Variance of Y is:
V(Y) = c
1
2
V(X
1
) + c
2
2
V(X
2
) + + c
n
2
V(X
n
)

75
Linear Combination of Random Variables (cont)
If X
1
, X
2
, X
n
are independent random variables, with each having
a mean and a variance
2
and
if Y = [(X
1
+ X
2
+ + X
n
) / n], then,
Mean of Y is:
E(Y) = (1/n) + (1/n) + (1/n) + . [n times]
=
Variance of Y is:
V(Y) = (1/n)
2

2
+ (1/n)
2

2
+ (1/n)
2

2
+ . [n times]
=
2
/ n

76
Special Case 1
Combination that represents the average of n independent
random variables with identical means and variances
Linear Combination of Random Variables (cont)
If X
1
, X
2
, X
n
are independent normal random variables, such that:
X
1
~ Normal(
1
,
1
2
), X
2
~ Normal(
2
,
2
2
), X
3
~ Normal(
3
,
3
2
), ,
X
n
~ Normal(
n
,
n
2
),
and
if Y = c
1
X
1
+ c
2
X
2
+ + c
n
X
n
then,
Y is also a normal random variable
Mean of Y is:
E(Y) = c
1

1
+ c
2

2
+ + c
n

n

Variance of Y is:
V(Y) = c
1
2

1
2
+ c
2
2

2
2
+ + c
n
2

n
2


77
Special Case 2
Reproductive property of Normal Distribution
Problems
The VP of Marketing at a breakfast cereal company wants to
implement a promotion idea. Each cereal box will contain any one of a
set of game pieces which a consumer can collect. The pieces will be
placed in cereal boxes at random so that a box is equally likely to
contain any one from the set. When the consumer has collected all
the pieces from a set, the consumer can claim a prize. The number of
pieces that should be in the set to maximize the promotion effect is
not clear. The VP wants to base this decision on the expected value
and variance of the number of boxes a consumer needs to buy to be
able to claim the prize. Compute the expected value of the number of
boxes a consumer has to buy to be able to claim the prize for a case:
(a) two game pieces, and
(b) three game pieces.

78
Linear combination of random variables
Problems (cont)
Assume that the weights of individuals are independent and normally
distributed with a mean of 160 pounds and a standard deviation of 30
pounds. Suppose that 25 people squeeze into an elevator that is designed
to hold 4300 pounds.
What is the probability that the load (total weight) exceeds the design limit?
What design limit is exceeded by 25 occupants with probability 0.001?

79
Linear combination of random variables
Problems (cont)
The width of a casing for a door is normally distributed with a mean of 24
inches and a standard deviation of 0.125 inch. The width of a door is
normally distributed with a mean of 23.875 inches and a standard
deviation of 0.0625 inch. Assume independence.
Determine the mean and standard deviation of the difference between the
width of the casing and the width of the door.
What is the probability that the width of the casing minus the width of the
door exceeds 0.25 inch?
What is the probability that the door does not fit in the casing?

80
Linear combination of random variables
Covariance and Correlation between two random
variables
Let X1 and X2 be two random variables

The Covariance between X1 and X2 is given by:
Cov(X1, X2) = E((X1 -
X1
) (X2 -
X2
))

The Correlation between X1 and X2 is given by:

X1X2
= Cov(X1, X2) / (
X1

X2
)


81
Covariance and Correlation between two random
variables

If X1 and X2 are independent then Cov(X1, X2) = 0.

The reverse need not be true
That is, even if Cov(X1, X2) = 0, X1 and X2 may still not
be independent.

Cov(X1, X1) = V(X1)


82
Linear Combination of Dependent Random
Variables
If Y is a linear combination of X
1
, X
2
, X
n

Mean of Y is:
E(Y) = c
1
E(X
1
) + c
2
E(X
2
) + + c
n
E(X
n
)

Variance of Y is:
V(Y) = c
1
2
V(X
1
) + c
2
2
V(X
2
) + + c
n
2
V(X
n
)
+ 2c
1
c
2
Cov(X
1
,X
2
) + 2c
1
c
3
Cov(X
1
,X
3
) + +
+ 2c
n-2
c
n-1
Cov(X
n-2
X
n-1
) + 2c
n-2
c
n
Cov(X
n-2
,X
n
)
+ 2c
n-1
c
n
Cov(X
n-1
,X
n
)

83
Linear Combination of Dependent Random
Variables - Example
Management of a chain of retail stores has the opportunity to lock
in prices for electricity and natural gas, the two energy sources
used in the stores. A typical store in this chain uses electricity for
lighting and air conditioning. In the winter; natural gas supplies
heat. Managers at a recent meeting settled on the following
estimates of typical annual use of electricity and natural gas by
the stores. They estimated the chances of varying levels of use
based on their own experiences operating stores and their
expectation for the coming long-term weather patterns. The cost
of electricity is roughly $100 per thousand kilowatt-hours, and the
cost of natural gas is about $12 per thousand cubic feet.

84
Source: Statistics for Business Decision Making and Analysis by Stine and Foster
Linear Combination of Dependent Random
Variables Example cont
The usage pattern for electricity is as follows:



The usage pattern for natural gas is as follows:


85
thousand kilowatt-
hours
200 300 400 500
chances 5% 25% 40% 30%
thousand cubic
feet
600 800 1000 1200
chances 5% 25% 40% 30%
Source: Statistics for Business Decision Making and Analysis by Stine and Foster
Linear Combination of Dependent Random
Variables Example cont
Identify random variables for the amount of electricity that is used
(X) and the amount of natural gas that is used (Y).
What are the marginal probability distributions for these random
variables?
Define a third random variable T that combines these two random
variables to determine the annual energy operating costs.
We do not have the joint distribution for X and Y. Do you think
that it is appropriate to model the two random variables X and Y
as independent?
The correlation between X and Y is believed to be 0.4. Using this
value, find the mean and variance of T.

86
Source: Statistics for Business Decision Making and Analysis by Stine and Foster

You might also like