You are on page 1of 41

Probability Distributions

Terminology
Random Variable

Variable that takes different values of the outcomes of random experiments.


Discrete Random Variable: A random variable that assumes countable values is called a discrete random variable. Example: the no. of houses in a certain block Continuous Random Variables A random variable whose values are not countable is called a continuous random variable. Example: the age

Probability Distributions
The value of a random variable can not be predicted accurately, by convention, the probabilities are assigned to all the likely values that the variable may take. By enumerating the possible values & assigning probabilities specifically to each of these values, we are in fact, describing a probability distribution. Expected Value E(X) = [(X) * P(X)]

Example: Question
Q. A person expects a gain of Rs. 80, Rs. 120, Rs.160 & Rs. 20 by investing in a share. The probability distribution of the gain is as follows:
Gain (Rs.) 80 120 160 20 Probability 0.2 0.4 0.3 0.1

The expected gain from the share is = (80*0.2)+(120*0.4) + (160*0.3)+(20*0.1) = Rs. (16+48+48+2) = Rs.114

Example 2
Q. An Investor is contemplating to invest either in bonds that will pay 9% interest or in stocks. The stock prices may either go up by 20% or decrease by 10% or remain unchanged. The probabilities of these three occurrences were found by the investor as 0.3, 0.2 & 0.5 respectively. If the investor has initial capital of Rs. 1000, what should he do?

Solution 2
Expected Money Value from Bonds: Interest from Bonds = 1000*9/100 = 90 Total EMV = 1090
Price of X 1200 900 1000 Total E(X) P (X) 0.3 0.2 0.5 E(X) 360 180 500 1040

Comparing the EMV of both the options, Bonds EMV is higher so it is advisable to invest in Bonds

Variance:
The variance is a measure of the variability or dispersion in a variable or a data set. Variance = (S.D.)2

Covariance
A measure of variability of one variable (or a data set) in relation to another variable (or a data set), is the covariance.

Factors affecting Covariance are:


The variability of X which may be measured by the standard deviation of X.

The variability of Y which may be measured by the standard deviation of Y.


The correlation between X & Y.

Types of Probability Distributions


Discrete Probability
Can take only a limited no. of values, which can be listed i.e. Discrete nos. Example:
Binomial Distribution Poisson Distribution Hypergeometric Distribution

Continuous Probability
The variables are allowed to take on any value within a given range. Example:
Normal Distribution; so, we cannot list all the possible outcomes.

Binomial Distribution
It describes discrete, non-continuous data resulting from an experiment that is also known as Bernoulli process. P(X=r) = nCr * pr*qn-r = n!/ (n-r)!* (r!)* pr*qn-r It has an expected value (or Mean ) = np where, n = total no. of trials;
p = probability of success
Variance = 2 = npq where, q = probability of failure

Characteristics of Binomial Distribution


Each trial can have only two possible outcomes success or failure. The probability of the outcome of any trial remains constant over time i.e. Probability of getting a tail is always 0.5 Irrespective of the no. of times a coin is tossed. Each trial is statistically independent i.e. outcome of one trial cannot influence the outcome of any other trail.

Example: Binomial Distribution


Q. Find the probability of getting exactly 3 heads in 4 tosses of a coin, where P =&P =
(H) (T)

Solution:
P = 4C3 * ()3 * ()1 = (4*27*1)/ 64*4 = 27/64 = 0.42
(X=3)

Example 2
Q. If 5% of the items produced turn out to be defective, then find out the probability that out of 20 items selected at random there are: i. Exactly three are defective ii. Atleast two are defective iii. Exactly four are defective iv. Find the mean & variance

i.

P (d) = 0.05 N = 20 R=? Probability of Exactly three defective


i. R = 3 ii. P(X=r) = nCr * pr*qn-r = n!/ (n-r)!* (r!)* pr*qn-r iii. P(d=3) = 20C3 * (0.05)3*(0.95)20-3 = 20!/ (20-3)!* (3!)* (.05)3*(.95)20-3 = 0.059

ii. Probability of atleast two defective


P (atleast two defectives) = 1 [P (0 defective) + P (1 defective) ] 20C * (0.05)0*(0.95)20 = a. R = 0, n=20 0 0.358486 P (0 defective) = 0.358486 b. R = 1, n = 20 20C1 * (0.05)1*(0.95)20-1 = .377354 P (1 defective) = 0.377354 P (atleast two defectives) = 1 (0.358486+0.377354) = 0.26416 i.

Poisson Distribution Parameter []


Named for Simeon Poisson, the Poisson distribution is a discrete probability distribution and refers to the number of events (a.k.a. successes) within a specific time period or region of space. For example: The number of cars arriving at a service station in 1 hour. (The interval of time is 1 hour.) The number of flaws in a bolt of cloth. (The specific region is a bolt of cloth.) The number of accidents in 1 day on a particular stretch of highway. (The interval is defined by both time, 1 day, and space, the particular stretch of highway.)
7.16

Poisson Probability Distribution


The probability that a Poisson random variable assumes a value of x is given by:

Note: is the only parameter [tell me and I can calculate the probabilities] and e is the natural logarithm base.

FYI:

7.17

Poisson Distribution
As mentioned on the Poisson experiment slide: The probability of a success is proportional to the size of the interval Thus, knowing an error rate of 1.5 typos per 100 pages, we can determine a mean value for a 400 page book as:

=1.5(4) = 6 typos / 400 pages.


7.18

Example 7.13
For a 400 page book, what is the probability that there are no typos? e=2.7183 P(X=0) =
there is a very small chance there are no typos

7.19

Example 7.13
For a 400 page book, what is the probability that there are five or less typos? P(X5) = P(0) + P(1) + + P(5) This is rather tedious to solve manually. k=5, =6, and P(X k) = .446

there is about a 45% chance there are 5 or less typos


7.20

Example
The number of typographical errors in new editions of textbooks varies considerably from book to book. After some analysis he concludes that the number of errors is Poisson distributed with a mean of 1.5 typos per 100 pages. The instructor randomly selects 100 pages of a new book. What is the probability that there are no typos?

That is, what is P(X=0) given that

= 1.5?

There is about a 22% chance of finding zero errors

7.21

Poisson Distribution
It is used to approximate binomial distribution when: n is large & p is very small. Conditions: Occurrence of each event is independent of the other. P (X=x) = [(e-m)*(mx )] / x! Where, m = mean x = no. of occurrences of the random event e = constant Mean = m = variance

Normal Distribution
Normal distribution reflects the various values taken by variables like heights/ weights of people. A large no. of observations are found to be clustered around the mean values & their frequency drops sharply as we move away from the mean in either direction.

Normal Distribution
When n is large, n tends to infinity and p is not that small, i.e., p,q tends to equality, we use this method Standard Normal Variate (Z) Z = X np / npq

Characteristics: Normal Distribution


Mean, Median, Mode of a normal distribution all lie at the centre. The curve has a single peak & thus it is Unimodal. The two tails extend indefinitely & never touch horizontal axis
1 2 3 Covers 68% of the observations Covers 95% of the observations Covers 99% of the observations

Standard Normal Distribution =0&=1 Standardizing Normal Distribution Z = (X- ) /

Normal Distribution

P(Y ) 0.50 P( Y ) 0.68 P( 2 Y 2 ) 0.95

68-95-99.7 Rule

68% of the data

95% of the data 99.7% of the data

68-95-99.7 Rule

68% of the data

95% of the data 99.7% of the data

Example
For example: Whats the probability of getting a math SAT score of 575 or less, =500 and =50?
575 500 Z 1.5 50 i.e., A score of 575 is 1.5 standard deviations above the mean

But to look up Z= 1.5 in standard normal chart (or enter into SAS) no problem! = .9332

Practice problem
If birth weights in a population are normally distributed with a mean of 109 oz and a standard deviation of 13 oz, a. What is the chance of obtaining a birth weight of 141 oz or heavier when sampling birth records at random? b. What is the chance of obtaining a birth weight of 120 or lighter?

Answer
a. What is the chance of obtaining a birth weight of 141 oz or heavier when sampling birth records at random?
141 109 Z 2.46 13
From the chart or SAS Z of 2.46 corresponds to a right tail (greater than) area of: P(Z2.46) = 1-(.9931)= .0069 or .69 %

Answer
b. What is the chance of obtaining a birth weight of 120 or lighter?
120 109 Z .85 13

From the chart or SAS Z of .85 corresponds to a left tail area of: P(Z.85) = .8023= 80.23%

Example: Normal Distribution


Q1. A normal variable has a mean of 10 & a standard deviation of 5. What is the probability that the normal variable will take a value in the interval 0.2 to 19.8?

Solution:
Z = (X- ) / X=0.2 = 10, = 5 Z = (0.2-10)/ 5 = -1.96 Z = (X- ) / X = 19.8 = 10, = 5 Z = (19.8-10)/ 5 = 1.96 Area under the curve from -1.96 to 1.96 0.475*2 = 0.95

Q2. Suppose that 4% of all TVs made by W&B Company in 1995 are defective. If eight of these TVs are randomly selected from across the country and tested, what is the probability that exactly three of them are defective? Assume that each TV is made independently of the others.

Q3. Take S = {1, 2, 3, 4, 5, 6}. The following table gives a probability distribution for S.
Outcome Probability 1 .3 2 .3 3 0 4 .1 5 .2 6 .1

a) P({1, 6}) = b) P({2, 3, 4}) = c) P({1, 5}) =

Sol:
Q2.: n=8, X=3, p=0.04, and q=(1-p)=0.96 P(X) = P(3) = 0.0003 or 0.03% Mean = (n) x (p) =(8)(0.04)=0.32 Variance = np (1 - p) = (0.32)(0.96) = 0.31 Standard deviation = 0.31 = 0.6.

Q3. a) P({1, 6}) = .3 + .1 = .4 b) P({2, 3, 4}) = .3+0+.1 = .4 c) P({1, 5}) = .3+.2 = .5

Q4. In a survey of voters, 30% of the respondents reported supporting the Republican candidate in the last presidential election, and 35% reported supporting the Democrat candidate. (Voting machines made it impossible to "spoil the ballot" by voting for both candidates.) Based on this data, the experimental probability that a voter will vote for a presidential candidate of either major political party is:
Q5. Binomial Random Variable: A binomial random variable is a random variable that counts the number of successes in a sequence of independent Bernoulli trials with fixed probability of success.

Now decide which of the following is a binomial random variable. a) Throw two dice 10 times; X = the number of doubles. X is
b) A bag of marbles contains 10 red ones and 10 blue ones. c) You draw 10 of them; X = the number of red ones. X is

Q6. What is the probability of getting heads exactly 3 times if you flip a fair coin 6 times?

Q4. P(PC) = .3+.35=.65


Q5 a) Throw two dice 10 times; X = the number of doubles. X is Binomial random variable.

b) A bag of marbles contains 10 red ones and 10 blue ones. Not a Binomial random variable c) You draw 10 of them; X = the number of red ones. X is Binomial random variable.
Q6. x = number of successes = 3 n = number of trials = 6 p = probability of success = .5 q = probability of failure = 1-p = 1 - .5 = .5 Probability of getting 3 heads = P(X = 3) = C(6, 3) (.5)3(.5)6-3 = 20*.125 * .125 = .3125

Q7. Your name is Edison and you are an expert penalty goal shooter. your skill has been improved for the past 10 years, and now you are as good as you will ever be. Your success rate has been measured at 80%. Thus, p = .8 and q = .2. You take n = 6 shots on goal, so the possible values of X (the number of successes) are 0, 1, 2, 3, 4, 5, 6. Here is the probability for each value of X:

P(X = 0) = C(6, 0) (.8)0(.2)6 = 1 * 1 * .000064 = .000064 Putting them all together gives the probability distribution for
x 0 1 2 3 4 .2457 6 5 6
X:

P(X = .0015 .0153 .0819 .000064 x) 36 6 2 Probability of at least 5 successes


Probability of at most 2 successes Probability of at least 3 successes

.39321 .26214 6 4

P(X = 0) = C(6, 0) (.8)0(.2)6 = 1*1 *.000064 = .000064 P(X = 1) = C(6, 1)(.8)1(.2)5 = 6 * (.8)1*(.2)5 = .001536 P(X = 2) = C(6, 2) (.8)2(.2)4 = 15 * (.8)2 * (.2)4 = .001536 P(X = 3) = C(6, 3) (.8)3(.2)3 = 20 * (.8)3 * (.2)3 = .08192 P(X = 4) = C(6, 4) (.8)4(.2)2 = 15* (.8)4* (.2)2 = .24576 P(X = 5) = C(6, 5) (.8)5(.2)1 = 6 * (.8)5 * (.2)1 = .393216 P(X = 6) = C(6, 6) (.8)6(.2)0 = 6 * (.8)6 * (.2)0 = . .262144

You might also like