Professional Documents
Culture Documents
When the numerical value of a variable is determined by a chance event, that variable is called a random variable.
Discrete. Discrete random variables take on integer values, usually the result of counting. Suppose, for example, that we flip a coin and count the number of heads. The number of heads results from a random process - flipping a coin. And the number of heads is represented by an integer value - a number between 0 and plus infinity. Therefore, the number of heads is a discrete random variable. Continuous. Continuous random variables, in contrast, can take on any value within a range of values. For example, suppose we flip a coin many times and compute the average number of heads per flip. The average number of heads per flip results from a random process - flipping a coin. And the average number of heads per flip can take on any value between 0 and 1, even a non-integer value. Therefore, the average number of heads per flip is a continuous random variable.
elections in the 20th century is an integer, but it does not vary and it does not result from a random process; so it is not a random variable.
The random variable Y is a function of X; that is, y = f(x). The value of y is greater than or equal to zero for all values of x. The total area under the curve of the function is equal to one.
The charts below show two continuous probability distributions. The chart on the left shows a probability density function described by the equation y = 1 over the range of 0 to 1 and y = 0 elsewhere. The chart on the right shows a probability density function described by the equation y = 1 - 0.5x over the range of 0 to 2 and y = 0 elsewhere. The area under the curve is equal to 1 for both charts.
y=1
y = 1 - 0.5x
The probability that a continuous random variable falls in the interval between a and b is equal to the area under the pdf curve between a and b. For example, in the first chart above, the shaded area shows the probability that the random variable X will fall between 0.6 and 1.0. That probability is 0.40. And in the second chart, the shaded area shows the probability of falling between 1.0 and 2.0. That probability is 0.25. Note: With a continuous distribution, there are an infinite number of values between any two data points. As a result, the probability that a continuous random variable will assume a particular value is always zero.
What is the probability that 4 or more adults reside at a randomly selected home? (A) 0.10 (B) 0.15 (C) 0.25 (D) 0.50 (E) 0.90 Solution
The correct answer is A. The sum of all the probabilities is equal to 1. Therefore, the probability that four or more adults reside in a home is equal to 1 - (0.25 + 0.50 + 0.15) or 0.10.
[ xi * P(xi) ]
where xi is the value of the random variable for outcome i, x is the mean of random variable X, and P(xi) is the probability that the random variable will be outcome i. Example 1 In a recent little league softball game, each player went to bat 4 times. The number of hits made by each player is described by the following probability distribution. Number of hits, x Probability, P(x) 0 0.10 1 0.20 2 0.30 3 0.25 4 0.15
What is the mean of the probability distribution? (A) 1.00 (B) 1.75 (C) 2.00 (D) 2.25 (E) None of the above. Solution
The correct answer is E. The mean of the probability distribution is 2.15, as defined by the following equation. E(X) = [ xi * P(xi) ] E(X) = 0*0.10 + 1*0.20 + 2*0.30 + 3*0.25 + 4*0.15 = 2.15
{ [ xi - E(x) ]2 * P(xi) }
where xi is the value of the random variable for outcome i, P(xi) is the probability that the random variable will be outcome i, E(x) is the expected value of the discrete random variable x. Example 2 The number of adults living in homes on a randomly selected city block is described by the following probability distribution. Number of adults, x Probability, P(x) 1 0.25 2 0.50 3 0.15 4 0.10
What is the standard deviation of the probability distribution? (A) 0.50 (B) 0.62 (C) 0.79 (D) 0.89 (E) 2.10
Solution The correct answer is D. The solution has three parts. First, find the expected value; then, find the variance; then, find the standard deviation. Computations are shown below, beginning with the expected value. E(X) = [ xi * P(xi) ] E(X) = 1*0.25 + 2*0.50 + 3*0.15 + 4*0.10 = 2.10 Now that we know the expected value, we find the variance. = { [ xi - E(x) ]2 * P(xi) } = (1 - 2.1) * 0.25 + (2 - 2.1)2 * 0.50 + (3 - 2.1)2 * 0.15 + (4 - 2.1)2 * 0.10 2 = (1.21 * 0.25) + (0.01 * 0.50) + (0.81) * 0.15) + (3.61 * 0.10) = 0.3025 + 0.0050 + 0.1215 + 0.3610 = 0.79
2 2 2
And finally, the standard deviation is equal to the square root of the variance; so the standard deviation is sqrt(0.79) or 0.889.
and
x-y
The above equations for general variables also apply to random variables. If X and Y are random variables, then E(X + Y) = E(X) + E(Y) and E(X - Y) = E(X) - E(Y)
where E(X) is the expected value (mean) of X, E(Y) is the expected value of Y, E(X + Y) is the expected value of X plus Y, and E(X - Y) is the expected value of X minus Y.
P(x|y) = P(x), for all values of X and Y. P(x y) = P(x) * P(y), for all values of X and Y.
The above conditions are equivalent. If either one is met, the other condition also met; and X and Y are independent. If either condition is not met, X and Y are dependent. Note: If X and Y are independent, then the correlation between X and Y is equal to zero.
The table on the right shows the joint probability distribution between two random variables - X and Y. (In a joint probability distribution table, numbers in the cells of the table represent the probability that particular values of X and Y occur together.) What is the mean of the sum of X and Y? (A) 1.2 (B) 3.5 (C) 4.5 (D) 4.7 (E) None of the above. Solution The correct answer is D. The solution requires three computations: (1) find the mean (expected value) of X, (2) find the mean (expected value) of Y, and (3) find the sum of the means. Those computations are shown below, beginning with the mean of X. E(X) = [ xi * P(xi) ] E(X) = 0 * (0.1 + 0.1) + 1 * (0.2 + 0.2) + 2 * (0.2 + 0.2) = 0 + 0.4 + 0.8 = 1.2 Next, we find the mean of Y. E(Y) = [ yi * P(yi) ] E(Y) = 3 * (0.1 + 0.2 + 0.2) + 4 * (0.1 + 0.2 + 0.2) = (3 * 0.5) + (4 * 0.5) = 1.5 + 2 = 3.5 And finally, the mean of the sum of X and Y is equal to the sum of the means. Therefore, E(X + Y) = E(X) + E(Y) = 1.2 + 3.5 = 4.7 Note: A similar approach is used to find differences between means. The difference between X and Y is E(X - Y) = E(X) - E(Y) = 1.2 - 3.5 = -2.3; and the difference between Y and X is E(Y X) = E(Y) - E(X) = 3.5 - 1.2 = 2.3
Problem 2 The table on the left shows the joint probability distribution between two random variables - X and Y; and the table on the right shows the joint probability distribution between two random variables - A and B. X 0 1 2 0 A 1 2
3 Y 4
0.1 0.1
0.2 0.2
0.2 B 0.2
3 4
0.1 0.2
0.2 0.2
0.2 0.1
Which of the following statements are true? I. X and Y are independent random variables. II. A and B are independent random variables. (A) I only (B) II only (C) I and II (D) Neither statement is true. (E) It is not possible to answer this question, based on the information given. Solution The correct answer is A. The solution requires several computations to test the independence of random variables. Those computations are shown below. X and Y are independent if P(x|y) = P(x), for all values of X and Y. From the probability distribution table, we know the following: P(x=0) = 0.2; P(x=1) = 0.4; P(x=2) = 0.4; P(x=0 | y=3) = 0.2; P(x=1 | y=3) = 0.4; P(x=2 | y=3) = 0.4; P(x=0 | y = 4) = 0.2 P(x=1 | y = 4) = 0.4 P(x=2 | y = 4) = 0.4
Thus, P(x|y) = P(x), for all values of X and Y, which means that X and Y are independent. We repeat the same analysis to test the independence of A and B. P(a=0) = 0.3; P(a=1) = 0.4; P(a=2) = 0.3; P(a=0 | b=3) = 0.2; P(a=1 | b=3) = 0.4; P(a=2 | b=3) = 0.4; P(a=0 | b = 4) = 0.4 P(a=1 | b = 4) = 0.4 P(a=2 | b = 4) = 0.2
Thus, P(a|b) is not equal to P(a), for all values of A and B. For example, P(a=0) = 0.3; but P(a=0 | b=3) = 0.2. This means that A and B are not independent.
Problem 3 Suppose X and Y are independent random variables. The variance of X is equal to 16; and the variance of Y is equal to 9. Let Z = X - Y. What is the standard deviation of Z?
(A) 2.65 (B) 5.00 (C) 7.00 (D) 25.0 (E) It is not possible to answer this question, based on the information given. Solution The correct answer is B. The solution requires us to recognize that Variable Z is a combination of two independent random variables. As such, the variance of Z is equal to the variance of X plus the variance of Y. Var(Z) = Var(X) + Var(Y) = 16 + 9 = 25 The standard deviation of Z is equal to the square root of the variance. Therefore, the standard deviation is equal to the square root of 25, which is 5.
Adding a constant: Y = X + b Subtracting a constant: Y = X - b Multiplying by a constant: Y = mX Dividing by a constant: Y = X/m Multiplying by a constant and adding a constant: Y = mX + b Dividing by a constant and subtracting a constant: Y = X/m - b
Note: Suppose X and Z are variables, and the correlation between X and Z is equal to r. If a new variable Y is created by applying a linear transformation to X, then the correlation between Y and Z will also equal r.
where m and b are constants, Y is the mean of Y, X is the mean of X, Var(Y) is the variance of Y, and Var(X) is the variance of X. Note: The standard deviation (SD) of the transformed variable is equal to the square root of the variance. That is, SD(Y) = sqrt[ Var(Y) ].
A Christmas bonus of $500. An incentive bonus equal to 10 percent of the employee's salary.
What is the mean bonus received by employees? (A) $500 (B) $3,000 (C) $3,500 (D) None of the above. (E) There is not enough information to answer this question. Solution The correct answer is C. To compute the bonus, management applies the following linear transformation to the each employee's salary. Y = mX + b Y = 0.10 * X + 500 where Y is the transformed variable (the bonus), X is the original variable (the salary), m is the multiplicative constant 0.10, and b is the additive constant 500. Since we know that the mean salary is $30,000, we can compute the mean bonus from the following equation.
Problem 2 The average salary for an employee at Acme Corporation is $30,000 per year, with a variance of 4,000,000. This year, management awards the following bonuses to every employee.
y y
A Christmas bonus of $500. An incentive bonus equal to 10 percent of the employee's salary.
What is the standard deviation of employee bonuses? (A) $200 (B) $3,000 (C) $40,000 (D) None of the above. (E) There is not enough information to answer this question. Solution The correct answer is A. To compute the bonus, management applies the following linear transformation to the each employee's salary. Y = mX + b Y = 0.10 * X + 500 where Y is the transformed variable (the bonus), X is the original variable (the salary), m is the multiplicative constant 0.10, and b is the additive constant 500. Since we know the variance of employee salaries, we can compute the variance of employee bonuses from the following equation. Var(Y) = m2 * Var(X) = (0.1)2 * 4,000,000 = 40,000 where Var(Y) is the variance of employee bonuses, and Var(X) is the variance of employee salaries. And finally, since the standard deviation is equal to the square root of the variance, the standard deviation of employee bonuses is equal to the square root of 40,000 or $200.