You are on page 1of 16

Unit 5b: Midterm Review

Practice Problems Solutions


Practice Problem #1
The following are a collection of unrelated quick problems. Briefly justify
your answer for each problem.
a) Suppose that A and B are two disjoint events within the same sample
space. In addition, let P(A) = 1/8 and P(B) = 1/4. Are events A and B
independent? Explain.

b) Suppose a particular outcome from a random event has a probability of


0.02. Which of the following statements represent correct interpretations of
this probability? Circle the right answer and provide justification.
i) The outcome will never happen.
ii) The outcome will happen two times out of every 100 trials, for
certain.
iii) The outcome will happen two times out of every 100 trials, on the
average.
iv) The outcome could happen, or it couldn't, the chances of either
result are the same.
Practice Problem #1
The following are a collection of unrelated quick problems. Briefly justify
your answer for each problem.
a) Suppose that A and B are two disjoint events within the same sample
space. In addition, let P(A) = 1/8 and P(B) = 1/4. Are events A and B
independent? Explain.
They are dependent. Since A and B are disjoint, this means there is no
intersection: P(A and B) = 0. Thus P(A)*P(B) P(A and B) [1/32 0]

b) Suppose a particular outcome from a random event has a probability of


0.02. Which of the following statements represent correct interpretations of
this probability? Circle the right answer and provide justification.
Option (iii) is correct. There is a chance that 1/100 trials will have the
outcome happen, or 3 out of 100, etc... But this is a binomial random
variable, which has mean = n* = 0.02*100 = 2.
Practice Problem #1 (cont.)
c) If females of a certain species of lizard always mate with males that are
0.75 years younger than they are, what would the correlation
coefficient between the ages of these male and female lizards be?
Circle the right answer and provide justification.
i) 0.75 iv) 1
ii) -0.75 v) -1
iii) 0 vi) Not enough information to tell

d) Consider the annual salaries of mutual fund managers in the Boston


area. The mean salary is $450,000 and the median salary is $380,000.
Circle the correct answer below. The probability that the salary of a
randomly selected mutual fund manager from the Boston area is larger
than the mean of $450,000 is:
i) > 0.5 iii) = 0.5
ii) < 0.5 iv) Cannot be determined
Practice Problem #1 (cont.)
c) If females of a certain species of lizard always mate with males that are
0.75 years younger than they are, what would the correlation coefficient
between the ages of these male and female lizards be? Circle the right
answer and provide justification.
Option (iv) is correct. Since this is a deterministic relationship (meaning:
male lizard ALWAYS are 0.75 years younger than their female mates),
then the points lie exactly along the line, so the correlation is perfect:
it is exactly 1 (draw the scatterplot to convince yourself).

d) Consider the annual salaries of mutual fund managers in the Boston


area. The mean salary is $450,000 and the median salary is $380,000.
Circle the correct answer below. The probability that the salary of a
randomly selected mutual fund manager from the Boston area is larger
than the mean of $450,000 is:
Option (ii) is correct. Since the mean > median, this implies its right
skewed. We know 50% of the data is to the right of the median, and since
the mean is larger, less than 50% must be to the right of it. (Draw a
distribution that is right-skewed to convince yourself).
Practice Problem #2
Cancer is the #2 cause of death in the United States, yet is not
nearly as deadly in other parts of the world. An investigator
looks at the cancer mortality rate (per 100,000 person-years)
vs. Population Growth Rate per year (in percent) for 191
countries. She starts by looking at the scatterplot and some
summary statistics from her data:
Practice Problem #2 (cont.)

a) What is the equation for the least squares regression line?

b) The US has a growth rate of 0.90%. What is the predicted


cancer mortality rate for US (the true cancer mort. rate is 123.8)?
a) What is the equation for the least squares regression line?

b1 = r*(sy/sx) = -0.461*(29.99/1.097) = -12.60


b0 = y b1*(x) =

111.69 (-12.60)*(1.179) = 126.5
y^ = 126.5 12.60(x)

b) The US has a growth rate of 0.90%. What is the predicted


cancer mortality rate for US (the true cancer mort. rate is 123.8)?

^y = 126.5 12.60(x) = 126.5 12.60(0.9) = 115.2


Practice Problem #2 (cont.)
c) What percentage of total variability in cancer
mortality rate can be predicted using growth rate?

d) The investigator believes cancer mortality rates


could be lowered if countries encouraged more
baby-making and more immigration. Briefly
explain why this statement may not be correct.
Practice Problem #2 (cont.)
c) What percentage of total variability in cancer mortality rate can
be predicted using growth rate?
R2 = r2 = (-0.461)2 = 0.213. About 21.3% of the total
variability in in y can be predicted using this model on x.

d) The investigator believes cancer mortality rates could be


lowered if countries encouraged more baby-making and more
immigration. Briefly explain why this statement may not be
correct.

No, this regression does not imply that x (growth rate in a


country) causes changes in y (cancer mortality rates). This is
simply an association.
Practice Problem #3
Not everybody likes Britney Spears. In fact, an internet poll
run by the Rolling Stones magazine showed that 66% of
college-aged men said they liked Britney, while 30% of
college-aged women like her (dont make this problem
more complicated than it seems).

a) Imagine Harvard, made up of 52% women, is hosting a


Britney Spears concert (and only fans of Britney attend,
randomly selected). Whats the probability that the person
sitting next to you at the concert is a woman?

b) There is a line at the snack bar for the concert with 10


people. What is the probability that exactly 5 of these
students are women?
Practice Problem #3
Not everybody likes Britney Spears. In fact, an internet poll run by the Rolling Stones
magazine showed that 60% of college-aged men said they liked Britney, while 44%
of college-aged women like her (dont make this problem more complicated than it
seems).

a) Imagine Harvard, made up of 52% women, is hosting a Britney Spears concert (and
only fans of Britney attend, randomly selected). Whats the probability that the
person sitting next to you at the concert is a woman?

P(Woman|Brit Fan) = P(Woman and Brit Fan) / P(Brit Fan)


= P(Brit Fan | Woman)*P(Woman) /
[P(P(Brit Fan | |Woman)*P(Woman)+ P(Brit Fan | Man)*P(Man)]

= 0.30*0.52 / [0.30*0.52 + 0.66*0.48] = 0.2288/0.5168 = 0.330

b) There is a line at the snack bar for the concert with 10 people. What is the
probability that exactly 5 of these students are women?

Let X = the count out of women out of 10 people. Thus X~Bin(n = 10, = 0.330)

P(X=5) = (10 choose 5)*(0.330)5*(0.670)5 = 252*(0.0039)*(0.1350) = 0.133


Practice Problem #3 (cont.)
c) There is a line of 100 students to get into the
concert (all of whom are Brit-fans). What is the
probability that the majority of them are women?

d) The internet poll run by the Rolling Stones


reported that 66% of all college-aged men like
Britney Spears? Why could this be a mistake?
Practice Problem #3 (cont.)
c) There is a line of 100 students to get into the concert. What is the probability
that the majority of them are women?

Let Y = the count out of women out of 100 people. Thus Y ~ Bin(n = 100,
= 0.330) and approximately [Note: is the square root symbol]:
Y ~ N( = n =100*0.330 = 33.0, =(n(1-))=(100*0.330*0.670) = 4.70)
P(Y 51) = P((Y - 33)/4.7 (51-33)/4.7 = P(Z 3.83) < 0.0002 (off the table)

d) The poll reported that 60% of all college-aged men like Britney Spears? Why
could this be a mistake?

This could easily be a mistake due to selection bias (and non-response bias).
The only people that read the Rolling Stones online may feel a certain way
about Brit-Brit (selection bias). Plus, those that see the link and dont answer
the poll may not really care about Britney, while only those that are passionate
for or against Miss Spears spend the time to fill out the poll (non-response
bias).
Practice Problem #4
A friend of yours is curious to see how confident Harvard students are in
their look. He asks a random sample of n = 130 Harvard students what
percent of Harvard students do you believe I better looking than you?
This sample had a mean of 30.8% and a standard deviation of 24.2%.
***Note: if people had realistic judgments about themselves, the mean
in the population should be 50%.

a) Calculate the 95% confidence interval to estimate the true mean


percent of students that Harvard students think they are better looking
than.

b) Based on your confidence interval in part (a), would you expect a


hypothesis test to determine whether H0: = 50 to be rejected based
on a twosided test?

c) Perform the formal hypothesis test as stated in part (b).

15
Practice Problem #4
a) Calculate the 95% confidence interval to estimate the true mean percent of
students that Harvard students think they are better looking than.
_
x t*(s/n) = 30.81.984(24.2/130) = (26.6, 35.0)
b) Based on your confidence interval in part (a), would you expect a
hypothesis test to determine whether H0: = 50 to be rejected based on a
twosided test?
Yes, we would expect this null hypothesis to be rejected since the value falls
outside the confidence interval (the range of plausible values)
c) Perform the formal hypothesis test as stated in part (b).
H0: = 50
HA: _ 50
t = (x - 0)/(s/n) = (30.8 - 50)/(24.2/130) = -9.05
Looing up this t-value in the t-table we see it is off the charts. So we can
say p-value < 2(0.001) = 0.002.
Since this p-value < 0.05, we can reject the null. There is evidence that the
mean rating is different than 50% (in fact it is less, implying more Harvard
students think they are better looking than they should).
16

You might also like