You are on page 1of 13

BZAN 6310 - Solution to HW 2

1. Answers to these WL W problems are 5. This problem is similar to problem 4. Let x


provided in the text-related files that are =number of birds hit. It seems reasonable
available online. Those solutions will not to model x as b(n = 20, p = 0.3)- check the
follow the four steps I advised you to use in 3 conditions - and we want P(x < 4), which
the opening paragraph of this assignment, equals P(x s: 3). Again using Excel, the
however, so I may add to them at the end of desired probability is

~~ ~------------~0
this solution set.
;;
2. Let x denote the gain on one of these
policies. Then x = 36 ifthere is no fire and Excel: binom.dist(3,20,0.3,1) = 0.1071
x = 36- 15,000 = -14,964 ifthere is a fire.
Now write down the prob. dist'n. ofx: 6. If Larry is simply guessing the shape shown
on each card drawn, we basically have the
X P(x) "guessing on a multiple-choice exam"
-14,964 0.002 (2 fires per 1000) problem, in this case with 16 trials
36 0.998 (1 - .002) (questions) and P(success) = 0.20 on each
trial. The conditions for modeling x =
E[x] = 2: xP(x) = (-14964)(.002) + 36(0.998) number of correct answers as a binomial
= $6.00 random variable appear to be well-satisfied,
No, on this policy the company will earn $36 so we assume x- b(n = 16, p = 0.20), and
or lose $14,964. we wish to fmd P(at most 3 correct)= P(x s:
3) and P(at least 10 wrong)= P(at most 6
3. Let a success be a number greater than 2. correct)= P(x ,; 6):
Then P(success) = 4/6 = 2/3. Let x be the
--~i- --------··--- ·-· -----·--l
number of successes in 10 rolls of the die, 0 q >~ 7 :z_o
and this sounds like (and is) a binomial .....
problem where x- b(n = 10, p = 2/3), and Excel: binom.dist(3, 16,0.2, 1) = 0.5981
we want to fmd P(x z 8) = 1 - P(x s: 7); binom.dist(6,16,0.2,1) = 0.9733
remember: draw a picture. Since 0.67 (2/3
rounded) is not tabled, we would have to 7. It is probably reasonable to assume that such
either compute the needed probability injuries occur randomly over time, and we'll
manually (not the sort of messy calculation assume that in this plant individual injuries
one would want to deal with) or get Excel to are the rule and simultaneous injuries to
do it for us. Excel is the clear choice, and multiple individuals are very rare (although
the formula would be clearly the plants in some industries would
_JJ__.. not satisfy this description), which should
---z 111~ make the Poisson distribution a suitable
..,._! ' - probability model for the number of work-
Excel: 1- binom.dist(7,10,2/3,1) = 0.2991 related injuries within an interval of time. In
this case the concern is with a one-year time
4. Let a success be the event that a person takes period, and the expected annual frequency is
a room. We're given that P(success) = 0.1. 12 x 0.75 = 9 =A. Thus, we assume that x =
P(motel still will not be full) = P(fewer than number of work-related injuries in the one-
2 successes). If we let x =number of year period is distributed P(A = 9) assuming
successes out of 20 trials, it seems clear that the new safety procedures have no impact on
x is a binomial random variable (check the injury frequency. The question of interest
three conditions that must be satisfied). becomes "How unlikely was it to see only 7
Thus, x - b(n = 20, p = 0.1 ), and we want to injuries if the expected number was 9?" and
fmd P(x < 2) = P(x s: 1), which, is this is then commonly rephrased as "What's
I 'I'
"'l
I
(' I ,'1.- ~
· - - - - - ·.. ··--··------!
2. \)
the chance the plant would have seen 7 or
+' even fewer injuries in the year following
Excel: binom.dist(l,20,0.1,1) = 0.3917 adoption of the new safety procedures if the
HW 2 Solution - page 2

in fact had no effect on the expected number a.


of injuries?" Thus, we look for P(x :s; 7):

1--;-:-:-:-; ~lr-
~
----
Excel: poisson.dist(7,9,1) = 0.3239
The area between 70 and x 0 is O.:W, which
A chance of roughly 1/3 says that a count of means that x 0 is about 1.28cr above 70, so Xa
7 or fewer would not be that surprising if the = 70 + 1.28(12) = 85.36 ==> $85,360.
expected count were equal to 9, so the Excel: norm.inv(0.90,70,12) = $85,379
results are not very convincing that the new
safety rules have been effective.
b.
8. A batch of 60 cookies will contain 450
chocolate chips, which means an average of
450 I 60 = 7.5 chocolate chips per cookie. 7o X
0
Some will have more and some will have
less (and in fact none will have exactly 7.5
chips assuming individual chips remain The area between 70 and x0 is 0.25, which
intact during the making of the cookies). means that x 0 is 0.675cr above J-1, so Xo = 70
We're told that a Poisson probability model + 0.675(12) = 78.1 ==> $78,100
is reasonable for the number of chips (x) in a Excel: norm.inv(0.75,70,12) = $78,094
cookie, so we assume x- P(A. = 7.5), and we
want to fmd P(x < 5) = P(x :s; 4):

~---;--;-]-~--t- ---~
Excel: poisson.dist(4,7.5,1) = 0.1321

We would then "expect" 0.1321 x 60 = 7.93,


or roughly 8, upsetting cookies in a batch of
60. (Note that you could now use a binomial
model to find the probability of, say, more
than a dozen upsetting cookies in a batch.)

9. Let x be the megawatt demand at the Amgar


Power Plant. We're told x- N(J-1 = 120, cr = (Continue to page 3)
10), and we wish to fmd the probability of
an overload = P(x > 150).

_11\: I 2.0 l.f"o


L___ _j
3<T
P(x > 150) = 0.5- 0.4987 = 0.0013.
Excel: I- norm.dist(150,120,10,1) = 0.00135

10. Let x denote the income (in $000s) of a


randomly selected member of the
community. We're given that x- N(J-1 = 70,
cr = 12).
\
HW 2 Solution - page 3

4.1. Continuous: b,c Discrete: a,d,e b.

4.12.e. 0.7
f. 0.9
g. E[x] = 1(0.1) + 3(0.2) + ... + 9(0.1) = 5.0

4.53.a. Assuming the identification trials are


independent, which seems reasonable, the The area beyond x0 is 0.025, so the area
probability of the expert getting all five trials between 50 and x0 must be 0.50 - 0.025,or
correct is (.92)(.92)(.92)(.92)(.92) = 0.6591. 0.475. This means that x0 must be 1.96cr
Also, since the number of correct matches in above the meari of 50. Thus, x0 =50+
5 independent trials is a binomial random 1.96 (3) = 55.88.
variable (call it x), we want P(x = 5) where x Excel: norm.inv(0.975,50,3)
is distributed b(n = 5, p = 0.92), which can
be found using Excel with e.
Excel: binom.dist(5,5,0.92,0)
b. Similar to part a, in this case obtaining the
answer with either (0.75) 5 = 0.2373 or
Excel: binom.dist(5,5,0.75,0).

4.93.a. P(IO :> x :> 12) = 0.1915 + 0.1915 = 0.3830


The area between x0 and 50 is 0.40, so x0
must be roughly 1.28cr below 50, or x0 =
50- 1.28 (3) = 46.16.
Excel: norm.inv(0.1 0,50,3)
!0 !il'r
1 d I
o, r-;. Lis o- f.
Excel: norm.dist(l2, 11 ,2, 1) -
norm.dist(10,11,2,1)

e. P(x>13.24)=0.5-0.3686=0.1314

The area between 50 and x0 must be 0.49, so


x0 must be roughly 2.33cr above 50, or x0 =
50+ 2.33 (3) = 56.99.
Excel: norm.inv(0.99,50,3)
1/ /3. w
L.J
/. ll- <r 4.114. Let x be the quantity injected and assume
Excel: 1- norm.dist(13.24,11,2,1) that x- N(!-! = 10, a= 0.02). [I've assumed
a= 0.02, not 0.2, due to wording of part c.]
4.94.a.
a.

/0

Since half (0.50) the distribution lies below Clearly, P( container is underfilled) =
the mean, it follows that the area between 50 P(x < l 0) = 0.5 and P( overfilled)=
and X0 must be 0.3413, so Xo must be l.Ocr P(x > 10) = 0.5.
above the mean. Thus, x0 =50+ 1.0 (3) =
53.0.
Excel: norm.inv(0.8413,50,3)
HW 2 Solution - page 4

b. I assume that if a container must be


reprocessed, the procedure is to add more
liquid to what is already in the container. In
that case, the amount of underfilling is not
important- ultimately, for this container, the
11 :2-r 3o
total amount of liquid injected is 10.60 units
t l. 0 "] LbfLJ
at a cost of 10.60 x $20, or $212, plus $10
for reprocessing, or $222, leaving a profit of P(s < 21) = 0.5- 0.4772 = 0.0228
$8 for this container (assuming there is not a P(s > 30) = 0.5 - 0.4938 = 0.0062
similar $10 cost associated with the initial P(21 < s < 30) = 0.4772 + 0.4938 = 0.9710
filling).
Excel: norm.dist(21,25,2,1) (=a)
c. In this case, where P(underfilling) is approx. 1- norm.dist(30,25,2,1) (=b)
zero (note that 10 is 5cr below the new mean 1 -a- b
of 10.10 units),
Profit P(Profit)
E[Profit] = E[230- 20x] = 230- 20E[x] -2 0.0228
= 230- 20(10.10) = $28 -1 0.0062
10 0.9710
4.180. Let x =number ofloaves demanded; we are
given that x - N(Jl = 7200, cr = 300). E[Profit] = (-2)(.0228) + (-1)(.0062)
+ (10)(.9710)
a. We want to produce x0 loaves of bread, = $9.6582 per part
where P(x < x0) = 0.94.

The area between 7200 and x0 is 0.44, which


means that x0 must be 1.55 to 1.56 -call it Results using Excel:
1.555 - standard deviations above 7200, so
that x0 = 7200 + 1.555(300) = 7200 + 466.5 4.53. a. 0.6591 b. 0.2373
= 7,667 loaves. 4.93. a. 0.3829 e. 0.1314
Excel: norm.inv(0.94,7200,300) 4.94. a. 52.999 b. 55.880 e. 46.155
f. 56.979
b. If the company produces 7,667 loaves, it 4.180. a. 7,666 b. 0.4562
will have more than 500 loaves left over if
demand falls below 7,167. So, must fmd
P(x < 7167). ,-O.fiiT
~

~'
;::.. .

P(x < 7167) = 0.5- 0.0438 = 0.4562.


Excel: norm.dist(7167, 7200,300, I)

4.202. Let s denote the tensile strength of one of


these metal parts; we are given that
s - N(Jl = 25, cr = 2).
BZAN 6310- Solution to HW 6

3.39. Let A= Active, B =Inactive, C= Caisson, 3.12l.a. P(A operates properly)


W =Well protector, F =Fixed platform = P(all 3 components operate properly)
a. S = {AC, A W, AF, IC, IW, IF} =(1-0.12)(1-0.09)(1-0.11)=0.7127
b. 503/3400, 225/3400, 1447/3400, Note that this product is possible due to the
508/3400, 177/3400, 450/3400, independence of the components.
where 3400 = 217 5+ 1225 b. P(A fails)= 1 - 0.7127 = 0.2873
c. P(A) = 2175/3400 = 0.640
d. P(W) = 225/3400 + 177/3400 For the remaining parts,
= 402/3400 = 0.118 let C denote Subsystem C works properly,
e. P(JC) = 598/3400 = 0.176 let D denote Subsystem D works properly.
f. P(I or F) = P(J) + P(F) - P(I and F)
= 122513400 + (1447 + 450)13400 P(C) = P(components 1 and 2 both work
- 45013400 = 267213400 = 0.786 properly)
g. P(Cc) = 1 - P(C) = 1 - (503 + 598)13400 = P(C 1and C2) = (0.9)(0.9) = 0.81
= 0.676 due to indep. of components
P(D) = P(C 3 and C4 ) = 0.81 by the same
3.41. Suggestion: First construct a two-way table argument as for P(C).
with Compensation across the top (Full, Also, events C and D are independent due to
Partial, and Volunteer) and Retired along the independence of the components.
side (Yes and No), then fill in the cells and
a. 127 I 244 margins. c. P(C or D) = P(C) + P(D) - P(C and D)
b. 7 I 244 = 0.81 + 0.81 - (0.81 )(0.81) = 0.9639
c. (45+72)/244=1171244 [(0.81 )(0.81) from indep. of C, D]
d. 127 I 244 + 28 I 244- 7 I 244 = 148 I 244 d. P( exactly one fails)
= P[(C and De) or (CC and D)]
3.76.a. (a+ d) I (a+ b + c +d) = P(C and De)+ P(Cc and D)- 0
b. d I (b +d) = P(C)P(Dc) + P(Cc)P(D) by indep. of C, D
c. cl(a+c) = (0.81)(0.19) +(0.19)(0.81)
d. d I (c +d) = 0.3078
e. 420 I 498 = 0.843 e. P(system fails)= P(CC and De)
20 I 49 = 0.408 = (1- 0.81)(1- 0.81) = 0.0361
49 I 449 = 0.109 product is due to indep. of events
20 I 69 = 0.290 or = 1- 0.9639 (complement of part a)
f. We want P(system operates properly) to be
3.117. Let S 1 denote sale on the first visit and S2 > 0.99, which is the same as P(fail) < 0.01.
denote sale on the second. We're given that P(fail) = P(all subsystems fail)= (I - 0.81 )\
P(S 1) = 0.40, P(S 2 S 1c) = 0.65, and
1 where k = number of subsystems.
implicitly that P(S 1 and S2) = 0, so it follows Thus find k such that (I - 0.81 )k < 0.0 I.
that Through trial and error (try k = I, then k = 2,
P(S 2) = P(S 1 and S2) + P(S 1c and S2) etc.), k > 2, so we would need 3 subsystems
= 0 + P(Sic)P(S2IS!c) to get P(system operates properly)> 0.99.
=(I - 0.40)(0.65) = 0.39.
Thus, P(Sale) = P(S 1 or S2)
= P(S 1) + P(S 2)- P(S 1 and S2 )
= 0.40 + 0.39- 0 = 0.79
(Also seep. 2 for a two-way table solution.)
HW 6 Solution - Page 2

independence means is to look at the


conditional distribution of marital status
(MS) given gender:
. 7or
MS P(MS I M) MS P(MS I F)
. 015
7183 = 0.084 10117 = 0.588
2 72/83 = 0.867 2 2/17=0.118
3 4183 = 0.048 3 5117 = 0.294

Clearly the conditional distribution of MS is


P(Incorrect Adj I Defective)= 0.05 I 0.095 affected by whether we condition on gender
= 0.526 equal to M or gender equal to F, so gender
and marital status are dependent, not
Non-text problems independent, variables in this population.

2.a. P(N) must equal! - P(A) by defmition. 5.a. 2 I 100


b. Probabilities cannot exceed 1. b. (8 + 43 + 28 + 3) I 100 = 82 I 100
c. Probabilities cannot be negative. c. 25 I 26
d. How can the intersection d. 12 I 14
of A and B (shaded) be
larger than A (or than B)?
(It can't be.) 6.a. 6
Also note that b. 44.90 years
P(B I A) = P(A and B) I P(A), which will c. $27,710; $44,000
be> 1 ifP(A and B)> P(A), but d. 2,200 sq. ft.
probabilities cannot exceed 1.
e.
P(A or B) is shaded. How Alternative (better) solution for problem 3.117
could this area be < P(A)?
(It can't be.) The two-way table below permits a more intuitive
Also note that solution to this problem. Note that it is assumed that
P(A or B) = P(A) + P(B) - P(A and B). there is no second visit if the first visit produces·a.
Since P(B) ~ P(A and B) by part d, it sale, so the probability is 0 for the cell corresponding
follows that the last two terms in the above to a sale on both visits. The problem statement also
expression for P(A or B) cannot produce a gives the marginal probability for a sale on the first
negative result, so that P(A or B) ~ P(A). visit (0.40) and sufficient information to compute the
probability that goes in the cell for No sale on the
3. The two-way table below summarizes the first visit and Sale on the second visit (0.60 x 0.65 =
information given in the problem statement. 0.39). With those three cells filled in, the remaining
Complete the table (start with the margins), cells can be determined. From the completed table, it
then answer the quest!ons. ? follows that
t3rokt /Oo, P(Sale) = P(S 1 or S2)
= P(S 1) + P(S 2) - P(S 1 and S2)

--==- -=;-j
= 0.40 + 0.39- 0 = 0.79
Jc./..;..17Y\ 2 1"(Vir/f!
Sf""j? : l
If N
3 '-/
Answers: a. 114 b. Yz, Yz c. 1/3 0 o.tf o 0. tfo

4.a. 831100, 17 I !00


0.17 fY,J-.J o. to
b. 74 I !00 0, tI
c. 5117
d. No. This answer can be supported in several
ways, but the way that best reinforces what
BZAN 6310 - Solution to HW 7a

_ _ _
6.27. x = 30/6 = 5; s2 = [(4-5)2 + ... + (3-5)2]/(6-1) 6.122.a. 99% CI: x ± mult ó_x = x ± mult ó/%n&.
= 26 / 5 = 5.2 ==> s = %5&.& 2 = 2.28 _Estimate ó with s, multiple becomes tn-1, or
x ± tn-1 (s/%n& ) = 1.13 ± 2.648 (2.21/%7&2 ) =
a. 90%: 5 ± 2.015 (2.28 / %6& ) = 5 ± 1.876 1.13 ± 0.690.
b. 95%: 5 ± 2.571 (2.28 / %6& ) = 5 ± 2.393 We are 95% confident that the true mean
c. 99%: 5 ± 4.032 (2.28 / %6& ) = 5 ± 3.753 number of pecks is contained in the interval
_ (0.44, 1.82).
Now assume same x and s, but let n = 25. Note: The t-multiple of 2.648 is for d.f.=70,
which is the closest value to 71 that is higher
a. 90%: 5 ± 1.711 (2.28 / %2&5 ) = 5 ± 0.780 up in the t table in the 12th edition of MBS.
b. 95%: 5 ± 2.064 (2.28 / %2&5 ) = 5 ± 0.941 In earlier editions, the t table jumped from
c. 99%: 5 ± 2.797 (2.28 / %2&5 ) = 5 ± 1.275 30 to 40 to 60 to 120, so d.f.=70 was not in
_ the table. In that older version of the table,
6.32.a. x = 3.8, s = 1.2, n = 20 the closest value to 71 that is higher up in
90% C.I. est. for ì: the table is 2.660, which corresponds to
3.8 ± 1.729 (1.2 / %2&0 ) = 3.8 ± 0.464 d.f.=60; using 2.660 would produce a
slightly wider CI.
b. We can be 90% confident that the interval b. The result in part a indicates that it is very
(3.336, 4.264) contains the true mean LOS likely that the mean for blue string is less
for women in this state’s hospitals in 2008. than 2 pecks, which provides very convinc-
c. It’s an interval constructed in such a way ing evidence that chickens are more likely to
that 90% of such intervals will contain the peck at white string (in light of previous
population measure being estimated. research supporting a ì of 7.5 for white
string).
6.43.a. Your text uses the rule that n is large enough _ _
^ and (1 - np^ ) are both $ 15. These
if np 6.123.a. 99% CI: x ± mult ó_x = x ± mult ó/%n&.
conditions are satisfied here, so n is large _Estimate ó with s, multiple becomes tn-1, or
enough. x ± tn-1 (s/%n& ) = 49.3 ± 9.925 (1.5/%3& ) =
49.3 ± 8.60.
b. ^p ± 1.96óp^ = 0.46 ± 1.96(0.033) b. We are 99% confident that the true mean
= 0.46 ± 0.0647 percentage of B(a)p removed by the toxin is
c. 95% confident that this interval contains the contained in the interval (40.7%, 57.9%).
true population proportion (in the sense that c. The probability distribution of x - the pct. of
95% of the intervals constructed this way B(a)p removed from a soil specimen by the
would do so). toxin - is roughly mound-shaped and
symmetric.
6.45.a. ^p = 818 / 2045 = 0.40 d. Based on the confidence interval in part a,
b. p^ is distributed approximately normal with 50% is certainly a plausible value for the
mean = p and std. error = %& 1-&p&
p(& )/&
n . To true mean percent removed.
build a CI, must estimate p in the standard e. Omit this part.
error with p^.
c. 95% CI: p^ ± mult %[ p^ (1 - p^)]] =

0.40 ± 1.96 %& 40&x&0&.&


0.& 60&/&2&0&
45&
= 0.40 ± 0.021 . (0.379, 0.421)
d. 95% confident that this interval contains the
true population proportion (in the sense that
95% of the intervals constructed this way
would do so).
BZAN 6310 - Solution to HW 7b

7.11. H0 : p = 0.07 vs. H.: p < 0.07 e. The probability distribution of breaking
strength of the new bonding adhesive is
7.13 H0 : !.! = 863 vs. H.: !.! < 863 roughly mound-shaped and symmetric.

7.53. H0 : !.! = 6000 vs. H.: !.! < 6000 7.78. H 0 : p = 0.70 vs. H.: p f 0.70

Test Stat: (pt. est.- hyp. value)= ~ Test Stat: (pt. est.- hyp. value)= ~
std. error of pt. est. a, std. error of pt. est. a!>
1\

s~
!Ill - tn_ 1 (estimate a with s) /poCl~:o)ln - appx N(O,l)

~s / [ \ 1
·~.us

-!.l.'lt 0 l,t,'/t

D.R. - Reject H 0 if zcalc < -1.645 or> I .645,


D.R.- Reject H 0 iftcalc < -1.363, o/w don't o/w don't reject.
reject p= 1554 I 2376 "' 0.654
tcalc = (3642.5 - 6000) / (4486.929 /lf2) = zcaJc = (.654- .70) I /(,......7=0)'""'(.=30""')/""'23=7,.,...6
-1.82 =- 4.893
Since -182 < -1.363, reject H 0 , conclude the -4.893 is much less than -1.645, so we can
true mean radon level in the tombs is less easily reject H 0 at a= 0.10.
than 6,000 Bq/m3
The p value in this case is the tail area below 7.128. H 0 : !.! = 1220 vs. H.: !.! < 1220
-1,82 on at distribution with 11 d.f. Using
Test Stat: (pt. est.- hyp. value)= x- Jlo
Excel- T.DIST(-1.82,11,1)- this area is std. error of pt. est. a,
0.048, which is less than a= 0.10, and that
s~
confmns that the null hypothesis should be
!Ill - tn_ 1 (estimate a with s)
rejected at the 10% significance level.

7.54.a. H0 : !l = 5.70 vs. H.: !l < 5.70

b. Test Stat: (p~ est.- hyp. value)=


st . error of pt. est.
~
a,
·~
-1.1'13 0
~
s !Ill - tn_ 1 (estimate a with s)

By default, use 0.05 significance level, so


reject H 0 iftcalc < -1.833, o/w don't reject.
I used Excel to get the needed descriptive
statistics ofx = 989.8 and s = 160.6755
tca1c = (989.8- 1220) I (160.6755 lifO)=
-4.53. Since -4.53 < -1.833, reject Ho, .
conclude very strongly that the new pricing
D.R.- Reject H 0 iftcalc < -2.821, o/w don't reduced the average number of vehicles
reject trying to use the tunnel at peak rush hour.
C. tcalc = (5.07- 5.70) / (0.46 /lfO) = -4.33 Using T.DIST, the p-value is 0.000713,
d. Since -4.33 < -2.821, reject H 0, conclude the meaning the observed sample mean would
new bonding adhesive is not as strong as the have been extremely unlikely if new pricing
current standard. had no effect on demand for the tunnel.
HW 7b Solution- page 2

7.132.a. H 0: p = 0.50 vs. H.: p f. 0.50


1\
Test Stat: (pt. est.- hyp. value)= ~
std. error of pt. est. crP
1\
P- Po
VPoO-Po)/n - appx N (0 ' 1)

D.R. -Reject H 0 if Zcalc < -1.96 or> 1.96,


olw don't reject.
77
Zcalc = (.60- .50) I viT(."'50")(".5;;;0"')/4 0 = 1.265

1.265 is not< -1.96 or> 1.96, so we


cannot reject H0 at a= 0.05.
b. np 0 = ( 40)(0.50) = 20 > 15, and similarly for
n( 1 - p0), so yes, n is large enough.
c.

~·''''
o /, ?../,r

Area in right tail= 0.5 - 0.3962 = 0.1 038; p-


value = 0.1038 times 2, or 0.2076, because
this is a 2-tailed test.
d. a> 0.2076
BZAN 6310 - Solution to HW 8

11.23.a. y= 1.150 + 0.078767x, where 11.24.a. y= 187.013- 0.27081x, where


y= revenue in $millions y=an elementary school's average
and x = avg. number of tweets per hour 3'd-grader score on the FCAT
one week before movie's release and x = percent of students below the
b. Test H 0 : ~~ = 0 against H.: ~~ fO poverty level
(\
b. Test H0 : ~ 1 = 0 against H.: ~ 1 fO
Test Stat: = @1 - (~ 1 ) 0 - tn.z (\

SA
~I
Test Stat: = ~~ - (@ 1) 0
SA
~I

tcalc = (.078767- 0)/ .007938 = 9.92 > 2.080 tcalc = (-0.27081 - 0)/ .03036 = -8.92 < -2.086
==> Reject H0, conclude average tweet rate ==>Reject H0 , conclude percentage of
one week before movie's release is a useful students below the poverty level is a useful
predictor of opening weekend box office predictor of a school's average 3'd-grade
revenue. score on the FCA T reading exam.
c. Estimate of a iss= 13.3165. Note that sis c. Estimate ofcr iss= 3.42319. In this
an estimate of the standard deviation of y for problem, for a given percentage of students
a given value ofx. In this problem, for a below the poverty level, there is a high
given average hourly tweet rate one week probability ("95%) that a school's actual
before opening, there is a high probability average 3'ct-grade score on the FCAT
( "95%) that a movie's actual opening reading exam will fall within 6.85 points ( "2
weekend box office revenue will fall within times s) of the estimated average.
$27M ("2 times s) ofthe estimated expected d.~= 79.9% ==> 79.9% ofthe variation in
revenue. average 3'ct-grade score on the FCAT
d. ~ = 82.4% ==> 82.4% of the variation in reading exam across the schools in this
opening weekend box office revenue across sample can be explained by variation in the
the movies in this sample can be explained percentage of students below the poverty
by variation in the average hourly one-week- Aevel across those schools.
,;thead tweet rate across those movies. e. ~~ = -0.27081, which says that on the
e. ~~ = 0.078767, which says that on the average, the average 3'd-grade score on the
average, opening weekend box office FCAT reading exam drops by 0.27081
revenue increases by $0.078767M, or points (? - units aren't provided) for each
$78,767 for each additional tweet in the one percentage point increase in the
average hourly one-week-ahead tweet rate. percentage of students below the poverty
Equivalently, it sayS' that on the average, level
opening weekend box office revenue f. See page 3.
increases by roughly $7.88M for each
additional I 00 tweets in the average hourly 11.103.
one-week-ahead tweet rate. a. y= 44.130 + 0.2366x, where
f. See page 3. (Note in this scatter plot that the y= manager success index
data suggest a violation in one of the and x = no. of interactions w/ outsiders
underlying population assumptions: the b. Test H 0 : ~~ = 0 against H.: ~~ fO
variability in y appears to increase as x (\

increases; that is, it appears that Var(y) is Test Stat: = P 1 - %)o -


tn·2
not a constant cr 2 for all values ofx.)
HW 8 Solution - page 2

-2.//C 0 1.110

tcalc = (0.2366- 0)/ 0.1865 = 1.27 j 2.110


==>Cannot reject H 0 at a= 0.05, conclude
the number of interactions with outsiders is
not a useful predictor of a manager's success
index.
c. Estimate of a iss= 19.4038. For a given
number of interactions, s is the estimate of
the standard deviation of the success index.
Thus, for a given number of interactions
with outsiders, there is a high probability
( "'95%) that a manager's actual success
index will be within± 2s, or 38.8 points of
the estimated expected value given by the
fitted regression model. Clearly there is
considerable variability in y for each value
ofx in this data (look at the scatter plot).
The index value itself likely is scaled to fall
between 0 and 100- no observations fall
outside those values- so± 38.8 points may
cover most of the possible values of the
index; in fact,± 38.8 gives a wider interval
(77.6 points) than the range ofthe sample
observations, which is 95 - 25 = 70. This is
consistent with the predictor providing
almost no explanatory power concerning the
response.
d. i = 8.6% ==> 8.6% of the variation in
manager success index across the managers
in this sample can be explained by variation
in the number of interactions with outsiders
across those managers.
e. No real reason to interpret this estimate
since it is not significantly different from 0.
f. Might as well plot ~e fitted line for practice;
see page 3.
HW 8 Solution - Page 3

Scatterplot of Revenue vs Tweets


160

140 • jj ~ /, I f o f 0. 0 7! 7 t 7 X
./'
120

100 ~¥

·I
Q)
::l
=
Q)
::>
80
Q)
~ 60
(z.oo 1 !1..7o) (t'fO~ 1 1/j, 1/1. J X "
40 f
20

l-00 /,Jfo f o.rnf7?7 (1-uo)=- /!.. 9o
0
l'f oo
0 200 400 600 800 1000 1200 1400
Tweets

Scatterplot of READING vs POVERTY

~ (t.o,tfU.) • •
~ 170 • ~ • 0oo, /Pl. 9)

~
165 X y
160 lh, 013-0. J:?of 1 ()..o) = /!/,?
• v'
10 20 30 40 50 60 70 80 90 100
/r!O
POVERTY

Scatterplot of Success vs Interacts


100 " ;: t/L/,/50 1- 0,1-3U X
• ~ /'
:Y.'h'f\
90
t Ju<.,c...t5S frtfu..A.C :k d'vt.S
80
•• ( 1o,
••
70
"'
~ 60
(_ o, J./'-(. 11o)
"'" 50 { ._ . ---~
• •
X
A

d
40
• 0 1/ L/, /30 f 0.1-3 U (o) 11. J3o
30 • • .o0

• • I
20
0 10 20 30 40 50 60 70
9o (fo) ~ iJ.'-Ilt/
80 90
Interacts
BZAN 6310 - Solution to HW 9a

l.a. y=45.103192 + 8.952449x 1 + 1.212715x2 + population of people who have those given
9.94556x 3, where values for those three predictors. SBP in
y =systolic blood pressure (mm of mercury) this sub-population is a random variable
x 1 = Quetelet index (metric version ofBMI) whose mean, by our assumptions, is given by
x 2 = age in years Po+ P1QUET + P2AGE + P3 SMK. We
x 3 = smoking preference ( 1=yes, 2=no) assume that SBP is normally dist'd in that
b. Test H 0 : P1 = 0 against H.: P1 1'0 sub-population, and that it has a standard
1\ deviation a. Our estimate of that standard
Test Stat: = @1 - %)o - tn-(k+I) deviation in this case is 7.407, and this value
SA
~.
gives us an indication of how spread out the
systolic blood pressures are in this sub-popu-

'\L1~,..
lation. For example, we would expect
roughly 95% of those SBPs to fall within
about± 15 ("' 2 x 7 .407) mm of mercury of
the true mean SBP for that sub-population.
-1-.0<13' 0 (l,Oiif e. For specified values of relative size and
(,716 smoking preference (i.e., holding QUET and
tcalc = (8.592449- 0) I 4.498681 = 1.910, SMK constant), SBP increases by 1.212715
which does not fall in either tail, so cannot mm of mercury, on average, for each
reject H0 at the 5% significance level. (Note additional year in a person's age.
that the p-value for this test is 0.0664, and f. For specified values of QUET and AGE, the
the decision not to reject H 0 could also be average SBP of smokers is 9.945568 mm of
made by noting that 0.0664 > 0.05.) mercury more than the avg for non-smokers.
Interpretation: QUET does not contribute g. R 2 = 0.7609. Interpretation: Roughly 76%
significantly to explaining the variation in of the variation in systolic blood pressure
SBP after the effects of AGE and SMK have across the 32 males in this sample is
been accounted for. explained by variation in the relative sizes
c. Ho: P1 = Pz = PJ = 0 (measured by the Quetelet index), ages, and
H.: At least one slope 1'0 smoking preferences of those males.
Test Stat: F = MSR I MSE - F 3,28 • h. r
The highest would be obtained by using
I don't have a handy picture to use, but the AGE if we wanted a one-predictor model.
F 3,28 sampling distribution (which is correct In that case, r 2 would be (0.7752? = 0.601.
if H 0 is actually true) is anchored at F=O and The other questions can be answered by
skewed to the right. For a= 0.01, the using the fact that SST= 6425.969, so that
critical value (tail cut-off value) for the test SSR in this single-variable model would be
is 4.568 using =F.INV(0.99,3,28), so we (0.601) x (6425.969) = 3862. The corres-
reject H0 ifF calc> 4.568. ponding ANOVA table would then be
Fcalc = 1629.942 I 54.862252 = 29.710 »
4.568, so reject H 0, conclude the regression Source ss df MS
is significant (i.e., at,least one of the three Regression 3862 1 3862
slopes is 1'0). Equivalently, SAS gives the Error 2564 30 85.47
p-value as 0.0001 << 0.01 ->reject H 0 • Total 6426 31
d. s = 7.406906 is the point estimate for a.
Interpretation: The estimated standard s = v'MSE = 185.47 = 9.245
deviation of SBP around its mean (i.e., the Fcalc = MSR I MSE = 3862 I 85.47 = 45.2
population regression line) for any given set
of values for the predictors is 7.407. This (Note: The corresponding t-statistic for
statistic s is therefore an estimate of a testing H 0 : P1 = 0 can be found from tcalc =
measure of the variation in SBP in the v'Fcalc = v'/\45.2 = 6.72, and it would be +6.72
population. More precisely, for given values because P1 would be positive since the corre-
ofQUET, AGE, and SMK, there is a sub- lation between SBP and AGE is positive.

You might also like