Professional Documents
Culture Documents
Instructions:
o Write the last 4 digits of your ID number in space provide on each page (top
right).
o Write clearly and legibly; avoid writing on the back of these pages.
o Show all your work and include units where appropriate.
o Write all answers and computations on these pages.
1. Which of the following best describes the retrospective design where subjects are
sampled by disease status and is often used when the investigator is interested in rare
diseases. (4 pts)
A. intervention trial
B. case control study
C. retrospective cohort
D. ecologic study
E. none of the above
2. Which of the following best describes the study design that can be either retrospective or
prospective and is often used when the investigators are interested in rare exposures. (4
pts)
A. intervention trials
B. cohort studies
C. prevalence studies
D. case control study
E. none of the above
3. The strength of an association is one of the criteria for evaluating the cause and effect
relationship between an exposure and outcome. Which of the following is a measure of
the strength of association? (Choose one best answer). (4 pts)
____ ____ a. A "J" or "U" shaped relationship of a continuous risk factor and continuous
measure of disease suggests a Pearson product-moment correlation coefficient of
near plus one or minus one.
b. A risk ratio measure and a correlation coefficient are both measures of
____ ____
association.
c. A population attributable risk proportion depends on the prevalence of exposure
____ ____
and is not directly related to the strength of an association.
d. The study base for a case-control study consists of those people who if they
____ ____
developed the disease could have been counted as cases.
e. The Bradford Hill criterion "coherence" means that the association has been
____ ____ observed repeatedly in different places, by different observers, and at different
times.
f. If an exposure is a cause of a disease, then "temporality" is the Bradford Hill
____ ____
criterion for causal inference that must hold true between exposure and disease.
7. The death rates from various conditions are often compared across geographic areas.
These comparisons are usually based on directly age-standardized mortality rates. Which
of the following best describes what is meant by an age-standardized rate created by the
direct method? (Choose one best answer). (4 pts)
A. The number of events in each age stratum of a standard population is used to
create a weighted average rate.
B. The event rates in each age stratum in the standard population are used to create a
weighted average rate.
C. The event rates in the geographic area of interest are applied to the age-stratum
sizes of a standard population to create a rate that is a weighted average.
D. The event rates in the geographic area of interest are compared to the event rates
of a standard population to create a summary rate that is a weighted average.
8. In order to estimate counts and rates of work-related fatalities, the National Traumatic
Occupational Fatality system has introduced a tick-box on the death certificate to indicate
"injury at work." Kraus et al. (Am J Epidemiol 1995; 141: 973-9) attempted to validate
this "injury at work" classification system against a gold standard [International
Classification of Diseases (ICD) death certificate codes designating deaths that occurred
during work-related activities]. After reviewing a sample of 100,000 death certificates,
the authors reported the following: 1,195 true positives; 788 false positives; 97,672 true
negatives; 345 false negatives. ("positive" indicates that the tick-box was checked;
"negative" indicates that it was not checked; "true" indicates agreement between the tick-
box and the ICD code).
a. Using the counts provided above, complete the 2x2 table below: (2 pts)
ICD Classification
Not work-
Death Certificate Work-related TOTAL
related
TOTAL
b. What are the sensitivity and specificity of the "injury at work" classification
system? (4 pts)
c. What is the positive predictive value? In your own words, how would you interpret
this value? (3 pts)
d. Based on these data is the death certificate "injury at work" classification system
likely to underestimate or overestimate the true number of work-related fatal
injuries? (2 pts)
e. The use of data from the "tick-box" on the death certificates to track work-related
mortality trends is an example of which kind of surveillance system? (choose one
best answer). (4 pts)
A. Active surveillance
B. Passive surveillance
C. Retrospective cohort surveillance
D. Cross-sectional survey surveillance
f. The sensitivity and specificity computed above are quantitative measures of which of
the following aspects of death certificate classification of work-related fatalities?
(choose one best answer). (4 pts)
9. Age-related maculopathy is a leading cause of blindness among people 65 and older in the
United States, and is estimated to affect between 16 and 26% of people in this age group. In
a recent study by Klein, residents aged 43 to 86 years in the town of Beaver Dam, Wisconsin
were asked to participate in a study to determine whether cigarette smoking was related to
age-related maculopathy. At a baseline examination, participants were asked to report their
lifetime smoking habits. After 5 years, participants had an examination to determine whether
they had developed age-related maculopathy. The following table presents the number of
cases of age-related maculopathy measured at the follow-up examination among the 1232
male participants ages 43-86 who did not have age related maculopathy (ARM) at the
baseline examination:
a. Which of the following best describes the research design used by in this study?
(choose one best answer) (3 pts)
b. Create a 2 x 2 table where one axis is smoking status and the other is age-related
maculopathy status. (4 pts)
10. The following data come from a national survey of the occurrence of back pain. A case of
low back pain was defined as having at least one episode of severe back pain occurring over
a period of 6 months. The number of cases was obtained from surveys of different
occupation groups as well as a national random sample.
Age Persons cases Rate Persons Cases Rate Persons Cases rate
c. Can these two ratios in part (a) and (b) be compared? Briefly explain why or why not.
(3 pts)
11. The evidence supporting obesity as a risk factor for colon cancer remains inconclusive,
especially among women. A recent study (Am J Epidemiol 1999;150:390-398) reported the
association between obesity (measured at baseline) and colon cancer morbidity as
determined from review of medical records and death certificates in a nationally
representative cohort of men and women age 25-74 years who participated in the First
National Health and Nutrition Examination Survey from 1971 to 1975 and were
subsequently followed up through 1992. The following table is from this study for men and
women combined.
<22 28 53,475
22 - <24 41 38,919
24 - <26 36 36,610
26 - <28 40 32,635
28 - <30 35 21,122
30+ 42 34,904
a. Which of the following best describes the research design used in this study? (choose
one best answer). (2 pts)
A. Cross-sectional survey
B. Ecological study
C. Population based case control study
D. Cohort study
E. None of the above
b. Complete the table by calculating the crude body mass index-specific incidence rates.
(3 pts)
c. Calculate the relative risk (RR) of colon cancer associated with a BMI of 28-<30. Use
the lowest BMI category as referent. In one sentence interpret your answer. (2 pts)
d. Calculate the attributable risk proportion of those in the 28-<30 BMI category. In
one sentence interpret your answer. (the attributable risk formulas provided in class
can be used even though the data provide is for rates) (2 pts)
12. Analyses of data from cohort studies often have to deal with the reality that participants have
unequal lengths of follow up. Given the data below, calculate the (a) total person time
(month) of follow up, (b) the overall incidence density rate, (c) 13 month cumulative
incidence, and (d) the product limit estimate of failure. Each horizontal line represents a
cohort participant. Each vertical line represents one month. Arrows indicate time of loss to
follow up. Black boxes indicate onset of disease (failure). (2 pts each)
a. ______________
b. ______________
c. ______________
d. ______________
Answer Guide
1. B. Case-control studies are said to use sampling by disease and are suited for studying rare
diseases.
2. B. Cohort studies can be either retrospective or prospective and are often used to study rare
exposures.
3. The ratio of odds of exposure among cases to odds of exposure among noncases is the odds
ratio, which is a measure of association.
4. Incidence rates cannot be estimated from case-control studies without additional
information. In the case-control design selection of subjects is based on disease status, so the
number of cases is under the control of the investigator. If the investigator has access to all
cases and knows the size of the population from which they arise s/he can estimate
incidence, but knowledge of the population size is not available from the case-control
design.
5.
d. TRUE – The study base for a case-control study consists of those people who if they
developed the disease could have been counted as cases.
e. FALSE – The Bradford Hill criterion "coherence" means that all of the known facts
about the relationship fit into place; the criterion of "consistency" means that the
association has been observed repeatedly in different places, by different observers,
and at different times.
f. TRUE – "Temporality" is the one Bradford Hill criterion for causal inference that
must hold true between exposure and disease.
7. C. "The event rates in the geographic area of interest are applied to the age-stratum sizes of a
standard population to create a rate that is a weighted average" describes a directly-
standardized rate.
8. a.
ICD Classification
d. Based on these data the death certificate "injury at work" classification system will
overestimate the true number of work-related fatal injuries, since more non-work-
related injuries will be classified as work-related than vice-versa.
f. C. Sensitivity and specificity are measures of validity, since there is a standard for
"truth".
9. D. Prospective cohort, since the investigators monitored people without the condition over
time to detect its development.
c. (was labeled "f") PARP = (overall incidence – incidence in never smokers) / overall
incidence of ARM
= (0.0852 – 0.0707) / 0.0852 = 17%
10.
a. Standardized event ratio (for cell phones) = SMR (cell phone) = observed/expected
= 42/{(.003)(1000) + (.06)(700) + (.08)(50)} = 42/49 = 0.86
c. These two ratios cannot be compared directly. An SMR is a weighted average where
the weights (e.g., age structure) come from the population for which indirect
standardization is being carried out. So SMRs for two populations use different
weights. Unless the populations have identical age structures, the stratum-specific
rates are the same for all strata, or the stratum-specific rates for one population are a
constant multiple of those for the second population, the comparison is invalid. With
indirect standardization, it is actually the "standard population" rates that are being
"standardized" to the age distribution of the study population.
11.
a. D. Cohort study
b. RR of colon cancer for BMI 28-<30 kg/m2 vs. lowest = 165.7/52.4 = 3.16
c. ARP for BMI 28-<30 kg/m2 vs. lowest = (3.16 – 1) / 3.16 = 68%
The ARP of 68% means that 68% of the incidence in the 28-<30 kg/m2 group is
attributable to elevated BMI.
12.
a. 43 person-months
The questions on this examination are largely based on Cantor KP, Lynch CF, Hildesheim ME,
Dosemeci M, Lubin J, Alavanja M, Craun G. Drinking water source and chlorination byproducts in
Iowa. III. Risk of brain cancer. Am J Epidemiol 1999;150:552-60. You may refer to an unannotated
copy of this article during the examination.
1. Briefly discuss two reasons why a case-control study is (or is not) well suited to examine risk
factors for brain cancer. (3 pts)
2. The authors describe the study design they used as a "population-based case-control study".
Briefly explain how this is different than a non-population based case-control study. Include
in your answer issues regarding the selection of cases, selection of controls, and validity. (3
pts)
3. Cases were identified by the State Health Registry of Iowa. Which of the following
categories of study design best describes this method of case finding? Choose one best
answer. (3 pts)
A. Prospective follow-up
B. Passive surveillance
C. Cross-sectional survey
D. Community-based screening
E. Hospital-based surveillance
4. The authors state that cases had to be newly diagnosed with histologically confirmed glioma
without previous diagnosis of a maligant neoplasm. Which of the following best describes an
advantage of using incident cases instead of prevalent cases? Choose one best answer. (3 pts)
A. Using incident cases allows the investigators to directly compute relative risks.
B. Using incident cases reduces the non-systematic error of case-control studies.
C. Estimates of exposure from incident cases may be less influenced by disease status.
D. Using incident cases allows for the investigation of effects on risk versus those
effecting duration.
E. Incident cases are less likely to be lost to follow up than prevalent cases.
5. Even if the investigators are careful in the selection of cases and controls, selection bias can
make interpretation of results difficult. Which of the following is NOT a situation that can
produce selection bias? Choose one best answer. (3 pts)
A. The exposure has some influence on the process by which controls are selected.
B. The exposure has some influence on the process of case ascertainment.
C. The disease status has some influence on the recall of exposures.
D. The exposed cases are reported to registries more than unexposed.
E. All of the above will produce selection bias.
6. In this study, exposre information for many of the brain cancer cases was provided by proxy
respondents. The authors did not have information from independent sources that could be
used to directly verify information provided by these surrogates. However, suppose a follow-
up questionnaire was administered to cases, and for 85 of the cases, the investigators were
able to obtained information about whether or not they used a private well directly for the
cases (self report). Assuming that self report is the best available assessment of whether they
used a private well or not, complete the table below so that it reflects a sensitivity, specificity,
and positive predictive value of a proxy response of 77%, 75%, and 57%, respectively.
Assume that 26 of cases reported that they used private wells. Show your calculation. (6 pts)
YES
NO
7. Cases in this study were histologically confirmed. This is an example of which of the
following disease classification criteria? Choose one best answer. (3 pts)
A. Causal criteria
B. Ecologic criteria
C. Manifestational criteria
D. Etiologic criteria
E. None of the above
8. Consider the data presented in Table 1 of this article. Which of the following best represents
the proportion of the risk of brain cancer in the population that is attributable to working on
a farm (farm occupation). Assume that a farm occupation is causally related to brain cancer
risk. Choose one best answer. (4 pts)
A. 33%
B. 57%
C. 10%
D. 29%
E. Cannot be calculated from case-control studies
9. A case-control study like the one described in this paper is most useful when it helps us
understand what is happening in the study base (underlying population). Which of the
following best describes the study base in this article? Choose one best answer. (3 pts)
A. The study base is those who if they developed brain cancer could have been selected
as a case.
B. The study base is those who have an equal probability to be selected as a case or
control.
C. The study base is those who are identified as cases or controls after excluding non-
responders.
D. The study base is those who if exposed would have been identified as exposed.
E. None of the above.
10. In Table 3 the odds ratios for incident brain cancer by duration of chlorinated surface water
exposure are given. The odds ratio (95% confidence interval) in men estimating the risk of
brain cancer with 1-19 years of exposure is 1.3 (0.8, 2.1) and 2.5 (1.2, 5.0) for 40 years or
more of exposure. Which of the following best describes the role of chance in observing
these two estimates? Choose one best answer. (3 pts).
A. The odds ratio for 40 years exposure is more likely due to chance because it is
based on fewer cases and controls.
B. The odds for 1-19 years of exposure is more likely due to chance because the point
estimate is closer to the null value (1.0).
C. The odds ratio for 40 years exposure is more likely due to chance because the
confidence interval is so wide.
D. The odds ratio for 1-19 years of exposure is less likely due to chance because the
confidence interval is narrower.
E. The odds ratio for 40 years exposure is less likely due to chance because the
confidence interval does not include 1.0.
11. Table 3 presents odds ratios for the association of incident brain cancer with various levels
of lifetime average THM exposure. The odds ratio (95% confidence interval) for lifetime
average THM concentration of 0.8-2.2 g/liter for men was 0.9 (0.6, 1.6). The odds ratio
(95% confidence interval) for lifetime average THM concentration of 32.6 g/liter for
woman was 0.9 (0.4, 1.8). Which of the following best describes the precision of these two
estimates of risk? Choose one best answer. (3 pts)
A. The estimate is equal because the point estimates are the same.
B. The estimate is equal because neither confidence interval excludes 1.0.
C. The estimate in men is slightly more precise because the confidence interval is
narrower.
D. The estimate in women is slightly more precise because the exposure level is much
higher.
E. The precision of the estimates cannot be compared because they are from different
exposure groups.
12. Using the data in Table 4, which of the following best describes the crude unadjusted odds
ratios estimating the risk of brain associated with 40 years exposure to chlorinated surface
water in men with above median tap water intake? Use the category of 0 years exposure to
chlorinated surface water as the reference group. Choose one best answer. (4 pts)
A. 4.0
B. 1.5
C. 3.6
D. 2.6
E. Cannot be computed from data in Table 4.
13. Table 1 shows the adjusted odds ratio estimating the risk of brain cancer by population size.
Using the 25,000 population size as a reference calculate the crude (unadjusted) odds ratio
associated with the > 50,100 population. In 2 sentences or less explain why the two estimate
agree or disagree. (4 pts)
14. The authors state that they "found a dose-response relationship among men between brain
cancer and duration of consuming drinking water from chlorinated surface water…". Using
3 Bradford Hill criteria, in 3-4 sentences, address causality (or the lack of causality) of the
relationship of drinking water to brain cancer. (4 pts)
15. An early study of drinking water and brain cancer was an ecological study conducted by the
lead author of the present article. In this study, brain cancer mortality rates in 923 U.S.
counties were compared with average levels of THM measured in the drinking water
supplies of those counties. For counties in which the sampled water supply served at least
85% of the residents of that county, the correlation coefficient between county-specific
mortality rates from brain cancer and trihalomethane levels was 0.24 in White men and 0.19
in White women. After reviewing this paper, your colleague concluded that THM in drinking
water are causally related to brain cancer. However, you are more cautious in your
interpretation, citing the "ecological fallacy." Please define the ecologic fallacy (2 pts) and
describe why it limits the causal inferences that can be made from the ecological study
described above (2 pts).
16. The authors used information provided by cases and controls on place of residence, primary
source of drinking water, and tap water and total fluid consumption to create an index of
cumulative lifetime exposure. However, the natural history of cancer (initiation, promotion,
conversion, and progression) may encompass many years. If drinking water is involved at the
earliest stages of brain cancer (initiation), then drinking water exposures in the recent past
may be more important than present exposures or those in the distant past (e.g., in
childhood). As defined in class, which of the following periods would be important in
defining the minimal and maximal length of time expected between drinking water exposure
and diagnosis with histologically confirmed glioma? Choose one best answer. (3 pts)
A. Induction period
B. One year case fatality
C. Latent period
D. Both a and c
E. None of above
17. The authors included all cases of histologically confirmed malignant brain cancers, including
glioblastoma, fibrillary and gemistocytic astrocytoma, and mixed glioma. If authors suspected
that drinking water exposure was associated with only certain subtypes of brain cancer (i.e.,
disease heterogeneity), which of the following strategies could they employ at the analysis
stage? (3 pts)
A. Adjustment for cancer type using mathematical modeling (e.g., logistic regression)
B. Stratification of cases by brain cancer type
C. Direct standardization by brain cancer type
D. Indirect standardization by brain cancer type
E. Matching cases and control by brain cancer type
18. The authors restricted their analysis to those cases and controls with at least 70 percent of
their lifetime years with a known source of drinking water. This approach was used to reduce
which type of bias? Choose one best answer (3 pts)
A. Confounding bias
B. Selection bias
C. Information bias
D. Random error
E. None of the above
20.
a. Using the data in Table 3, label and complete a 2x2 table for the association between
brain cancer and >=40 years’ residence with a chlorinated surface water source
(versus 0 years), collapsing over sex (i.e., combine the data for men and women). (4
pts)
b. Calculate the odds ratio for your 2x2 table in part a. Show your work. (3 pts)
c. Suppose that the sex-adjusted OR for the relationship between brain cancer and
>=40 years’ residence with a chlorinated surface water source is 1.1. Is sex a
confounder of this relationship? Justify your answer. (3 pts)
d. Is sex an effect modifier (assuming a multiplicative model for joint effects) of the
relationship between brain cancer and >=40 years’ residence with a chlorinated
surface water source? Justify your answer. (3 pts)
e. According to Table 1, having a farming occupation (ever vs. never) is a risk factor
for brain cancer (OR=1.5). Assume that among the controls, farming occupation is
associated with duration of residence with a chlorinated surface water source. Could
farming occupation be a confounder of the associations reported under the Total
column in Table 3? Explain your answer. (3 pts)
21. Characteristics of cases and controls included in this study are shown in Table 1. Using this
information answer the following questions.
YES
NO
b. Assume that 10% of the cases that were labeled as never having worked on a farm truly had
worked in such an environment. Furthermore assume that 15% of the controls that were
labeled as having ever worked on a farm, in fact never really did work on a farm. What
would the true association be between farm occupation and brain cancer? Assume that the
classification of disease status is valid. (4 pts)
c. Which of the following best describes a comparison of the odds ratios you computed in
parts (a) and (b)? Choose one best answer. (3 pts)
A. The odds ratios are different as a result of differential misclassification of exposure.
B. The odds ratios are different as a result of nondifferential misclassification of
exposure.
C. The odds ratios are different as a result of differential misclassification of disease
status.
D. The odds ratios are different as a result of nondifferential misclassification of disease
status.
E. The odds ratios are different as a result of random variation in the exposure
assessment.
22. Which of the following is a measure of the validity of methods used to classify exposures
such as having worked on a farm? Choose one best answer. (3 pts)
23.
a. Using data in Table 1, assess whether the crude OR of brain cancer associated with
farm occupation is confounded by age and/or sex. Support your answer with
relevant calculations. Table 1 shows the adjusted odds ratios estimating the risk of
brain cancer due to having farm occupation. (2 pts)
b. What feature of the study design could have contributed to the crude OR’s in Table
1 being confounded by age and/or sex? (2 pts)
University of North Carolina at Chapel Hill
School of Public Health
Department of Epidemiology
Fundamentals of Epidemiology (EPID 168)
Final Examination, Fall 1999
Answer Guide
1. Case-control studies are well-suited for studying risk factors for brain cancer because the
disease is rare (hence difficult to study in a cohort design). Also, the case-control design
facilitates examining many risk factors of current interest, a substantial advantage when so
few risk factors have been identified. A retrospective cohort study can examine only
exposures for which historical data are available.
2. A "population-based case-control study" is a case-control study for which the study base is a
defined population. With a hospital-based case-control study, it is difficult to specify the
study base, since which cases come to a given hospital is influenced by such factors as
seriousness and treatability of the disease, type of hospital, and health care financing ability
and arrangements. A representative sample from this same defined population yields a
control group that permits valid estimation of odds ratios. In contrast, the validity of
measures of association estimated using a control group selected from among hospitalized
persons is always somewhat uncertain, since it is generally impossible to know how well such
controls provide valid estimates of the study base.
3. B. The method of finding cases was passive surveillance.
4. D. Using incident cases allows the odds ratio to estimate the incidence density ratio or risk
ratio. In contrast, the exposure distribution among prevalent cases will reflect differential
survival in relation to exposures as well as differential incidence.
5. C. Selective recall (the disease status has an influence on the recall of exposures) is a form of
information bias, not selection bias.
6. Since 26 of the cases reported using a private well, 85-26=59 cases did not. Sensitivity=0.77
means that the proxy respondents correctly classified as "exposed" 0.77x26 approx.=20
brain cancer cases. Specificity=0.75 means that the proxy respondents correctly classified as
"unexposed" 0.75x59 approx.=44 brain cancer cases. The rest of the table can be completed
by subtraction and addition. As a check on the arithmetic, the positive predictive value is
20/35 approx.=0.57.
No private well 6 44 50
Total 26 59 85
7. C. Manifestional criteria – histological criteria are observable characteristics of tumor cells in
microscopic examination.
8. C. 10% – the proportion of cases who are exposed is 85/291 approx.=0.29, and the OR
approx.=1.5. Substituting into the formula for PARP in a case-control study gives 0.29x(1.5–
1)/1.5 approx.=0.097.
9. A. The study base consisted of those people who if they developed brain cancer could have
been selected as a case.
10. E. The OR for the oldest group is less likely to be due to chance because the confidence
interval does not include 1.0 (although not without problems, this response was the best).
11. C. The narrower confidence interval indicates that the estimate for men is slightly more
precise.
12. D. 2.6 = (7x423)/(30x38) for men with above median tap water intake
Cases 7 30
Controls 38 423
13.
Average population
≥50,010 ≤2,500
Cases 32 112
Crude OR = (32x780)/(112x246) = 0.91 versus 0.7 adjusted. The estimates differ because
the OR in the table has been adjusted for age and sex (according to the footnote to Table 1).
The associations observed for this association were of medium strength (1.7 for 20-39 years
of exposure to chlorinated surface water, 2.5 for >=40 years). The authors measured lifetime
exposure (through recall) so in spite of the prolonged induction and latent periods for brain
cancer, the criterion of temporality is satisfied to some extent. Some of the exposure history
in Table 3 must have occurred after the brain cancer had begun and is therefore not relevant.
However, it seems unlikely that if the association were causal it could go in the opposite
direction (i.e., brain cancer causes exposure to chlorinated water). There is little evidence to
support the plausibility of the association nor of its being found for men but not for women.
Studies of the association have not yielded consistent results. (The remaining criteria –
coherence, experiment, and analogy – are not applicable to the information in the article.)
15. The "ecologic fallacy" is the inference from aggregate data that a relationship exists at the
level of the individual. The flaw in this inference is that the prevalences of a characteristic
(e.g., exposure to trihalomethanes in drinking water) and a condition (e.g., brain cancer) can
both be elevated in a population even if the individuals who possess the characteristic are
not those with the condition. In the study described in the question, people who developed
brain cancer may not themselves have ingested large amounts of THM despite living in
counties with high THM levels in the county water supplies. A related analytic problem is
that the absence of individual-level data precludes individual-level control for potential
confounders, such as farming occupation.
16. D (both A and C). "Induction period" refers to the time between exposure and the onset of
the disease; "latent period" refers to the time from disease onset to diagnosis. For exposure
to be causal in early stages of tumor development, the exposure must be present prior to the
latent period. In principle, exposure prior to the sum of the longest possible induction
period and the longest possible latent period would not be relevant, either.
17. B. Stratification of cases by brain cancer type would permit examination of the relationship
for the individual subtypes.
18. B. "We selected cases and controls with at least 70 percent of their lifetime years with a
known source of drinking water in order to …minimize misclassification of exposure …"
(end of p 554).
19. (question was not asked)
20. a.
21. a.
Farming occupation
Yes No Total
The OR of 0.89 indicates no (or possibly a slight inverse) crude association between
brain cancer risk and having had a farming occupation.
Farming occupation
Yes No Total
22. D. Sensitivity is a measure of validity (kappa is a measure of agreement that gives equal
weight to both classifications; standard error measures variability of an estimate)
23.
a. The crude OR = (85 x 1,355) / (206 x 628) = 0.89. This value is substantially
different from the adjusted value of 1.5, indicating that confounding by age and sex
are present.
b. Controls were matched by age and sex to cancer cases for five cancer sites. Thus, the
control group is not a simple random sample from the study base, so that analyses
must control for the matching variables.
1. a. Briefly summarize two criteria on which disease classifications are based. Discuss a reason
why these two criteria do not always correspond with one another. (3 pts)
1. b. List two examples of each of the two types of criteria you mentioned in 1A. (2 pts)
2. Cohort studies can form the framework for efficient substudies, using nested case-control
and case-cohort study designs. Which of the following best compares and contrasts these
nested case control studies and case-cohort studies. (3 pts)
A. Both nested case control and case-cohort studies select controls that are matched on
time of case development but only case-cohort studies allow for multiple
comparisons with different case groups.
B. Both nested case control and case-cohort studies select controls from the entire
baseline cohort, but in case-cohort studies the selection is done at random.
C. In case-cohort studies a single group of controls can be used for comparison with
several case groups.
D. In nested case control studies, cases are selected entirely from the non-exposed
cohort group.
E. both C and D
3. Name the three component parts of any kind of incidence measure. (3 pts)
4. Over a ten-year period the number of bicycle injury events in a population increases even as
the age adjusted bicycle injury rate decreases in the population. Describe two conditions that
could cause this outcome (assume the definition of a bicycle injury and the quality of the
data remain constant over the 10 year period) (3 pts)
5. Which of the following best describes the condition(s) that are required for the odds ratio
(OR) to estimate the risk ratio (RR) in a case-control study. (choose one best answer) (3pts)
A. Incident cases are identified for a defined population at risk.
B. The controls represent the base population that gave rise to the cases.
C. The disease outcome is rare in the base population at risk.
D. All of the above.
6. The association between induced abortion and breast cancer has been the subject of
previous epidemiological studies. Cohort studies have found no association, while at least
one case-control study has found a positive association. Possible explanations for the
different results in case-control and cohort studies of this topic include (choose single best
answer). (3pts)
A. Case-control studies are prone to selection bias, whereas cohort studies are not
vulnerable to selection bias.
B. Recall bias might explain the association observed in a case-control study, but this
would not be a problem in prospective cohort studies.
C. The method of disease classification is different in case-control and cohort studies.
D. All of the above
7. Swaen et al (1998) conducted a study of 6,803 males who worked for at least six months
before 1/1/80 at one of nine chemical plants in the Netherlands. The workers were
followed for mortality from 1/1/56 until 1/1/96. Before 1/1/80, 2,842 of the workers were
occupationally exposed to acrylonitrile and the other 3,961 workers were not exposed to
acrylonitrile. After 1/1/80, there was no exposure to acrylonitrile. To measure the
association between occupational exposure to acrylonitrile and several outcomes, the
investigators calculated standardized mortality ratios (SMRs) for both the exposed and the
unexposed workers. Age-interval-specific person-years were generated for specific exposure
groups and were multiplied by the mortality rates for the total male population of the
Netherlands to generate expected numbers of cause specific deaths.
b. What was the (crude) cumulative incidence ratio (CIR) for mortality comparing the
exposed to the unexposed men? What are two reasons why this measure is
problematic with these data?
c. For brain cancer, the SMR for the exposed workers (SMR=173.9) was more than
twice the SMR for the unexposed workers (SMR=85.7). Why are these two SMRs
not strictly comparable? (3 pts)
d. There were 290 deaths due to all causes among the exposed group and 983 deaths
due to all causes among the unexposed group. What measure of effect could be
calculated to strictly compare all-cause mortality between the exposed and the
unexposed group. (2 pts)
Background information:
A panel of experts reviewed the medical records of 525 patients discharged from the hospital
with diagnosis codes indicative of a stroke (ICD 430-438). The panel classified strokes as
either ischemic or not ischemic. Assume the diagnos is reached by the panel is the most
accurate classification possible. Of the 525 cases, 325 had a discharge diagnosis code for
ischemic stroke (ICD code 434). Of these 325 patients, 85 were determined by the panel not
to be ischemic strokes. All but 20 o f the patients with discharge diagnosis codes other than
434 were determined by the panel to have non-ischemic strokes.
Given the background information, compute the sensitivity, specificity, and positive
predictive value of a hospital discharge code for ischemic stroke (ICD code 434) in
classifying a patient as truly having an ischemic stroke.
e. If you were to use a 434 discharge code to identify a group of cases with ischemic
stroke and the sensivity was 99% but the specificity was 40%, which of the following
would best describe your resulting case group. (choose one best answer). (2 pts)
A. The case group would be highly homogenous with respect to
pathophysiology of stroke.
B. The case group would be highly heterogeneous with respect to
pathophysiology of stroke.
C. The case group would have many false negative ischemic strokes.
D. The case group would represent the source population of cases.
f. What two factors influence the positive predictive value of a screening test in
most situations? (2 pts)
9. Suppose that a study was conducted to compare the rates of automobile collisions in
two cities. The researchers were impressed with studies that suggest that the use of
cell phones and pagers contribute to auto collisions. They wanted to adjust
(standardize) the rates of auto collisions in the two cities for cell phone and pager
use. Data on cell phone use and auto collisions in the two cities were collected and
are presented in the table below.
a. Calculate the crude total and cell phone/pager use specific rates for Corona
del Mar and Boulder. How do these two cities compare in crude prevalence
of auto accidents. (2 pts)
10. In a community intervention study, like the Minnesota Heart Health Program, the
effectiveness of an educational intervention program was evaluated. Which of the
following best describes the unit of assignment, the unit of observation, and the unit
of analysis in these types of studies (in this order)? (2 pts)
11. Indicate next to each statement below whether you consider it to be TRUE, FALSE,
or if you are NOT SURE. A correct answer receives 2 points, an incorrect one
zero.
12. Attributable measures are used by researchers to assess the public health impact of a
detrimental exposure, assuming causality. Given data from a cohort study on the
incidence of stroke (see below), estimate the attributable risk proportion among the
exposed (physically inactive). Explain your answer in one sentence. Assume that
physical activity is causally related to stroke risk.
Incidence
Physical activity Did develop a Do not develop Person years
per 1,000
level stroke a stroke (PY)
PY
b. Additional data from the National Health and Nutrition Examination Survey
(NHANES) suggest the prevalence of a physically active lifestyle (at least 30 minutes
of moderate activity 3 days per week) is 27%. Using this information and your
answer to part (A), estimate what we can hope to accomplish with programs to get
people to be physically active in the total population. In one sentence explain your
answer. (3 pts)
Explain:
13. Suppose that in 1998 researchers hypothesized that communication ability and skill
in young adulthood was related to Alzheimer’s Disease. To test this they evaluated
hand written essays completed by a group of 350 nuns joining a single religious sect
in 1930. By careful review of these writing samples, the researchers categorized all
350 as either having a high error profile (N=150) or a low error profile (N=200).
Using surveillance of death certificates and other methods the researchers verified
vital status of each nun through 1998. An accounting of all deaths produced the table
below.
High error
Low error profile
profile
# of Year of # of Year of
Cause of Death Cause of Death
Deaths Death Deaths Death
b. Compute the incidence density rate of Alzheimer’s disease death for those with a
high error profile and for those with a low error profile. (3 pts) Show your work.
c. Compute the incidence density ratio for the risk of Alzheimer’s disease death
associated with a high error communication profile. Explain, in two sentences or
less, what this value means. (3 pts)
d. Using data from this study compute an odds ratio for the association of a high error
communication profile with death from Alzheimer’s disease. Show a clearly labeled
2x2 table. (2 pts)
e. Compare the odds ratio with the incidence density ratio computed in part c and
explain why they are similar or different.
Causal criteria: disease definition and classification based on the cause of the condition,
Causal criteria : microbial diseases for which the pathogen has been identified (syphilis, TB,
malaria, yellow fever, influenza, etc.), lead poisoning, birth trauma,
2. (C)- Other choices are incorrect because controls in case-cohort studies are not matched to
cases (A), contrrols are selected at random with both designs (B), and cases must be selected
without regard to exposure (D).
3. New cases or events, population at risk or source population, passage of time
4. The size of the population may have grown (number increases even though rate does not);
the age distribution of the population may have changed (e.g., influx of families with small
children, outmigration of families with older children), so that age-standardized rate may not
change but a greater proportion of the population may be in the higher risk age range
(assuming that younger children have higher injury rates).
5. (D)- All of the above - use of prevalent cases requires that duration is not related to
exposure, controls should provide estimate of exposure in study base, and rare disease
assumption is required for OR to estimate RR (though not for OR to estimate IDR).
6. (B)- In a prospective cohort study, information on exposure is obtained before the outcome
(breast cancer, in this case) has occurred. Therefore recall bias - different recall by cases and
noncases - is not an issue. In a case-control study, cases and noncases may recall and report
exposure with different degrees of accuracy.
7. a. A (retrospective) cohort study.
c. SMRs are an indirect method of standardization, since they are based on weighted
averages for which the weights are taken from the population whose SMR is being
computed rather than from a "standard" population. Unless the age (and in this case, age-
calendar year interval) distributions for the populations whose SMR's are being computed
are the same, then the weighted averages that make up the SMR's are based on different sets
of weights and are not strictly comparable. Since age-interval distributions of exposed and
unexposed workers may differ, their SMR's are not strictly comparable.
d. An ROC curve plots the value of sensitivity and specificity for each case definition or
cutpoint. Examining the ROC curve shows the trade-off between sensitivity and specificity
that is available for the diagnostic test or measurement method. [The area between the
identity diagonal (slope = 1.0) and the ROC curve serves as a measure of accuracy that takes
into account both sensitivity and specificity, with the assumption that the costs of false
negatives and false positives are the same.]
e. (B) - Due to the low specificity (50%), half of hemmorhagic strokes in the patient group
will be classified as ischemic strokes.
9. a. Corona del Mar has a 2.9 times higher crude accident rate than Boulder.
b. Adjusted rates -
Corona del Mar: (4579 x .0654) + (1274 x .0277) + (9399 x .0136)/15,252 = 29.9/1000
The cell phone/pager adjusted auto accident rate for Corona del Mar was 1.6 times that of
Boulder. A portion of the difference seen in the crude rates was due to differences in the
distribution of use of cell phones and pagers between the two cities.
The standard weights are the sum of the population sizes for the two cities. The weighted
rates are the rates for each city, weighted (multiplied) by the standard weights. The total of
the weighted rates is the directly standardized rate. A problem in using the directly
standardized rates is that there are small numbers of cellular phone and pager users in
Boulder.
The higher crude rate in Corona del Mar reflects the much higher use of cellular phones and
pagers, which is associated with a much higher accident rate. The difference is reduced for
the standardized rates, since these control for the different distributions of cellular phones
and pagers between the two cities. However, this is a situation where it is essential to
examine the specific rates, since Boulder has lower accident rates among cellular phone and
pager users but a higher rate among never-users.
Since the rates in never users are quite similar, Corona del Mar is likely to make its greatest
impact on accident rates by getting motorists to reduce cellular phone and pager use while
driving or finding some way to such use safer (promote the use of "designated drivers"!?).
10. (A) Community intervention trials of this type assign groups to treatments and collect
measurements from individuals. The unit of analysis must be the same as the unit of
assignment (GROUP) or both (i.e., using mixed models).
11. a. T – a cohort study enrolls people who are free of the outcome and monitors them for the
development of the outcome, so the cohort design can be used to estimate risk of the event;
b. Not sure – the temporal sequence of exposure and disease can typically not be addressed
in a case-control study, though in some cases (e.g., a genetic characteristic or other
"exposure" that can be definitively assigned to a time prior to disease onset);
c. F – a cohort design can readily be used to study multiple outcomes; a case-control design
can readily be used to study multiple exposures;
d. T – a randomized clinical trial often enrolls participants over a period of time, with
follow-up time measured from the time of randomization;
e. T – a cohort study begins with disease-free subjects and monitors them for development
of the outcome; if the outcome is rare, many subjects must be followed to obtain an
adequate number of cases;
f. F – ecological studies use group-level variables (e.g., per capita meat consumption) and
relate them to disease rates; direct assessment at the individual level is NOT made, which is
the basis for the ecological fallacy (where the group data are used to infer a link at the
individual level);
g. T – correlational studies (another term for ecological studies) are often used to compare
disease rates across geopolitical entities using available data;
i. F – cross-sectional studies measure prevalence, not risk (of a future event); they are the
most statistically generalizable type of study when, as is often the case, the study population
is obtained through population-sampling;
j. F – the natural history of a disease is the process by which it develops over time;
descriptive information relating to person, place, and time can at best provide only indirect
information;
k. F – as used in class, the term "attributable risk" refers to the risk difference;
m. T – for a rare outcome, the odds ratio (OR) closely approximates the cumulative
incidence ratio (CIR) and incidence density ratio (IDR), so it indicates strength of association
in the epidemiologic sense; when the outcome is not rare, the OR does not approximate but
does vary with the CIR and IDR, so the OR still gives an indication strength of association
n. T – an attributable risk proportion estimates the proportion of risk that is associated with
an exposure in people who are exposed; attributable risk (as used in this course) is the risk
difference, which indicates the amount of risk associated with an exposure in people who are
exposed; attributable risk must be adjusted for the prevalence of the exposure in order to
estimate the amount of risk associated with exposure in the population as a whole;
o. F – since case-control studies begin with people who are already cases, they avoid having
to study a large number of people for a long time in order to accumulate enough cases; they
can also compare cases and controls in respect to many exposures; HOWEVER, they
cannot readily study many outcomes, since to do so requires enrolling cases for each of the
outcomes to be studied (i.e., equivalent to conducting several case-control studies that share
the same control group);
r. F – comparability of standardized rates and ratios across study populations requires that
the standardized measures be constructed using the same set of weights; indirect
standardization (e.g., via a SMR) employs the weights (the number of people in each
stratum) from the study population, so measures standardized using this method are, strictly
speaking, useful only for comparing a study population with the standard population used in
the standardization;
s. F – typically, general population controls will be less motivated than cases and sources of
medical information for them will not be comparable to those for cases.
12. a. ARP = (I1 - I0) / I1 = (RR-1) / RR = (1.34-1.04) / 1.34 = 0.30 / 1.34 = 22% (after
rounding)
Interpretation: Based on these data, 22% (about one in five) strokes in people who are
physically inactive can be attributed to their physical inactivity; in other words, if physically
inactive people became active early enough in their lives, their stroke incidence would
decrease by 22%
b. A key point here is that 27% is the prevalence of physically active people, whereas the
exposure is physical inactivity, whose prevalence is therefore 100% - 27% = 73%
(The formula PARP = (I - I0) / I can also be used by first estimating the crude population
incidence, I, as a weighted average of the incidences in exposed and unexposed, weighting by
the prevalence of exposure, e.g.: I = (0.73)(1.34) + (0.27)(1.04) = 1.26, so PARP = (1.259 -
1.04) / 1.259 = 17%
Attributable cases are (1.34-1.04) x number of exposed person-years. Since we do not know
the population size, represent it by n. Based on the NHANES data, 27% of people are
physically active, so there are 0.73n physically inactive people (in one year, 0.73 person-
years). So: Attributable cases = (1.34-1.04)(0.73) = 0.219.
All cases are exposed cases + unexposed cases. Since we do not know the population size,
let it be represented by n. Based on the prevalence of physically active people, there are
0.73n phyisically inactive and 0.27n physically active people (or person-years, if we assume a
one-year period). So the total number of cases = exposed cases + unexposed cases =
0.73(1.34) + 0.27(1.04) = 1.259
Note that these measures can be computed more precisely by using the original number of
cases and person-years and not rounding intermediate results, but two significant figures is
adequate for the actual result, and in this case the answer does not change.
Explanation: Seventeen percent of all strokes in the population are attributable to physical
inactivity; if everyone were physically active, there would be 17% fewer strokes.
c. Attributable risk measures assume that the relationship is causal (i.e., that physical
inactivity does in fact cause an ncrease stroke risk). Some of the above interpretations may
also require that the process be reversible, so that changing to a physically active lifestyle
brings risk down to the level of someone who was not inactive. Another assumption is that
the rates and rate ratio observed in the cohort study hold ofr the entire population. Also, we
have ignored the effects of other factors, most notably age.
13. a. This is a retrospective cohort study (researchers developed the hypothesis in 1998).
c. IDR= ID High / ID low = 2.24/0.651 = 3.4. Nuns with a high error communications
profile are 3.4 times more likely to die from Alzheimer's Disease than nuns with a low error
profile.
d.
Alzheimer’s Disease
Most of the questions on this examination relate to the article "Individual risk factors for hip
osteoarthritis: obesity, hip injury, and physical activity" (Cyrus Cooper, Hazel Inskip, Peter Croft,
Lesley Campbell, Gillian Smith, Magnus McLaren, and David Coggon. Am J Epidemiol 1998;
147:516-22). You may refer to this article during the examination.
1. Briefly list two reasons why a case control study is (or is not) appropriate to examine
individual risk factors for hip osteoarthritis. (2 pts)
2. The authors state that their cases come from a defined population. List four features of
the population or the study design that support this statement or helped the authors to
achieve it? (4 pts)
3. Considering the study population, study design, and other information in the article,
which of the following statements is (are) TRUE and which is (are) FALSE. (2 pts each)
b. If about 12% of the population was age 65 years or older, then about 12,000
people age 65 years or older in the two districts have radiographic evidence of hip
osteoarthritis.
c. The data in Table 1 demonstrate that women are 1.9 times as likely to develop
severe symptomatic hip osteoarthritis as are men.
d. The data in Table 2 indicate that female gender is not a risk factor for hip
osteoarthritis.
e. In this study, matching the control group to the cases on age, as opposed to a
random sample of the general adult population, probably resulted in greater
statistical power and precision.
4. The case identification process was based on a register in each district made up of
persons on a waiting list for a total hip arthoplasty (surgical reformation of the hip joint).
Waiting lists for procedures are common in societies with a nationa l or social medicine
system. In the United States, a region wide waiting list for a hip arthoplasty is unlikely, as
the availability of receiving this procedure would be more related to insurance status or
ability to afford such a procedure. Explain how using the register system in the Untied
Kingdom to select cases either increases or decreases the possibility of selection bias as
compared to a study conducted in the United States. (4 pts)
5. How was the diagnosis of hip osteoarthritis made in this study? Was this based on
manifestional or causal criteria? Explain your answer. (3 pts)
6. According to the authors: "For each case, a control of the same sex and age was
selected from the list of the same general practice held by the county Family Health
Service Association". State in one sentence the rationale for using a list from ge neral
practioners? (3pts)
7. Eighty-four percent of the patients listed for total hip arthroplasty fulfilled the criteria
for entry into the study as cases. Which of the following best describes the criteria: (3 pts)
a. age > 45 years, being on the waiting list for hip arthroplasty, and the presence
of Heberden’s nodes.
b. age > 45 years, pain duration at least for 36 months, and presence of
Heberden’s nodes.
c. history of hip fracture within the past year, being on the waiting list for hip
arthroplasty and reside in the study area.
d. presence of Heberden’s nodes, history of hip fracture within the past year, and
reside in the study area.
e. being on the waiting list for hip arthroplasty, reside in the study area, and age
> 45 years
8. The authors report that 89% of the eligible cases agreed to participate and 60% of the
1060 controls approached agreed to participate. Which of the following best states a
condition regarding the non-responders that could lead to an odds ratio re ported for the
risk of osteoarthritis associated with previous hip injury that is biased away from the null
(>1). Choose one best answer. (3 pts)
a. control non-responders are more likely to have a history of hip injury compared
to case non-responders.
b. control non-responders are less likely to have a history of hip injury compared
to case non-responders.
9. What was accomplished by replacing controls who refused to participate? (Choose one
best answer) (3 pts)
b. the control group would have been less representative of the study base;
f. it would have been necessary to control for age and sex in the analysis.
10. The authors selected controls who were individually matched to cases by age, gender,
and family practitioner. Matching in the design stage is usually considered only for those
variables that are known to be confounders. Under which of the follow ing circumstances
could gender be a confounder of the association between a risk factor (obesity) and the
outcome (hip osteoarthritis)? Circle all that apply. (4 pts)
a. the prevalence of obesity and the prevalence of hip osteoarthritis are both
higher in men that in women
b. the prevalence of obesity is lower in men than women, but the prevalence of
hip osteoarthritis is higher in men than women.
c. the prevalence of obesity is higher in men than women, but the prevalence of
hip osteoarthritis is the same in men and women.
d. the prevalence of obesity is the same in men and women, but the prevalence of
hip osteoarthritis is higher in men than women.
11. The odds ratios in Table 2 are "mutually adjusted for the other two variables" by
logistic regression. The following questions concern the models used to estimate the odds
ratios in the table (ignore the fact that it was "condit ional" logistic regression and ignore
the middle categories for body mass index and presence of Heberden’s nodes) (2 pts
each):
a. How many logistic models were necessary to estimate the odds ratios for body
mass index >28.0, definite Heberden’s nodes, and previous hip injury among
women.
b. The odds ratio estimate for hip injury in women was 2.8. What must the logistic
coefficient have been?
c. From this table, estimate the odds ratio for women who had both definite
Heberden’s nodes and previous hip injury compared to women who had neither.
12. In this study, information on medical history, life style, and leisure time physical
activities was obtained through a "structured interviewer-administered questionnaire".
(page 517). It is possible that persons on a waiting list for a hip arthoplasty would be
more keenly aware of hip injuries they may have had in the past than controls. If true, this
is an example of which of the following? Choose one best answer. (3 pts)
14. Which of the following conclusions can be made from the above results? (choose one
best answer) (3 pts)
a. the unadjusted (crude) association between hip injury and hip osteoarthritis in
women is completely confounded by body mass index and Heberden’s nodes.
b. since the unadjusted and adjusted odds ratios are similar, the risk factor (hip
injury) must not be associated with the adjustment variables (body mass index and
Heberden’s nodes)
c. since the unadjusted and adjusted odds ratios are similar, there is no effect-
measure modification of the association between hip injury and hip osteoarthritis.
15. The odds ratios presented in Table 5 are adjusted for previous hip injury. Why might
they still be confounded by hip injury? (3 pts)
16. In Table 6, is the crude association between previous hip injury and risk of unilateral
hip osteoarthritis biased towards the null or away from the null? (2 pts)
17. Based on the data in Table 3, what is the odds ratio for Heberden's nodes (definite
versus none) for persons in the Upper tertile of body mass index? (3 pts)
18. Rothman has proposed that "public health synergism" is present when an observed
joint effect exceeds that expected under the additive model. Do the odds ratios in Table 3
indicate the presence of "public health synergism" for effect of Heberden 's nodes and
elevated body mass index on hip osteoarthiritis? If not, do the odds ratios conform to a
multiplicative model? Include in your answer a 1-2 sentence assessment of whether these
data indicate "public health synergism". (For this question, ignore the row for "Possible"
Heberden's nodes and the column for the middle tertile of body mass index, and assume
that both Heberden’s nodes and elevated BMI reflect casual risk factors for hip
osteoarthritis. Note: do not necessarily rely on the autho rs' description of this table.)
(6 pts)
19. The authors investigated the association of specific sporting activities with risk of hip
osteoarthritis. Their data are presented in Table 5. Using their data, compute separately
the unadjusted (crude) risk of osteoarthritis associated with pla ying golf and for
swimming in men and women combined. Consider those who do not participate in any
sport as the reference group and assume no missing data. Show two appropriate 2x2 table
and your calculations. (4 pts)
19a. Compare these unadjusted (crude) odds ratios with the ones presented in Table 3.
Briefly describe and explain the comparison. (3 pts)
19b. Consider the possibility that golfers who have hip osteoarthritis are reluctant to seek
medical attention for their condition for fear it will mean the end of their ability to play
golf. Therefore, cases who golf are less likely to be se lected for this study than cases
who do not golf. If the true OR associated with golf is 2.0, then which of the following
best describes the selection bias and its impact on the odds ratio you computed. (3 pts)
a. non-differential selection bias resulting in an odds ratio biased toward the null.
b. non-differential selection bias resulting in an odds ratio biased away from the
Null.
c. differential selection bias resulting in an odds ratio biased away from the null.
d. differential selection bias resulting in an odds ratio biased toward the null.
19c. The authors state that "...the association with swimming may have arisen because
patients with hip osteoarthritis were advised to swim..." (page 521). Suppose that 25% of
the cases had been incorrectly classified as swimmers and assume that the misclassified
cases had not participated in any other sporting activity, either. Re-compute the odds ratio
for the association of hip osteoarthritis and swimming, after re-classifying these
individuals, using the number from the 2x2 table in question 19 above. Briefly discuss
how your conclusion about the role of swimming does (or does not) change. In what
direction did misclassification bias the study OR? (3 pts)
20. The odds ratio (95% confidence interval) estimating the risk of osteoarthritis
associated with a previous hip injury was 24.8 (3.1-199.3) in men and 2.8 (1.4-5.8) in
women (see Table 2).
c. Which estimate is more compatible with a population odds ratio of 4.0? (2 pts)
21. Which one of the statements best interprets the following passage? (3 pts)
"In a previous case-control study (17) of men aged 60-76 years, we observed a
doubling of risk for hip osteoarthritis among those in the highest third of body
mass index distribution, as compared with those in the lowest third, although the
increased risk was not statistically significant." (p519 bottom of right column)
c. The doubling of risk was not statistically significant because a p-value was not
computed, so it is not possible for the authors to know whether the increased risk
was due to chance.
d. If 1,000 independent random samples the same size as that study population
were drawn from a population with no increased risk of hip osteoarthritis, fewer
than 950 would have an OR between 0.5 and 2.0.
e. If 1,000 independent random samples the same size as that study population
were drawn from a population with a doubling of risk of hip osteoarthritis for the
highest third of the body mass distribution, as compared with the lowest third,
more th an 5% of the samples would display no elevation in risk.
f. If 1,000 independent random samples the same size as that study population
were drawn from a population with a doubling of risk of hip osteoarthritis for the
highest third of the body mass distribution, as compared with the lowest third,
fewer t han 80% would display an association of that magnitude.
22. A medical journalist, confused by the thrust of this article, comes to you and says:
"I've read this article several times, but I can't figure out what it shows about the
relationship of body mass index, Heberden's nodes, and hip osteoarthri tis. The authors
explain that 'two broad mechanisms are believed to underlie the pathogenesis of
osteoarthritis at any joint site: mechanical stress and a generalized predisposition to the
disorder' as indexed by Heberden’s nodes [p519 right column]. T hat seems
straightforward enough, and they later conclude that the analysis 'supports the notion that
this condition arises through an interaction between a generalized predisposition to the
disorder and specific mechanical insults to the hip' [p521]. Y et on page 518 [right
column], the authors state that there was 'no statistically significant interaction' between
body mass index and Heberden's nodes, and on page 519 [left column] they refer to
obesity and a tendency to polyarticular involvement as 'i ndependent risk factors for hip
osteoarthritis'. Would you please assess for me what this article shows about the
relationship among body mass index, Heberden's nodes, and hip osteoarthritis? I have
room for 40-60 words. Thanks!" (6 pts)
23. Write a brief statement for or against a causal relationship between hip injury and risk
of osteoarthritis. Comment specifically on at least two of Bradford Hill’s criteria for
causal inference. Support your conclusion with data or statements f rom the article. (4
pts)
1. Briefly list two reasons why a case control study is (or is not) appropriate to examine individual
risk factors for hip osteoarthritis. (2 pts)
Condition rare, faster to complete than cohort study, wide range of exposures of interest.
2. The authors state that their cases come from a defined population. List four features of the
population or the study design that support this statement or helped the authors to achieve it? (4
pts)
1. The two health districts had a centralized orthopedic facility for assessment and treatment of hip
osteoarthritis;
2. Local orthopedic surgeons were willing to enter all patients into the study;
3. All men and women 45 years and older who were placed on the waiting list for primary total hip
arthoplasty were considered for the study;
5. The study excluded patients who lived outside the two districts.
The diverse socioeconomic profile was an advantage for generalizability but does not make this a defined
population.
3. Considering the study population, study design, and other information in the article, which of
the following statements is TRUE and which are FALSE . (2 pts each)
a. In these two health districts, the incidence density of symptomatic hip osteoarthritis of
sufficient severity to warrant hip arthroplasty exceeds 40 per 100,000 person-years.
[TRUE - 726 eligible cases / 1 million population over 18 months = 48.4 per 100,000]
b. If about 12% of the population was age 65 years or older, then about 12,000 people age
65 years or older in the two districts have radiographic evidence of hip osteoarthritis.
[TRUE - 10% population prevalence in age 65 years and older * 12% of one million]
c. The data in Table 1 demonstrate that women are 1.9 times as likely to develop severe
symptomatic hip osteoarthritis as are men.
[FALSE - the data in Table 1 cannot demonstrate this female excess, since there is no information
about the sex ratio in the older population; this ratio may well reflect a greater incidence of severe
symptomatic hip osteoarthritis in women, but some of the excess presumably derives from greater
mortality among men.]
d. The data in Table 2 indicate that female gender is not a risk factor for hip osteoarthritis.
[FALSE - controls were matched to cases on gender (and age), so the sex ratio in the controls must
match that in the cases]
e. In this study, matching the control group to the cases on age, as opposed to a random
sample of the general adult population, probably resulted in greater statistical power and
precision.
[TRUE - the mean age of the cases is 70 years old, with the majority older than 60; thus, the use of
general population controls without regard to age would result in relatively little overlap between the
age distributions of cases and controls on this very important variable.]
4. The case identification process was based on a register in each district made up of persons on a
waiting list for a total hip arthoplasty (surgical reformation of the hip joint). Waiting lists for
procedures are common in societies with a national or social medicine system. In the United States,
a region wide waiting list for a hip arthoplasty is unlikely, as the availability of receiving this
procedure would be more related to insurance status or ability to afford such a procedure. Explain
how using the register system in the Untied Kingdom to select cases either increases or decreases the
possibility of selection bias as compared to a study conducted in the United States. (4 pts)
Using the registry may reduce selection bias if affluence or ability to pay for a hip replacement is associated
with exposures like BMI, physical activity, Heberden’s nodes. Cases selected from surgery lists in the United
States system may have a differential association with a risk factor as compared cases not receiving this
procedure, so measures of association may be more biased in a U.S. study.
5. How was the diagnosis of hip osteoarthritis made in this study? Was this based on manifestional
or causal criteria? Explain your answer. (3 pts)
(page 517, left column, 2nd paragraph): Diagnosis of hip osteoarthritis in this study was based on pelvic
radiographs. This is based on manifestional criteria.
6. According to the authors: "For each case, a control of the same sex and age was selected from the
list of the same general practice held by the county Family Health Service Association". State in one
sentence the rationale for using a list from general practioners? (3pts)
(page 517, left column, 3rd paragraph): In England and Wales, almost everyone is registered with a general
practitioner so that these lists essentially provide an enumeration of the general population.
7. Eighty-four percent of the patients listed for total hip arthroplasty fulfilled the criteria for entry
into the study as cases. Which of the following best describes the criteria: (3 pts)
a. age > 45 years, being on the waiting list for hip arthroplasty, and the presence of
Heberden’s nodes.
b. age > 45 years, pain duration at least for 36 months, and presence of Heberden’s nodes.
c. history of hip fracture within the past year, being on the waiting list for hip arthroplasty
and reside in the study area.
d. presence of Heberden’s nodes, history of hip fracture within the past year, and reside in
the study area.
e. being on the waiting list for hip arthroplasty, reside in the study area, and age > 45 years (answer)
8. The authors report that 89% of the eligible cases agreed to participate and 60% of the 1060
controls approached agreed to participate. Which of the following best states a condition regarding
the non-responders that could lead to an odds ratio reported for the risk of osteoarthritis associated
with previous hip injury that is biased away from the null (>1). Choose one best answer. (3 pts)
a. control non-responders are more likely to have a history of hip injury compared to case non-responders.
(answer)
b. control non-responders are less likely to have a history of hip injury compared to case
non-responders.
9. What was accomplished by replacing controls who refused to participate? (Choose one best
answer) (3 pts) If controls who refused had not been replaced:
b. the control group would have been less representative of the study base;
f. it would have been necessary to control for age and sex in the analysis.
Answer: d. Failure to replace controls who refused would have reduced both the number of controls and of
cases (due to the matching), with a loss of statistical power and increase in the probability of a type II error.
10. The authors selected controls who were individually matched to cases by age, gender, and family
practitioner. Matching in the design stage is usually considered only for those variables that are
known to be confounders. Under which of the following circumstances could gender be a
confounder of the association between a risk factor (obesity) and the outcome (hip osteoarthritis)?
Circle all that apply. (4 pts)
a. the prevalence of obesity and the prevalence of hip osteoarthritis are both higher in men that in women
(true)
b. the prevalence of obesity is lower in men than women, but the prevalence of hip osteoarthritis is higher in
men than women. (true)
c. the prevalence of obesity is higher in men than women, but the prevalence of hip
osteoarthritis is the same in men and women.
d. the prevalence of obesity is the same in men and women, but the prevalence of hip
osteoarthritis is higher in men than women.
11. The odds ratios in Table 2 are "mutually adjusted for the other two variables" by logistic
regression. The following questions concern the models used to estimate the odds ratios in the table
(ignore the fact that it was "conditional" logistic regresion and ignore the middle categories for body
mass index and presence of Heberden’s nodes) (2 pts each):
a. How many logistic models were necessary to estimate the odds ratios for body mass index
>28.0, definite Heberden’s nodes, and previous hip injury among women.
"Mutually adjusted" means that each odds ratio comes from a model that includes the other two
factors, which therefore means that all three factors are included in the same model. So one model
yields an adjusted odds ratio for each variable. So one model was used.
b. The odds ratio estimate for hip injury in women was 2.8. What must the logistic
coefficient have been?
<p
The OR for a dichotomous or indicator variable is exp(beta), where beta is the logistic
coefficient. Therefore the coefficient was 1n(2.8) = 1.0296.
</p
c. From this table, estimate the odds ratio for women who had both definite
Heberden’s nodes and previous hip injury compared to women who had
neither.
The logistic model is based on additivity of the logit or multiplicativity of the odds.
Therefore the odds ratio for the double exposure is the product of the adds ratio for
each of the risk factors: 1.5*2.8=4.2.
12. In this study, information on medical history, life style, and leisure time physical
activities was obtained through a "structured interviewer-administered
questionnaire". (page 517). It is possible that persons on a waiting list for a hip
arthoplasty would be more keenly aware of hip injuries they may have had in the past
than controls. If true, this is an example of which of the following? Choose one best
answer. (3 pts)
13. Among women, the odds of previous hip injury is higher among cases than
controls (Table 2; OR=2.8). As indicated in the footnotes for Table 2, the odds ratio
for pervious hip injury is adjusted or controlled for the other two variables in the
Table (body mass index and Heberden’s nodes). Using the counts shown in Table 2,
calculate an unadjusted (crude) odds ratio for previous hip injury in women. (3 pts)
14. Which of the following conclusions can be made from the above results? (chose
one best answer) (3 pts)
a. the unadjusted (crude) association between hip injury and hip osteoarthritis
in women is completely confounded by body mass index and Heberden’s
nodes.
b. since the unadjusted and adjusted odds ratios are similar, the risk factor
(hip injury) must not be associated with the adjustment variables (body mass
index and Heberden’s nodes)
c. since the unadjusted and adjusted odds ratios are similar, there is no effect-
measure modification of the association between hip injury and hip
osteoarthritis.
There may be residual confounding by type of hip injury or by how long ago the hip injury
occurred, or imperfect recall of hip injury (non-differential misclassification).
16. In Table 6, is the crude association between previous hip injury and risk of
unilateral hip osteoarthritis biased towards the null or away from the null? (2 pts)
17. Based on the data in Table 3, what is the odds ratio for Heberden's nodes
(definite versus none) for persons in the Upper tertile of body mass index? (3 pts)
18. Rothman has proposed that "public health synergism" is present when an
observed joint effect exceeds that expected under the additive model. Do the odds
ratios in Table 3 indicate the presence of "public health synergism" for effect of
Heberden's nodes and elevated body mass index on hip osteoarthiritis? If not, do the
odds ratios conform to a multiplicative model? Include in your answer a 1-2 sentence
assessment of whether these data indicate "public health synergism". (For this
question, ignore the row for "Possible" Heberden's nodes and the column for the
middle tertile of body mass index, and assume that both Heberden’s nodes and
elevated BMI reflect casual risk factors for hip osteoarthritis. Note: do not
necessarily rely on the authors' description of this table.) (6 pts)
Ignoring the intermediate categories for Heberden's nodes and body mass
index gives the following expression for the additive model:
Expected joint excess risk = excess risk for factor 1 + excess risk for factor 2
= excess risk for Heberden's nodes + excess risk for Body mass index
Since hip osteoarthritis of this severity is rare, the following approximate
expressions are appropriate:
Expected excess risk = (OR for Heberden's nodes - 1) + (OR for Body mass index -
1)
The substantial difference between 2.2 and 1.0 indicates that the odds ratios
in this table do not conform to an additive model for expected joint effect.
Expected joint OR = (OR for Heberden's nodes) * (OR for Body mass index )
Since these odds ratios indicate a joint effect greater than that expected under
an additive model, "public health synergism" is present, to a moderate degree
(we expect a 100% increase in risk but observe a 220% increase in risk)
19. The authors investigated the association of specific sporting activities with risk of
hip osteoarthritis. Their data are presented in Table 5. Using their data, compute
separately the unadjusted (crude) risk of osteoarthritis associated with playing golf
and for swimming in men and women combined. Consider those who do not
participate in any sport as the reference group and assume no missing data. Show
two appropriate 2x2 table and your calculations. (4 pts)
YES 51 34
NO 140 162
OR = 1.7
NO 140 162
OR = 1.6
19a. Compare these unadjusted (crude) odds ratios with the ones presented in Table
3. Briefly describe and explain the comparison. (3 pts)
Table shows 1.4 and 1.5, respectively. This suggests that BMI, nodes, and hip injury
explain very little of the association of these two sports with hip osteoarthritis.
19b. Consider the possibility that golfers who have hip osteoarthritis are reluctant to
seek medical attention for their condition for fear it will mean the end of their ability
to play golf. Therefore, cases who golf are less likely to be selected for this study than
cases who do not golf. If the true OR associated with golf is 2.0, then which of the
following best describes the selection bias and its impact on the odds ratio you
computed. (3 pts)
c. differential selection bias resulting in an odds ratio biased away from the
null.
d. differential selection bias resulting in an odds ratio biased toward the null. (answer)
19c. The authors state that "...the association with swimming may have arisen
because patients with hip osteoarthritis were advised to swim..." (page 521). Suppose
that 25% of the cases had been incorrectly classified as swimmers and assume that
the misclassified cases had not participated in any other sporting activity, either. Re-
compute the odds ratio for the association of hip osteoarthritis and swimming, after
re-classifying these individuals, using the number from the 2x2 table in question 19
above. Briefly discuss how your conclusion about the role of swimming does (or
does not) change. In what direction did misclassification bias the study OR? (3 pts)
OR = 0.96: The misclassification was differential and biased the odds ratio
upward.
20. The odds ratio (95% confidence interval) estimating the risk of osteoarthritis
associated with a previous hip injury was 24.8 (3.1-199.3) in men and 2.8 (1.4-5.8) in
women (see Table 2).
21. Which one of the statements best interprets the following passage? (3 pts)
"In a previous case-control study (17) of men aged 60-76 years, we observed
a doubling of risk for hip osteoarthritis among those in the highest third of
body mass index distribution, as compared with those in the lowest third,
although the increased risk was not statistically significant." (p519 bottom of
right column)
c. The doubling of risk was not statistically significant because a p-value was
not computed, so it is not possible for the authors to know whether the
increased risk was due to chance.
d. If 1,000 independent random samples the same size as that study population were
drawn from a population with no increased risk of hip osteoarthritis, fewer than 950 would
have an OR between 0.5 and 2.0. (answer)
22. A medical journalist, confused by the thrust of this article, comes to you and says:
"I've read this article several times, but I can't figure out what it shows about the
relationship of body mass index, Heberden's nodes, and hip osteoarthritis. The
authors explain that 'two broad mechanisms are believed to underlie the
pathogenesis of osteoarthritis at any joint site: mechanical stress and a generalized
predisposition to the disorder' as indexed by Heberden’s nodes [p519 right column].
That seems straightforward enough, and they later conclude that the analysis
'supports the notion that this condition arises through an interaction between a
generalized predisposition to the disorder and specific mechanical insults to the hip'
[p521]. Yet on page 518 [right column], the authors state that there was 'no
statistically significant interaction' between body mass index and Heberden's nodes,
and on page 519 [left column] they refer to obesity and a tendency to polyarticular
involvement as 'independent risk factors for hip osteoarthritis'. Would you please
assess for me what this article shows about the relationship among body mass index,
Heberden's nodes, and hip osteoarthritis? I have room for 40-60 words. Thanks!" (6
pts)
Points to include:
1. Both body mass index and presence of Heberden's nodes were associated with greater
risk of hip osteoarthritis, even when the other is absent.
2. People with both elevated BMI and Heberden's nodes have a greater risk for hip
osteoarthritis than people with only one of these risk factors and even greater than would be
expected from adding or multiplying their individual effects (i.e., greater than expected by
both additive or multiplicative models).
3. The authors seem to believe and the study does not show otherwise that most cases of hip
osteoarthritis in their study result from a combination of mechanical stress (which could be
something other than obesity) and biologic predisposition (which might not yet have
manifested in other joints).
Grading: 6 points for 3 of these, 5 points for two of them, 3 points for one. If none was
mentioned then 1-2 points awarded depending upon the relevance and accuracy of what was
written.
23. Write a brief statement for or against a causal relationship between hip injury and
risk of osteoarthritis. Comment specifically on at least two of Bradford Hill’s criteria
for causal inference. Support your conclusion with data or statements from the
article. (4 pts)
1. Match the term from column A with the most appropriate topic or
concept from column B (use each term only once and each topic only
once). (1 pt each = 12 pts)
3. In the Minnesota Heart Health Program (as described in class) and many
other community intervention studies, the effectiveness of an
educational intervention program is evaluated. Which of the following
selections best describes the unit of assignment, the unit of
observation, and the unit of analysis (in this order) in studies of
these types? (Choose one best answer) (4 pts)
____ b. prevent bias introduced when the patients know what type of
treatment they are receiving
____ c. prevent bias introduced when the investigators know what type of
treatment the patients are receiving
____ d. b and c
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
-3- ID Number __-__ __ __ __
___________________________________________________________________
HIV-infected 9 31 40
___________________________________________________________________
7A. Which one answer best describes the transmission rate in the table?
(4 pts)
____ a. proportion
____ d. odds
7B. Using the data in the table, estimate the relative risk of HIV
infection for infants whose mothers took zidovudine relative to
infants of mothers who took placebo. Show formula and calculations.
(4 pts)
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
7C. Based on the data in the above table, estimate the proportion of
potential cases of perinatal HIV transmission that could be prevented
by providing zidovudine to HIV-positive, 2nd trimester pregnant women
who would otherwise not receive the drug. (Assume all women take the
medication and consider only singleton births.) Show formula or
diagram and calculations. (4 pts)
____ c. Cases are HIV-infected infants; controls are infants whose mothers
should have received zidovudine but did not.
Methods: Data were obtained from interview, exam, and lab tests.
Results:
SD = standard deviation
-5- ID Number __-__ __ __ __
a. mean
b. SD
c. range
d. median
8B. Of the four variables in Table 1, which has the most symmetrical
(normal-like) distribution? (Choose one best answer.) (4 pts)
Syphilis 7/930
Gonorrhea 42/940
Chlamydia 66/957
_______________________________________________________
8C. Based on the above data and assuming that the the two diseases have
the same average duration, how do their incidence rates compare in
this population? (Choose the one correct answer.) (3 pts)
8D. Based on the above data but this time assuming that the two diseases
have the same incidence, how do their average durations compare in
this population? (Choose the one correct answer.) (2 pts)
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
9C. What measure would you use to quantify the strength of association
between cigarette smoking and AA-10? Show the formula for this
measure, substitute the appropriate numbers for that formula, compute
the result, and state its meaning in one sentence. (4 pts)
a. Formula
b. Substitution
c. Result
d. Meaning ____________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
9D. Assuming that cigarette smoking is responsible for the observed excess
in AA-10, how many cases of AA-10 during the quarter are attributable
to cigarette smoking? Show a relevant formula or diagram,
intermediate computation, and result, and give a sentence stating the
meaning of the result. (4 pts)
a. Formula or diagram
b. Substitution
c. Result
d. Meaning ____________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
10. Suppose that 900 of the subjects in question #8 consent to regular STD
screening following release from detention. Subjects are counseled
about preventive measures and screened every three months for two
years. All cases are treated and cured.
Number tested 890 870 850 810 780 760 710 630
____________________________________________________________________
(Subjects can become infected with the same organism more than once
and/or become co-infected with more than one organism.)
10B. What is the average incidence density (per 100 person months or per
100 person years) of chlamydia for the two years of follow up? Assume
that: dropouts contribute no time to follow up after the last time
they are tested; subjects remain at risk even while infected. (3 pts)
10C. Give two reasons for preferring incidence density over cumulative
incidence for assessing frequency of infection in this cohort. (6 pts)
i. ___________________________________________________________
_______________________________________________________________
ii. ___________________________________________________________
_______________________________________________________________
-9- ID Number __-__ __ __ __
11. A study of alcoholism and major depressive disorder recruited 100
consecutive patients in a Veterans Administration hospital in Urbana,
Illinois. All patients had been diagnosed as being alcohol abusers.
An equal number of non-abusers were selected randomly from the same VA
hospital. 76 of the participants identified as being abusers
fulfilled criteria for major depression, as did 20 of the non-abusers.
Evaluate the evidence provided by this study for the inference that
alcohol abuse causes depression in relation to the following aspects:
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
11B. Many of the criteria for causal inference pertain to the evaluation of
evidence from multiple studies, but several can also apply to a single
study. Name two (2) such criteria and use them to evaluate
(quantitatively where possible) the evidence from the above study.
(6 pts)
i. ___________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
ii. ___________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
Congratulations!
Department of Epidemiology
1. Matching (1 pt each):
Column A - Terms Column B - Topics
7 cumulative incidence (11 is ok) 1. Case-control studies
12 incidence density 2. Causal inference
11 prevalence (7 is ok) 3. Confounds cross-sectional data
2 dose response 4. Death certificate
9 induction period 5. Descriptive epidemiology
1 odds ratio 6. Diagnostic tests
8 preventive fraction in the exposed 7. Estimates risk
4 underlying cause of death 8. Measures impact
6 positive predictive value 9. Natural history of disease
10 detectable, pre-clinical phase 10. Population screening
5 migrant studies 11. Proportion
3 cohort effect 12. Relative rate
(Credit was also given for some other pairings.)
-2-
By diagram:
8B. a. Age at first coitus -- its mean and mean are both close together
and not very far from the middle of the range. Although the mean and
median are also close together for the number of partners in the past
4 months, but they are no where near the middle of the range. (4 pts)
-3-
8C. a. Incidence of gonorrhea is lower than that of chlamydia -- if
duration is the same for both diseases, the prevalence odds are
proportional to the incidence density, so gonorrhea's smaller
prevalence (42/940 vs. 66/957) implies a lower incidence. (3 pts)
Since both diseases have the same incidence, the ratio of their
durations equals the ratio of their prevalence odds:
9A. School absence from acute asthma and cigarette smoking (4 pts):
CI in smokers 8.3%
Cumulative incidence ratio = ----------------- = ------ = 1.89
CI in nonsmokers 4.4%
9D. Number of cases of excessive absence due to acute asthma (AA-10) that
(assuming causation) are attributable to smoking.
This question asks for the size of the shaded box in the diagram in
the "evolving text". That diagram, with numbers instead of variables
is:
|
8.3% | 8.3% = incidence
| |XXXXXXXXXXXXXXX| in exposed
Incidence | | | persons
| | 3.9% x 1,200 |
| | = 47 | 3.9% = "attributable
4.4% | |XXX XXXX| risk"
| |\\\\\\\\\\\\\\\|
| 300 | 4.4% x 1,200 | 4.4% = incidence
0| |\\ = 53 \\| in unexposed
6,800 1,200 (15%) persons
Nonsmokers Smokers
All these methods come up with approximately the same answer, the
differences being due to the rounding of intermediate results in
obtaining some of the incidences and the CIR. When the numbers
from the table are used and intermediate results not rounded, the
number of cases attributable to smoking is 47.0588
-5-
(Total) Cases
Prevalence = ---------------------
(Total) person-time
These diseases have an extended risk period (i.e., one longer than the
period of observation)
The following exam questions relate to the article: Freudenheim J et al. Exposure to
breastmilk in infancy and the risk of breast cancer. Epidemiology 1994;5:324-331. You
may refer to this article du ring the examination.
NOTE:
"I have neither given nor received help from others in completing this examination."
______________________________________________________________________________
________
1. Which of the following best characterizes the present study as presented in the article
(2 pts):
2. Find an example from the paper for each of the following (give the page number and
quote enough of the words to identify the point or passage; the same point or phrase
cannot be used more than once) (2 pts each)
A. Age is causally related to breast cancer risk and an infant’s age is related to
her exposure to breastmilk.
B. Age is causally related to breast cancer risk and infant feeding practices have
changed over time.
C. Age is causally related to breast cancer risk but not associated with breast
feeding purchases.
D. Age is causally related to breast cancer risk but is causally related to breast
feeding practices.
4. The authors describe their study as a case-control study of dietary and reproductive
factors for breast cancer (p. 324). Which of the following best describes the type of
situation for which case-control studies are most advantageo us compared to other
designs. (choose one best answer). ( 2 pts)
5. The authors used the term "cohort effects" in regard to results from previously
reported studies. Which of the following best describes what is meant by cohort
effects in this context? (choose one best answer). (2 pts)
6. Cases in this study were incident cases of conformed cancer of the breast (p. 325).
Which of the following best describes the advantage of selecting incident cases over
prevalent cases (choose one best answer) (2 pts)
A. selecting from a pool of prevalent cases would make separation of factors
associated with risk and those with survival more difficult.
B. selecting from a pool of prevalent cases would make exposure assessment
more difficult because of pre-existing disease status.
C. selecting from a pool of incident cases creates a more homogenous case
group with regard to unknown confounding factors.
D. selecting from a pool of incident cases reduces misclassification bias.
a. primary
b. histologically-confirmed
8. In this study, controls were selected by a random process from residents of the two
counties and were frequency age matched to cases (p. 325). Which of the following
best describes a reason for preferring community controls over ho spital-based
controls for this study? (choose one best answer). (2 pts)
A. kappa coefficient
B. correlation coefficient of reproducibility
C. intraclass correlation coefficient
D. product-moment correlation
E. A or B
F. A, B, or C
12. For each of the following statements, indicate if it is TRUE OR FALSE: (1 pt each)
a. By matching the controls to the cases on age, the authors have ensured that
age will not be a confounder .
b. The procedure for identifying cases is essentially one of active surveillance.
c. The difference between the proportion of cases interviewed and the
proportion of controls interviewed will cause selection bias.
d. The fact that premenopausal controls who had been breastfed were somewhat
older than controls who had not (page 325, bottom of col. 2) indicates
frequency matching by age did not "work.
e. The absence of an association between age and breast cancer in tables 1 and 2
is likely to be a reflection of selection bias from the low response rates for
cases and controls.
f. In postmenopausal women there appears to be a "dose response"
relationship between body mass index and the association between having
been breastfed.
g. A case-control study design is often the design of choice in outbreak
investigations.
h. For a factor under study to be considered an effect modifier it must be an
independent risk factor for the outcome of interest
13. A list of control variables for use in the logistic regression models appears on page
325, middle of column 2. These variables have been chosen because they (choose one
best answer): (2 pts)
A. are likely to be associated with breast cancer risk in the bottle-fed women.
B. are known or suspected risk factors for breast cancer, or at least proxies for
such factors
C. are likely to be associated with infant feeding history in the controls
D. are likely to be associated with infant feeding history in the cases
a. An association between breast cancer risk and having zero pregnancies. Use
> 3 pregnancies as a reference.
b. An association between having been breastfed and being over 165 cm in
height. Use <160 cm as a reference.
c. An association between breast cancer and having been breastfed, overall.
15. On page 326, 2nd column, the authors state "As shown in Table 3, the risk of breast
cancer associated with having been breastfed, was about 0.7 for both pre- and
postmenopausal women." In this context, to which of the following epi demiologic
measures does the term "risk" refer? Choose one best answer. (2 pts)
A. Cumulative incidence
B. Incidence density
C. Attributable risk
D. Odds ratio
16. Using the data in Table 3, estimate AND state the meaning of the following
measures (for this question you may ignore the possibility of selection bias in cases
and controls):
a. Attributable Risk Proportion (ARP) for NOT having been breastfed for all
breast cancer (both premenopausal and postmenopausal breast cancer,
combined). Note that an ARP is also known as the etiologic fraction in the e
xposed. (3 pts)
b. Population Attributable Risk Proportion (PARP) for NOT having been
breastfed for premenopausal and for postmenopausal breast cancer,
separately (i.e., 2 PARP's). Note that the PARP is also known as the etiologic
fract ion. (4 pts)
c. Why would you or would you not expect the PARP to be different for
premenopausal breast cancer compared to the PARP for postmenopausal
breast cancer case in this investigation (part b)? (2 pts)
17. In the multiple logistic model referred to as Model 2 in Table 3, what was the
coefficient for the variable not-having-been-breastfed among all breast cancer cases?
(2 pts)
Which of the following assumptions is involved in that model? Indicate True or False
for each assumption. (1 pt each)
a. The odds of breast cancer vary as the product of the odds for age and the
odds for education.
b. The odds of breast cancer vary as the sum of the odds for age and the odds for
education.
c. Age, education, and not having been breastfed were independent of (i.e.,
uncorrelated with) each other.
d. Breast cancer is a rare disease.
18. Suppose that cases who refused to participate in this study were less likely to have
been breastfed as infants than those who participated in the study. Which of the
following best describes what this fact would imply for the obser ved relative risk
associated with being breastfed compared with what would have been observed had
all persons participated I the study? (choose one best answer). (2 pts)
A. the observed relative risk would be biased away from the null.
B. the observed relative risk would be subject to selection bias and the direction
of the bias can not be estimated.
C. the observed relative risk would be biased toward the null.
D. the observed relative risk would be subject to misclassification bias and the
direction of the bias can not be estimated.
19. In table 3, the confidence intervals for the OR's for all women do not include the
value 1.0, whereas all but one of the OR's for premenopausal breast cancer and
postmenopausal breast cancer do. Mathematically, what does this patte rn reflect? (2
pts)
20. On page 324, 2nd column, the authors offer a possible explanation of why two
previous studies of breastfeeding and breast cancer found little crude association,
observing that the result may have been "confounded by a fa ilure to adjust for age,
because of cohort effects with regard to breastfeeding frequency". The following
stratified analysis has been constructed to illustrate a situation where cohort effects
with regard to breastfeeding completely obscure a true prote ctive association seen
when age is controlled.
Breastfed 24 67
Bottlefed 81 36
22. Use the data from Table 2 (Distribution of Characteristics of Postmenopausal Cases
and Controls) to draw separate 2 x 2 tables for women who have had : 0 pregnancies,
1-2 pregnancies, and >=3 pregnancies. (5 pts)
23. A hypothetical cross-sectional ancillary study to this report was conducted. In that
study a survey of breast cancer annual incidence rates in geographically distinct
areas was completed. Region A in the upper Midwest were breast c ancer mortality is
high, and Region B the Southeast where mortality from breast cancer is low. The
following data were obtained.
Region A Region B
Crude 2.9
Compute the following (for adjusted rates use the direct method and the total
population as a standard):
24. Write a brief statement for or against a causal relationship between breastfeeding in
infancy and risk of breast cancer as an adult. Comment specifically on at least two of
Bradford Hill's criteri for causal inference. Include in y our comments data or
statements from the article. (5 pts)
25. Assuming that this relationship is causal, why might a similar study, 50 years from
now, fail to find as strong a relationship? (2 pts)
Format 8/4/2000 vs
9. A. Kappa coefficient
10. Table:
Biomarker validation of women's self-report of having been breastfed
Yes No Total
S r --------------------------------------------
e e Breastfed 70 26 96
l p
f o Not breastfed 80 28 108
r --------------------------------------------
t Total 150 54 204
11. a. Table:
Adult breast cancer by having been breastfed as an infant,
among premenopausal women with education beyond high school
b. Table:
d. False - The matching caused cases and controls to have the same
age distribution, so it did "work"; matching would not be expected
to eliminate an association between age and the exposure, since
exposure status was not known when controls were being selected and
in any case would not have been used in the matching procedure.
From Table 2:
Cases Controls
------------------------- -------------------------
Breastfed Not breastfed Breastfed Not breastfed
Body mass ---------- -------------- --------- -------------
index (kg/mz)
16-22 48 15 89 19
23-27 103 26 125 16
>27 90 17 91 16
To show the details, here is a table for estimating OR's for body mass index and breast
cancer:
and the resulting OR's are [e.g., (90 * 89) / (48 * 91) = 1.83]:
The OR's in the total column are shown to illustrate that in this
case there is some confounding by breastfeeding history, at body
mass index level 23-27 kg/m sq. Within either breastfed or not
breastfed group there is no "dose-response" relationship.
13. Potential confounders are factors that are known or suspected risk
factors for breast cancer or its detection, or at least proxies for
such factors.
OR = (50 x 216) / (38 x 167) = 1.7 (for zero vs. >= 3 pregnancies)
Not breastfed 41 25 66
----------------------------------
Total 189 208 397
> 165 vs. all others: OR = (148 x 68) / (396 x 41) = 0.62
16. a. Estimate RR for Not breastfed as 1/OR for Breastfed: 1 / 0.69 = 1.45
b. If know the formula (or can derive it from the diagram and the
"grand synthesis"):
P(E|D) (RR-1)
PARP = --------------- and since breast cancer is rare, use OR.
RR
(117)
----------- (1.47-1)
(117+112) (0.51) (0.47)
Premenopausal: ----------------------- = --------------- = 0.16
1.47 1.47
AND
(58)
-------------- (1.45-1)
(58+241) (0.19) (0.45)
Postmenopausal: ------------------------- = --------------- = 0.06
1.45 1.45
Proportion of exposed (Not breastfed) cases that are atttributable to not having been
breastfed is:
ARP = (RR-1)/RR
Since breast cancer is rare, we can estimate with
(OR-1)/OR = (1.47-1) / 1.47 = 0.3197 for postmenopausal.
17. Logistic model coefficients for risk factor variables are natural
logarithms of odds ratios per one unit change in the variable.
So the coefficient was ln(0.70) = -0.3567
Assumptions:
a. True - The odds of breast cancer vary as the product of the odds
for age and the odds for education.
b. False - Only in a few special cases will the product of two odds
equal their sum (e.g., both odds equal zero or both odds equal two).
The logistic model is additive in the logit (logarithm of odds),
multiplicative in the odds.
18. C. The observed relative risk would be biased toward the null.
20.
AGE < 60 AGE > 60 TOTAL
----------------------------------------------------
Breast Bottle Breast Bottle Breast Bottle
------ ------ ------ ------ ------ ------
Cases 24 40 256 100 280 140
Region A Region B
Cases Population Rate/1000 Cases Population Rate/1000
< High School Education
Age
40-50 10 7,000 1.4 10 15,000 0.7
51-60 15 10,000 1.5 20 5,000 4.0
61-65 30 3,000 10 600 55,000 10.9
Crude 2.9
25. Assuming that this relationship is causal, why might a similar study,
50 years from now, fail to find as strong a relationship? (2 pts)
_____________________________________________
1. Briefly state the primary study question of this report. Identify the
main exposure and outcome of interest. (3 pts)
___________________________________________________________
A. Active surveillance
B. Ongoing crossectional survey
C. Passive surveillance
D. Follow up study of dynamic population
5. This study determined exposure and outcomes using data from "a list of
all members of the agricultural community who were certified to apply
restricted-use pesticides in 1991" (p. 394-methods) and from "all in-
wedlock live births recorded in the state for the years 1989 through
1992" (p. 394-methods). Briefly assess the strength of these data
sources in establishing the temporal sequence of pesticide exposure
and birth defects and provide support for your assessment. (4 pts)
7. The use of the term "rate" is not an infallible guide to the specific
epidemiologic measure being presented. Which one of the following
epidemiologic measures best characterizes the measure that the authors
refer to as the "rate of anomalies per 1000 live births" (Table 2 -
footnote)? Choose one best answer. (4 pts)
A. pesticide appliers had 1.37 times more births with anomalies than
did the general population.
B. pesticide appliers had more children with birth anomalies than did
the general population.
___________________________________________________________________
11. Using the data presented in Table 1, recalculate the crude odds ratio
for all births with anomalies assuming that all musculoskeletal birth
anomalies occurring among those with maternal age greater than 30 and
the "other" anomalies among maternal age > 35 were later found to
actually have occurred among persons incorrectly classified as
appliers. Explain what implications this new calculation would have
on the conclusions of the study. (3 pts)
___________________________________________________________________
12. It is possible that the pesticides examined in this study might have
reduced fecundity or increased the proportion of conceptions not
resulting in live births. Assume that both of these effects (lower
fecundity, more spontaneous abortions, and more still births) have in
fact occurred in the pesticide applier population studied here, so
that the number of live births to pesticide applier fathers is smaller
than it would have been in the absence of pesticide exposure. Which of
the following statements is (are) TRUE and which is (are) FALSE? (2
pts each)
TRUE FALSE
____ ____ A. Since all births would be affected equally, effects on
fecundity and spontaneous abortion WOULD NOT have influenced
the size of the odds ratio presented in this study. [This
question is problematic.]
____ ____ B. If pesticides were equally likely to cause fetal loss and birth
anomalies, then the odds ratios would strongly understate the
harmful effects of pesticides.
13. Table 4 shows the frequency per 1000 births of major anomalies for the
general population by region. Which of the following best describes
the study design from which these data were obtained. (4 pts)
A. ecologic study
B. prospective cohort study
C. retrospective cohort study
D. region-specific case control study
14. The authors begin their discussion section by stating that this report
"is an initial step in the evaluation of the possible relationships
between the frequency of birth anomalies and pesticide use". They
conclude, however by saying that these data "signify a clear-cut need
for comprehensive examination of the health issues involved". This
latter statement seems to indicate that the authors suspect a causal
relationship. Identify and describe three criteria for causal
inference for which at least some information is present in the
article. Give specific examples from the article to support your
selection. (9 pts)
___________________________________________________________________
15. Suppose that after this publication came out, another study was
conducted in Illinois to investigate the hypothesis that birth defects
occurred more often in Illinois as compared to Minnesota. However,
in this new study the authors thought that the type of water consumed
could be related to birth defects. They wanted to adjust
(standardize) the rates of defects in the two states for water type.
Data from the two studies are compared as below.
a. calculate the crude rate and the water-type specific rates for
Illinois. Briefly describe how these two states compare in crude
rates of birth anomalies. (4 pts)
17. Which of the following statements about the present study are (is)
TRUE and which are (is) FALSE. Indicate TRUE or FALSE for each
statement. (2 pts each)
TRUE FALSE
____ ____ A. Subjects used in the analyses for Table 1 of this study were
selected on the basis of their exposure status.
____ ____ C. The age-adjusted odds ratio for all birth anomalies of 1.41 is
considered a modest association.
____ ____ D. Since birth defects of these types are rare in the general
population, a cohort study could be designed to efficiently
examine further the relationship of pesticides and birth
anomalies.
4. C. Passive surveillance
10A. Since the question does not specify absolute or relative impact, either
attributable risk (AR) or attributable risk proportion (ARP) is correct
(actually, attributable prevalence, but the term attributable risk is
typically applied to rates and prevalences as well as risks).
Meaning: 7.2 births with anomalies per 1000 live births fathered by
pesticide appliers are attributable to pesticide exposure.
(Note: small differences among the results from the various methods are
primarily due to the fact that the OR of 1.37 has been rounded to fewer
significant digits than are the prevalences computed above.
12. A. False - there is no basis for assuming that all births would be
affected equally.
15. This question underwent a revision to simplify it, but unfortunately some
parts of the previous version remained. The columns labelled
"# live births" should have included the qualifier "Normal", and the
rates for Minnesota needed to be re-computed accordingly. Due to this
problem, two alternate solutions are completely acceptable, one in which
the denominators are the numbers in the "# live births" column and one in
which the denominators equal the sum of these numbers plus the numbers of
births with anomalies. In addition, full credit is given if the rates
for Minnesota were recomputed. Here is the version in which the stated
rates were used and the # of live births column was treated as if it
meant "Total live births":
16. Yes - it is not clear from these data whether birth anomalies occurred
in people with or without exposure because exposure information was
based on group data.
17. A. False - subjects were selected from birth records for live births
B. False
C. True
D. False
E False
F. True - (however, a correlation coefficient indicates the extent of
association in the sense of two variables moving in tandem; it does
not indicate the strength of association in the epidemiologic sense
of how great a change occurs in the response variable for a change
of a given size in the exposure variable)
G. True
19. Points in favor of action at this time are the evidence that the
relationship is causal (biological plausibility, consistency between
results of ecologic [by crop-region] and individual-based [pesticide
applier] analyses, pattern of findings (season of conception),
consistency across several epidemiologic studies, and the high
attributable risk percent (27%) among babies with birth anomalies born
to pesticide applier couples. In addition, the substantially
increased prevalences of birth anomalies among all live births in
county clusters with high use of chlorophenoxy herbicides/fungicides
(Table 4), consistent across the four regions, suggest that anomalies
due to pesticides (assuming that the relationship is causal) occur
throughout areas where these pesticides are used. Even though the
population attributable risk proportion is very small (about 1%) for
exposure due to being a pesticide applier, the proportion of all
Minnesota birth anomalies potentially attributable to residence in a
county cluster with high pesticide use is 27% [overall prevalence of
birth anomalies for all Minnesota in-wedlock births was 3791 / 183,721
= 20.63 per 1000 live births (Table 1), prevalence of birth anomalies
in low-pesticide county clusters ("unexposed") was 15 per 1000 (Table
4), so PARP = (PCrude - P0) / PCrude = (20.63 - 15) / 20.63 = .27).
The effects seem to be strongest for chlorophenoxy pesticides,
suggesting that at least this category should be restricted.
Moreover, there are powerful arguments for reducing pesticide use for
environmental reasons as well.
Against taking action other than continuing research are that the
evidence is still not very strong (biological mechanisms not yet
elucidated, relationship is not highly specific, epidemiologic studies
limited and not entirely consistent, experimental evidence not
available), the potential impact on agriculture and therefore food
prices is considerable, and the costs to industry and commerce from
restrictions on a major product are substantial. Moreover, the
relative weakness of the odds ratios (below 2.0) indicates a
significant possibility that other factors could be responsible for
the increase in birth anomaly prevalence seen in association with
pesticide exposure, a possibility whose investigation requires better
data on exposure and other factors that may lead to birth anomalies.
Grading of this question is based on the clarity and support for your
evaluation and recommendation.
10/21/96, 10/7/97 - wr:eml/vs \ mepid168\ exams 1996 Midterm exam - answers rev.
University of North Carolina
School of Public Health
Department of Epidemiology
Fundamentals of Epidemiology (EPID 168)
Victor J. Schoenbach and Wayne D. Rosamond
NOTE: For simplicity, ignore the requirement that this study was
restricted to those persons with a telephone number.
A. Manifestational criteria
B. Causal criteria
D. Neither
A. Selection bias
B. Prevalence-incidence bias
C. Information bias
D. Surveillance bias
_ -2-
A. Without the exclusion the odds ratio would be closer to the null.
8. This study uses a case control design with a population based control
group. Which of the following, in general, is a strength of this
design. (Choose one best answer) (3 pts)
A. Nominal
B. Ordinal
C. Interval
D. Ratio
10. Control for age in the analyses presented in Table 2 was accomplished
through which of the following methods? (Choose one best answer)
(3 pts)
12. The authors state on page 49 that after controlling for smoking, the
relative risk for CrohnÕs disease among men was 1.9 for a high
consumption of sucrose and 0.7 for a high consumption of fiber. Briefly
explain why based on these data the authors state that smoking did not
confound these associations. (3 pts)
High Low
Fast foods
1+ times/wk 12 10 8 14
14. In the discussion (page 50), the authors state that Òif the change in
diet is the same in cases as in controls, then the relative risk
estimates would be biased toward unityÓ. This is an example of which of
the following? (Choose one best answer) (3 pts)
A. Non differential misclassification bias
B. Non differential selection bias
C. Differential information bias
D. Differential misclassification bias
15. This articles does not present p-values yet reports 95% confidence
intervals for all odds ratios. Which of the following best describes
what information a confidence interval conveys that a p-value does not.
(Choose one best answer) (3 pts)
17. Briefly present the evidence for or against the role of fiber as a
confounder of the association of sucrose intake and CrohnÕs disease. (3
pts)
18. Suppose a follow-up to this study was done to estimate the rate (per
10,000 person years) of ulcerative colitis among a large sample in the
Swedish population. The table below summarizes the results.
19. This study did not differentiate between caffeinated and decaffeinated
coffee. Using the data presented in Table 4 and applying the
assumptions below, calculate the odds ratio (heavy versus no use)
associated with caffeinated coffee consumption and determine if it is
protective against ulcerative colitis. Describe in 2 sentences or less
the interpretation of this new odds ratio, ignoring issues of random
error. (4 pts)
Assumptions:
1. 20% of the heavy coffee drinkers ( 3 cups per day) among cases drink
only decaffeinated coffee.
20. Which of the following variables was NOT in the multiple logistic model
that was used to estimate the relative risk for sucrose intake in
relation to ulcerative colitis in women? (Choose best answer) (3 pts)
A. Age
B. Gender
D. Ulcerative colitis
21. In the multiple logistic model that yielded the relative risk estimate
of 0.7 for Ulcerative colitis in relation to daily vegetable consumption
(Table 4), what was the value of the coefficient for the vegetable
consumption variable assuming that it was coded as 1=daily, 0=less
frequently? Write the conversion equation of coefficient to relative
risk estimate. (3 pts)
22. Assume that the population of Stockholm County in the age range covered
by this study was 1,000,000 in 1980 and remained constant throughout the
decade. What was the average annual incidence of hospital-diagnosed
Crohn's disease during that period regardless of when their medical
record became available? (3 pts)
23. Using the data in Table 2, for which of the following two associations
is there more of an indication of confounding by age and total energy
intake in WOMEN? Support your answer with relevant data and/or
computations. (3 pts)
24. Briefly state one major strength and one major limitation of this study
(2 pts)
_ -6-
25. List two Bradford Hill criteria for evaluating whether dietary sucrose
intake is causally related to inflammatory bowel disease. Evaluate each
using specific facts from the article. (4 pts)
26. Which of the following statements about the data in Tables 1 and 2 are
TRUE and which are FALSE (answer TRUE or FALSE for each statement). (2
pts each)
d. The proportion of controls with high dietary fat intake was higher
for men than for women.
27. A Swedish friend of yours who lives in Stockhom has an indentical twin
sister who is anything but identical in terms of her diet. Your friend,
as other health conscious Swedes, avoids fast foods and soft drinks, and
eats whole grain bread and muesli-type cereals daily. Her twin sister,
and many Swedes, often consumes fast foods and soft drinks, but never
touches whole grain bread or muesli.
Your friend comes to visit with you over the holidays, and while you are
sleeping late one morning she comes across your class notes from EPID
168. At breakfast, where she has been busily scribbling on her napkin,
she asks you this question.
"Suppose that fast foods, soft drinks, whole grain bread, and muesli-
type cereal affect Crohn's disease risk independently, and that I can
ignore other risk factors. Suppose also that the excess risks are
additive. Is my twin sister's risk of Crohn's disease 10 times my own?"
She shows you how she used the information in Table 3 to obtain that
estimate:
She goes on to explain "(3.4 -1) is the excess risk from fast foods, and
((1/0.4) - 1) is the excess risk from eating bread that is not whole
grain."
Even though you're not quite fully awake, you feel justifiable pride in
your command of epidemiologic concepts and explain to her the one big
mistake she has made. You say, " . . . ". Write a brief statement of
what you would say. (4 pts)
2. A. Manifestational criteria
3. C. Information bias
9. D. Ratio (The response scale for each item was ordinal, but in order to
create the total energy variable the authors had to convert each
response into calories.)
11. The odds ratios for 80 to 104 grams per day was 1.4 and for intakes of
greater than 105 grams per day the odds ratio was 1.3. This suggests a
tendency for cases to have a greater proportion of high fat eaters than
controls. However, the confidence intervals are broad, extending as low
as 0.4 and 0.6. Furthermore there is no suggestion of a dose response.
This is at most weak evidence of a relationship between fat intake and
ulcerative colitis.
12. a. The crude (with respect to smoking) and adjusted odds ratios are the
same. If smoking had been a confounder in the relationship between
sucrose and Crohn's disease or between fiber and Crohn's disease the
adjusted odds ratio would have been meaningfully different from the
values in Table 2.
13. a. Odds ratios: Crude = (24 x 285) / (20 x 128) = 6840 / 2560 = 2.7
among High education = (10 x 150) / (12 x 100) = 1.3
among Low education = (14 x 135) / (28 x 8) = 8.4
b. The stratum-specific odds ratios are quite different from each other,
suggesting some degree of effect modification. The crude odds ratio
is within the range of the two stratum-specific odds ratio, which
suggests that education is not so much a confounder as an effect
modifier.
17. The authors state that sucrose and fiber intake could be associated with
one another as well as with Crohn's disease and thus each factor might be
a confounder of the associations between Crohn's disease and the other
("mutual confounding"). The odds ratio was 2.6 for a high sucrose intake
(bottom page 48). When adjusted for fiber the sucrose odds ratio changed
only slightly to 2.5. Therefore, fiber was a only a slight modifier of
the sucrose and Crohn's disease relationship.
18. a. Under the additive model, we expect the joint excess rate of the two
factors will be equal to the sum of the excess rate from each factor
separately. The additive model can also be written in terms of rates:
expected rate of ulcerative colitis with both daily soft drink and =2
fast foods per week = rate (daily soft drinks, without fast food) +
rate (less freq. soft drink, =2 fast food per week) - rate (neither).
Under the multiplicative model, we expect the joint rate ratio of the
two factors to be equal to the product of the rate ratios for each
factor separately. In the above notation, the model can be expressed
as: R1,1 = (R1,0 x R0,1)/R0,0. This equation expressed with numbers
from the tables is: (9.1 x 6.8) / 3.7 = 16.7. The observed rate is
18.0. The close agreement for the observed joint rate and that
expected under the multiplicative model suggests that the relationship
among daily soft drink consumption, frequent fast food exposure, and
Crohn's disease is closer to multiplicative than to additive.
19. odds ratio for =3 caffeinated coffee = (56 x 36) / (18 x 36) = 3.1
Heavy caffeinated coffee drinking now appears to be a risk factor for
Ulcerative colitis where before coffee drinking appeared to be
protective. An alternative approach would be to include the
decaffeinated coffee drinkers in the "No" (caffeinated) coffee group.
Under this model the odds ratio for =3 cups caffeinated coffee, relative
to none or only decaffeinated = (56 x 201) / (50 x 18) = 12.5
22. 236 cases / 5,000,000 person years = 4.72 cases/100,000 person years.
Full credit was given for 236 cases / 4,000,000 person years = 5.9 cases
/ 100,000 per year. Note that the incidence is obtained from all cases
(or at least all confirmed cases), rather than from only consenting
cases.
For disaccharides:
Crude OR = (30 x 66) / (35 x 45) = 1.26, versus adjusted OR of 1.2
26. a. F
b. F
c. T
d. F
27. Models of joint effects combine effects of "pure" exposures, i.e., in the
absence of other exposures. But the excess risk for each food item in
Table 2 is estimated without controlling for the effects of others. For
example, since people who eat fast foods are also likely to take soft
drinks and not to eat whole grain bread, the relative risk estimates for
fast food 2+ times/week probably already reflect frequent soft drink
consumption and low whole grain bread consumption. In order to add up
the excess risk for each food item, we need to know the excess risks for
exposure to that item in the absence of the others.