You are on page 1of 12

Computerized Medical Imaging and Graphics 31 (2007) 224235

Current status and future directions of computer-aided diagnosis in mammography


Robert M. Nishikawa
Carl J. Vyborny Translational Laboratory for Breast Imaging Research, Department of Radiology and Committee on Medical Physics, The University of Chicago, 5841 S. Maryland Avenue, MC-2026, Chicago, IL 60637-1463, United States

Abstract The concept of computer-aided detection (CADe) was introduced more than 50 years ago; however, only in the last 20 years there have been serious and successful attempts at developing CADe for mammography. CADe schemes have high sensitivity, but poor specicity compared to radiologists. CADe has been shown to help radiologists nd more cancers both in observer studies and in clinical evaluations. Clinically, CADe increases the number of cancers detected by approximately 10%, which is comparable to double reading by two radiologists. 2007 Elsevier Ltd. All rights reserved.
Keywords: Breast cancer; Mammography; Computer-aided diagnosis; Computer-aided detection; Observer studies; Clinical evaluation; ROC analysis

1. Introduction Breast cancer is a major killer of women in the United States and in many other parts of the world. Each year approximately 41,000 women die from breast cancer in the United States, and 213,000 women are diagnosed with breast cancer [1]. Screening of asymptomatic women by mammography has lead to a reduction in breast cancer mortality. Several randomized, controlled screening studies have shown an overall decrease in breast cancer mortality of up to 30% [24]. Further, using mathematical modeling, Berry et al. have shown that the recent decrease in breast cancer mortality in the United States has been due equally to screening with mammography and to better treatment [5]. The detection and diagnosis of breast cancer with mammography are composed of two steps. The rst is asymptomatic screening, where suspicious areas in a mammogram are identied. The second is diagnostic mammography, where symptomatic women with an abnormal mammogram or some physical or clinical abnormality (e.g., a palpable lump) receive special view mammograms (e.g., magnication views or spot compression views) and possibly ultrasound and MRI. The goal

of obtaining a diagnostic mammogram is to determine whether a woman should have a biopsy. 1.1. Screening mammography Mammography, although effective as a screening tool, has limitations. On a screening mammogram, cancers can be missed (false-negative mammogram), and non-cancerous lesions can be mistaken as cancer, leading to a false-positive mammogram. Depending on how the true cancer status of a woman is determined, the miss rate in mammography can be nearly 50% [6]. Retrospective analyses of missed cancers [713] indicated that approximately 60% are visible in retrospect, although in some cases the cancer may be very subtle [12]. These studies also show that approximately 30% of cancers are not visible in retrospect. In many of these cases, the reason for the cancer not being visible is that there is normal tissue above and below the cancer that camouages the cancer. This is because a mammogram is a 2D image of the 3D breast, so that the superposition of tissue can hide cancers. The superposition of tissues can also produce patterns in the mammogram that look suspicious to a radiologist. As a result, between 5 and 15% of screening mammograms are read as abnormal [14], even though the prevalence of cancer in the screening population is typically 0.5%. For addressing the superposition problem, two new 3D X-ray imaging techniques for the breast are being developed: computed tomography [1517] (CT) and digital breast tomosynthesis [18]

Financial disclosure: Robert M. Nishikawa has a research agreement with Eastman Kodak Company and he is a shareholder in Hologic Inc. Both he and the University of Chicago receive research funding and royalties from Hologic, Inc. Tel.: +1 773 702 9047; fax: +1 773 702 0371. E-mail address: r-nishikawa@uchicago.edu. 0895-6111/$ see front matter 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.compmedimag.2007.02.009

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

225

(DBT). These techniques produce slices, typically 1 mm or less in thickness, that can be stacked to produce a 3D image of the breast. Compared to CT, which can have isotropic resolution, DBT has superior resolution within a slice, but much poorer resolution in the direction perpendicular to the slice. Whereas, CT collects images to cover at least a complete 360 angle, DBT collects images over only 60 or less, leading to a loss of spatial resolution in the direction perpendicular to the detector. One drawback of both of these techniques is that there are many images that a radiologist must review. For example, in DBT there can be as many as 80 slices, with each slice having the same information content as a standard mammogram. In this situation, CADe may be useful in helping radiologists to handle the large amount of data [19,20]. 1.2. Diagnostic mammography When a suspicious lesion is found on a screening mammogram, or the patient has some physical symptoms (e.g., a palpable lump), diagnostic mammography is performed. On diagnostic mammograms, benign lesions are often difcult to distinguish from cancers, and thus, a cancer can be misinterpreted as a benign lesion. Clinically, differentiating benign from malignant lesions is a difcult task. In the USA, the positivepredictive value (PPV) for diagnostic breast imaging is generally less than 50%. The PPV measures the percentage of all breast biopsies that are positive for cancer. Using data from the Breast Cancer Surveillance Consortium, Barlow et al. determined that the PPV based on 41,427 diagnostic mammograms was 21.8% [21]. Elmore et al. examining the results from eight large mammography registries (containing the follow up information on more than 300,000 screening mammograms), found that the PPV ranged from 16.9 to 51.8%, with a median value of 27.5% [22]. Thus, approximately three biopsies of benign lesions are performed for every biopsy of a malignant lesion. Unnecessary biopsies are both physically and emotionally traumatic for the patient, they are costly to the health care system, and add unnecessarily to the workload of radiologists, pathologists, and surgeons. Improving radiologists PPV can have a substantial positive effect on patient care and on the healthcare system. In addition, the interpretation of a mammogram are inherently variable because the mammograms are read by human beings. There is both inter- and intra-variability among radiologists [23,24]. Furthermore, there are substantial differences between the performance of radiologists in Europe and of those in North America [14,22]. Computer-aided diagnosis (CAD) is being developed to address some of the limitations of mammography. Two different types of CAD systems are being developed: computer-aided detection (CADe) can be used to help radiologists nd breast cancer on screening mammograms, and computer-aided diagnosis (CADx) can be used to help radiologists decide whether a known lesion is benign or malignant on diagnostic mammograms. It should be noted that here CAD refers to the whole eld and comprises both CADe and CADx. There is good evidence that CADx systems may be useful for improving radiologists PPV [2528]. Nevertheless, in this paper, I will discuss only

CADe, giving a description of the current status and possible future directions. I will start with a brief description of the historical development of CAD in mammography. 2. Historical development As early as 1955, Lee Lusted talked about automated diagnosis of radiographs by computers. In 1967, Fred Winsberg et al. published a paper in radiology describing a CADx system in which the computer determined whether a lesion on a mammogram was malignant or benign [29]. By todays standards, the lm digitization, computer power, and computer vision techniques at that time were very crude, and Winsbergs method was not successful. During the next few years, there were several unsuccessful attempts at automating both detection and diagnosis. Through most of the late seventies to the mid eighties, there was a period of inactivity, at least as reected in publications. In the mid-eighties at the University of Chicago, Doi, Chan, Giger, MacMahon, and Vyborny started to investigate the concept called computer-aided diagnosis, which is different from the automated diagnosis of many earlier attempts. Their goal was not to replace radiologists, but to develop systems that may help radiologists render better clinical decisions. A breakthrough came with two studies. In the rst, Getty et al. showed that a CADx system, the input for which is as a checklist that a radiologist used to characterize the features of a lesion, could improve radiologists ability to predict whether a lesion was benign or malignant [25]. Theirs was not an automated system. The second was an observer study conducted by Chan et al. in which 15 radiologists read 60 mammograms, half of them containing a cluster of microcalcications [30]. They showed that, by using a computer-aided detection scheme, which was completely automated, radiologists could nd additional calcication clusters in a mammogram. These two studies opened the eld to many new investigators and approaches for developing CADe and CADx algorithms. This has led to several observer studies that have shown the potential for computer-aided diagnosis to help radiologists not only in mammography, but in chest radiography and thoracic CT as well [3135]. In 1998, the rst commercial system received FDA approval. Another important milestone in terms of clinical implementation was approval for reimbursement in 2000 by Medicare and other health care payers. A timeline of these developments is shown in Fig. 1. 2.1. Computer-aided detection (CADe) algorithms Many different techniques are used for developing a CADe scheme. Various techniques have been summarized in several review papers [3641]. In addition, much of the CADe research has been presented at three main conferences, all of which have conference proceedings: SPIE Medical Imaging, ComputerAssisted Radiology and Surgery (CARS), and the International Workshop on Digital Mammography. A digital image is the starting point for all techniques, although an optical computing method was proposed more than 10 years ago. The digital image may come from a full-eld dig-

226

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

Fig. 1. Timeline of CAD development.

ital mammography (FFDM) system, or it may be obtained by digitizing of a screen-lm mammogram. An FFDM image will have properties that differ from those of a digitized screen-lm mammogram (dSFM) in terms of response to X-ray exposure (which is linear), contrast, spatial resolution, and noise. These differences, which are discussed below, are important when CAD algorithms are designed. 2.1.1. Linearity An FFDM image either has a log relationship or is linearly related to the exposure to the X-ray detector. A dSFM image has a sigmoidal relationship to the exposure to the X-ray detector, even though lm digitizers are inherently linear. Screen-lm (SF) systems are relatively insensitive at low X-ray exposures and saturate at high exposures, as shown in Fig. 2. A curve of pixel value versus log exposure or versus exposure to the detector is called a characteristic curve. 2.1.2. Contrast Because of the non-linear response of the SF system, contrast is reduced at high and low exposures. The slope of the characteristic curve shown in Fig. 2 is proportional to the contrast in the image. For FFDM images that are linear, the inherent contrast of the system is constant at all exposures. 2.1.3. Spatial resolution The spatial resolution of a digital image is dependent on two factors: the inherent resolution of the X-ray detector and the size of the pixels in the image. SF systems have higher spatial resolution than FFDM systems do. However, once an image is

digitized, the resolution difference can disappear. FFDM systems have pixel sizes between 0.05 and 0.1 mm. Commercial CADe systems use 0.05 mm pixels; however, in many of the systems reported in the literature, 0.1 mm pixels are used. For detection of clustered microcalcications, the pixel size of the image will affect the performance. Chan et al. showed that, as the pixel size decreased from 0.105 mm down to 0.035 mm, the performance of their CADe scheme improved [42]. For detection of masses, pixel size is less important, because masses are typically 5 mm or larger in diameter. Therefore pixels are usually reduced in size to approximately 0.4 mm. This reduces the memory requirements and allows for reduced computation time. 2.1.4. Noise In FFDM, the image noise is proportional to the square root of the X-ray exposure to the detector. At low exposures, however, the electronic noise of the detector can be signicant. This is true for a linear system. If the FFDM records the log of the measured exposure, then the noise is proportional to the inverse of the square root of the X-ray exposure to the detector. In an SF system, the noise is proportional to the inverse of the square root of the detector exposure, but is modied by the slope of the characteristic curve (shown in Fig. 2), so that it is decreased at both high and low exposures. In addition, the lm digitizer adds noise to the digitized image, principally at high exposures. The lm is dark at high exposures, so that the amount of light transmitted through the lm is low. As a result, the electronic noise of the lm digitizer becomes signicant, and the total noise in the image increases.

Fig. 2. Characteristic curve for a full-eld digital mammography system (left) and a digitized screen-lm system (right).

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

227

Fig. 4. An illustration of the utility of FROC curves. Two points given by the circle and triangle represent the performance of two hypothetical CADe schemes. If the two points lie on the same FROC curve (broken line), the two CADe schemes have the same performance. If the two points lie on different curves (solid lines), the curve closer to the upper left corner of the graph has the best performance.

Fig. 3. Flowchart of a generic CADe scheme.

Two general approaches are used for the automated detection of cancer on mammograms. The rst approach is to apply statistical classiers, such as articial neural networks [43] and support vector machines [44,45], directly to the image data. The image is divided into small regions of interest, typically 32 32 pixels. This produces approximately 50,000 non-overlapping ROIs per 100- m pixel image. Therefore, for reducing the number of false ROIs down to even ve per image, the classier must be able to eliminate 99.99% of the false ROIs without appreciably eliminating ROIs containing malignant lesions. This is extremely difcult to achieve and this approach to automated detection has not yet been successful. The second approach is outlined in Fig. 3. After a digital mammogram is obtained, potential signals are identied. This is usually accomplished by transforming of the image by use of linear lters, morphologic operators, wavelets, and other means. Next, thresholding is applied. The goal is to identify as many true signals as possible without an excessive number of false signals. For detection of calcication, the ratio of false to true signals can be 100:1 or higher. Once the signals have been identied, they are segmented from the image. Many different techniques have been developed. Most rely on thresholding of the image either in the transformed space or in the acquired pixel value space. More sophisticated methods have also been developed, such as a Markov random eld model [46]. In this approach, pixels in the image are modeled as belonging to one of four classes: background, calcication, lines/edge, and lm emulsion errors. Three different features are used in the model: local contrast at two different spatial resolutions and the output of a line/edge detector. Once signals have been segmented, features of the signals are extracted and used in statistical classiers to distinguish true from false signals. Many different types of classiers have been

employed. A partial list includes simple thresholds [47], articial neural networks [48], nearest neighbor methods [49], Fuzzy logic [50], linear discriminant analysis [51], quadratic classier [52], Bayesian classier [53], genetic algorithms [54], multiobjective genetic algorithms [55], and support vector machines [44,45]. 2.2. Evaluation of CADe schemes CADe schemes are typically evaluated by use of freeresponse receiver operating characteristic (FROC) curves (see Fig. 4). These are plots of sensitivity versus the average number of false detections per image. Sensitivity is calculated in two ways. The rst method is calculation by case. A case consists of two views of each breast, or four mammograms. Here, if a cancer is detected by the computer in at least one view, it is considered detected. The second method is calculation by image. Here, sensitivity is calculated based on each image; that is, if the computer detects a cancer in only one of two views, the sensitivity is only 50%. The sensitivity by case is almost always higher than the sensitivity by image. Sensitivity by case is often reported when CADe is evaluated clinically, because it is assumed that, if the computer detects a cancer in at least one view, the radiologist will be able to locate the cancer in the other view, if necessary. However, there is evidence that, if an overlooked cancer is detected only in one view by the computer, it is likely that the radiologist will not recognize the correct computer prompt [56]. FROC curves are more useful than just measuring a single sensitivity and false-detection rate, which is essentially a single point on the FROC curve. If one is comparing two different CADe schemes, and one has a sensitivity of 80% with 0.1 false detection per image and the other has 85% sensitivity with 0.5 false detections per image, it is not clear which system is better (see Fig. 4). The two points could belong to the same FROC curve, in which case they have the same performance. That is, by tuning a CADe scheme, it is possible to obtain any sensitiv-

228

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

Fig. 5. Effect of CADe scoring criteria on measured performance by use of FROC curves. The circle method scores a true positive if two computer-detected signals are within the circle with smallest diameter that encloses all actual microcalcications. The centroid method scores a true positive if the centroid of the computer-detected cluster is within 6 mm of the centroid of the actual cluster and there are at least two actual microcalcications detected by the computer. The bounding box method scores a true positive as follows. The smallest box that completely encloses all actual microcalcications is drawn, and then the smallest box that completely encloses all computer-detected signals is drawn. The computer-detected cluster is scored as a true positive if any of the following conditions are true: (i) the detected-cluster bounding box is entirely within the truth bounding box; or (ii) the truth bounding box is completely in the computerdetected bounding box and the area of the bounding box is no larger than twice the cluster bounding box; (iii) the center of the truth bounding box lies in the computer-detected bounding box and the center of the computer-detected bounding box lies within the truth bounding box, and the area of the bounding box is no larger than twice the cluster bounding box.

Fig. 6. Effect of database on performance of CADe algorithms. The performance of the computer detection scheme was tested with three different databases: easy, altered-easy, and difcult. Each is a subset of 50 pairs of mammograms from a larger database of 90 pairs. Whereas, the easy and difcult databases have only 10 pairs of images in common, the easy and altered-easy databases are identical except for 10 pairs.

ity/false detection rate on the curve. Alternatively, the two points could belong to different curves, in which case the scheme with the higher curve has a better performance. For comparing two FROC curves, a statistical technique called JAFROC (jackknife FROC) analysis can be used [57]. Free software for performing such an analysis is available at http://www.devchakraborty.com/. Comparing published results of different CADe schemes is problematic. Differences in the criteria used for scoring whether the computer detected a cancer, the database used for the evaluation, and differences in the way the CADe scheme was trained all can affect the measured performance. Fig. 5 shows the measured FROC curves for one CADe scheme evaluated by use of various scoring criteria, all of which have been used in published studies. Fig. 6 shows the effect of the database on the measured FROC curve. As expected, the easiest cases produce the highest performance. Finally, bias can arise in the measured performance depending on how the algorithm is trained and tested. If the same cases are used for training and testing, there is a positive bias, which can be very large. To avoid this, researchers often use either bootstrapping or some type of jackkning to train and test. Recent studies show that the bootstrap method has advantages over jackkning method using cross-correlation [58]. However, it has been shown that, if the same cases are used for selecting features and for training a classier by use of those selected features, there again will be a positive bias. Whereas, an FROC curve characterizes the performance of a CADe scheme, when it is used clinically, a single operating

point on the curve is chosen. There is no accepted method for choosing the operating point. The choice is generally based on the perceived tradeoff between sensitivity and the false-detection rate. Usually, the operating point is chosen to give the highest clinically acceptable false-detection rate. Because the highest clinically acceptable false- detection rate is not known, there can be differences in the selection of the operating point depending on who is making the choice. In general, clinical CADe systems have high sensitivity. Commercial systems have reported sensitivities of 98% for clustered microcalcications and 85% for masses, which are comparable or exceeding the sensitivity of most radiologists. Therefore, it is possible for CADe systems to detect cancers. The difculty is to achieve this at a low false-detection rate. False detections reduce radiologists productivity because the radiologists must spend time reviewing all computer detections. If the false-detection rate is high greater than approximately one per image for an exam with four images then a radiologist who is trying to read as efciently as possible may choose to ignore all of the computer prompts rather than spend the time to review multiple prompts that are all most likely to be falsein screening the cancer prevalence is typically 0.5%. Radiologists typically recall between 5 and 10% of women screened, which means that there are approximately 0.0120.05 false detections per image. Current CADe schemes have between 0.1 and 0.5 false detections per image, an order of magnitude higher than those of radiologists. In the detection of clustered microcalcications, most false detections are caused by benign calcications, most often calcied vessels (see Fig. 7). It can be difcult to distinguish calcied vessels from malignant calcications that appear in a linear pattern. In the detection of masses, most false detections are caused by superposition of tissue and benign lesions, the same causes of false detections by radiologists. However, even though the causes are the same, the same false detections are not found in each image by the computer and the radiolo-

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

229

she has already detected. However, from a practical viewpoint, if the computer misses too many cancers that the radiologist has detected, the radiologist will lose condence in the ability of the computer to detect cancers and CADe will not be an effective aid. The two necessary conditions for CADe to be successful are: (1) The computer is able to detect cancers that the radiologist misses. (2) The radiologist must be able to recognize when the computer has detected a missed cancer. There is good evidence in the literature that CADe can detect clinically missed cancers. Several studies have shown that between 50 and 77% of missed cancers can be detected by CADe. In these studies, previous mammograms from women with a screen-detected cancer are reviewed for signs that a cancer was visible in an earlier mammogram. These missed cancers are collected and subjected to CADe. Whereas detecting 77% of the missed cancers is a large fraction of the misses, not all computer prompts are actionable. In women with mammograms that appear lumpy, there are many areas that resemble cancer. A small and subtle cancer cannot be reliably detected by a radiologist in the presence of multiple similar lumps. Therefore, a computer prompt in this situation is unlikely to prevent a radiologist from missing a cancer. Thus, it is important to determine radiologists ability to distinguish computer prompts for cancer from computer prompts for false lesions. Four observer studies have been performed for measuring the benets of CADe. The rst two studies showed a statistically signicant improvement in radiologists performance when they used CADe [30,5961]. These were small studies and were conducted in such a manner as to produce a bias in favor of using CADe. In the study by Chan et al., a time limit was given to reading the images and radiologists were shown only a single image, instead the four images that are standard in most screening exams. Nevertheless, this is the seminal paper in the eld, and it launched renewed interest in computer analysis of mammograms [30]. The study by Kegelmeyer et al. looked only at spiculated lesions, and the CADe scheme had 100% sensitivity [61]. The two more recent studies were much larger than the rst. The study by Taylor et al. did not show a statistically signicant improvement in radiologists performance as measured in terms of sensitivity and specicity [59]. The sensitivity increased from 0.78 to 0.81 with 95% CI for a difference of [0.003, 0.064], and the specicity increased from 0.86 to 0.87 with 95% CI for a difference of [0.003, 0.034]. These values are close to signicant and one can speculate if data were collected to allow an ROC analysis to be performed, whether there would be a statistically signicant increase in the area under the ROC curves, since ROC experiments have substantially higher statistical power than sensitivity and specicity calculations. The fourth observer study was performed by Gilbert et al., called the CADET study (computer aided detection evaluation trial) [60]. This was a very large study (10,267 cases containing 236 cancers) with eight readers. They obtained several important results. First, radiologists need to be trained with a large number

Fig. 7. A mammogram with calcied vessels, which lead to multiple false detections by the computer.

gist. Thus, if the radiologist cannot reliably distinguish computer false detections from computer prompted cancers, CADe could preferentially increase the radiologists recall rate (the fraction of women considered to have an abnormal screening mammogram). The difculty for radiologists to distinguish actual cancers from false lesions implies that actual cancers and false lesions appear similar visually. Therefore, it is also difcult for the computer accurately to separate cancers from false detections. As a result, the false-detection rate for mass detection is higher than that for detection of calcications. 2.3. Clinical effectiveness of CADe The goal of CADe is not the detection of cancer. The goal is to help radiologists avoid overlooking a cancer that is visible in a mammogram. Thus, whereas a high CADe performance is good, in theory it is neither a necessary nor a sufcient condition for CADe to be successful clinically. Therefore, it is possible for a CADe scheme to have a sensitivity of less than 50% and still be a useful aid. The computer, in theory, only needs to prompt those cancers that the radiologist missed, because the radiologist gains no advantage from the computer prompting cancers that he or

230

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

of cases, at least 400 in their study, before radiologists use CADe consistently [62]. That is, their recall rate when CADe was used decreased as the training increased up to 400 cases. They were able to show that, compared to double reading by two radiologists, when using CADe radiologists detected 49.1% of cancer cases, whereas only 42.6% were found by double reading. The sensitivity was low in this study because, in addition to the 236 cancers detected in the time frame from which the cases were collected, there were an additional 85 patients who developed breast cancer after the study period. 2.4. Clinical requirements There is good evidence that CADe can detect cancers missed by radiologists and that radiologists can use CADe to nd more cancers, at least as noted in observer studies. These results can be extrapolated to clinical effectiveness, but there are limitations. In observer studies, goal is to simulate clinical reading conditions, and yet there are differences. The greatest difference is that, in an observer study, the radiologists interpretation has no effect on patient management. Thus, radiologists are not under the same clinical pressure in an observer study. Also, the cancer prevalence is usually signicantly higher in an observer study, typically 2550% compared to 0.5% in clinical practice. Therefore, clinical studies must be performed for assessment of the clinical effectiveness of CADe. Seven such studies have been published, these are summarized in Table 1. Overall, the average increase in cancers detected when using CADe is approximately 10%. This is comparable to the increase in the cancer detection rate from double reading by two radiologists [6365]. The rst published clinical evaluation of CADe was done by Freer and Ulissey [66]. They found a 19.6% increase in the number of cancers detected when CADe was used. The second published clinical evaluation was performed by Gur et al. [67] Who found that the cancer detection rate increased only from 3.49 to 3.55 per 1000 women screened. Feig et al. did a reanalysis of the Gur data and found that the low-volume readers had a 19.7% increase in the cancer detection rate, but the high volume readers had a 3.2% decrease [68]. These two studies are important because they represent the two different methods used for
Table 1 Summary of seven clinical studies of CADe Study Total number screened Unaided Longitudinal studies Gur et al. [67] Feig et al. (high volume) [68] Feig et al. (low volume) [68] Cupples et al. [70] Cross-sectional studies Freer and Ulissey [66] Birdwell et al. [73] Helvie et al. [74] Khoo et al. [75] 56,432 44,629 11,803 7872 Aided 59,139 37,500 21,639 19,402

measuring the effectiveness of CADe in screening mammography. The Freer study was a cross-sectional study, that is, data is collected sequentially. The radiologist rst reads the image without CADe and renders an opinion. He or she then examines the computer results and renders a new opinion if necessary. As a result, the effectiveness of CADe is determined patient by patient, and the number of extra cancers detected because CADe was used can be computed. The Gur study, on the other hand, was a longitudinal study, that is, historical or temporal comparisons were made. The cancer detection rate can be compared between two time periods, one before CADe was implemented clinically and the other after CADe was implemented. In this method, the effectiveness of CADe is determined by the change in the cancer detection rate. Overall, there is a large range in measured increase in cancers detected by use of CADe in part for two reasons. The rst is that two different methodologies were used for measuring the clinical effectiveness of CADe. The second is that, although a large number of women may have been screened in a given study, the number of cancers in the population is small, typically 5 per 1000 women screened, and therefore, the statistical uncertainty in the cancer detection is large, large enough to account for the apparent variation. To examine these two effects, I previously developed a Monte Carlo simulation of CADe in screening mammography [69]. The owchart for the simulation model is shown in Fig. 8. Cancers are assumed to grow exponentially, with a volume doubling time of 157 days. Once the cancer is greater than the detection threshold, assumed to be 0.5 cm, it can be detected in one of three ways. First, can be detected by non-mammographic means, such as palpation; these are considered to be interval cancers, which are assumed to occur in 15% of cancers. Second, it can be detected by the radiologist without the help of CADe. These are assumed to constitute 85% of non-interval cancers. Third, it can be detected by the radiologist with the help of CADe. These are assumed to include 75% of the cancers missed by the radiologist. Two different conditions were simulated. The rst was an idealized situation in which all cancers grow at the same rate. There were 125 women who developed cancer each year. We

Number of cancers detected Unaided 197 161 36 29 Aided 210 131 79 83

%Change

1.7 3.2 19.7 16.1

N/A N/A N/A N/A

12,860 8,692 2,389 6,111

41 27 10 116

49 29 11 118

19.5 7.4 10.0 1.7

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

231

Fig. 8. Flowchart of simulation model.

then repeated the simulation 100 times and averaged the results. This produces outcomes with very little statistical variation. The second was a more realistic situation in which 125 women developed cancer each year, there was a log-normal distribution of growth rates with a median of 157 and a standard deviation of 90 days, and there was only a single run (i.e., no averaging over multiple repeated simulations). When the growth rate is the same for each cancer, there is the same number of detectable cancers in the patient population each year. With a spectrum of growth rates, there is a large variation in the number of detectable cancers present each year. As a result, this leads to a large variability in the number of screening-detected cancers from year to year. This can be seen by comparing Fig. 9a and b. If the cross-sectional method were used for assessment of the benets of CADe, the result would depend upon the year the data were collected, because there is variation in the lower curve of Fig. 9a. If we were to use the longitudinal method, it is likely that we would measure only a very small change in the cancer detection rate, because as shown in Fig. 9b, the actual difference is very small. If we were to repeat the realistic simulation, we would

Fig. 9. Results of simulation of CADe in screening mammography: number of cancers detected per year in a screening population as a function of time. CADe is introduced in year 20. (a) All cancers grow with the same doubling time, and the curves are averaged over 100 trials. (b) A single trial result is shown, and the cancers have a distribution of doubling times. The horizontal lines with arrowheads indicate two time periods to compare the benets of CADe for historical comparison (longitudinal method) and the vertical line indicates the difference in the radiologists cancer detection with and without computer aid (cross-sectional method).

get a different result. This result can vary greatly from the one shown because it is possible to measure a decrease in the cancer detection rate when using the longitudinal method [69]. Therefore, the large variation in the results of the clinical studies is not unexpected. These simulation results may indicate that the cross-sectional method is a better method to use. However, both methods have strengths and weaknesses, as listed in Table 2. Although the list

Table 2 Strengths and weaknesses of two different methods for measuring the clinical effectiveness of CADe Type of study Method Outcome Strengths Weakness Cross-sectional Sequential reading of each patient without and with CADe Change in number of cancers detected Straightforward to implement Not subject to variations apparent in longitudinal method Subject to potential large positive and negative biases Not possible to determine which type of bias is present Longitudinal Temporal comparison of two groups of patients read without and with CADe Change in cancer detection rate Possible to conduct large studies Not subject to biases that may be present in the cross-sectional method Subject to variation in number of prevalence screens, which are difcult to control for Subject to variation in radiologists ability to read mammograms Subject to variation in the number of cancers in the screening population from year to year CADe effect on cancer detection rate is small

232

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

of weaknesses for the longitudinal method is longer, the potential for large positive or negative biases in the cross-sectional method makes it difcult to interpret the results from such studies. The three sources of variation listed in Table 2 for the longitudinal method imply that there can be large variations in the results between studies (compare Gur and Cupples studies). This makes it extremely difcult to measure the small difference in the cancer detection rate that actually exists. Therefore, clinically, the longitudinal method will not be effective in measuring the benets of using CADe. As discussed in another paper, the change in the size of the cancer may be a better endpoint [69]. A change in cancer stage when CADe was used was found in one of the clinical studies [70]. 2.5. Improving CADe performance There is good evidence that radiologists often ignore CADe prompts of actual cancers. The reason for this is not well understood. One possibility is that the false-detection rate is too high and radiologists tend to dismiss computer prompts or pay less attention when an image has many prompts. Generally, radiologists like using CADe for clustered calcications, but are less enthusiastic about CADe for masses. The obvious difference is that clustered calcication algorithms have a better performance than do those for masses. However, there are less obvious reasons. There is very little structure in the breast that can mimic clustered calcications. Therefore, false calcication prompts are either due to benign calcications or obviously not to calcications. Radiologists, in general, can evaluate calcication prompts quickly. On the other hand, the superposition of overlapping normal tissues can produce a pseudo-lesion in the mammogram. This is a very common occurrence, and it is sometimes difcult for radiologists to determine whether a lesion is real or is merely overlapping tissue. Approximately 30% of all screening mammograms classied as abnormal are due to the superposition of tissue. One technique that radiologists used to determine whether an apparent lesion is real or not is to determine whether the lesion is visible in both views. If it is, then it is highly likely to be an actual lesion. If it is not, it may or may not be an actual lesion. Radiologists will then compare the area containing the apparent lesions with other regions within the mammogram. If the pattern where the apparent lesion is located is similar to patterns elsewhere in the mammogram, then it is likely that the apparent lesion is not real. The radiologist also compares the current lms to previous lms to see whether the apparent lesion is new or has changed over time. CADe schemes need to use this approach to reduce the falsedetection rate. It has been shown that observers improved their performance when using context information for classication of false-positive and true-positive regions. That is, better discrimination was achieved when radiologists looked at a whole image rather than a small ROI around the lesion [71], and furthermore, that radiologists are better than computers at this task. There are several approaches to correlating views either within the same examinations or between examinations done at different times. One approach is to transform one image to

match the corresponding image taken at a previous time, or an image of the opposite breast. Using geometry based on patient positioning, possible match pairs of detections are determined [72]. Then, using features of the match pairs, radiologists developed a matching pair score. This score allowed corresponding pairs to be determined. Another approach is to extract features of CADe detections and compare detections from different views of the same breast or the same view taken at different times to match the detections between views. 2.6. CADe as a rst reader One of the most demanding and time-consuming aspects of reading a screening mammogram is to nd clustered microcalcications. This is because microcalcications can be very small, a few tenths of a millimeter. Radiologists typically use a magnifying glass and carefully examine all areas of each of four lm mammograms. On a digital system, electronic zoom is used. Given that CADe for clustered calcications has a high sensitivity, approximately 98%, as radiologist gain condence in the computers ability to nd clustered calcications, the need to search the image with a magnifying glass may be reduced to the point where the radiologist relies on the computer to detect these calcications. This would allow radiologists just to check the computer-detected clusters of calcication and then read the mammograms for mass lesions. This should improve radiologists productivity and reduce reading fatigue. 2.7. CADe and picture archiving and communication system (PACS) As mammography migrates from lm to digital acquisition, it becomes important for CADe to be integrated with PACS. Proper integration of CADe and PACS is critical for CADe to be used as a tool to increase productivity. Digital images need to be sent to the CADe server, where the actual CADe algorithms are run; then the output of the CADe schemes needs to be stored with the images so that they are available for review. If the digital images are printed and viewed on light boxes, then the method used for reviewing the CADe output with lm mammography can still be used. If the digital images are viewed on soft-copy monitors, the CADe prompts must be displayed as an overlay on the digital mammogram. In either case, a mechanism is needed for storing, transmitting, and display the CADe marks. This can be done by use of the structure report feature of DICOM. DICOM stands for Digital Imaging and Communications in Medicine (http://medical.nema.org/). DICOM is a set of standards for handling, storing, printing, and transmitting information in medical imaging. It is a global standard used by virtually all medical imaging enterprises. Although CAD companies have embraced DICOM structure reports as a method for storing and retrieving information about the output of CAD analyses, not all PACS companies currently are able to utilize structured reports. As a result, it is not possible to store and retrieve any CAD output. Further, one of the strong features of DICOM is its exibility, but is also a draw-

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235

233

back in terms of integrating different systems. For example, a woman may have a digital mammogram taken on system A, but her pervious mammograms were taken on system B. If the review workstation that is being used cannot display mammograms from both system A and system B, there will be a problem for the radiologist. This can occur even though both systems use DICOM because, in addition to the standard features that all systems use, each system can in addition specify added features, and these may differ from one system to another. Even if both images can be displayed, they may be displayed differently (e.g., they may have different image-processing techniques applied to them). This is very problematic. This is not a problem for CADe display per se, but integration of CADe must occur in this environment. Fortunately, a consortium of industry, radiologists, and informaticist is developing a superset of standards for DICOM that will allow full integration of all of the systems involved in digital mammography (PACS, acquisition hardware, display hardware, and CAD). This consortium is called IHE (Integrating the Healthcare Enterprise: http://www.ihe.net/). The goal of IHE is to standardize many of the optional features of standard DICOM. This should allow true compatibility among all of the components necessary for digital mammography to work seamlessly in the clinic, allowing radiologists to work productively and use CADe routinely. 3. Summary The concept of computer-aided detection (CADe) was introduced more than 50 years ago; however, only in the last 20 years have there been serious and successful attempts at developing CADe for mammography. CADe schemes have high sensitivity, but poor specicity compared to radiologists. CADe has been shown both in observer studies and in clinical evaluations to help radiologists nd more cancers. Recent clinical studies indicate that CADe increases the number of cancers detected by approximately 10%, which is comparable to double reading by two radiologists. However, it is difcult to measure the clinical benets of CADe because of variability in the number of cancers present in the screened population from year to year. Furthermore, the actual increase in the cancer detection rate is very small, and yet a radiologist can reduce his or her missed cancer rate by using CADe. Finally, one important goal of CADe is to improve radiologists productivity. To accomplish this goal, it is important to incorporate CADe seamlessly into the clinical workow. This goal can be achieved by careful integration of CADe into the clinical PACS. References
[1] American Cancer Society. Cancer facts and gures 2006. Atlanta, GA: American Cancer Society; 2006. [2] Anderson I, Aspegren K, Janzon L, et al. Mammographic screening and mortality from breast cancer: the Malmo mammographic screening trial. Br Med J 1988;297:9438. [3] Shapiro S, Venet W, Strax PH, Venet L, Roeset R. Ten to fourteen-year effect of screening on breast cancer mortality. J Natl Cancer Inst 1982;69:34955. [4] Tabar L, Yen MF, Vitak B, Tony Chen HH, Smith RA, Duffy SW. Mammography service screening and mortality in breast cancer patients:

[5]

[6]

[7] [8]

[9]

[10]

[11]

[12] [13]

[14]

[15]

[16]

[17] [18] [19]

[20]

[21]

[22]

[23]

[24]

[25] [26]

[27]

20-year follow-up before and after introduction of screening. Lancet 2003;361(9367):140510. Berry DA, Cronin KA, Plevritis SK, et al. Effect of screening and adjuvant therapy on mortality from breast cancer. N Engl J Med 2005;353(17):178492. Pisano ED, Gatsonis C, Hendrick E, et al. Diagnostic performance of digital versus lm mammography for breast-cancer screening. N Engl J Med 2005;353(17):177383. Andersson I. What can we learn from interval carcinomas? Recent Results Cancer Res 1984;90:1613. Frisell J, Eklund G, Hellstrom L, Somell A. Analysis of interval breast carcinomas in a randomized screening trial in Stockholm. Breast Cancer Res Treat 1987;9:21925. Harvey JA, Fajardo LL, Innis CA. Previous mammograms in patients with impalpable breast carcinoma: retrospective versus blinded interpretation. Am J Roentgenol 1993;161:116772. Holland T, Mrvunac M, Hendriks JHCL, Bekker BV. So-called interval cancers of the breast. Pathologic and radiographic analysis. Cancer 1982;49:252733. Ma L, Fishell E, Wright B, Hanna W, Allen S, Boyd NF. A controlled study of the factors associated with failure to detect breast cancer by mammography. J Natl Cancer Inst 1992;84:7815. Martin JE, Moskowitz M, Milbrath JR. Breast cancers missed by mammography. Am J Roentgenol 1979;132:7379. Peeters PHM, Verbeek ALM, Hendriks JHCL, Holland R, Mrvunac M, Vooijs GP. The occurrence of interval cancers in the Nijmegen screening programme. Br J Cancer 1989;59:92932. Smith-Bindman R, Chu PW, Miglioretti DL, et al. Comparison of Screening Mammography in the United States and the United Kingdom. J Am Med Assoc 2003;290:212937. Boone JM, Kwan AL, Yang K, Burkett GW, Lindfors KK, Nelson TR. Computed tomography for imaging the breast. J Mammary Gland Biol Neoplasia; 2006. Boone JM, Nelson TR, Lindfors KK, Seibert JA. Dedicated breast CT: radiation dose and image quality evaluation. Radiology 2001;221(3): 65767. Chen B, Ning R. Cone-beam volume CT breast imaging: feasibility study. Med Phys 2002;29(5):75570. Niklason LT, Christian BT, Niklason LE, et al. Digital tomosynthesis in breast imaging. Radiology 1997;205(2):399406. Chan HP, Wei J, Sahiner B, et al. Computer-aided detection system for breast masses on digital tomosynthesis mammograms: preliminary experience. Radiology 2005;237(3):107580. Reiser I, Nishikawa RM, Giger ML, et al. Computerized mass detection for digital breast tomosynthesis directly from the projection images. Med Phys 2006;33(2):48291. Barlow WE, Chi C, Carney PA, et al. Accuracy of screening mammography interpretation by characteristics of radiologists. J Natl Cancer Inst 2004;96(24):184050. Elmore JG, Nakano CY, Koepsell TD, Desnick LM, DOrsi CJ, Ransohoff DF. International variation in screening mammography interpretations in community-based programs. J Natl Cancer Inst 2003;95(18): 138493. Beam CA, Layde PM, Sullivan DC. Variability in the interpretation of screening mammograms by US radiologists. Findings from a national sample. Arch Intern Med 1996;156(2):20913. Elmore JG, Wells CK, Lee CH, Howard DH, Feinstein AR. Variability in radiologists interpretations of mammograms. N Engl J Med 1994;331(22):14939. Getty DJ, Pickett RM, DOrsi CJ, Swets JA. Enhanced interpretation of diagnostic images. Invest Radiol 1988;23:24052. Horsch K, Giger ML, Vyborny CJ, Lan L, Mendelson EB, Hendrick RE. Classication of breast lesions with multimodality computer-aided diagnosis: observer study results on an independent clinical data set. Radiology 2006;240(2):35768. Huo Z, Giger ML, Vyborny CJ, Metz CE. Effectiveness of computer-aided diagnosisobserver study with independent database of mammograms. Radiology 2002;224:5608.

234

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235 [49] Davies DH, Dance DR. Automatic computer detection of clustered calcications in digital mammograms. Phys Med Biol 1990;35:11118. [50] Cheng H-D, Lui YM, Freimanis RI. A novel approach to microcalcication detection using fuzzy logic technology. IEEE Trans Med Imag 1998;17(3):44250. [51] Cernadas E, Zwiggelaar R, Veldkamp W, et al. Detection of mammographic microclcications using a statistical model. In: Karssemeijer N, Thijssen M, Hendriks J, van Erning L, editors. Digital mammography. Amsterdam: Kuwester; 1998. p. 2058. [52] Brown S, Li R, Brandt L, Wilson L, Kossoff G, Kossoff M. Development of a multi-feature CAD system for mammography. In: Karssemeijer N, Thijssen M, Hendriks J, van Erning L, editors. Digital mammography. Amsterdam: Kuwester; 1998. p. 18996. [53] Bankman IN, Christens-Barry WA, Kim DW, Weinberg IN, Gatewood OB, Brody WR. Automated recognition of microcalcication clusters in mammograms. Proc SPIE 1993;1905:7318. [54] Anastasio MA, Yoshida H, Nagel R, Nishikawa RM, Doi K. A genetic algorithm-based method for optimizing the performance of a computeraided diagnosis scheme for detection of clustered microcalcications in mammograms. Med Phys 1998;25(9):161320. [55] Anastasio MA, Kupinski MA, Nishikawa RM. Optimization and FROC analysis of rule-based detection schemes using a multiobjective approach. IEEE Trans Med Imag 1998;17(6):108993. [56] Nishikawa R, Edwards A, Schmidt R, Papaioannou J, Linver M. Can radiologists recognize that a computer has identied cancers that they have overlooked? Proc SPIE 2006;6146:18. [57] Chakraborty DP, Berbaum KS. Observer studies involving detection and localization: modeling, analysis, and validation. Med Phys 2004;31(8):231330. [58] Efron B, Tibshirani R. Improvements on cross-validation: the.632+ bootstrap method. J Am Stat Assoc 1997;92(438):54860. [59] Taylor P, Champness J, Given-Wilson R, Johnston K, Potts H. Impact of computer-aided detection prompts on the sensitivity and specicity of screening mammography. Health Technol Assess 2005;9(6):170. [60] Gilbert FJ, Astley SM, McGee MA, et al. Single reading with computeraided detection and double reading of screening mammograms in the United Kingdom National Breast Screening Program. Radiology 2006;241(1):4753. [61] Kegelmeyer Jr WP, Pruneda JM, Bourland PD, Hillis A, Riggs MW, Nipper ML. Computer-aided mammographic screening for spiculated lesions. Radiology 1994;191(2):3317. [62] Astley S, Quarterman C, Al Nuaimi Y, et al. Computer-aided detection in screening mammography: the impact of training on reader performance. In: Pisano E, editor. Digital mammography 2004. Chapel Hill; 2004. [63] Anderson ED, Muir BB, Walsh JS, Kirkpatrick AE. The efcacy of double reading mammograms in breast screening. Clin Radiol 1994;49(4): 24851. [64] Harvey SC, Geller B, Oppenheimer RG, Pinet M, Riddell L, Garra B. Increase in cancer detection and recall rates with independent double interpretation of screening mammography. Am J Roentgenol 2003;180(5):14617. [65] Thurfjell EL, Lernevall KA, Taube AA. Benet of independent double reading in a population-based mammography screening program. Radiology 1994;191(1):2414. [66] Freer TW, Ulissey MJ. Screening mammography with computer-aided detection: prospective study of 12,860 patients in a community breast center. Radiology 2001;220(3):7816. [67] Gur D, Sumkin JH, Rockette HE, et al. Changes in breast cancer detection and mammography recall rates after the introduction of a computer-aided detection system. J Natl Cancer Inst 2004;96(3):18590. [68] Feig SA, Sickles EA, Evans WP, Linver MN. Re: changes in breast cancer detection and mammography recall rates after the introduction of a computer-aided detection system. J Natl Cancer Inst 2004;96(16):12601, author reply 1261. [69] Nishikawa RM. Modeling the effect of computer-aided detection on the sensitivity of screening mammography. In: Astley SM, editor. Digital mammography 2006. Berlin: Springer-Verlag; 2006. p. 13642.

[28] Jiang Y, Nishikawa RM, Schmidt RA, Metz CE, Giger ML, Doi K. Improving breast cancer diagnosis with computer-aided diagnosis. Acad Radiol 1999;6(1):2233. [29] Winsberg F, Elkin M, Macy J, Bordaz V, Weymouth W. Detection of radiographic abnormalities in mammograms by means of optical scanning and computer analysis. Radiology 1967;89:2115. [30] Chan H-P, Doi K, Vyborny CJ, et al. Improvement in radiologists detection of clustered microcalcications on mammograms: the potential of computer-aided diagnosis. Invest Radiol 1990;25(10):110210. [31] Kobayashi T, Xu X-W, MacMahon H, Metz CE, Doi K. Effect of a computer-aided diagnosis scheme on radiologists performance in detection of lung nodules on radiographs. Radiology 1996;199:843 8. [32] Abe K, Doi K, MacMahon H, et al. Computer-aided diagnosis in chest radiography: analysis of results in a large clinical series. Invest Radiol 1993;28:98793. [33] Abe H, MacMahon H, Engelmann R, et al. Computer-aided diagnosis in chest radiography: results of large-scale observer tests at the 1996-2001 RSNA scientic assemblies. Radiographics 2003;23(1):25565. [34] Abe H, Ashizawa K, Li F, et al. Articial neural networks (ANNs) for differential diagnosis of interstitial lung disease: results of a simulation test with actual clinical cases. Acad Radiol 2004;11(1):2937. [35] Shiraishi J, Abe H, Engelmann R, Aoyama M, MacMahon H, Doi K. Computer-aided diagnosis for distinction between benign and malignant solitary pulmonary nodules in chest radiographs: ROC analysis of radiologists performance. Radiology 2003;227:46974. [36] Malich A, Fischer DR, Bottcher J. CAD for mammography: the technique, results, current role and further developments. Eur Radiol 2006;16:144960. [37] Karssemeijer N. Detection of masses in mammograms. In: Strickland RN, editor. Image-processing techniques in tumor detection. New York, NY: Marcel Dekker Inc.; 2002. p. 187212. [38] Nishikawa RM. Detection of microcalcications. In: Strickland RN, editor. Image-processing techniques in tumor detection. New York, NY: Marcel Dekker Inc.; 2002. p. 13153. [39] Giger ML, Huo Z, Kupinski MA, Vyborny CJ. Computer-aided diagnosis in mammography. In: Sonka M, Fitzpatrick JM, editors. Handbook of medical imaging, vol. 2. Bellingham, WA: The Society of Photo-Optical Instrumentation Engineers; 2000. p. 9151004. [40] Sampat MP, Markey MK, Bovik AC. Computer-aided detection and diagnosis in mammography. In: Bovik AC, editor. The handbook of image and video processing. 2nd ed. New York: Elsevier; 2005. p. 1195 217. [41] Nishikawa RM. Computer-assisted detection and diagnosis. Wiley; 2005. [42] Chan H-P, Niklason LT, Ikeda DM, Lam KL, Adler DD. Digitization requirements in mammography: effects on computer-aided detection of microcalcications. Med Phys 1994;21(7):120311. [43] Stafford RG, Beutel J, Mickewich DJ. Application of neural networks to computer-aided pathology detection in mammography. Proc SPIE 1993:1898. [44] El-Naqa I, Yang Y, Wernick MN, Galatsanos NP, Nishikawa RM. A support vector machine approach for detection of microcalcications. IEEE Trans Med Imag 2002;21(12):155263. [45] Campanini R, Dongiovanni D, Iampieri E, et al. A novel featureless approach to mass detection in digital mammograms based on support vector machines. Phys Med Biol 2004;49(6):96175. [46] Veldkamp WJH, Karssemeijer N. Improved correction for signal dependent noise applied to automatic detection of microcalcications. In: Karssemeijer N, Thijssen M, Hendriks J, van Erning L, editors. Digital mammography nijmegen 98. Amsterdam: Kluwer Academic Publishers; 1998. p. 16076. [47] Chan HP, Doi K, Galhotra S, Vyborny CJ, MacMahon H, Jokich PM. Image feature analysis and computer-aided diagnosis in digital radiography I. Automated detection of microcalcications in mammography. Med Phys 1987;14(4):53848. [48] Nagel RH, Nishikawa RM, Doi K. Analysis of methods for reducing false positives in the automated detection of clustered microcalcications in mammograms. Med Phys 1998;25(8):15026.

R.M. Nishikawa / Computerized Medical Imaging and Graphics 31 (2007) 224235 [70] Cupples TE, Cunningham JE, Reynolds JC. Impact of computer-aided detection in a regional screening mammography program. AJR Am J Roentgenol 2005;185(4):94450. [71] van Engeland S, Varela C, Timp S, Snoeren PR, Karssemeijer N. Using context for mass detection and classication in mammograms. Proc SPIE 2006;5749:94102. [72] Paquerault S, Petrick N, Chan HP, Sahiner B, Helvie MA. Improvement of computerized mass detection on mammograms: fusion of two-view information. Med Phys 2002;29(2):23847. [73] Birdwell RL, Bandodkar P, Ikeda DM. Computer-aided detection with screening mammography in a university hospital setting. Radiology 2005;236:4517. [74] Helvie MA, Hadjiiski L, Makariou E, et al. Sensitivity of noncommercial computer-aided detection system for mammographic breast cancer detection: pilot clinical trial. Radiology 2004;231(1):20814. [75] Khoo LA, Taylor P, Given-Wilson RM. Computer-aided detection in the United Kingdom National Breast Screening Programme: prospective study. Radiology 2005;237(2):4449.

235

Robert M. Nishikawa received his B.Sc. in Physics in 1981 and his M.Sc. and Ph.D. in Medical Biophysics in 1984 and 1990, respectively, all from the University of Toronto. He is currently an Associate Professor in the Department of Radiology and the Committee on Medical Physics at the University of Chicago. He is director of the Carl J. Vyborny Translational Laboratory for Breast Imaging Research. He is also a fellow of the American Association of Physicists in Medicine (AAPM). His research has three intertwining themes. The rst is the development of computer-aided diagnosis (CAD) techniques for x-ray imaging of the breast, in particular for digital breast tomosynthesis and full-eld digital mammography (FFDM). The second is the evaluation of CAD, principally its clinical effectiveness. The evaluations include Monte Carlo modeling of using computer-aided detection in screening mammography and observer studies to understand how effectively radiologists can use computers as aids when interpreting mammograms. The third is the investigation of the performance of new breast x-ray imaging systems. These studies include the evaluation of new clinical systems, such as FFDM and phase contrast mammography, and the optimization of digital breast tomosynthesis.

You might also like