You are on page 1of 10

Journal of Neuroscience Methods 221 (2014) 2231

Contents lists available at ScienceDirect

Journal of Neuroscience Methods


journal homepage: www.elsevier.com/locate/jneumeth

Computational Neuroscience

Detecting brain structural changes as biomarker from magnetic


resonance images using a local feature based SVM approach
Ye Chen a,c , Judd Storrs b , Lirong Tan a,d , Lawrence J. Mazlack c ,
Jing-Huei Lee b,e,f , Long J. Lu a,d,g,
a
Division of Biomedical Informatics, Cincinnati Childrens Hospital Research Foundation, 3333 Burnet Avenue, Cincinnati, OH 45229-3026, United States1
b
Center for Imaging Research, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, OH 45267, United States
c
School of Electronics and Computing Systems, University of Cincinnati, 497 Rhodes Hall, Cincinnati, OH 45221, United States
d
School of Computing Sciences and Informatics, University of Cincinnati, 810 Old Chemistry, Cincinnati, OH 45221-0008, United States
e
School of Energy, Environmental, Biological, and Medical Engineering, University of Cincinnati, 601 Engineering Research Center, Cincinnati, OH 45221,
United States
f
Department of Psychiatry and Behavioral Neuroscience, University of Cincinnati, 260 Stetson Street, Cincinnati, OH 45219, United States
g
Department of Environmental Health, College of Medicine, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, OH 45267-0524, United States

h i g h l i g h t s g r a p h i c a l a b s t r a c t

Identify noise local image features


by comparing features from disease
group and healthy control group.
Support vector machine is used to
classify image features.
Disease related regions are identi-
ed from three different diseases
(Alzheimers disease, Parkinsons dis-
ease and bipolar disorder).
The algorithm can be used to ana-
lyze MR images from heterogeneous
datasets.

a r t i c l e i n f o a b s t r a c t

Article history: Detecting brain structural changes from magnetic resonance (MR) images can facilitate early diagno-
Received 13 February 2013 sis and treatment of neurological and psychiatric diseases. Many existing methods require an accurate
Received in revised form 1 September 2013 deformation registration, which is difcult to achieve and therefore prevents them from obtaining high
Accepted 2 September 2013
accuracy. We develop a novel local feature based support vector machine (SVM) approach to detect brain
structural changes as potential biomarkers. This approach does not require deformation registration and
Keywords:
thus is less inuenced by artifacts such as image distortion. We represent the anatomical structures based
Local features
on scale invariant feature transform (SIFT). Likelihood scores calculated using feature-based morphom-
Brain
Neurological diseases
etry is used as the criterion to categorize image features into three classes (healthy, patient and noise).
Psychiatric diseases Regional SVMs are trained to classify the three types of image features in different brain regions. Only
MRI images healthy and patient features are used to predict the disease status of new brain images. An ensemble clas-
SVM sier is built from the regional SVMs to obtain better prediction accuracy. We apply this approach to 3D
Biomarker MR images of Alzheimers disease, Parkinsons disease and bipolar disorder. The classication accuracy
ranges between 70% and 87%. The highly predictive disease-related regions, which represent signicant

Corresponding author at: Division of Biomedical Informatics, MLC 7024, Cincinnati Childrens Hospital Research Foundation, 3333 Burnet Avenue, Cincinnati, OH 45229,
United States. Tel.: +1 513 636 8720; fax: +1 513 636 2056.
E-mail address: long.lu@cchmc.org (L.J. Lu).
1
http://dragon.cchmc.org.

0165-0270/$ see front matter 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.jneumeth.2013.09.001
Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231 23

anatomical differences between the healthy and diseased, are shown in heat maps. The common and
disease-specic brain regions are identied by comparing the highly predictive regions in each disease.
All of the top-ranked regions are supported by literature. Thus, this approach will be a promising tool for
assisting automatic diagnosis and advancing mechanism studies of neurological and psychiatric diseases.
2013 Elsevier B.V. All rights reserved.

1. Introduction transform (Hackmack et al., 2012), incorporating histogram distri-


bution of gray matter density (Li et al., 2010), combining different
Many neurological and psychiatric diseases have been found classication algorithms (Wolz et al., 2011) and incorporating
to harbor structural changes in patients brains. Examples include multi-source heterogeneous data (Yuan et al., 2012).
Alzheimers disease (AD), Parkinsons disease (PD) and bipo- Because the previous approaches rely on the deformation reg-
lar disorder (BPD), each of which affects millions of individuals istration, their performance is highly sensitive to the registration
throughout the world. Early and accurate diagnosis of these dis- accuracy (Cuingnet et al., 2011). Unfortunately, a good deformation
eases is crucial to improving the treatment of these diseases. registration is often difcult to achieve. The alignment algorithms
Magnetic resonance (MR) brain images are frequently used to assist try to minimize the differences between brain images; however,
diagnosis of these diseases. However, compared to objective meas- it is difcult to determine whether an observed brain anatomical
ures, visual examination is often subject to experts experience. difference is attributable to normal inter-subject variance or to dis-
By contrast, machine learning based algorithms for analyzing MR ease status. The differences caused by disease may be removed
images can provide more objective assessment of disease status and in the registration process and cause an over-alignment problem
are not biased by experts experience. They can potentially identify (Toews et al., 2010). On the other hand, if the differences caused
the characteristics of structural changes in a patients brain as dis- by normal inter-subject variance are removed, the voxels at the
ease biomarkers. These biomarkers may improve disease diagnosis, same position in different brain images may represent different
patient treatment and the understanding of the disease mecha- anatomical structures and cause problems in the subsequent anal-
nisms. As such, there is an urgent need to develop robust machine ysis.
learning based algorithms for identifying such disease biomarkers. These difculties associated with performing deformation regis-
Traditional methods for analyzing MRI brain images include tration often prevent the previous approaches from achieving high
voxel-based morphometry (VBM) (Ashburner and Friston, 2000), classication performance. By contrast, methods using local image
deformation-based morphometry (DBM) (Ashburner et al., 1998), features, such as features generated by scale invariant feature trans-
and tensor-based morphometry (TBM) (Bossa et al., 2010; Hua et al., form (SIFT), are robust to noises and distortions. SIFT is widely
2008). used in computer vision algorithms (Lowe, 2004). Toews et al.
VBM is based on a voxel-wise comparison of brain tissue density (2010) rst proposed feature-based morphometry (FBM), which
generated by segmentation. Brain tissue density changes caused by represents brain images by SIFT features. In FBM, the compari-
a disease can be captured (Ashburner and Friston, 2001). Deforma- son of brain images is based on image features, instead of voxels.
tion registrations are used to align the voxels to ensure the voxels Therefore voxel-level alignments are not required. The anatomical
at the same location in different brains represent the same brain structures related to a disease could thus be preserved. The image
region and thus a voxel-wise comparison is meaningful. features are clustered based on Euclidean distances between the
DBM compares the deformation elds that are used to regis- vector representations of the features. Likelihood scores are cal-
ter the brain images (Ashburner et al., 1998; Chung et al., 2001). culated for these features to measure the correlation between the
A deformation eld describes the moving direction of every voxel features and the disease statuses of the MR images. Highly cor-
in an alignment. It is optimized by an algorithm so that the vox- related features are considered as biomarkers. Although FBM has
els in the brain image and the template are aligned. The moving been successfully applied to classify AD, it still suffers from the
directions of some voxels are different between healthy control following drawbacks:
and diseased brains and therefore can be used as biomarkers. TBM
examines the determinant of the Jacobian matrix, which is the (i) The accuracy for evaluating new SIFT features can be further
derivatives of the deformation eld (Hua et al., 2008). If a deter- improved. In FBM, a nearest neighbor approach is used, which
minant is larger than one, the brain region is expanded; otherwise, requires a large number of samples; while brain image dataset
the brain region is contracted. The expansion and contraction of the is usually small. Because SVM is able to infer the best classica-
brain regions can be used as biomarkers of the diseases. Both DBM tion boundary that minimize generalization error and it usually
and TBM derive biomarkers based on deformation eld. As such, an works better than nearest neighbor approach when the num-
accurate deformation registration is critical. ber of samples is small, we propose to identify the SIFT features
The biomarkers can be further analyzed by machine learning using SVM.
algorithms, such as support vector machine (SVM) (Focke et al., (ii) Feature selection is not performed in the original FBM approach.
2011), ensemble sparse classication (Liu et al., 2012) and multi- The classication accuracy may be adversely affected by the
variate searchlight classication (Uddin et al., 2011). SVM is one of noise features. In our approach, we propose to use the original
the most widely used classiers (Meyer et al., 2003). In SVM, the FBM approach as an image feature preprocessing method. The
samples are mapped into high-dimension space and then a classi- likelihood score generated by FBM is used as a feature selection
cation boundary is found to maximize the margin between two criterion.
classes. SVM thus has excellent generalizability and was applied to
brain image classication for various diseases, e.g., schizophrenia In this paper, we develop a novel MRI analysis method based on
(Nieuwenhuis et al., 2012), attention-decit/hyperactivity disor- SIFT features and SVM. In this method, FBM approach is used in the
der (Seidman et al., 2011), autism spectrum disorder (Calderoni image feature preprocess step to remove noise features and iden-
et al., 2012), Parkinsons disease (Focke et al., 2011) and Alzheimers tify disease-related and healthy-related features. SVMs are used to
disease (Cuingnet et al., 2011; Klppel et al., 2008; Vemuri et al., classify image features extracted from the testing brains. We obtain
2008). Several approaches were proposed to further increase the convincing results to demonstrate the feasibility of combining SVM
classication accuracy, e.g., multi-scale analysis based on wavelet with FBM, which has not been attempted before. In addition, we
24 Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231

demonstrate the wide applicability of our approach to neurological obtained using the Modied Driven Fourier Equilibrium Transform
and psychiatric diseases by applying it to three different diseases: (MDFET) technique (Lee et al., 1995).
AD, PD and BPD.
2.2. Preprocessing

2. Materials The preprocessed images are downloaded from OASIS. The pre-
processing steps include averaging 34 scans of the same subjects,
2.1. Datasets removing skulls, registering to Talairach space and bias eld correc-
tion. Please refer to Marcus et al. (2007) for a detailed description
The proposed algorithm is tested on datasets for three dis- of the preprocessing steps for this dataset. PPMI images are aligned
eases. We use the public datasets in OASIS (Marcus et al., 2007) to the ICBM452 template using SPM8 with default parameters. BPD
for Alzheimers disease, the public datasets in Parkinsons Progres- images are aligned to the ICBM452 using SPM5 with default param-
sion Markers Initiative (PPMI) (PPMI, 2011) for Parkinsons disease, eters.
and the dataset obtained by our group for bipolar disorder.
OASIS contains 416 subjects aged 1896. We use only the
subjects with age 60 (average ages for the healthy sub- 3. Methods
jects and the patients are 76 and 77, respectively). Three to
four images are available for each subject. All images are The framework of our approach is outlined in Fig. 1. Before
acquired on a 1.5-T Vision scanner (manufactured by Siemens) applying the proposed image analysis algorithm, 3D brain images
using T1-weighted magnetization prepared rapid gradient-echo are rst transformed into a set of 2D images. Then ve major steps
(MPRAGE) technique. Dementia statuses measured by Clinical are performed to analyze the brain images. These steps include fea-
Dementia Rating (CDR) scales are available. For the purpose ture extraction, feature evaluation, feature labeling and selection,
of comparison, the brain images are organized into two sub- SVM training and nally classication. Before we introduce these
sets: steps, we will discuss a concern with respect to the huge number
of SIFT features.

(1) AD-86: 86 subjects aged 6080 years are chosen, including 3.1. Dividing human brains into regions
20 patients with mild AD (CDR = 1) and 66 healthy subjects
(CDR = 0); A typical human brain contains 50,000 SIFT features accord-
(2) AD-126: subjects aged 6096 years with CDR = 1 (patients) and ing to our experiment results as shown in Fig. 3. A good training
CDR = 0 (healthy subjects) are chosen. There are 126 subjects, set should contain at least a dozen brains. Therefore, more than a
including 28 patients and 98 healthy subjects. million SIFT features could be involved in the training step. The
huge number of SIFT features poses calculation efciency prob-
Besides the two subsets, Toews et al. (2010) used a third subset lems in two computation steps. First, the feature evaluation step
with 135 subjects contains both very mild and mild AD. However, will be affected because pair-wise distance between SIFT features
we found a different number of subjects with the same criterion. will be calculated in this step. There are 1012 pairs for a mil-
Therefore we will not compare the results on this subset. lion SIFT features, which requires excessive computation time to
PPMI contains brain images for PD and healthy controls that are calculate them directly. Second, the SVM training step will be
acquired from different locations with different scanner param- affected. If all SIFT features are used to train the SVM, it will be
eters. We use subsets from PPMI with the subject ID ranges of likely that there are many highly similar SIFT features coming
36003616, 34623450 and 34053429. We choose these sub- from different regions of the brain. These SIFT features are not
sets because all three subsets contain 3D T1 weighted images. The easily distinguishable and will pose difculty to the SVM train-
images are combined and divided into 2 subsets: (i) PPMI-15: 9 ing algorithm. Therefore it is necessary to divide the whole brain
patients ranging from 46 to 78 years old (mean 63.7y, stdev 11.1y) into several small regions and apply the algorithm to each region
and 6 healthy controls ranging from 42 to 66 years old (mean 56.8y, individually. This will limit the number of the SIFT features in
stdev 8.3y); (ii) PPMI-37: 18 patients ranged from 35 to 78 years both the feature evaluation and SVM training steps. The classi-
old (mean 58.7y, stdev 11.5y) and 21 healthy controls ranging from cation results for the individual regions will be combined to
37 to 66 years old (mean 54.1y, stdev 9.6y). All images in PPMI-15 obtain the nal result. We realize that the division of brains might
are acquired through MPRAGE technique; while PPMI-37 contains slightly affect the classication result because some of the fea-
images acquired through either FSPGR or MPRAGE. There should be tures may be located on the boundary of the regions. We thus
some differences in the images acquired by different techniques. divide the brains into different numbers of regions and evaluate
PPMI-37 can thus be used to test the robustness of the proposed the classication result of each in order to determine the best
approach. division methods. The effect of the brain division should be mini-
The bipolar disorder dataset consists of 14 bipolar depressed mized.
patients and 9 demographically matched comparison subjects.
Patients met DSM-IV criteria for type I bipolar depression and had 3.2. Algorithm steps
Hamilton Depression Rating Scale (HDRS) total scores 20 and
Young Mania Rating Scale (YMRS) total scores 12. Comparison 3.2.1. Step 1: feature extraction
subjects had no history of any Axis I psychiatric disorder and no We use a 2D implementation of the SIFT algorithm to extract
rst- or second-degree relatives with affective or psychotic dis- features from the brain images (Lowe, 1999). SIFT is widely used
orders. The age of bipolar depressed subjects ranges from 20.4 to to identify salient features from images (Lowe, 2004). Fig. 2
33.8 years (mean 25.5y, stdev 4.3y). The age of healthy subjects shows an example of a brain slice with identied SIFT fea-
ranges from 20.3 to 33.8 years (mean 26.5y, stdev 4.1y). All sub- tures. A 2D SIFT feature is described by 132 numbers: 1 number
jects were imaged on a 4 T Varian Unity INOVA whole-body MRI for feature scale, 2 numbers for center location, 1 number for
(Varian Inc., Palo Alto, CA) using protocols approved by the local orientation and 128 numbers for appearance matrix, which char-
Institutional Review Board. T1-weighed anatomical images were acterizes the image appearance around the center of the feature
Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231 25

Fig. 1. Schematic diagram of the proposed approach. Our approach consists of a learning process and a testing process. The training process contains the following steps:
(1) the training MR images are aligned and smoothed; (2) the preprocessed images are sliced along three orientations to obtain 2D images; (3) the 2D SIFT algorithm is used
to extract image features. The SIFT features are shown as circles on the right; (4) the features are evaluated using feature based morphometry (FBM). The right side shows
an example. Three clusters of features (shown in different radii and colors) are discovered. They are assigned with likelihood scores based on the number of appearance in
patient and healthy brains; (5) the SIFT features are labeled as patient, healthy and noise features. The noise features is not used in the further analysis, except it is used
to train SVM to identify noise features in the testing images; (6) SVMs are trained for every brain region to classify SIFT features as patient, healthy or noise features; (7)
calculate the difference between number of patient features and healthy features and nally (8) classify the subject based on the difference. The testing process is similar to
the training process but (9) the SVM classication step is used instead of the training steps. (For interpretation of the references to color in this gure legend, the reader is
referred to the web version of the article.)

in more detail. In order to represent 3D volumes by 2D fea- descriptor. Package vlFeat (Vedaldi and Fulkerson, 2010) is used
tures, we add one more number to represent the slice number to extract SIFT features from the 2D slices of MR images in this
of the 2D slice from which the 2D features are extracted, i.e., study.
the location of the SIFT feature is represented by 3 numbers. When 2D SIFT features are being extracted, the 3D brain images
Therefore a SIFT feature in our paper is represented by 133 are sliced along three orientations (coronal, axial and sagittal) to
numbers. generate a series of 2D images. In order to retain the 3D loca-
We use a 2D SIFT algorithm mainly because the 2D SIFT algo- tion information, the location of the SIFT features are recorded
rithms are more readily available; while there are very few 3D in the 3D format. For example, if we extract a feature at location
SIFT implementations and the available ones do not provide all the (x, y) from the third image slice along the rst orientation, the
properties of the 2D counterparts. For example, the 3DSIFT Mat- locations of the feature will be (3, x, y). After all of the SIFT fea-
lab package (Scovanner et al., 2007) does not provide rotational tures are extracted, they are organized according to their locations
invariance and it is intended for spatialtemporal 2D image stream and slice orientations. The SIFT features generated from different
analysis and the n-SIFT C++ package (Cheung and Hamarneh, 2009) slice orientations are processed separately. The SIFT features with
does not provide steps to remove the unstable key points and does the same slice orientation are further divided into small regions
not provide scale and orientation information in the output feature according to their locations. The size of the regions is determined
26 Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231

Fig. 2. Illustration of SIFT features. The SIFT features are shown as circles in panel A. The centers of the circles are the center locations of the SIFT features. The radii present
the feature scales. The lines starting from the centers show the directions of the SIFT features. The appearance matrices are shown in panel B. The image around the SIFT
feature is divided into 4 4 squares. The lines starting from the center of the squares show the gradient directions. Their lengths are proportional to the number of pixels
with the same gradient directions. Take the lowest square as an example. The longest line starting from the center points to the upward direction. It means most of the pixels
have the upward directions, i.e., the image intensities increase when moving upwards.

through experiments and it is set to 20 20 20 in this study. For 3.2.2.2. Assigning likelihood scores. The likelihood score of a SIFT
brain images with the size of 176 208 176, there are 9 11 9 feature is assigned according to the following equation:
regions for every slicing orientation. The total number of regions is
3 9 11 9 = 2673. ln |Si P|/NP |Si |NP + NC
Lt = |Si C|/NC

0, otherwise
3.2.2. Step 2: feature evaluation
where Si is the similarity feature set for SIFT feature i, P is the patient
The SIFT features are evaluated using FBM. There are two
feature set, C is the healthy feature set, NP is the number of patients,
steps to evaluating the SIFT features. In the rst step we nd
NC is the number of healthy controls.
out the features similar to the feature that is being evaluated. In
According to this equation, the likelihood score Li is non-zero
the second step likelihood scores are assigned to the SIFT fea-
if there is on average at least one SIFT feature per brain and the
tures.
number of patient features equals the number of healthy features.
The likelihood score is positive if there are more patient features
than healthy features. The likelihood score is negative if there are
3.2.2.1. Building similar feature sets. Four measures are used to
fewer patient features than healthy features.
represent the similarity between two SIFT features fi and fj : the dis-
tance between the center locations of the two SIFT features x (i, j),
the scale difference s (i, j), the orientation difference o (i, j) and 3.2.3. Step 3: feature labeling and selection
the difference between their appearance matrices a (i, j). They are We propose to classify the SIFT features into three categories.
dened as follows: The rst category of SIFT features appears more frequently in
patient brains than healthy brains. We denote these SIFT features
||xi xj ||2 as patient features (likelihood score larger than a small thresh-
x (i, j) = old). Similarly, healthy features are the SIFT features that appear

 i more frequently in healthy brains (likelihood score smaller than
 Sj 
s (i, j) = ln 
a threshold). The third category of SIFT features is noise features.
Si They appear with almost equal frequency in both healthy brains
and patient brains (the absolute value of the likelihood score is less
o (i, j) = min(|oi oj |, 2 |oi oj |)
than a threshold). According to the denition, noise features are
 (i, j) = ||ai aj ||2 not indicative of diseased or healthy status. Therefore noise fea-
tures are not used in the classication process or the generating of
heat-maps. However, we train SVMs to identify noise features so
where xi , si , oi and ai is the center location, scale, orientation
that the new image features extracted from testing brain images
and appearance matrix of SIFT feature i respectively. ||x||2 is the
can be recognized and excluded from the classication process.
Euclidean norm of vector x. As ai and aj are matrix, they are vector-
We assign a score of 1, 0 or 1 (represent healthy feature, noise
ized before calculating the Euclidean norm.
feature or patient feature respectively) to every SIFT feature based
A SIFT feature is similar to another one if all four similarity meas-
on its likelihood score:
ures are below their corresponding thresholds. The similar feature

set of SIFT feature i consists of all the SIFT features that are similar 1, Li > l
to SIFT feature i. It is dened as follows:

Si = 0, |Li | l


Si = {fj : x (i, f ) < x s (i, j) < s o (i, j) < o a (i, f ) < a } 1, Li < l

where l is the threshold for likelihood score. Because of random


where x , s , o and a are similarity thresholds for center locations, variations, the likelihood scores for noise features can be slightly
scales, orientations and appearance matrix, respectively. larger or smaller than 0. The threshold l is used in order to avoid
Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231 27

misclassifying noise features as patient or healthy features. This As the calculation of EER classication rate uses the class label
threshold is chosen based on experiments. The threshold with the of the MR images, it is not a good measure for the purpose of eval-
highest Equal Error Rate (EER) classication accuracy, i.e., 0.8, is uating the performance of the classier on class label unknown
used in this study. The number of non-noise feature depends on MR images. The class labels of the MR images are unknown in
the distribution of likelihood score and the threshold. Usually less computer-aided diagnosis. In order to measure the performance of
than 50,000 out of around 1,000,000 SIFT features are not classied the classier on class label unknown MR images, we use accuracy
as noise feature. on unknown data as another performance measure. The threshold
for the nal classier is determined by rst using the classier to
3.2.4. Step 4: training SVM classiers classify the training MR images and then choosing a threshold so
The input samples to the SVM are a number of image features that the number of false positives rate for the training MR images
(133 dimensional vectors). The class labels of the image features is equal to the false negatives rate for the training MR images. The
equals to 1, 0 and 1 if they are healthy features, noise features and threshold is applied in the classication of the testing MR images.
patient features respectively. A 3-class SVM implementation from The proportion of correctly classied testing MR images is the accu-
libSVM (Chang and Lin, 2011) is used. The training is performed for racy on unknown data. This accuracy should be the real accuracy
every brain region as we have mentioned in the section Dividing that we can expect when the classication algorithm is used in
human brains into regions. real-world scenarios.

3.2.5. Step 5: predicting brain images 3.4. Experiment evaluation method


For a brain image to be classied, we rst perform feature
extraction and then the SIFT features are classied by the trained Leave-one-out cross-validation is performed for all datasets.
SVMs from the previous step. Generally speaking, if there are more Every image is chosen as a testing image once and the remaining
patient features than healthy features, the brain is classied as images are used for training. The heat-map is drawn based on the
patient, and vice versa. The features that are predicted as noise training results on all images in the dataset. The EER classication
features will not affect the nal prediction result. However, as we rate, AUC and accuracy on unknown data are calculated based on
have divided the brain into small regions, the SIFT features in each the method introduced in Section 3.3.
region is classied by the corresponding SVM. For the purpose of comparison, we have also implemented VBM
The nal classication result should be the sum of the scores and DBM approaches using SPM8 and SVM. We apply both unied
Ssum = Si of all features (because Si = 0 for the noise features, segmentation tool and the segmentation tools named New Seg-
i
mentation in SPM8 to generate gray matter tissue density maps
they will not affect the nal score Ssum ). Ideally the brain is classied and the deformation elds. The methods using unied segmen-
as patient brain if Ssum is positive and is classied as healthy con- tation are denoted as VBM1 and DBM1 respectively; while the
trol otherwise. However, this simple approach will cause problems. methods using the New Segmentation are denoted as VBM2 and
Some patient brains may contain only a small number of patient DBM2 respectively. Please refer to the supplementary material for
features and therefore Ssum may be a positive number. We thus the details of the implementations.
need to set a threshold c and determine the nal classication
result as follows: 3.5. Parameter settings

Patient, if Ssum > c The size of the divided small regions is 20 20 20. The size of
Class label =
Healthy, otherwise the regions affects the accuracy and speed of feature evaluation.
The similar SIFT features that locate near the region boundaries
We obtain this threshold by classifying the training brains and may be divided into two regions by the boundaries. As likeli-
then nding the threshold that minimizes the difference between hood scores are calculated within individual regions, the likelihood
false negatives and false positives. We can then predict new brains scores for these near-boundary SIFT features are not accurate. The
using this threshold. speed is also affected by the size of the regions. The number of
SIFT features increases and the speed of the algorithm decreases
3.3. Result measures as the size of the regions increases. The size of 20 20 20 is
chosen in order to balance the speed and the accuracy of the algo-
Heat-maps are drawn to show the predictive power of every rithm.
brain region. The predictive power is measured by the number of There are several thresholds including x , s , o , a and l . The
group-related features. If the total number of patient features and rst two thresholds are set generously as x = 0.5 and s = 2/3. The
healthy features in a brain region is large, the region has a high last three thresholds are set based on a grid search on differ-
predictive power. We may identify disease-related brain regions ent parameter values. The parameter values are nally chosen as:
from the heat-maps. o = /2, a = 0.45, and l = 0.8.
The performance of algorithm is assessed by comparing the
Receiver Operating Characteristic (ROC) and two numerical meas- 4. Results
ures of the ROC curves. An ROC curve describes how the true
positive rate and false positive rate change as the threshold of the 4.1. Assessing prediction performance
classier changes. As it is hard to compare two ROC curves, we use
two measures, i.e., Equal Error Rate (EER) classication rate and To demonstrate the broad applicability of our method to the
Area Under ROC Curve (AUC). EER classication rate is calculated by neurological and psychiatric diseases, we have applied the pro-
rst choosing a threshold so that the number of false positives rate posed approach to three diseases: AD, PD and BPD. Leave-one-out
is equal to false negatives rate and then calculating the classication experiments are performed. The performance of the algorithms is
accuracy with the chosen threshold. We use EER classication rate measured by EER accuracy, Area Under ROC Curve (AUC) and real
partly because we would like to compare the proposed algorithm accuracy (dened in Section 3.3).
to Toews et al.s approach. In practice, we may choose a threshold Our approach outperforms VBM and DBM in all the datasets
to minimize the risk of misclassication. except the EER accuracy of VBM1 on AD-126. VBM1 outperforms
28 Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231

Table 1
Comparison of approaches. EER classication rate is the percentage of correctly classied subjects at the threshold that makes the false positives rate equals false negatives
rate. The best EER accuracy for every dataset is shown in bold. Accuracy is the percentage of correctly classied subjects at the threshold obtained based on training subjects
only. AUC is the area under receiver operating characteristics curve.

Data sets Toews et al.s VBM 1 VBM 2 DBM 1 DBM 2 Our approach
Nave Bayes approach

EER accuracy (%) EER (%) EER (%) EER (%) EER (%) EER (%) Accuracy (%) AUC

AD-86 80 79 79 70 64 83 80 0.919
AD-126 70 77 71 62 56 71 74 0.803
PPMI-15 67 67 73 73 87 80 0.907
PPMI-37 51 46 51 73 68 0.780
BPD 57 48 57 70 57 0.806

VBM2 in AD-126 and BPD. VBM2 is not able to classify PPMI-37 4.2. Feature evaluation results
and BPD. An investigation into the execution process of the algo-
rithm shows that the SVM uses all the training samples as support We hypothesize that there are a large number of noise fea-
vectors, which suggests the trained SVM is unable to learn a gener- tures that would affect the accuracy of image analysis methods.
alized model. In other words, the classication model is specically We show the distribution of likelihood ratios for all of the three
tted to the training sample and therefore is unable to classify test diseases in Fig. 3 to verify this hypothesis. There are a small number
subjects correctly. of SIFT features with large positive (or negative) scores. These are
Our algorithm outperforms Toews et al.s Nave Bayes approach patient (or healthy) features. The features with near-zero scores
for both AD subsets (Table 1). The classication accuracies for AD- are noise features. The number of noise features is much more
86 are higher than AD-126, which is consistent with the results in than patient features and healthy features. The most informative
Toews et al. (2010). It is because AD-86 contains subjects younger features have absolute scores close to 3, which means the number
than AD-126. The older subjects are considered more difcult to of occurrences of the similar features in healthy control (or patient)
classify because some of the anatomical changes occurred in normal brains is e3 = 20.1 times the number of occurrence in patient (or
aging are similar to the anatomical abnormalities existed in patients healthy) brains. The distribution of likelihood ratios are similar for
with AD. the three diseases, except there is a difference in the number of SIFT
The classication accuracies for PPMI-37 using different features.
approaches are lower than PPMI-15. This is likely because PPMI-37
contains MR images from different MRI scanners, making PPMI- 4.3. Identication of disease-related regions
37 more difcult to classify. However, there is also a possibility
that PPMI-15 is easier to classify because the algorithm detects the We show the spatial distribution of the likelihood ratios for rep-
age-related differences (the mean age of the patient subjects is 6 resentative slices of each disease in Fig. 4 and the other slices in
years younger than healthy subjects for PPMI-15). The EER accuracy Supplementary Fig. 1. The color of the squares shows the sum of
for PPMI-37 using VBM and DBM approaches is around 50%, which the absolute likelihood scores in the corresponding brain region.
suggests that these methods are not sufciently robust to be able to Of note, several regions with high absolute likelihood scores (red
classify images obtained using different scanning techniques. The squares in the graph) are known brain regions that are believed to
accuracy for PPMI-37 using our approach is more than 20% higher be affected associated with the diseases.
than VBM and DBM approaches, which suggests our approach is To verify the discriminative brain regions identied by our algo-
more suitable to PPMI-37. rithm, we map each brain region to one of the 12 brain lobes
The EER classication accuracy for BPD is slightly worse than by locating the center coordinates using the Talairach software
AD-126 but the AUC for BPD is higher. It is observed that the real (http://www.talairach.org/). We consider only the top 20 brain
accuracy for BPD is low. This may be partly due to the smaller regions ranked in descending order according to their absolute sum
number of training MR images for BPD. of likelihood scores. These identied highly discriminative regions

Fig. 3. Histograms of the likelihood scores for different diseases. A large number of SIFT features have likelihood scores close to 0. Only a small number of SIFT features have
large likelihood scores and are useful in the analysis process. The SIFT features with large positive scores are the features that appear frequently in patient brains; while the
SIFT features with large negative scores are the features that appear frequently in healthy brains.
Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231 29

Fig. 4. Spatial distribution of the SIFT features for representative slices of each disease. The gure shows the spatial distribution of the sum of absolute likelihood scores of
the SIFT features in every brain region in the representative slices of each disease. Regions with higher absolute likelihood scores are considered highly predictive of disease
status. (For interpretation of the references to color in the text, the reader is referred to the web version of the article.)

Table 2
Highly predictive brain regions for three neurological and psychiatric diseases. The highly predictive brain regions are the ones with high absolute sum of likelihood scores.
These regions are small cubes in the 3D brain volumes. The brain regions containing these cubes are listed in this table.

Disease Highly predictive brain regions

AD Limbic lobe Frontal lobe Sub-lobar Temporal lobe


BPD Limbic lobe Frontal lobe Sub-lobar Posterior lobe Parietal lobe
PD Limbic lobe Frontal lobe Sub-lobar Midbrain Pons Posterior lobe Occipital lobe

may indicate that these brain regions are affected by the diseases classication of image features for a wide range of training sample
(Table 2). We nd that limbic lobe, frontal lobe and sub-lobar are sizes.
affected by all three diseases. Limbic lobe includes regions, such as We also demonstrate in our study that our approach is capable of
hippocampus and amygdala that are affected by a variety of neu- utilizing MR images from multiple sources and is widely applicable
rodegenerative disorders. The frontal lobe is also a very reasonable to various neurological and psychiatric diseases. In the following
region with considerations of its important roles in activities such sections, we will elaborate each of these aspects in greater detail.
as voluntary movement, short-term memory and consciousness,
as well as the symptoms of the disorders. Furthermore, a num- 5.1. Support vector machine
ber of papers support these three regions (Cvetkovic-Dozic et al.,
2001; Davie, 2008; Strakowski et al., 2004). Aside from the three The state-of-the-art machine learning approach SVM is used to
common regions, AD is found to be associated with the temporal classify the local image features in our approach. We have two
lobe, which is consistent with the ndings reviewed in Cvetkovic- goals. The rst goal is to establish the applicability of SVM for the
Dozic et al. (2001). For BPD, our algorithm has identied the uvula, classication of local image features. The second goal is to obtain
which forms a considerable portion of the inferior vermis supported better performance.
by Strakowskis group (Strakowski et al., 2004) and is mapped SVMs have been successfully applied to various areas. The ques-
to the posterior lobe based on the anatomical hierarchy dened tion here is whether SVMs are suitable for MR image classication
in the Talairach. In addition, midbrain and pons are recognized based on local image features. We tried different ways of using
as two unique regions for PD. Support of these two regions can SVMs, different SVM kernels and different parameter settings to
be found in Davie (2008). In conclusion, our algorithm may have search for a best performing algorithm. The nal experimental
captured the unique signature of each disease, and the identied result of our proposed approach enables us to provide a positive
regions are well consistent with the ndings reported in existing answer to that question: SVMs are suitable for MR image classi-
literature. cation based on local image features with the elimination of noise
features. The proposed approach is applied to three diseases: AD,
5. Discussion PD and BPD. The worst EER accuracy for all the diseases is 70%. For
the AD-86 and the PPMI 15-brain data set, the EER classication
This paper introduces a novel MRI analysis method based on SIFT accuracies are higher than 80%. These results suggest that the SVM
features and support vector machine (SVM). The proposed algo- is indeed applicable to both small and large datasets.
rithm includes two novel steps, i.e., the image feature preprocessing Comparison with Toews et al.s approach on the two AD datasets
using feature-based morphometry (FBM) and the classication of (Toews et al., 2010) shows that SVM approach outperforms Toews
local image features based on ensemble SVMs. et al.s approach in terms of EER accuracy.
In the proposed approach, FBM is used as an image feature pre-
processing method. The likelihood scores calculated based on FBM 5.2. Classifying features into three categories
are used to identify noise features, which is not used in further
analysis, and also used to identify patient features and healthy We develop a novel framework of analyzing MR images based
features. Based on the image feature preprocessing method, it is on three-class classiers and the FBM approach. A straightforward
possible for many different advanced machine learning techniques way of classifying the image features is to classify them into healthy
to be applied in combination with the FBM approach. In this paper, features (i.e., the features that appears more frequently in healthy
we explored the use of ensemble SVM to classify the image fea- subjects) and patient features (i.e., the features that appears more
tures. The good generalization ability of SVM enables the accurate frequently in patient subjects). However, there is a difculty caused
30 Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231

by the existence of a large number of features that appear almost identied in the heat-maps. The comparison of the disease-affected
equally frequently in both patient brains and healthy brains. Ran- brain regions among the three diseases shows several common
dom variations may cause these features appear slightly more regions. These common regions can be explained because these
frequently in patient or healthy brains. It is difcult to correctly diseases have several similar syndromes, and these syndromes
classify the noise features as either healthy features or patient are caused by the common brain regions that are affected by the
features. To address this problem, we propose to label these SIFT diseases. The identied brain regions may help us understand the
features as noise features and train three-class SVMs to identify disease pathology. They can also be used as biomarkers for guiding
all three categories of features, i.e., noise features, healthy features disease treatment. The analysis of disease-affected brain regions
and patient features. In the testing steps, the noise features can be provides us a consistent way of comparing the mechanism and
identied by the three-class SVMs and removed from the follow- effect of different diseases.
ing classication process. The proposed approach of classifying the In summary, our approach represents a signicant advancement
SIFT features into three categories enables us to apply other classi- in analyzing brain images. It will be a promising tool for assisting
ers to the classifying of MR images. Future work should be done automatic diagnosis and advancing mechanism studies of neuro-
to obtain a better classication performance by applying different logical and psychiatric diseases.
classication algorithms.

5.3. Ensemble approach Acknowledgments

We use an ensemble approach to predict the nal class label. The YC and LJL conceived the idea and designed the experiments.
idea of ensemble approach is to build a high-accuracy predictive YC designed and implemented the proposed algorithm. JS and JHL
model by combining the results of multiple low-accuracy models provided imaging data for bipolar disorder. JS performed the image
(Rokach, 2010). Ensemble classiers have been used in several VBM registration for bipolar disorder. LT performed the image registra-
approaches (Hinrichs et al., 2009; Liu et al., 2012). The classiers tion for PPMI brain images and analyzed the disease related brain
are built based on gray matter densities of each voxel. We can- regions for the three diseases. LJM offered critical suggestions on
not apply this method to local image features in a straightforward the classication algorithms. All the authors have reviewed and
way, because the brain images are not aligned using deformation contributed to the text writing. We would also like to thank Dr. Scott
registration. In our approach, we consider one image feature as an K. Holland for critiquing the manuscript. This work is supported
instance to be classied. The classication results for all image fea- by CCHMC CCTST Methodology grant awarded to LJL as part of an
tures in one region are used to vote for the classication result for Institutional Clinical and Translational Science Award (NIH/NCRR
the region. As the brains are coarsely aligned, the same region in 8UL1TR000077-04).
different brain images should represent the same anatomical struc-
tures. The ensemble classication can be performed at brain region
Appendix A. Supplementary data
level, i.e., the classier for every brain region is a predictive model
and the top-performing predictive models are chosen to vote for
Supplementary data associated with this article can be found, in
the nal classication result.
the online version, at http://dx.doi.org/10.1016/j.jneumeth.2013.
We have tried one of the ensemble classication approaches in
09.001.
our research. Other ensemble approaches are worth studying in
the future. Because the classication is based on local image fea-
ture, which is an independent observation of the brain images, References
it is possible to use local image feature based classier as a
component classier in combination with other component clas- Ashburner J, Friston KJ. Voxel-based morphometry the methods. Neuroimage
2000;11:80521.
siers built from gray matter densities to form a more predictive Ashburner J, Friston KJ. Why voxel-based morphometry should be used. Neuroimage
classier. 2001;14:123843.
Ashburner J, Hutton C, Frackowiak R, Johnsrude I, Price C, Friston K. Identifying
global anatomical differences: deformation-based morphometry. Human Brain
5.4. Analysis of MR images from multiple sources
Mapping 1998;6:34857.
Bossa M, Zacur E, Olmos S. Tensor-based morphometry with stationary velocity eld
Because the SVM approach uses SIFT to extract image features. It diffeomorphic registration: application to ADNI. Neuroimage 2010;51:95669.
is expected the image features are invariant in the images acquired Calderoni S, Retico A, Biagi L, Tancredi R, Muratori F, Tosetti M. Female children
with autism spectrum disorder: an insight from mass-univariate and pattern
from different scanners with different acquisition technique and classication analyses. Neuroimage 2012;59:101322.
parameters. We use PPMI-37, which contains images from different Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. ACM Transactions
scanners with different parameter settings, to test the robustness of on Intelligent Systems and Technology 2011;2:127.
Cheung W, Hamarneh G. n-SIFT: n-dimensional scale invariant feature transform.
our approach. The performance measure for PPMI-37 is lower than Transactions on Image Processing 2009;18:201221.
the PPMI-15, which contains images from only one scanner with the Chung MK, Worsley KJ, Paus T, Cherif C, Collins DL, Giedd JN, et al. A unied statistical
same parameter settings. This result suggests that our approach is approach to deformation-based morphometry. Neuroimage 2001;14:595606.
Cuingnet R, Gerardin E, Tessieras J, Auzias G, Lehricy S, Habert M-O, et al. Automatic
able to analyze MR images from different scanners and parameter classication of patients with Alzheimers disease from structural MRI: a com-
settings. The results obtained by VBM and DBM on PPMI-37 is only parison of ten methods using the ADNI database. Neuroimage 2011;56:76681.
around 50%, which suggests VBM and DBM are not suitable to be Cvetkovic-Dozic D, Skender-Gazibara M, Dozic S. Neuropathological hallmarks of
Alzheimers disease. Archive of Oncology 2001;9:1959.
used to classify such highly heterogeneous brain images. Davie CA. A review of Parkinsons disease. British Medical Bulletin 2008;86:10927.
Focke NK, Helms G, Scheewe S, Pantel PM, Bachmann CG, Dechent P, et al. Individ-
5.5. Identication of disease related brain regions ual voxel-based subtype prediction can differentiate progressive supranuclear
palsy from idiopathic parkinson syndrome and healthy controls. Human Brain
Mapping 2011;32:190515.
We show the spatial distribution of likelihood ratio as a heat- Hackmack K, Paul F, Weygandt M, Allefeld C, Haynes J-D. Multi-scale classi-
map. The overall distribution of disease related SIFT features can cation of disease using structural MRI and wavelet transform. Neuroimage
be easily observed from the heat-map. We show the heat-maps 2012;62:4858.
Hinrichs C, Singh V, Mukherjee L, Xu G, Chung MK, Johnson SC. Spatially augmented
for the three diseases and compare the disease-related regions. LPboosting for AD classication with evaluations on the ADNI dataset. Neuroim-
The known brain regions that are affected by the diseases can be age 2009;48:13849.
Y. Chen et al. / Journal of Neuroscience Methods 221 (2014) 2231 31

Hua X, Leow AD, Parikshak N, Lee S, Chiang MC, Toga AW, et al. Tensor-based mor- PPMI. The Parkinson Progression Marker Initiative (PPMI). Progress in Neurobiology
phometry as a neuroimaging biomarker for Alzheimers disease: an MRI study 2011;95:62935.
of 676 AD, MCI, and normal subjects. Neuroimage 2008;43:45869. Rokach L. Ensemble-based classiers. Articial Intelligence Review 2010;33:139.
Klppel S, Stonnington CM, Chu C, Draganski B, Scahill RI, Rohrer JD, et al. Scovanner P, Ali S, Shah M. A 3-dimensional sift descriptor and its application to
Automatic classication of MR scans in Alzheimers disease. Brain 2008;131: action recognition. In: Proceedings of the 15th international conference on mul-
6819. timedia. Augsburg, Germany: ACM; 2007. p. 35760.
Lee J-H, Garwood M, Menon R, Adriany G, Andersen P, Truwit CL, et al. High contrast Seidman LJ, Biederman J, Liang L, Valera EM, Monuteaux MC, Brown A, et al. Gray
and fast three-dimensional magnetic resonance imaging at high elds. Magnetic matter alterations in adults with attention-decit/hyperactivity disorder iden-
Resonance in Medicine 1995;34:30812. tied by voxel based morphometry. Biological Psychiatry 2011;69:85766.
Li X, Mess A, Marrelec G, Plgrini-Issac M, Benali H. An enhanced voxel-based mor- Strakowski SM, DelBello MP, Adler CM. The functional neuroanatomy of bipo-
phometry method to investigate structural changes: application to Alzheimers lar disorder: a review of neuroimaging ndings. Molecular Psychiatry
disease. Neuroradiology 2010;52:20313. 2004;10:10516.
Liu M, Zhang D, Shen D. Ensemble sparse classication of Alzheimers disease. Neu- Toews M, Wells W III, Collins DL, Arbel T. Feature-based morphometry: discovering
roimage 2012;60:110616. group-related anatomical patterns. Neuroimage 2010;49:231827.
Lowe DG. Distinctive image features from scale-invariant keypoints. International Uddin LQ, Menon V, Young CB, Ryali S, Chen T, Khouzam A, et al. Multivariate search-
Journal of Computer Vision 2004;60:91. light classication of structural magnetic resonance imaging in children and
Lowe DG. Object recognition from local scale-invariant features. In: International adolescents with autism. Biological Psychiatry 2011;70:83341.
conference on computer vision (ICCV). Corfu, Greece: IEEE Computer Society; Vedaldi A, Fulkerson B. Vlfeat: an open and portable library of computer vision algo-
1999. p. 11507. rithms. In: Proceedings of the international conference on multimedia. Firenze,
Marcus DS, Wang TH, Parker J, Csernansky JG, Morris JC, Buckner RL. Open access Italy: ACM; 2010. p. 146972.
series of imaging studies (OASIS): cross-sectional MRI data in young, middle Vemuri P, Gunter JL, Senjem ML, Whitwell JL, Kantarci K, Knopman DS, et al.
aged, nondemented, and demented older adults. Journal of Cognitive Neuro- Alzheimers disease diagnosis in individual subjects using structural MR images:
science 2007;19:1498507. validation studies. Neuroimage 2008;39:118697.
Meyer D, Leisch F, Hornik K. The support vector machine under test. Neurocompu- Wolz R, Julkunen V, Koikkalainen J, Niskanen E, Zhang DP, Rueckert D, et al. The
ting 2003;55:16986. Alzheimers disease neuroimaging I. Multi-method analysis of mri images in
Nieuwenhuis M, van Haren NEM, Hulshoff Pol HE, Cahn W, Kahn RS, Schnack early diagnostics of alzheimers disease. PLOS ONE 2011;6:e25446.
HG. Classication of schizophrenia patients and healthy controls from struc- Yuan L, Wang Y, Thompson PM, Narayan VA, Ye J. Multi-source feature learning
tural MRI scans in two large independent samples. Neuroimage 2012;61: for joint analysis of incomplete multiple heterogeneous neuroimaging data.
60612. Neuroimage 2012;61:62232.

You might also like