You are on page 1of 17

Accepted Manuscript

Semi-automatic liver tumor segmentation with hidden Markov measure field


model and non-parametric distribution estimation
Yrj Hme, Mika Pollari
PII: S1361-8415(11)00093-4
DOI: 10.1016/j.media.2011.06.006
Reference: MEDIMA 621
To appear in: Medical Image Analysis
Received Date: 13 August 2010
Revised Date: 13 June 2011
Accepted Date: 16 June 2011
Please cite this article as: Hme, Y., Pollari, M., Semi-automatic liver tumor segmentation with hidden Markov
measure field model and non-parametric distribution estimation, Medical Image Analysis (2011), doi: 10.1016/
j.media.2011.06.006
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers
we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and
review of the resulting proof before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Semi-automatic liver tumor segmentation with hidden Markov measure eld
model and non-parametric distribution estimation
Yrj o H ame
1,
, Mika Pollari
Department of Biomedical Engineering and Computational Science, Aalto University School of Science
P.O. Box 12200, FI-00076 AALTO, Finland
Abstract
A novel liver tumor segmentation method for CT images is presented. The aim of this work was to reduce the
manual labor and time required in the treatment planning of radiofrequency ablation (RFA), by providing accurate
and automated tumor segmentations reliably. The developed method is semi-automatic, requiring only minimal user
interaction. The segmentation is based on non-parametric intensity distribution estimation and a hidden Markov
measure eld model, with application of a spherical shape prior. A post-processing operation is also presented to
remove the overow to adjacent tissue. In addition to the conventional approach of using a single image as input data,
an approach using images from multiple contrast phases was developed. The accuracy of the method was validated
with two sets of patient data, and articially generated samples. The patient data included preoperative RFA images
and a public data set from 3D Liver Tumor Segmentation Challenge 2008. The method achieved very high accuracy
with the RFA data, and outperformed other methods evaluated with the public data set, receiving an average overlap
error of 30.3% which represents an improvement of 2.3 percentage points to the previously best performing semi-
automatic method. The average volume dierence was 23.5%, and the average, the RMS, and the maximum surface
distance errors were 1.87, 2.43, and 8.09 mm, respectively. The method produced good results even for tumors with
very low contrast and ambiguous borders, and the performance remained high with noisy image data.
Keywords:
Liver tumor segmentation, Semi-automatic segmentation, Hidden Markov measure eld model
1. Introduction
Liver tumor segmentation has several applica-
tions, such as treatment planning and evaluation, and
computer-assisted surgery. Manual delineation of tu-
mors is time-consuming and laborious, and the results
depend on the observer. For these reasons, there has
been increasing research interest directed at segmenta-
tion methods that take advantage of existing comput-
ing capabilities. The need for method development is
underlined by the fact that liver cancer is among the
$
The research leading to these results has received funding
from the European Communitys Seventh Framework Programme
(FP7/2007-2013) under grant agreement n

223877.

Corresponding author
Email addresses: yh2475@columbia.edu (Yrj o H ame),
mika.pollari@tkk.fi (Mika Pollari)
URL: http://users.tkk.fi/yhame/ (Yrj o H ame)
1
Present address: Dept. of Biomedical Engineering, Columbia
University, New York, NY, USA
ve cancers causing the most deaths worldwide, and
metastatic lesions are also common in the liver (Fried-
man et al., 2003).
Contrast-enhanced computed tomography (CECT) is
most commonly used for liver lesion evaluation and
staging after initial ultrasound imaging (Hann et al.,
2000). The imaging is commonly performed in two
or three phases that correspond to the dierent times
at which the contrast agent arrives to the liver through
the dual blood supply of the organ (Baron, 1994). In
addition, native computed tomography (CT) imaging is
commonly performed.
The correct timing of the CECT imaging phases is
dicult due to variability of patients, making the im-
age data often sub-optimal. Typical CT data also has
a relatively high level of noise, and as the contrast be-
tween the tumor and parenchyma is often low, the tumor
may be dicult to detect, and even more so to reliably
delineate. In addition to the limitations of the imaging

method, liver tumor segmentation is also complicated
by tumor variability in size and structure, and they may
appear practically anywhere within the organ.
State-of-the-art segmentation methods oer reduc-
tions in the amount of required user interaction, repeat-
able results and accuracy comparable with manual seg-
mentations. For the overall treatment process, these
traits reduce expenses and increase the process relia-
bility. Segmentation methods that require only mini-
mal initial user interaction, i.e. semi-automatic meth-
ods, have been the recent focus of research. They have
proved to be able to provide reliable results with ac-
curacy similar to interactive methods (Deng and Du,
2008). Fully automatic methods generally suer from
lower accuracy and robustness, as well as a signicantly
higher computational cost.
The semi-automatic method by Smeets et al. (2009)
is based on a level set method tted on a fuzzy classi-
cation of the image data. The method performed well
in the 3D Liver Tumor Segmentation Challenge 2008
(LTS08) (Deng and Du, 2008) but its accuracy declines
if the tumor has a low-contrasted edge. In addition,
since the classication assumes normal distributions for
the classes, it does not perform so well if the tumor is
adjacent to other structures than healthy liver tissue.
Another semi-automatic method by Moltz et al.
(2008) estimates typical tumor and parenchyma intensi-
ties based on input from the user, and denes thresholds
for region growing based on these estimates. The result
is post-processed with morphological operations. The
method also performed well with the LTS08 data, but it
encounters diculties with tumors that have inhomoge-
neous intensity distributions.
A more general approach to lesion segmentation pre-
sented by Jolly and Grady (2008) also estimates the in-
tensity distribution based on user-given points, and the
segmentation is based on a fuzzy connectedness algo-
rithm that nds a cost value for every image point. The
method is able to segment various kinds of tumors as
proved by an extensive evaluation. However, the seg-
mentation accuracy leaves room for improvement and it
does not perform well with heterogeneous tumors.
Other related work includes semi-automatic method
by Li et al. (2006), where tumor boundaries are lo-
cated with a machine learning-based classier, and the
liver structure segmentation method by (Freiman et al.,
2008), which uses a multi-class Bayesian classier and
morphological operations for adjustments. In addition,
a recent publication includes a benchmark study of three
semi-automatic methods (Zhou et al., 2010).
Tumors with low contrast are challenging for these
tumor segmentation methods, especially if the image
has a high level of noise. Ambiguous borders cause dif-
culties in particular for boundary-based segmentation
methods. This has created a need for a method that is
able to perform reliably and accurately with these char-
acteristics present, without increasing the amount of re-
quired user interaction.
The target application of this work was radiofre-
quency ablation (RFA) treatment planning, where reli-
able tumor segmentations are needed for accurate nee-
dle placement. Tumors treated with RFA are typically
relatively small in size, with diameters of less than 5 cm,
and they have a generally spherical shape (Gazelle et al.,
2000). A spherical shape is typical also for metastatic
lesions in the liver (Halvorsen et al., 1982) and for single
nodular hepatocellular carcinoma (Kanai et al., 1987).
A novel semi-automatic method was developed for
segmenting liver tumors from low-quality CT data. The
method is based on non-parametric intensity distribu-
tion estimation and the hidden Markov measure eld
(HMMF) model (Marroquin et al., 2003). The HMMF
model adds a continuous-valued measure eld estima-
tion step to the classical Markov Random Field (MRF).
The application of the measure eld provides a smooth
cost function that is simple and ecient to optimize,
improving the MRF by removing diculties with lo-
cal minima and oscillating behavior. Also, the measure
eld captures the classication uncertainty by reducing
the weight for points with uncertain classications in
the eld cliques.
The method assumes a roughly spherical shape for
tumors, and that in general, the intensity distributions of
tumors and the adjacent tissue do not necessarily follow
any particular statistical distribution. Amultivolume ap-
proach is also presented for using all the available image
data.
The developed method was evaluated using two sets
of patient data, the publicly available data set of LTS08
and a data set that consisted of pre-operative images of
patients treated with RFA. Also, a novel framework of
creating articial data with ground truth segmentations
was developed. The articial data is used for analyzing
performance with dierent levels of contrast.
Following this introduction, the developed method
and the data used for training and evaluation are de-
scribed in Section 2. The evaluation results are reported
in Section 3, and Section 4 concludes the paper with a
discussion.
2

2. Methods
2.1. Segmentation task formulation
This general formulation follows the outline of the
original HMMF model (Marroquin et al., 2003). Some
signicant modications have also been introduced,
most importantly in the probability distribution P(q) of
the measure eld q, and in the non-parametric intensity
distribution estimates that are kept static in the segmen-
tation process.
Let the observed image be I. The segmentation task
consists of nding the label eld f that maximizes the
posterior probability P( f |I). Let represent the image
domain, with r representing image points (voxels).
Then f (r) Z
M
= {1, ..., M}, where M is the number
of dierent classes for the segmentation task. For the
purposes of liver tumor segmentation, M = 2 (see dis-
cussion in Section 2.2).
The label eld f is found in two steps. The rst step
consists of generating a Markov random vector eld q
with distribution
P(q) =
Q(q)
K
exp
_

_
S
D
(q)

C
W
C
(q)
_

_
, (1)
where Q(q) is a class-dependent prior probability func-
tion, K is a positive normalizing constant, S
D
(q) is a
shape prior dependent on the input data D, C are the
cliques of a given neighborhood system, and W
C
are
potential functions. The used shape prior S
D
is simi-
lar to the approach introduced in Flach and Schlesinger
(2008). The M-dimensional vector q(r) is also con-
strained by
M

k=1
q
k
(r) = 1, q
k
0, (2)
where q
k
(r) is the kth component of q(r). Here the con-
straint becomes q
1
(r) + q
2
(r) = 1; q
1
, q
2
0.
The label eld f is generated from q in the second
step, each f (r) being an independent sample from the
distribution q(r), with:
P( f |q) =
_
r
q
f (r)
(r),
where the component q
f (r)
(r) of vector q(r) corresponds
to class f (r).
For nding the optimal estimator q

for the vector


eld in the rst step, the MAP estimator is computed
by
q

= arg max
q
P(q|I),
with the constraint (2) applied. Using the Bayes rule,
the posterior distribution P(q|I) is dened as:
P(q|I) =
1
R
P(I|q)P(q), (3)
where R is a positive normalizing constant. The con-
ditional distribution is dened as (see Marroquin et al.
(2003) for proof):
P(I|q) =
_
r
M

k=1
P(I(r)| f (r) = k)q
k
(r). (4)
For brevity, we denote the observation likelihood func-
tions as P(I(r)| f (r) = k) = v
k
(r) and P(I(r)| f (r)) = v(r).
The sum term in (4) can then be expressed as v(r) q(r),
or v
1
(r)q
1
(r) + v
2
(r)q
2
(r) for M = 2.
Combining (1), (3), and (4) results in
P(q|I) =
1
KR
exp
_
U(q)
_
,
where
U(q) =

r
log(v(r) q(r)) log(Q(q))
S
D
(q) +

C
W
C
(q). (5)
As 1/KR > 0, q

is found simply by computing the


minimum of U(q). The details of this process are pre-
sented in Section 2.5.
After obtaining the MAP estimator q

, the optimal
estimator f

for the label eld f is found by maximizing


P( f |q = q

, I). This is done by nding the mode for each


q

(r):
f

(r) = arg max


k
q

k
(r). (6)
With M = 2, this is equal to f

(r) = max(q

1
(r), q

2
(r)).
2.2. Method overview
The segmentation is performed in four stages:
1. Preprocessing and user input
2. Estimation of observation likelihood functions
3. HMMF segmentation
4. Post-processing
The rst stage involves input from the user, after which
all the subsequent stages are performed automatically.
Here, the stages are described briey, with a more
thorough presentation in the following sections. The
process is illustrated in Fig. 1.
Given an image or multiple images of a patient as in-
put to the method, the user selects two points indicating
3

Figure 1: Stages of the segmentation method
the location of the tumor. The method then performs the
preprocessing steps based on the user input.
The second stage uses the image data and regions de-
ned in the previous stage to estimate intensity distri-
butions for the segmentation classes. Using these esti-
mates, each image point r is then assigned a likelihood
value v
k
(r), indicating how probable such an observa-
tion I(r) is for each class k. The result is passed on to
the third stage of the method.
The third stage performs the actual segmentation us-
ing the formulation presented in Section 2.1.
The nal post-processing stage modies the segmen-
tation objects by removing overown sections. This is
done by comparing the tumor object shape with the cen-
ter of the tumor as dened by user input. The output of
the post-processing stage is the nal segmentation.
A segmentation in two classes was selected in the im-
plementation, since the number of actual tissue types
around the tumor is unknown in general. Two classes
simply indicate whether a point is part of the tumor or
not. The used nonparametric intensity distribution esti-
mates are very useful in cases with several tissue types
around the tumor.
In the following, k = 1 represents the class corre-
sponding to the tumor. The stages of the method are
illustrated with an example in Fig. 2.
All of the parameter values presented here were se-
lected based on results from training data and used for
all evaluation data. The used training data was provided
by the LTS08 competition.
2.3. Preprocessing and user input
From image I, the user selects the axial slice view
where the tumor appears the largest. Then, the user se-
lects two points on opposite edges of the tumor, so that
if a line was drawn between the points, it would pass ap-
proximately through the center of the tumor as observed
in the slice view. Let the selected points be l
1
and l
2
.
The next step is to construct a ROI
R
. The
segmentation is performed only for points r
R
. In
addition, training samples T
k

R
need to be deter-
mined for each class k for estimating the observation
likelihood functions in the next step.
Using the user-dened points, the following variables
are determined:
1. ROI center r
c
=
1
2
(l
1
+ l
2
)
2. tumor radius d
T
=
1
2
|l
1
l
2
|
3. ROI radius d
ROI
= max(1.5d
T
, d
min
)
The ROI radius is limited to at least d
min
= 8 mm, to pre-
vent it from becoming too small for very small tumors.
In addition, the width of the ROI edge d
e
is assigned a
value of 1.5 mm.
To simplify notation, let x = {x
1
, x
2
, x
3
} represent co-
ordinate points with origin at r
c
, so that x = r r
c
. The
terms x
1
, x
2
, and x
3
are coordinates along the sagittal,
coronal, and axial axis, respectively. Then
R
is dened
as a sphere, with center at r
c
and radius d
ROI
:

R
=
_
x | x
2
1
+ x
2
2
+ x
2
3
d
2
ROI
.
_
The training samples T
1
are determined as the set of
points within an ellipsoidal area centered at r
c
:
T
1
=
_

_
x
R
|
x
2
1
a
2
1
+
x
2
2
a
2
2
+
x
2
3
a
2
2
1
_

_
,
where the coordinates x

1
and x

2
are obtained by rotating
x
1
and x
2
around the x
3
axis with center of rotation at
r
c
, so that the long axis of the ellipsoid passes through
the input points l
1
and l
2
. The variables a
1
and a
2
are
assigned values depending on the tumor radius: a
1
=
2d
T
and a
2
= 0.8d
T
.
The training samples T
2
are determined as the set of
points at the ROI edge as dened by d
e
:
T
2
= {x
R
| d(x, r
c
) d
ROI
d
e
} ,
4
(a) (b) (c)
(d) (e) (f)
Figure 2: Main stages of the method illustrated with an example: a) User input and ROI construction, with the following markers: outer ring for
ROI border, ellipsoid for sampling area of tumor training data, x-markers for input points, small circle for r
c
, b) observation likelihood function for
class 1, c) observation likelihood function for class 2, d) measure eld MAP estimate q

, e) axial slice visualization of the segmentation result, f)


3D visualization of the segmentation result
where d(x, r
c
) is the Euclidian distance between x and
r
c
.
An example of the selected points, the resulting ROI
and the training sample regions are visualized in Fig.
2(a).
The chosen approach for user interaction and inten-
sity distribution estimation is similar to the solutions
used by Smeets et al. (2009) and Moltz et al. (2008).
In all of these methods, the intensity distributions are
estimated directly from the image using the location in-
formation provided by the user. Also, all of the methods
use a region of interest (ROI) or maximal radius to re-
strict the segmentation.
2.4. Estimation of observation likelihood functions
As the class intensity distributions do not neces-
sarily follow a specic statistical distribution, a non-
parametric estimation method is used. An estimate v
k
(r)
for v
k
(r) is obtained separately for each class k using
the Parzen windows method (Parzen, 1962). Given the
training data T
k
for class k, there are n
k
training samples
T
k
(t
i
), i = 1, ..., n
k
at points t T
k
. Then, v
k
(r) is dened
as:
v
k
(r) =
1
hn
k
nk

i=1
K
_
I(r) T
k
(t
i
)
h
_
, (7)
where h is a free parameter controlling the smoothness
of the estimate, and K is a Gaussian kernel:
K(x) =
1

2
2
exp
_

(x )
2
2
2
_
.
5

The variables
2
and are the variance and the mean of
the kernel, respectively. Here we use values
2
= 1 and
= 0, so (7) can be written as
v
k
(r) =
1
hn
k

2
n
k

i=1
exp
_

(I(r) T
k
(t
i
))
2
2h
2
_
.
An empirically chosen value h = 2.2 is used here.
The above likelihood estimation is used only for
points having intensity values higher than 0, since lower
values do not usually occur in tumors in CT images
(Dowsett et al., 1998). Points r with intensities of 0
of less, i.e. I(r) 0 are assigned likelihood values of
v
1
(r) = 0 (tumor class) and v
2
(r) = v
m
, where v
m
is the
highest likelihood value assigned to any point for class
2.
2.5. Segmentation with HMMF model
To nd the MAP estimate q

, the minimum of U(q)


(5) is computed. Here the terms of U(q) are dened and
the process is described in detail.
The class-dependent prior probability Q(q) is used to
add sensitivity to the segmentation method by giving a
lower prior probability to class 2 (not tumor). This way,
in uncertain cases the point will be classied more likely
as a part of the tumor than the background. However, it
was noted that if the tumor contrast is very low, adding
sensitivity may cause poor results. For this reason, an
adaptive Q was used, based on the overlap of the inten-
sity distribution estimates.
The function is dened as:
Q(q) =
_
r
R
M

k=1

k
q
k
(r),
where the weights
k
control the prior probability for
each class k. The used values were
1
= 1 and
2
=
1.08 0.55, where is a measure of separation of the
two intensity distribution estimates:
=
1
2
_
r
| v
1
(r) v
2
(r) | . (8)
The value of varies from 0 for identical distributions,
to 1 for completely separated distributions.
Next, the shape prior S
D
is dened. It takes advan-
tage of the location information provided by the user,
essentially indicating that an area around the center r
c
should be segmented as the tumor and the region at the
ROI edge should not be a part of the tumor. The shape
prior is dened as:
S
D
(q) =
s

r
R

kM
s
k
(r, D)q
k
(r),
Figure 3: Example of shape prior s
1
, with input points shown with
x-markers
where
s
= 3.0 is a weighting constant. The function s
for class k = 1 is dened inside the ROI as:
s
1
(r, D) =
_

_
1 (1 + a
D
(r))
1
, if d(r, r
c
) < d
ROI
d
e
1, otherwise
with
a
D
(r) = exp
_

_
d(r, r
c
)
d
T
d
s
__
,
where = 20 controls the slope of the function and d
s
is a modier parameter controlling the size of the center
region with respect to tumor radius d
T
, chosen here as
d
s
= 0.55. For class 2, s
2
(r) = s
1
(r).
The function s takes the form of a logistic function
scaled with the tumor radius d
T
. For class k = 1, s
is close to a value of one at the ROI center r
c
, and ap-
proaches zero when the distance to r
c
is larger than d
T
d
s
.
At the ROI edge, it is given a value of 1. An example
of the shape prior is shown in Fig. 3.
The potential functions W
C
enforce the smoothness
of q. Here, pairwise cliques C in the 26-neighborhood
of each point r are used to compute the potentials. The
potential functions are dened as:
W
r
1
r
2
=
W
exp
_
d(r
1
, r
2
)
2
/(2
2
W
)
_
M

k=1
(q
k
(r
1
)q
k
(r
2
))
2
,
where d(r
1
, r
2
) is the Euclidian distance between the two
points,
W
= 1.5 is the standard deviation of the expo-
nential term used to modify the weights of the neighbor-
ing points and
W
= 20 is a weighting constant.
The function U is minimized using the gradient de-
scent optimization method. After this, the label eld f
is found as described in (6).
2.6. Post-processing
Structures adjacent to the tumor with similar intensity
distributions may cause the segmentation to overow
6

outside the tumor. This often happens through a narrow
passage, and results in a segmentation object that has a
handle attached to the spherical tumor mass. These
handles are removed in the post-processing stage, along
with any objects that were classied as tumor, but not
attached to the actual tumor object.
The handle removal is based on comparing two dif-
ferent distance values of points from the ROI center r
c
.
The rst value is the Euclidian distance d
E
(r, r
c
), and the
second one is a weighted distance value d
w
(r, r
c
), which
approximates the distance to be traveled inside the seg-
mentation object to connect the two points. After ob-
taining the two values, the dierence is observed. If the
dierence is large, it can be deduced that r belongs to a
handle, since there is no direct path from r
c
to r when
advancing inside the segmentation object.
The implementation of this is done using the Fast
Marching Method (Sethian, 1996), so that a front with a
spatially varying speed function F(r) is advanced start-
ing from the center point r
c
. The passing time t
r
of the
front at each point r is assigned as its distance value
d(r, r
c
).
To do this, the Eikonal equation is solved:
F(r)|d(r, r
c
)| = 1, where d(r, r
c
) is the gradient of the
distance function. To compute the Euclidian distance,
F(r) = 1, r
R
. For computing the weighted dis-
tance, F(r) = 1 if f (r) = 1 and F(r) = 0.1 otherwise. In
the case of the weighted distance, the advance of the
front is signicantly slower outside the segmentation
object. Using the distance values, a probability volume
is then computed: (r) = exp[(d
E
(r, r
c
) d
W
(r, r
c
))
2
]
if f (r) = 1, and (r) = 0 otherwise. The nal seg-
mentation f

is then found by thresholding, so that if


(r) > 0.8, then f

(r) = 1, otherwise f

(r) = 0.
The operations of the post-processing stage are illus-
trated in Fig. 4 using an articial example. The gure
shows that the handles are removed in the process, while
leaving the spherical object and the protrusions in the
bottom part intact.
Fig. 5 illustrates the post-processing stage using a
real CT volume example. The gure shows a large
tumor, where the segmentation has slightly overown.
The overown region and a disconnected object are
clearly seen in the 3D object of Fig. 5(b). The post-
processing operation removes the handle and the dis-
connected object while keeping the tumor segmentation
intact.
2.7. Multiphase segmentation
To incorporate information from multiple CECT
phase images, all of the available images were regis-
tered to a common coordinate system. The registration
was performed with an in-house registration program,
which was similar to the IRTK software (Rueckert et al.,
1999) with a few modications.
The native CT image of each patient was selected as
a common target and all CECT images of the same pa-
tient were registered pair-wise with the target. The po-
sitional dierences were corrected with a rigid transfor-
mation model and local deformations with a B-spline
model (Rueckert et al., 1999; Rohlng et al., 2003).
During the non-rigid registration, the control point dis-
tance was hierarchically rened from isotropic 40 mm
to 20 mm. These values were chosen based on a previ-
ous study (Rohlng et al., 2004), where similar transfor-
mation models were successfully used to estimate res-
piratory motion in the liver.
As the energy function, the inverse of normalized mu-
tual information (Studholme et al., 1999) was used:
E =
H(I
S
) + H(I
T
)
H(I
S
, I
T
)
,
where H(I) is the marginal entropy and H(I
S
, I
T
)
the joint entropy for source I
S
and target I
T
images.
For non-rigid registration, smoothness (Wahba, 1990;
Rueckert et al., 1999) and incompressibility (Rohlng
et al., 2003) constraints were tested but not used, since
they did not result in any increase of registration accu-
racy.
The energy in rigid registration was minimized with
the downhill simplex method (Nelder and Mead, 1965),
and in the non-rigid case with the conjugate gradient
method (Press, 2007). For both optimization methods
an implementation from Press (2007) was used. To
speed up the computation, two-level multiresolution op-
timization was used. We used isotropic voxel dimen-
sions of 1.0 and 2.0 mm
3
for the ne and coarse reso-
lution levels, respectively. After registration, the CECT
images were transferred and resampled to the target do-
main with the computed transformations. Resampling
was performed with trilinear interpolation.
The segmentation process mostly remains the same
as in the single-volume case, with the only dierence
being how the observation likelihood functions are esti-
mated. The user can use any image for providing input,
but this should be done using the one with the highest tu-
mor contrast. The subsequent steps in the preprocessing
and user input stage remain the same as in the single-
volume case.
The estimation of observation likelihood functions is
initially done separately for each image. Then, a sepa-
ration measure of the intensity distribution estimates (8)
for each image is computed. The separation measure is
used as a weighting factor for a joint estimate. With this
7

(a) (b) (c) (d)
Figure 4: Post-processing operation illustrated with an articial example, where tumor center r
c
indicated with x-marker: a) spherical binary
object with protrusions at the bottom and an overown section at the top, b) weighted distance d
W
(r, r
c
), c) probability volume (r), and d) nal
segmentation f

(a) (b) (c) (d)


Figure 5: Example of a post-processing result: a) segmentation overlaid on slice image before post-processing, b) 3D view of segmentation object
before post-processing, c) slice view after post-processing and d) 3D view after post-processing
weighting factor, images with high contrast are weighed
more and subsequently given more inuence on the re-
sulting segmentation.
Let N be the number of available images and v
I
i
k
(r)
the intensity distribution estimate generated from image
I
i
for class k. Then the joint estimate can be expressed
as:
v
k
(r) =
_
N
i=1

i
v
I
i
k
(r)
_
N
i=1

i
,
where
i
is the separation measure (8) for image I
i
.
Using all three images adds robustness to the result
and may prevent overow occurring with the single
phase result. An example of the eect is displayed in
Fig. 6, showing a single phase segmentation with slight
overow that is corrected with the multiphase segmen-
tation. In this example, the multiphase segmentation re-
sult is less sensitive than the single phase alternative.
The multiphase segmentation requires only slightly
more computation than the single-phase approach, since
the most computationally expensive operation of esti-
mating the MAP eld remains the same. However,
the registration step involves a high computational cost,
making the multiphase segmentation impractical unless
the registered image volumes are already available. In
our target application of RFA treatment planning, the
images are registered for other purposes of the treatment
planning system.
2.8. Evaluation measures and data
In the evaluation of the method, the following ve
measures were computed by comparing each segmenta-
tion with its reference segmentation (see Deng and Du
(2008)):
I) Volumetric overlap error [%] (percentage of points
in the intersection of the two segmentations) (OE)
II) Relative absolute volume dierence [%] (VD)
III) Average symmetric surface distance [mm] (SD)
IV) Root mean square (RMS) symmetric surface dis-
tance [mm] (RD)
V) Maximum symmetric surface distance [mm]
(MD)
For each measure, a value of 0 corresponds to an exact
match with the reference segmentation and all are larger
8

(a) (b) (c) (d)
Figure 6: Example of multiphase segmentation eect: a) slightly overown segmentation result using single portal vein phase image, b)-d)
multiphase segmentation result using respective portal vein, arterial and native images.
than or equal to zero. The average human rater vari-
ability has been reported for the LTS08 data as (Deng
and Du, 2008): I) 12.94%, II) 9.64%, III) 0.40 mm, IV)
0.72 mm, and V) 4.0 mm.
The rst evaluation was made with articial data,
which provided a ground truth segmentation for refer-
ence. In addition, it also enabled controlling the tumor
contrast.
Due to the autocorrelation of noise in CT images, it
is dicult to construct an articial image that resembles
a real CT image of a liver with a tumor. For this reason,
articial data was created by altering the intensities of a
region of the parenchyma from a real native CT image,
to imitate the appearance of a tumor.
An articial tumor object was created by constructing
a small volume including a spherical region in the mid-
dle with value 1, and value 0 outside. All the generated
articial tumors were of the same size with a diameter
of 2.0 cm, resembling the size of a small tumor, typical
for RFA treatment. Gaussian ltering with standard de-
viation 1.5 was performed on this object to simulate the
point spread of the imaging device. The articial tumor
object is shown in Fig. 7(b).
Two regions from separate CT images were then ex-
tracted (see example in Fig. 7(a)). Five locations rep-
resenting tumor center points were selected from both
regions, bringing the total to ten locations. A single ar-
ticial tumor sample was generated by multiplying the
articial tumor object with the desired contrast value
and then adding the object to one of the ten locations.
An example result is shown in Fig. 7(c). This way, ten
articial tumors were generated for each contrast level,
one at each location. Four dierent contrast levels were
used: 20, 15, 10, and 7.5.
The second data set used for evaluation was the pub-
licly available LTS08 competition data, which has al-
(a) (b) (c)
Figure 7: Example of generation of articial evaluation data: a) a sec-
tion of healthy liver tissue from CT data, b) a spherical tumor object,
and c) sum of image and tumor object with contrast of 15 Hounseld
units
ready been used for evaluation of several other methods
(Deng and Du, 2008). The data set included training
data of four images with 10 reference tumor segmenta-
tions, and test data with a total of 13 images. A total
of 20 tumors were segmented from the test data. The
reference segmentations for the test data were not avail-
able to the authors, and the evaluation was conducted by
sending the nal segmentations to the competition orga-
nizer, who provided the evaluation measures and scores.
The segmentations were given scores on a scale from 0
to 100, with 100 corresponding to an exact match, and
90 points corresponding to a segmentation with error
values equal to interobserver variability in manual de-
lineations.
To set the conditions similar to other evaluations with
the LTS08 data, the user was allowed to modify the seg-
mentation input if the output of the method was not sat-
isfactory. The modication was done simply by choos-
ing input points from another location. The modied in-
put provided the method with a dierent set of training
data, giving an alternate result. Only the nal segmen-
tation was compared with the reference segmentation.
9

The third data set was provided by University of
Leipzig, Department of Diagnostic and Interventional
Radiology. The images were preoperative images of
nine patients undergoing RFA treatment. Three phase
images were available for all but one patient, for which
only two images were acquired. Since the RFA treat-
ment is generally used only for relatively small tumors,
all of the tumors in this data set were smaller than 5 cm
in diameter. Patient 1 had three tumors, and the rest had
a single tumor, bringing the total to 11 tumors. Three
of the tumors had previously been treated with tran-
scatheter arterial chemoembolization (TACE). The vol-
umes had axial dimensions of 512 512 points with an
in-plane resolution between 0.68 mm and 0.89 mm. The
resolution between the axial image slices was between
2 mm and 3 mm. The error values were computed by
comparing the segmentations produced by the method
to manual segmentations.
The average tumor contrasts for the data sets were
measured by computing the absolute dierence of the
medians of the two training sets for each tumor and tak-
ing the average of the dierence values over the data
set.
3. Results
3.1. Articial data
The average evaluation results for the segmentations
of ten tumors with dierent contrast levels are listed
in Table 1. The results show that the method gener-
ated very accurate segmentations at the highest contrast
level (20). The accuracy clearly dropped with decreas-
ing contrast. With a contrast of 10 Hounseld units, the
accuracy was approximately the same as for the RFA
patient data (Table 2).
These results can be used for estimating the reliability
of the segmentation result of real tumors. For example,
if an average overlap error of less than 30% is desired,
the tumor contrast should be more than 10 Hounseld
units. However, it should be noted that the articial
samples did not include any heterogeneity that is often
present in patient data, and that the used shape is espe-
cially suitable for the segmentation method. For these
reasons, the results could be interpreted as an upper-
bound estimate for the method performance, rather than
a measure of the average accuracy.
3.2. LTS08 data
A summary of the LTS08 data evaluation results is
shown together with the Leipzig data results in Table
2. The scores received in the evaluation are illustrated
Table 1: Evaluation result averages for articial data, using four dif-
ferent contrast levels for 10 articial tumors. For descriptions of error
measures, see Section 2.8.
Cont. OE VD SD RD MD
20 15.12 12.33 0.44 0.62 2.13
15 21.13 16.50 0.64 0.84 2.99
10 31.47 26.09 1.02 1.22 3.24
7.5 38.65 35.53 1.32 1.50 3.33
Table 2: Evaluation results for the LTS08 and Leipzig data sets, in-
cluding multiphase (Leipzig M.) and single-phase (Leipzig S.) seg-
mentation results. For descriptions of error measures, see Section 2.8.
OE VD SD RD MD
LTS08
Mean 30.35 23.53 1.87 2.43 8.09
SD 11.03 13.97 1.17 1.41 4.49
Worst 53.62 51.69 4.75 5.87 19.39
Best 15.23 0.67 0.43 0.71 2.60
Leipzig M.
Mean 28.59 17.85 0.86 1.20 4.65
SD 5.94 10.60 0.33 0.45 2.41
Worst 35.83 30.48 1.54 2.15 9.52
Best 15.52 2.00 0.61 0.85 2.32
Leipzig S.
Mean 29.60 17.75 0.89 1.24 5.12
SD 5.61 11.40 0.31 0.42 2.75
Worst 36.66 37.18 1.56 2.14 11.06
Best 17.72 2.05 0.58 0.78 2.20
with a boxplot in Fig. 8. The evaluation included all 20
tumors of the test data. The training data evaluation is
not included in the results.
The average tumor contrast for the test data set was
39 Hounseld units. The post-processing stage left ve
of the segmentations unaltered and only four of the seg-
mentations were altered by more than 1% in volume,
with the maximum change being 8.12%. These four
tumors were among the ve largest tumors of the data
set. In six cases, the user chose to have the tumor re-
segmented. Four of the tumors were resegmented once
and two of them twice. The resegmentation was done
by choosing the input points at another location. Only
the nal segmentation was evaluated with the reference
segmentation.
The average total score standard deviation in the
evaluation was 70.3 14.3 points, with a median of 73
10

Overlap error Volume diff. Av. Surf. Dist. RMS Surf. Dist. Max. Surf. Dist. Total
0
10
20
30
40
50
60
70
80
90
100
S
c
o
r
e
Figure 8: Scores of the LTS08 data set presented as boxplots. Score
of average interobserver variability (90) is shown with dashed line for
reference.
points. The segmentation results for very large tumors
were signicantly poorer than the average results. The
data set included three very large tumors, for which the
mean score was 52.7. For these three tumors, the rel-
ative error measures (overlap error and volume dier-
ence) were close to the average, but the surface distance
errors were very high, the mean average symmetric sur-
face distance being 3.51 mm, for example. This was
partly caused by the heterogeneous appearance of the
largest tumors. In addition, largest tumors often also oc-
cupy regions close to the liver border, making overow
to adjacent structures more likely.
The computation time depended greatly on the size of
the tumor. The smallest tumors were computed in less
than 30 seconds, but the larger ones required up to 15
minutes with the current implementation. However, the
program code used for evaluation was non-optimized
and used only a single processor core. The computa-
tionally most intensive part was the iterative process of
nding the MAP estimate q

, which is also possible to


compute in parallel.
In Table 3 the evaluation results are compared with
the results of the previously best-performing semi-
automatic method for the LTS08 data set, and the in-
terobserver variability. The presented method receives
a higher score and lower values for all of the evalua-
tion measures, with the exception of the relative vol-
ume dierence. The previously highest score for the
LTS08 data was by an interactive method (Stawiaski
et al., 2008), with 70.0 points.
Table 3: Comparison of validation results and respective scores with
previously best-performing semi-automatic method (Smeets et al.,
2009) and manual delineations evaluated on the LTS08 data set (Deng
and Du, 2008). For descriptions of error measures, see Section 2.8.
OE VD SD RD MD Total
Meas.
H ame 30.3 23.5 1.9 2.4 8.1
Smeets 32.6 17.9 2.0 2.6 10.1
Manual 12.9 9.6 0.4 0.7 4.0
Scores
H ame 76.6 75.6 53.8 66.2 79.7 70.3
Smeets 74.8 81.5 52.6 63.3 74.6 69.4
Manual 90 90 90 90 90 90
3.3. Leipzig data
The segmentation was performed using rst a single
image and then multiple images from dierent phases.
Both the multiphase and single phase segmentations
were generated using the same user input data for each
patient data set. The single phase segmentation was per-
formed on the image observed as having the best tumor
contrast. In the multiphase evaluation, the used patient
data included native, arterial and portal vein phase im-
ages, except for Patient 1, for which the portal vein im-
age was not available.
Tumor contrast was evaluated in the single phase ver-
sion, and the average value for non-TACE tumors was
21 Hounseld units. Only one of the segmentations was
altered in the post-processing stage, with a volume re-
duction of 1.31%.
The volumes were aligned with the registration
method described in Section 2.7. The registration accu-
racy was visually evaluated by overlaying the resampled
source image in the target image. A transparency of 40-
60%and intensity windowfrom100 to 150 Hounseld
units was used. For one subject (Patient 5, arterial
phase) a registration error of a few millimeters at the
top of the liver was detected. All the other registrations
were visually evaluated as successful.
The evaluation results are listed in Table 2. The sur-
face distance error values are notably lower than for the
LTS08 data set. Most of the average error values for
the multiphase segmentation results are slightly lower
than for the single-phase results. The worst-case results
show that the multiphase version is more robust, reduc-
ing some the highest error values. Examples of multi-
phase segmentation results are visualized in Fig. 9.
The average computational time for the single vol-
ume segmentation after initial user input was 33 seconds
11

5
0
5
0
5
0
5
0
Patient 1c Patient 3 Patient 4 Patient 9
Figure 9: Examples of multiphase segmentation results for Leipzig data
per tumor, of which 7.8 seconds were taken by the itera-
tion of the HMMF model, on average. The large dier-
ence between the total processing time and the computa-
tionally most expensive step of the HMMF model is due
to disk operations that could be removed for optimized
program code. The equivalent average time taken for
manual contouring was 254 seconds per tumor, or 7.7
times more than the automated method. The registra-
tion step of the multiphase version is computationally
very costly and would dominate the processing time es-
timates. As noted above, the multiphase segmentation
is impractical unless the images are registered for other
purposes.
4. Discussion
The developed method was shown to provide a suc-
cessful framework for liver tumor segmentation. The
capabilities of the developed method were validated
with a varied collection of tumors. For the LTS08 data,
the method outperformed all other methods that have
previously been tested on the same data set. The average
overlap error was improved by 2.3 percentage points.
The evaluation with the Leipzig data set showed that
the method produced excellent results even for tumors
with very low contrast and ambiguous borders, and the
performance remained high with noisy image data.
The HMMF model enabled an eective inclusion of
prior information, a spatially smooth segmentation and
a computationally ecient way to nd the optimal so-
lution for the cost function. Learning intensity distri-
butions directly from the available image data and ad-
justing the model accordingly proved to be a good ap-
proach, providing adaptivity and robustness.
A framework for creating articial evaluation data
was also presented. The samples were made to resem-
ble tumors treated with RFA. Articial data with ground
truth segmentations provided a reliable estimate of the
method upper-bound performance with dierent con-
trast levels. With a contrast of at least 20 Hounseld
units, the average overlap error was 15.12%. Contrast
should be at least 10 Hounseld units in order to achieve
an overlap error of 31.47%.
Extremely good results were received for the Leipzig
data set, which included only relatively small tumors.
The average contrast of non-TACE tumors was only
21 Hounseld units, about half of the LTS08 data set
value. The surface distance error measures for the
Leipzig data set were signicantly lower than for the
LTS08 data. For relative measures, the error values
were similar between the data sets, since relative mea-
sures are sensitive for small objects. It was noted that
the method performance deteriorates with larger tumors
and high levels of heterogeneity. The training data used
for the non-parametric intensity distribution estimation
may not represent the heterogeneous tumors suciently
in all cases.
The multiphase segmentation results with the RFA
patient data were slightly better on average than the sin-
gle volume results. However, such small dierences in
the average errors did not conclusively indicate the su-
periority of the multiphase approach, even though there
were notable dierences in the individual results. In this
work, the method parameter values were selected based
on the single volume training data, and the multiphase
version might have performed better if the parameter
values were optimized for it. The multiphase method
seemed to add some robustness to the process.
The developed method has many parameter vari-
ables, but only a few of them are signicant for opti-
mizing method performance. The important ones are
the weights and the parameters controlling the adaptive
prior in the cost function used for MAP estimation. The
remainder of the parameters control basic functionali-
ties, and the method is relatively insensitive to their val-
ues. A drawback of the method is its iterative nature,
12

that causes a relatively high computational cost for large
tumors.
The edge of the ROI poses a hard limit for the seg-
mentation area. This may cause an undersegmentation
if the shape of the tumor deviates signicantly from
a sphere and its longest axis is close to perpendicular
to the image plane used for selecting the input points.
In the conducted evaluation, the ROIs were suciently
large to include the shape variation present in the data
set. However, for ellipsoidal tumors the input method
should be modied to allow more freedom in deter-
mining the two input points in 3D space. This would
not require any modication to the actual segmentation
method.
The introduced post-processing method provided an
eective approach for removing overown regions, but
was not able to entirely eliminate the erroneous area
in all cases. In most cases, the post-processing had
very little eect on the segmentation result, in partic-
ular only one of the Leipzig data segmentations was al-
tered. This indicates that the main segmentation method
is robust, especially for relatively small tumors. In the
presented framework, it could be possible to include
the shape analysis of the post-processing stage in the
HMMF model. This way, the post-processing stage
would be unnecessary and the whole process would be
included in the optimized cost function of the model.
The presented method is best suited for small and
medium-sized liver tumors, for which the segmentation
accuracy is high and computational cost remains mod-
est. The method performs reliably even for tumors with
low contrast, high levels of noise and ambiguous bor-
ders. These traits and the reduction in expensive manual
labor make the method ideal for RFA treatment plan-
ning.
Acknowledgements
We would like to thank Dr. Xiang Deng of Siemens
Ltd. China for data evaluations, our research partners
Dr. Daniel Seider and Dr. Michael Moche of Univer-
sity of Leipzig, Department of Diagnostic and Interven-
tional Radiology for patient data, Bernhard Kainz and
Judith Muehl of Graz University of Technology, Insti-
tute for Computer Graphics and Vision for technical as-
sistance, Mikko Lilja of Aalto University for proofread-
ing, Prof. Dr. Tuomas H ame of VTT for comments, as
well as our other partners: Medical University of Graz;
Fraunhofer Gesellschaft, Institute for Applied Informa-
tion Technology FIT; University of Oxford, Institute of
Biomedical Engineering; and NUMA Engineering Ser-
vices Ltd.
References
Baron, R., 1994. Understanding and optimizing use of contrast mate-
rial for CT of the liver. American Journal of Roentgenology 163,
323331.
Deng, X., Du, G., 2008. Editorial: 3D Segmentation in the Clinic: A
Grand Challenge II Liver Tumor Segmentation. MICCAI Work-
shop Proceedings .
Dowsett, D., Kenny, P., Johnston, R., 1998. The Physics of Diagnostic
Imaging. Chapman & Hall Medical London.
Flach, B., Schlesinger, D., 2008. Combining shape priors and MRF-
segmentation. Structural, Syntactic, and Statistical Pattern Recog-
nition , 177186.
Freiman, M., Eliassaf, O., Taieb, Y., Joskowicz, L., Sosna, J., 2008.
A bayesian approach for liver analysis: Algorithm and valida-
tion study. Medical Image Computing and Computer-Assisted
InterventionMICCAI 2008 , 8592.
Friedman, S., Grendell, J., McQuaid, K., 2003. Current Diagnosis &
Treatment in Gastroenterology. McGraw-Hill Medical.
Gazelle, G., Goldberg, S., Solbiati, L., Livraghi, T., 2000. Tumor
Ablation with Radio-frequency Energy. Radiology 217, 633.
Halvorsen, R., Korobkin, M., Ram, P., Thompson, W., 1982. CT
appearance of focal fatty inltration of the liver. American Journal
of Roentgenology 139, 277.
Hann, L., Winston, C., Brown, K., Akhurst, T., 2000. Diagnostic
imaging approaches and relationship to hepatobiliary cancer stag-
ing and therapy. Journal of Surgical Oncology 19, 94115.
Jolly, M., Grady, L., 2008. 3D general lesion segmentation in CT, in:
IEEE ISBI, pp. 796799.
Kanai, T., Hirohashi, S., Upton, M., Noguchi, M., Kishi, K., Maku-
uchi, M., Yamasaki, S., Hasegawa, H., Takayasu, K., Moriyama,
N., et al., 1987. Pathology of small hepatocellular carcinoma. A
proposal for a new gross classication. Cancer 60, 810819.
Li, Y., Hara, S., Shimura, K., 2006. A machine learning approach for
locating boundaries of liver tumors in CT images, in: Proc. ICPR,
pp. 400403.
Marroquin, J., Santana, E., Botello, S., 2003. Hidden Markov measure
eld models for image segmentation. IEEE Transactions on Pattern
Analysis and Machine Intelligence 25, 13801387.
Moltz, J., Bornemann, L., Dicken, V., Peitgen, H., 2008. Segmenta-
tion of liver metastases in CT scans by adaptive thresholding and
morphological processing, in: Workshop on 3D Segmentation in
the Clinic: AGrand Challenge II. Liver Tumor Segmentation Chal-
lenge. MICCAI, New York, USA.
Nelder, J., Mead, R., 1965. The downhill simplex method. Computer
Journal 7, 308.
Parzen, E., 1962. On estimation of a probability density function and
mode. The annals of mathematical statistics 33, 10651076.
Press, W., 2007. Numerical recipes: the art of scientic computing.
Cambridge University Press.
Rohlng, T., Maurer Jr, C., Bluemke, D., Jacobs, M., 2003. Volume-
preserving nonrigid registration of MR breast images using free-
form deformation with an incompressibility constraint. IEEE
Transactions on Medical Imaging 22.
Rohlng, T., Maurer Jr, C., ODell, W., Zhong, J., 2004. Modeling
liver motion and deformation during the respiratory cycle using
intensity-based nonrigid registration of gated MR images. Medical
Physics 31, 427.
Rueckert, D., Sonoda, L., Hayes, C., Hill, D., Leach, M., Hawkes, D.,
1999. Nonrigid registration using free-formdeformations: applica-
tion to breast MR images. IEEE Transactions on medical imaging
18.
Sethian, J., 1996. A fast marching level set method for monotoni-
cally advancing fronts. Proceedings of the National Academy of
Sciences of the United States of America 93, 1591.
13

Smeets, D., Loeckx, D., Stijnen, B., De Dobbelaer, B., Vandermeulen,
D., Suetens, P., 2009. Semi-Automatic Level Set Segmentation of
Liver Tumors combining a Spiral Scanning Technique with Super-
vised Fuzzy Pixel Cassication. Medical Image Analysis .
Stawiaski, J., Decencieere, E., Bidault, F., 2008. Interactive liver tu-
mor segmentation using graph cuts and watershed, in: Workshop
on 3D Segmentation in the Clinic: A Grand Challenge II. Liver
Tumor Segmentation Challenge. MICCAI, New York, USA.
Studholme, C., Hill, D., Hawkes, D., 1999. An overlap invariant en-
tropy measure of 3D medical image alignment. Pattern recognition
32, 7186.
Wahba, G., 1990. Spline models for observational data. Society for
Industrial Mathematics.
Zhou, J., Wong, D., Ding, F., Venkatesh, S., Tian, Q., Qi, Y., Xiong,
W., Liu, J., Leow, W., 2010. Liver tumour segmentation using
contrast-enhanced multi-detector CT data: performance bench-
marking of three semiautomated methods. European Radiology
, 111.
14


- Method produces accurate segmentations of liver tumors from low-quality data

- Only minimal user interaction is required

- Highest score for benchmark data set, with average overlap error of 30.35%

- Multiphase segmentation uses several images and adds robustness to the method

- Novel post-processing method removes extraneous regions

You might also like