Professional Documents
Culture Documents
Abstract—As an active area of research, segmentation of contrasting classical Machine Learning algorithms which
volumetric data, the form in which most medical data is require manual feature extraction feature engineering.
represented, faced challenges in the lack of sufficient amount of
data for training. The purpose of this project is to implement and As of today, Deep Learning underlies many advanced
characterize architectures of deep learning neural network for pattern recognition features such as naming object in pictures,
three-dimensional image segmentation. Artificial data was used segmentation, and natural language processing. Although
to circumvent the scarcity of training examples in characterizing originating from the field of applied artificial intelligence
the neural networks for volumetric semantic segmentation. research, Deep Learning has found many application in various
Additionally, gaussian noise with varying SNR were introduced fields including but not limited to, computer vision[4], control
to the images to determine to what extent does noise affect the engineering[5], medicine [6], Internet-Of-Things and the
performance of deep learning networks. Three architectures automotive industry[7].
were successfully implemented, Fully Convolutional[1], Residual
Blocks[2], and Inception Block[3]. The performance neural Deep Learning Neural Network has been shown to be
networks were measured by computing the accuracy and able to address the more complex tasks in image processing,
visualizing the results of segmentation. Among the three such as semantic segmentation. The challenge with training a
implemented architectures, Inception were found to be the most deep learning neural network end-to-end is the scarcity of
robust in performance compared to other implementation, that it labeled data. Although there has been researches proposing to
could overcome, at certain conditions, class imbalance issue that address some issues of a volumetric segmentation[8]–[12],
was found during the preliminary work of the project. Moreover, there is yet an approach that can provide quantitative
the inception network was also found to have a relatively good measurements for the results of the segmentation. Additionally,
performance with noisy images. Although it is expected that the there is no clear relationship between the performance of the
performance of the network will not generalize well with real neural network with the choices of architecture, effect of noise
data, the project provided some insight as to the performance of and resource constraints. As such application of deep learning
deep learning neural networks for volumetric semantic neural networks for volumetric semantic segmentation is still
segmentation. an active area of research. Therefore, there is a need to
implement and explore architectural choices of a deep learning
Keywords—Deep Learning; Semantic Segmentation;
neural networks that can perform efficient semantic
segmentation on volumetric data.
I. INTRODUCTION
Machine Learning is one of the technology with the II. BACKGROUND
potential to revolutionize the way information is utilized in
various fields. At its core Machine Learning algorithms are a The project focuses on implementing and characterizing
data-driven decision model requiring minimal human three deep learning convolutional neural network architectures
intervention. With the advent of the Age of Big Data and for volumetric segmentation.
innovations in hardware technology, a subset of machine A. Convolutional Neural Network
learning, called Deep Learning, has risen to a significant
popularity due its accuracy, efficiency, efficacy and reliability. Convolutional neural network[13] is a class of artificial
Deep learning is an advanced and powerful class of neural neural network, which was inspired by the vision process
network algorithms that are able to learn complex mapping happening in visual cortex. CNN has been the choice
between input and output. The learning process occurs during a architecture for many deep learning applications which
phase called training phase in the development lifecycle of a involve computer vision tasks, such as classification and
Deep Learning neural networks. These mappings are learnt segmentation. The basic structure of a convolutional neural
directly from the data being used to train the neural network, network is a successive convolution of the input layers, which
extracts optimal features from each convolutional layer, with
filters producing feature maps. After the convolution the convolutional neural network for an automatic 3D
features map is passed through an activation function, usually segmentation of brain MRI images where segmentation is
a ReLu, to introduce non-linearity[14]. Afterwards, these done by taking patches of an image, which predicts the center
activated feature maps are either down sampled, which voxel, along with its close neighbors. DeepNat[17] used two
increases abstraction, through max pooling or go through networks, hierarchically arranged, one to separate the
another convolution. After several convolutions and max background and the foreground and the other was used in the
pooling, the extracted features followed by fully-connected foreground to anatomically classify 25 brain structures by
layers, where high-level reasoning occurs, and outputs of the implementing multi-task learning. The hierarchical
previous layer are connected to every input of the next layer. arrangement of the network was used to counter the class
Output of the fully-connected layer are then connected to a imbalance problem in patch-based segmentation method.
softmax layer producing probability distribution over N- Additionally, the approach tried to reconcile the issue of the
number of classes. Fig. 1 shows an example of convolutional depth dimension by augmenting the model with spectral
neural network architecture. coordinates.
Results of the different methods used during evaluation
are compared based on performance when tested using the
dataset of the MICCAI Multi-Atlas Labeling challenge
P0F1P[20]. [17] used mean Dice volume overlap score, called
Sørensen–Dice coefficient index to measure the accuracy of
the neural network. Increased performance of segmentation
was observed when the network implemented an efficient
fully connected Conditional Random Fields (CRF).
Fig. 1. Convolutional Neural Network, AlexNet[8]
loss can only be used for binary classification, foreground or 9. Convolution 2 x 2 x 2 4 filters, Stride: 2 8x8x8x4
Fig. 9. Class-wise Validation Accuracy (-10dB Noise) Fig. 12. Segmentation results on images with -10dB Noise
C. SNR Study [4] D. K. Nithin and P. B. Sivakumar, “Generic Feature Learning in
Computer Vision,” Procedia Comput. Sci., vol. 58, pp. 202–209, 2015.
The effect of noise on the performance of networks on [5] K. Cheon, J. Kim, M. Hamadache, and D. Lee, “On Replacing PID
volumetric segmentation can be observed in Fig. 7 to 12. The Controller with Deep Learning Controller for DC Motor System,” J.
amount of noise does not seem to adversely affect the Autom. Control Eng., vol. 3, no. 6, pp. 452–456, 2015.
performance of network with inception blocks. Architecture [6] J. Schmidhuber, “Deep Learning in neural networks: An overview,”
Neural Networks, vol. 61, pp. 85–117, 2015.
A6, which was shown to overcome class imbalance issue in [7] A. Luckow, M. Cook, N. Ashcraft, E. Weill, E. Djerekarov, and B.
noiseless images seems to be affected quite heavily by the Vorster, “Deep Learning in the Automotive Industry: Applications and
noise introduced for SNR study, in that it failed to learn to Tools,” 2016 IEEE Int. Conf. Big Data (Big Data), pp. 3759–3768,
classify classes which are not the background class, where it 2017.
was able to achieve more than 99% accuracy. This might imply [8] K. Kamnitsas et al., “Efficient multi-scale 3D CNN with fully
that the addition of noise into the images increase the severity connected CRF for accurate brain lesion segmentation,” Med. Image
Anal., vol. 36, pp. 61–78, Feb. 2017.
of class imbalance or that it shifts the mean and variance of the [9] J. Dolz, C. Desrosiers, and I. Ben Ayed, “3D fully convolutional
data sample thereby increasing the difficulty for the network to networks for subcortical segmentation in MRI: A large-scale study,”
find the optimal mapping. Architecture A7 shows a more Neuroimage, Apr. 2017.
interesting result where it is able to achieve a similar [10] Q. Dou et al., “3D deeply supervised network for automated
performance on very noise image (-10dB) as with noiseless segmentation of volumetric medical images,” Med. Image Anal., vol.
41, no. 4, pp. 40–54, 2017.
images, however fails to perform segmentation for images with [11] J. Kleesiek et al., “Deep MRI brain extraction: A 3D convolutional
0db noise. The cause of this failure might be due to the neural network for skull stripping,” Neuroimage, vol. 129, pp. 460–
occurrence whereby the optimizer is stuck in a local minimum 469, 2016.
and is not able to find a way out, possibly caused by the [12] F. Milletari, N. Navab, and S.-A. Ahmadi, “V-Net: Fully
addition of noise. For the classical implementations of fully Convolutional Neural Networks for Volumetric Medical Image
Segmentation,” ArXiv, pp. 1–11, 2016.
convolutional neural network it is more difficult to provide a [13] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet
general interpretation of how the amount of noise in the images Classification with Deep Convolutional Neural Networks,” Adv.
affect their performance due to the class imbalance issue. Neural Inf. Process. Syst., pp. 1–9, 2012.
[14] V. Nair and G. E. Hinton, “Rectified Linear Units Improve Restricted
Boltzmann Machines,” Proc. 27th Int. Conf. Mach. Learn., no. 3, pp.
V. CONCLUSION 807–814, 2010.
In conclusion, the project has successfully implemented and [15] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic
Optimization,” arXiv:1412.6980v9, pp. 1–15, 2014.
characterize three fully convolutional architecture which
[16] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional
includes implementation of state of the art architectural choices Networks for Biomedical Image Segmentation,” in Medical Image
which were shown to allow deep neural netwrok to achieve Computing and Computer-Assisted Intervention -- MICCAI 2015: 18th
significantly better performance. Deeper neural networks are International Conference, Munich, Germany, October 5-9, 2015,
quite robust to, in certain conditions, to overcome the issue of Proceedings, Part III, N. Navab, J. Hornegger, W. M. Wells, and A. F.
class imbalance while also performing remarkably well on Frangi, Eds. Cham: Springer International Publishing, pp. 234–241,
2015.
images with signal-to-noise ratio less than one. However, the [17] C. Wachinger, M. Reuter, and T. Klein, “DeepNAT: Deep
performance shown in this paper is limited to applications of convolutional neural network for segmenting neuroanatomy,”
deep learning neural network trained and evaluated using Neuroimage, vol. Volume 170, pp. 434–445, 2017.
artificial data with access to abundance of examples with [18] B. Kayalibay, G. Jensen, and P. van der Smagt, “CNN-based
perfect ground truth. Although, similar performance cannot be Segmentation of Medical Imaging Data,” arXiv:1701.03056v2, 2017.
[19] D. Ciresan, A. Giusti, L. Gambardella, and J. Schmidhuber, “Deep
expected when dealing with real data with high variability, this Neural Networks Segment Neuronal Membranes in Electron
project provides an informative exercise in characterization of Microscopy Images,” Adv. Neural Inf. Process. Syst. 25, pp. 2843–
deep learning neural network for volumetric semantic 2851, 2012.
segmentation. [20] A. J. Asman and B. A. Landman, “Formulating spatially varying
performance in the statistical fusion framework,” IEEE Trans. Med.
Imaging, vol. 31, no. 6, pp. 1326–1336, 2012.
REFERENCES [21] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller,
[1] J. Long, E. Shelhamer, and T. Darrell, “Fully Convolutional Networks “Striving for Simplicity: The All Convolutional Net,” pp. 1–14, 2014.
for Semantic Segmentation,” IEEE Conf. Comput. Vis. Pattern [22] A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez,
Recognit., pp. 3431–3440, 2015. and J. Garcia-Rodriguez, “A Review on Deep Learning Techniques
[2] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Applied to Semantic Segmentation,” pp. 1–23, 2017.
Image Recognition,” arXiv:1512.03385v1, pp. 1–12, 2015. [23] A. A. Taha and A. Hanbury, “Metrics for evaluating 3D medical image
[3] C. Szegedy et al., “Going Deeper with Convolutions,” in Computer segmentation: Analysis, selection, and tool,” BMC Med. Imaging, vol.
Vision and Pattern Recognition (CVPR), 2015 IEEE Conference, pp. 15, no. 1, 2015.
1–9, 2015.