A Brief Performance Evaluation of ECG Feature Extraction Techniques For Artificial Neural Network Based Classification

A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network Based Classification
Vishwakarma Institute of Information Technology, Pune-411048, India Dr. Babasaheb Ambedkar Technological University, Lonere-402103, India.
Rajesh Ghongade#, Dr. A.A. Ghatol*
Abstract-Electrocardiogram is the most easily accessible bioelectric signal that provides the doctors with reasonably accurate data regarding the patient heart condition. Many of the cardiac problems are visible as distortions in the electrocardiogram (ECG). Normally ECG related diagnoses are carried out by the medical practitioners manually. The major task in diagnosing the heart condition is analyzing each heart beat and co-relating the distortions found therein with various heart diseases. Since the abnormal heart beats can occur randomly it becomes very tedious and time-consuming to analyze say a 24 hour ECG signal, as it may contain hundreds of thousands of heart beats. Hence it is desired to automate the entire process of heart beat classification and preferably diagnose it accurately. In this paper the authors have focused on the various schemes for extracting the useful features of the ECG signals for use with artificial neural networks. Once feature extraction is done, ANNs can be trained to classify the patterns reasonably accurately. Arrhythmia is one such type of abnormality detectable by an ECG signal. The three classes of ECG signals are Normal, Fusion and Premature Ventricular Contraction (PVC). The task of an ANN based system is to correctly identify the three classes, most importantly the PVC type, this being a fatal cardiac condition. Transform feature extraction and morphological feature extraction schemes are mostly preferred. Discrete Fourier Transform, Principal Component Analysis, and Discrete Wavelet Transform are the three transform schemes along with three other morphological feature extraction schemes are discussed and compared in this paper Keywords: ECG , MLP, Feature extraction, DWT, DFT, PCA
ANN for pattern recognition of these three types of heartbeats and classify them correctly as Normal (N), Fusion (F) and Premature Ventricular Contraction (PVC).We have used the ECG data available from MIT/BIH Arrhythmia Database since it is considered as the benchmark data.
Figure 1. Types of heartbeats

15 0 0 14 0 0 13 0 0 12 0 0 110 0 10 0 0 900 800 0 90
R Q
I.
INTRODUCTION
Heart is one of the most important organs of the human body hence it is termed as a vital organ. The major function of the heart is to deliver oxygen to all the cells in a human body. If this function is improper, the oxygen supply is hampered and the cells remain deprived of the essential element. The irregularity of the heart function can be termed as arrhythmia and is the most observed cardiac disease. There are numerous types of arrhythmia disorders, out of which premature ventricular contraction is a pre-indication of the lethal ventricular fibrillation [1][2], is taken into consideration here. The ECG is one of the most effective diagnostic tools available to the medics. Of the many thousand heartbeats available, it becomes tedious to locate the distorted heartbeat and then classify it.[12] Figure 1 depicts the three types of heartbeats. Here we discuss the various schemes to extract useful information from the ECG signal and use these features for training an
S amp l es
18 0
Figure 2. ECG morphology
II. ECG DATA PRE-PROCESSING The data available from MIT/BIH Arrhythmia Database[14] is the standard used by many researchers. This database contains 25 records of 30 minutes duration. The ECG signal is sampled at 360 Hz with a resolution of 11bits. Since we are interested in only certain type of arrhythmias three records we used. Since each record is a continuous waveform it is necessary to extract only a single heartbeat. This is done by considering the R peak and extracting 90 samples on either side of this R peak. It is not mandatory to use all the 180 samples for classification but here we use all 180 samples.[1] Figure 2 shows the data selection method for one heartbeat. Since the MIT/BIH database comes with
annotations for each heartbeat it is necessary to associate the categories with individual extracted beats. A total of 1827 beats were extracted with 609 beats belonging to each category viz. Normal, Fusion and PVC. We can use the entire 180 samples for training an ANN but this may give rise to two problems namely overtraining and excessive computational overhead. Overtraining affects the accuracy adversely while computational overhead will put constraints on the speed and required resources. Hence only useful features have to be identified so that only these features can be used to train an ANN and avoid the above mentioned problems.[12][13]
the vector set. PCA ALGORITHM: a) Obtain the Mean vector and mean adjusted data:
1 N 1 xi N i =0 Mean Adjusted Data = ( X x )
x =
(3)
b) Obtain the covariance matrix:
Cx =
c)
1 N
( x )( x )
i =0 i i i i
N 1
(4)
Obtain the eigen vectors and eigen values:
Cx e = e
(5)
where e is eigen vector and is eigen value d) Choosing components and forming a feature vector: We get 180 components corresponding to the dimensionality of the input sequence. Components that are significant from the point of view of contribution to the total energy of the signal are selected. The selected components together must constitute about 99% of the total energy of the signal. This procedure decreases the data dimensionality without significant loss of information. Feature Vector = (eig1 eig 2 eig 3 ...eigP ) (6) e) Creating the new data set:
N ew D ata = [ Feature Vector ]T M ean Adjusted D ata
Figure 2. Selection of samples
III. TRANSFORM FEATURE EXTRACTION A. DISCRETE FOURIER TRANSFORM The Discrete Fourier Transform (DFT) of a sequence
{u (n), n = 0,1,..., N 1} is defined as

v(k ) = u (n)WNkn ,
n=0 N 1
k = 0,1,,..., N 1
(1)
where
WN
j 2 N
(2)
(7) Once we select 14 significant components the next task is to train a MLP with these coefficients as inputs and the category of the heartbeat as output. C. DISCRETE WAVELET TRANSFORM Wavelet analysis consists of decomposing a signal or an image into a hierarchical set of approximations and details. The levels in the hierarchy often correspond to those in a dyadic (dual) scale The continuous wavelet transform is defined as:
The resulting sequence consists of real and imaginary components.[11] Again all components are not required as feature vectors and we can select those components with higher energy contribution. Further only the real part of the components can be used. We have selected the real part of only 20 components which preserves the ECG waveform morphology. Further it was found that components more than 20, do not contribute much to the classifier accuracy. Once we select 20 significant components the next task is to train a MLP with these coefficients as inputs and the category of the heartbeat as output. B. PRINCIPAL COMPONENT ANALYSIS The aim of PCA, also known as Karhunen-Leve Transform - KLT, is to reduce the dimensionality of the data set.[1] The importance of the variables is statistically evaluated. So, this method is based on the significance of the information, in which the KLT identifies the direction of the signal with maximum energy or variance. Below we present the algorithm to obtain the Principal Components of a vector set X represented by a X M N matrix, where N represents the number of segments or vectors x i
M
1 CWT (a, b) = a
t b f (t ) dt a
2
(8)
where a and b are the dilation and translation parameters, respectively. A wide variety of functions can be chosen as the mother wavelet , provided (t ) L and
(t ) dt = 0
(9)
in the vector set,
i = 0,1,..., N 1 , and M represents the number of samples

per segment or the dimension of the vectors that constitute
An elegant way to compute Discrete Wavelet Transform (DWT) is to convolve the signal with a pair of appropriately designed quadrature mirror filters (QMF) and then down sample by a factor of two. The QMF pair which decomposes the signal consists of low pass filter H and a high pass filter G which split the signal bandwidth into half. A 23 coefficient data is retained for training corresponding to A3, other components are discarded. Again 23 coefficients are selected on the basis of morphology preservation criterion. .[8]
Once these 23 DWT coefficients are selected the next task is to train a MLP with these coefficients as inputs and the category of the heartbeat as output.
seconds
Q-S DIST 70 60 50 40 30 20 10 0 1 501 1001 ECG Beats 1501
IV. MORPHOLOGICAL FEATURE EXTRACTION A. R-peak One very peculiar feature that distinguishes the Fusion type from other types i.e. Normal and PVC types is the amplitude of the R-peak. Figure 3 depicts the R-peak distribution for the three classes. The R-peak is computed the difference between the mean level of the ECG sample and Rpeak. We observe that the three classes of heart beats are more or less distinctly separated.
R-Peak 800 700 600 500 400 300 200 100 0 1 301 601 901 ECG beats 1201 1501
Figure 5. Q-S distance distribution over the entire dataset
Function approximation can also be done quite effectively by the MLP. Figure 6 depicts the MLP architecture used for experimentation.
x0
x 0' x1 y0 x2 y1
mV
Figure 3. R-peak value distribution over dataset
. . . . . . .
x 29
. . .
y2 x 29 '
B. QRS area Another important observation is that the QRS area for the three classes under consideration differs from one another considerably. Figure 4 depicts the fact. It is evident that this value can prove as an important feature for classification.
QRS-AREA 30000 25000 20000 V-s 15000 10000 5000 0 1 501 1001 ECG Beats 1501
Input layer
Hidden layer
Output layer
Figure 6. Multilayer perceptron
Figure 4. QRS Area value distribution over dataset
C. Q-S distance The time difference between the Q and the S point is also found to be an important feature that differentiates the three classes. Figure 5 shows the clearly the difference in the Q-S distance values. V. ARTIFICIAL NEURAL NETWORKS An artificial neural network is inspired by the biological neurons present in the human brain. Though the sheer number of biological neurons and their high interconnectivity is impossible to be duplicated, the scaled down models of the artificial neural networks shows similar capacity of learning and generalizing intrinsic characteristics of the information presented to it. A. Multilayer Perceptron (MLP) This is one of the most popular and powerful ANN architectures. MLP not only overcomes the severe limitation of a simple perceptron, but also offers the most important feature of learning. It is thus a very powerful pattern classifier.
The entire process of classification consists of two phases: training phase and testing phase. MLP training is executed using a well-known technique of backpropagation. After satisfactory training error level is obtained the training is terminated and a new data is presented to the network to test the performance. B. Backpropagation Algorithm Following steps indicate the backpropagation algorithm. a) Initialize the weights and biases with small random values. b) Present a continuous valued input vector x0 , x1 ,..., xN 1 and specify the desired outputs d 0 , d1 ,..., d M 1 c) Compute actual outputs y0 , y1 ,..., yM 1 d) Adapt weights: Use a recursive algorithm starting at the output nodes and working back to the first hidden layer. Adjust weights by
wi , j (t + 1) = wi , j (t ) + j xi'
Where wi,j(t) is the weight from hidden node I or from an input to node j at time t , xi' is either the output of node i or is an input , is a gain term and j is an error term for node j . If node j is an output node then
j = (1 ( y j ) 2 )(d j y j )
Where dj is the desired output of node j and yj is the actual output. If node j is an internal hidden node then
j = (1 ( x ' ) 2 ) k w jk
j
where k is over all the nodes in the layers above node j. Internal node thresholds are adapted in a similar manner by assuming they are connection weights on links from auxiliary constant valued inputs (called as biases). Convergence is faster if a momentum term is added and weight changes are smoothed by
wi , j (t + 1) = wi , j (t ) + j xi' + [ wij (t ) wij (t 1)]

Where 0< <1 Here we have used hyperbolic tangent sigmoid functions at the output of each neuron. It is given by the following equation:
y=
e x e x e x + e x
generalizes properly as can be seen from the results. (Table 1) The accuracy shown here is testing accuracy and not the training accuracy. Table 1 also shows the comparative performances in term of accuracy. It can be seen that the accuracy for PVC type is good for all the classifiers. Transform based feature extraction thus enables us to realize a reliable classifier. ANNs utilizing morphological features also show good performance and are also computationally more efficient. (Table 2). Thus the feature extraction techniques play a crucial role in the performance of these classifiers. A definite improvement in the accuracy can be achieved by selecting the optimum network topology, learning rule, number of iterations for training and learning rate.
TABLE I PERFORMANCE OF CLASSIFIERS Sr No. 1 2 3 4 Method DFT+MLP PCA+MLP DWT+MLP Morphologic al Features+ MLP % Accuracy Normal 100 100 100 98 Fusion 95.07 89.16 87.68 86. 5 PVC 93.59 94.55 95.566 96.5 Average 96.22 94.57 94.42 93.67
If the mean square error (MSE) is not acceptable then, go back to step b.[3] VI. EXPERIMENTATION AND RESULTS We used the transformed data obtained using the above mentioned techniques to train a MLP. The MLP architecture consisted of one hidden layer with 15 neurons in the hidden layer and three output neurons corresponding to the three classes under consideration. However the input neurons corresponding to the four methods were different since number of features so obtained varied with the extraction scheme. The training set consisted of equal samples of each type to reflect the equal probability data. Multifold and differential training was employed to enable true network learning and remove the dataset bias if any. Rigorous experimentation indicated that momentum learning rule for MLP and tan-sigmoid activation function produced best results. Optimum number of hidden layer neurons and training iterations was explored experimentally. Naturally the four feature extraction techniques exhibited different optimum configurations. Memorizing due to overtraining was avoided by again subjecting the networks to iterations ranging from 1000 to 40000 and finding the iteration number after which the testing accuracy deteriorated. Once the training phase completed we tested the trained network with 609 samples (containing 203 samples of each type). Thus we subjected the network to equal sample test data. The comparative performance of the four ANN based classifiers is presented in Table 1. VII. DISCUSSIONS ANN based classifiers are now an established paradigm for pattern classifiers, MLP being a simple but powerful architecture. The main power and robustness of MLP can be harnessed by proper selection of input features. All the ECG sample values per heartbeat can be presented to MLP but this consumes a lot of computational resources and training time. This may also over train network and prevent it from generalizing. Feature extraction proves an essential process for reducing the inputs to the ANN drastically. Also the bare minimum and prominent markers are obtained which enhance the performance of the ANN classifier in addition to making it more robust and fault tolerant. The MLP
TABLE II RESOURCES CONSUMED IN TERMS OF NEURONS Sr No. 1 2 3 4 Method DFT+MLP PCA+MLP DWT+MLP Morphologic al Features+ MLP Input 10 15 23 3 Number of neurons Hidden Layer 8 10 10 6 Total 21 28 36 12
REFERENCES
[1] Fabian Vargas, Djones Lettin, Maria Cristina Felippetto de Castro, Marcello Macarthy, Electrocardiogram Pattern Recognition by Means of MLP Network and PCA: A Case Study on Equal Amount of Input Signal Types, IEEE, 2002 [2] G. Krishna Prasad, J.S. Shambi, Classification of ECG Arrhythmias using Multi-Resolution Analysis and Neural Networks, IEEE, 2003 [3] N.Maglaveras, T.Stampkopoulos, K. Diamantaras, C. Pappas and M. Strintzis, ECG pattern recognition ans classification using non-linear transformations and neural networks: A review, International Journal of Medical Informatics, vol 52, nos.1-3, pp.191-208, 1998 [4] N. Mahalingam and D. Kumar, Neural Networks for signal processing applications: ECG classification, Australas. Phys Eng. Sci. Med., vol. 20, no 3, pp.147-151, 1997 [5] Rajesh Ghongade, Dr. A.A. Ghatol, Electrocardiogram Pattern Classification: An Approach Employing DWT and Artificial Neural Networks, Proceedings of INCON 2004 [6] Alexander D. Poularikas , The Transforms and Applications Handbook, Second Edition, CRC Press, 1999 [7] Lin He, Wensheng Hou, Xiaolin Zhen and Chenglin Peng ,Recognition of ECG Patterns Using Artificial Neural Network, ,Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06),IEEE,2006 [8] H. Gholam Hosseini, D. Luo, K.J. Reynolds, the comparison of different feed forward neural network architectures for ECG signal diagnosis, Medical Engineering & Physics 28, pp.372-378, Elsevier, 2005 [9] http://ecg.mit.edu [10] MATLAB MATHWORKS. http://www.mathworks.com/ [11] NeuroSolutions. http://www.nd.com/

A Brief Performance Evaluation of ECG Feature Extraction Techniques For Artificial Neural Network Based Classification

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Brief Performance Evaluation of ECG Feature Extraction Techniques For Artificial Neural Network Based Classification

Uploaded by

Copyright:

Available Formats

A Brief Performance Evaluation of ECG Feature Extraction Techniques for Artificial Neural Network Based Classification

Rajesh Ghongade#, Dr. A.A. Ghatol*

Figure 1. Types of heartbeats

Figure 2. ECG morphology

1 N 1 xi N i =0 Mean Adjusted Data = ( X x )

b) Obtain the covariance matrix:

Obtain the eigen vectors and eigen values:

Figure 2. Selection of samples

{u (n), n = 0,1,..., N 1} is defined as

in the vector set,

i = 0,1,..., N 1 , and M represents the number of samples

Q-S DIST 70 60 50 40 30 20 10 0 1 501 1001 ECG Beats 1501

Figure 5. Q-S distance distribution over the entire dataset

Figure 3. R-peak value distribution over dataset

Figure 6. Multilayer perceptron

Figure 4. QRS Area value distribution over dataset

wi , j (t + 1) = wi , j (t ) + j xi' + [ wij (t ) wij (t 1)]

You might also like