You are on page 1of 8

J Med Syst (2010) 34:643650 DOI 10.

1007/s10916-009-9277-6

ORIGINAL PAPER

Recurrent Neural Networks for Diagnosis of Carpal Tunnel Syndrome Using Electrophysiologic Findings
Konuralp Ilbay & Elif Derya beyli & Gul Ilbay & Faik Budak

Received: 20 February 2009 / Accepted: 12 March 2009 / Published online: 1 April 2009 # Springer Science + Business Media, LLC 2009

Abstract This paper presents the use of recurrent neural networks (RNNs) for diagnosis of carpal tunnel syndrome (CTS) (normal, right CTS, left CTS, bilateral CTS). The RNN is trained with the Levenberg-Marquardt algorithm. The RNN is trained on the features of CTS (right median motor latency, left median motor latency, right median sensory latency, left median sensory latency). The multilayer perceptron neural network (MLPNN) is also implemented for comparison the performance of the classifiers on the same diagnosis problem. The total classification accuracy of the RNN is significantly

high (94.80%). The obtained results confirmed the validity of the RNNs to help in clinical decision-making. Keywords Carpal tunnel syndrome . Median motor latency . Median sensory latency . Clasification accuracy . Recurrent neural network

Introduction Carpal tunnel syndrome (CTS) results from compromise of median nerve function at the wrist caused by increased pressure in the carpal tunnel, an anatomical compartement bounded by the bones of the carpus and the transvers carpal ligament [1]. CTS affects mainly middle aged women. In the majority of patients the exact cause and pathogenesis of CTS is unclear [2]. Patients with CTS may present with a variety of symptoms and signs. Although CTS is usually bilateral both clinically and electrically, dominant hand usually is more severely affected, especially in idiopathic cases. Patients complain of wrist and arm pain associated with paresthesias in the hand. The pain may be localized to the wrist or may radiate to the forearm, arm, or rarely shoulder. Paresthesias are frequently present in the median nerve distrubition (medial thumb, index, middle, and lateral ring finger). Although many patients report that entire hand falls asleep, if asked directly about little finger involvement, most will subsequently note that the little finger is spared. Symptoms often provoked when either a flexed or extended wrist posture is assumed. Most commonly, this occurs during ordinary activities, such as driving or holding a phone, book, or newspaper. Nocturnal paresthesias are particularly common [3]. Artificial neural networks (ANNs) are computational modeling tools that have recently emerged and found

K. Ilbay Departmant of Neurosurgery, Faculty of Medicine, Kocaeli University, Umut Tepe Campus, 31380 Kocaeli, Turkey e-mail: konuralpilbay@yahoo.com E. D. beyli (*) Department of Electrical and Electronics Engineering, Faculty of Engineering, TOBB Ekonomi ve Teknoloji niversitesi, 06530 Stz, Ankara, Turkey e-mail: edubeyli@etu.edu.tr URL: http://www.etu.edu.tr/~edubeyli/ G. Ilbay Departmant of Physiology, Faculty of Medicine, Kocaeli University, Umut Tepe Campus, 31380 Kocaeli, Turkey e-mail: gulilbay@yahoo.com F. Budak Departmant of Neurology, Faculty of Medicine, Kocaeli University, Umut Tepe Campus, 31380 Kocaeli, Turkey e-mail: fbudak@kou.edu.tr

644

J Med Syst (2010) 34:643650

extensive acceptance in many disciplines for modeling complex real-world problems. ANNs produce complicated nonlinear models relating the inputs (the independent variables of a system) to the outputs (the dependent predictive variables). ANNs have been widely used for various tasks, such as pattern classification, time series prediction, nonlinear control, function approximation, and telecommunications. ANNs are desirable because (1) nonlinearity allows better fit to the data, (2) noiseinsensitivity provides accurate prediction in the presence of uncertain data and measurement errors, (3) high parallelism implies fast processing and hardware failuretolerance, (4) learning and adaptivity allow the system to modify its internal structure in response to changing environment, and (5) generalization enables application of the model to unlearned data. neural networks can be trained to recognize patterns. ANNs can generalize their conclusions during training and then recognition of patterns not previously encountered is possible by application of neural networks [46]. Automated diagnostic systems are important applications of pattern recognition, aiming at assisting doctors in making diagnostic decisions. Automated diagnostic systems have been applied to and are of interest for a variety of medical data, such as electrocardiograms (ECGs), electroencephalograms (EEGs), ultrasound signals/images, X-rays, and computed tomographic images [717]. Conventional methods of monitoring and diagnosing the diseases rely on detecting the presence of particular signal features by a human observer. Due to large number of patients in intensive care units and the need for continuous observation

of such conditions, several techniques for automated diagnostic systems have been developed in the past 10 years to attempt to solve this problem. Such techniques work by transforming the mostly qualitative diagnostic criteria into a more objective quantitative signal feature classification problem [717]. The choice of methods appropriate for a given pattern analysis task is rarely obvious. At each level (feature extraction, feature selection, classification) many methods exist [717]. The recurrent neural networks (RNNs) [1823] have been studied extensively for classification, regression and density estimation. The results of the existing studies [1823] showed that the RNNs have high accuracy in classification of the biomedical data, therefore we used the RNNs in the diagnosis of CTS. In this study, in order to diagnose CTS, the RNNs and multilayer perceptron neural network (MLPNN) trained with the Levenberg-Marquardt algorithm are implemented (Fig. 1). A significant contribution of the present work is to examine the performance of the RNNs on the diagnosis of CTS (normal, right CTS, left CTS, bilateral CTS).

Database description We retrospectively considered 350 patients (289 females and 61 males) with various CTS symptoms and signs who underwent nerve conduction studies. Of these patients, 121 had no electrophysiologic evidence of CTS, and was accepted as normal group (103 females and 18 males). 229 of the patients were suffered from

Fig. 1 A schematic representation of an Elman recurrent neural network. z1 represents a one time step delay unit

y1

y2

yn

Output layer

Hidden layer

z-1

z-1

z-1 Context layer

Input layer

x1

x2

xn

J Med Syst (2010) 34:643650 Table 1 Values of median motor and sensory latency of the conduction study of the patient with bilateral CTS Site Latency (ms) Amplitude Area Segment Distance (mm) Interval (ms)

645

NCV (m/s)

Motor Nerve Conduction Study Median, L Wrist 4.74 ms Elbow 8.25 ms Median, R Wrist 6.42 ms Elbow 10.26 ms Ulnar, R Wrist Elbow Sensory Nerve Median, L Wrist Median, R Wrist Ulnar, R Wrist

7.83 mV 6.68 mV 6.77 mV 5.24 mV

16.09 mVms 13.43 mVms 15.66 mVms 11.94 mVms 29.78 mVms 29.36 mVms

Wrist Wristelbow Wrist Wristelbow Wrist Wristelbow

170 mm

4.74 ms 3.51 ms 6.42 ms 3.84 ms 2.85 ms 3.06 ms

48.4 m/s

170 mm

44.3 m/s

2.85 ms 18.22 mV 5.91 ms 17.77 mV Conduction Study 4.32 ms 4.72 ms 2.58 ms 19.60 uV 7.20 uV 31.60 uV

180 mm

58.8 m/s

1.24 uVms 1.00 uVms 1.28 uVms

Wrist Wrist Wrist

4.32 ms 4.72 ms 2.58 ms

right CTS (32 females and 15 males), left CTS (22 females and 14 males) and bilateral CTS (132 females and 14 males). Patients with generalized peripheral neuropathy caused by diabetes or other medical illness and those who had undergone prior carpal tunnel surgery were not included in the study. Each subject completed a self-administered questionnaire. The questionnaire focused on hand symptoms that are commonly associated

with CTS. The study was approved by the Ethical Committee of Kocaeli University Hospital.

Method of electrophysiologic evaluation All the studies were performed with the subjects at supine position in a warm room with the temperature

Table 2 Values of median motor and sensory latency of the conduction study of the normal subject Site Latency (ms) Amplitude Area Segment Distance (mm) Interval (ms) NCV (m/s)

Motor Nerve Conduction Study Median, L Wrist 3.92 ms Elbow Median, R Wrist Elbow Ulnar, R Wrist Elbow Sensory Nerve Median, L Wrist Median, R Wrist Ulnar, R Wrist 7.76 ms 3.72 ms 8.04 ms

10.91 mV 12.32 mV 2.70 mV 3.38 mV

17.95 mVms 20.40 mVms 64.21 mVms 51.41 mVms 3.51 mVms 2.64 mVms

Wrist Wristelbow Wrist Wristelbow Wrist Wristelbow 220 mm

3.92 ms 3.81 ms 3.72 ms 4.32 ms 2.85 ms 4.14 ms 57.7 m/s

250 mm

57.9 m/s

2.58 ms 4.07 mV 6.72 ms 3.31 mV Conduction Study 2.64 ms 2.76 ms 2.58 ms 36.60 uV 11.60 uV 25.90 uV

240 mm

58.0 m/s

3.25 uVms 1.82 uVms 1.28 uVms

Wrist Wrist Wrist

2.64 ms 2.66 ms 2.36 ms

646

J Med Syst (2010) 34:643650

maintained at 26 to 28C. Skin temperatures were checked over the forearm. Nerve conduction studies were performed using standart techniques of supramaximal percutaneous stimulation with a constant current stimulator and surface electrode recording on both hands of each subject. Sensory responses were obtained antidromically stimulating at the wrist and recording from the index finger (median nerve) or little finger (ulnar nerve), with ring electrodes at a distance of 14 cm. The results of the median motor nerve obtained by stimulating the median motor nerve at the wrist and elbow and the recording was done over the abductor pollicis brevis muscle. The results of the ulnar motor nerve were performed by stimulating the ulnar nerve at the wrist, below the elbow, and above the elbow and the recording was done over the abductor digiti minimi muscle, with the arm flexed 135. In the present study, the following median nerve and ulnar nerve measures were used: (1) distal onset latency of the sensory nerve action potential (DL-S); (2) distal onset latency of the compound muscle action potential (DL-M). Median sensory latency greater than 3.5 ms, median motor
Fig. 2 The samples of EMG records of the patient with bilateral CTS (a) Image of motor nerve conduction study of right median nerve (b). Image of motor nerve conduction study of left median nerve (c). Image of sensory nerve conduction study of right median nerve (d). Image of sensory nerve conduction study of left median nerve

latency greater than 4.2 ms was used as the criteria for abnormal median nerve conduction [24].

Recurrent neural networks In the diagnosis applications, Elman RNNs were used and therefore in the following the Elman RNN is presented. An Elman RNN is a network which in principle is set up as a regular feedforward network. This means that all neurons in one layer are connected with all neurons in the next layer. An exception is the so-called context layer which is a special case of a hidden layer. The neurons in the context layer (context neurons) hold a copy of the output of the hidden neurons. The output of each hidden neuron is copied into a specific neuron in the context layer. The value of the context neuron is used as an extra input signal for all the neurons in the hidden layer one time step later. Therefore, the Elman network has an explicit memory of one time lag [18]. The strength of all connections between neurons are indicated with a weight. Initially, all weight values are

wrist

elbow 5mV 3ms

wrist

20V 2ms

J Med Syst (2010) 34:643650 Fig. 3 The samples of EMG records of the normal subject (e). Image of motor nerve conduction study of right median nerve (f). Image of motor nerve conduction study of left median nerve (g). Image of sensory nerve conduction study of right median nerve (h). Image of sensory nerve conduction study of left median nerve

647

F
wrist 5mV 3ms

elbow

wrist

20V 2ms

chosen randomly and are optimized during the stage of training. In an Elman network, the weights from the hidden layer to the context layer are set to one and are fixed because the values of the context neurons have to be copied exactly. Furthermore, the initial output weights of the context neurons are equal to half the output range of the other neurons in the network. The Elman network can be trained with gradient descent backpropagation and optimization methods, similar to regular feedforward neural networks [25]. The backpropagation has some problems for many applications.
Table 3 Confusion matrix Classifiers Desired result

The algorithm is not guaranteed to find the global minimum of the error function since gradient descent may get stuck in local minima, where it may remain indefinitely. Therefore, a lot of variations to improve the convergence of the backpropagation were proposed [4]. Optimization methods such as second-order methods (conjugate gradient, quasi-Newton, Levenberg-Marquardt) have also been used for neural networks training in recent years. The Levenberg-Marquardt algorithm combines the best features of the Gauss-Newton technique and the steepest-descent algorithm, but avoids many of their
Output result Normal Right CTS 0 28 2 0 0 26 3 1 Left CTS 0 0 22 2 0 1 21 2 Bilateral CTS 1 2 2 110 3 4 5 103

RNN

MLPNN

Normal Right CTS Left CTS Bilateral CTS Normal Right CTS Left CTS Bilateral CTS

77 2 1 1 71 5 3 2

648 Table 4 The values of the statistical parameters Classifiers Classification accuracies (%) Specificity RNN MLPNN 95.06 87.65 Sensitivity (right CTS) 93.33 86.67 Sensitivity (left CTS) 91.67 87.50 Sensitivity (bilateral CTS) 95.65 89.57

J Med Syst (2010) 34:643650

Total classification accuracy 94.80 88.40

limitations. In particular, it generally does not suffer from the problem of slow convergence [26, 27] and can yield a good cost function compared with the other training algorithms. The Levenberg-Marquardt algorithm is a least-squares estimation algorithm based on the maximum neighborhood idea. Let E(w) be an objective error function made up of m individual error terms e2 w as follows: i m X E w e2 w k f wk2 ; 1 i
i1

ydi yi 2 and ydi is the desired value of where output neuron i, yi is the actual output of that neuron. It is assumed that function f () and its Jacobian J are known at point w. The aim of the Levenberg-Marquardt algorithm is to compute the weight vector w such that E(w) is minimum. Using the Levenberg-Marquardt algorithm, a new weight vector wk+1 can be obtained from the previous weight vector wk as follows: e2 w i wk1 wk dwk ; where wk is defined as 1 dwk JkT f wk JkT Jk lI : 2 3

In equation (3), Jk is the Jacobian of f evaluated at wk, l is the Marquardt parameter, I is the identity matrix [26, 27]. The Levenberg-Marquardt algorithm may be summarized as follows: 1. 2. 3. 4. compute E(wk), start with a small value of l (l=0.01), solve equation (3) for wk and compute E(wk + wk), if E(wk + wk) E(wk), increase l by a factor of 10 and go to (3), 5. if E(wk + wk) < E(wk), decrease l by a factor of 10, update wk : wk wk + wk and go to (3).

left median motor latency, right median sensory latency, left median sensory latency (four features used as inputs of the classifiers) of sample records of two classes (bilateral CTS and normal) are presented in reports of the subjects as presented in Tables 1 and 2. The samples of the electromiyogram (EMG) records of the patient with bilateral CTS and the normal subject are shown in Figs. 2 and 3. MATLAB software package (MATLAB version 7.0 with neural networks toolbox) was used for implementation of the RNN and the MLPNN. The key design decisions for the neural networks used in classification are the architecture and training. The adequate functioning of neural networks depends on the sizes of the training set and test set. Various experiments were performed for determining the sizes of the training and testing sets of the CTS database. In the developed classifiers, 100 of 350 records were used for training and the rest for testing. The training set consisted of 40 normal, 17 right CTS, 12 left CTS and 31 bilateral CTS. The testing set consisted of 81 normal, 30 right CTS, 24 left CTS and 115 bilateral CTS. Experiments were done for different network architectures and the results of the architecture studies confirmed that networks with one hidden layer consisting of 25 recurrent neurons results in higher classification accuracy. In order to compare performance of the different classifiers, for the same classification problem MLPNN which is the

Results and discussion The features of CTS (right median motor latency, left median motor latency, right median sensory latency, left median sensory latency) were used as the inputs of the RNNs. The values including right median motor latency,

Fig. 4 ROC curves

J Med Syst (2010) 34:643650

649

most commonly used feedforward neural networks was implemented. The single hidden layered (20 hidden neurons) MLPNN was used to classify the CTS. In the hidden layer and the output layer, the activation function was the sigmoidal function. Classification results of the classifiers are displayed by a confusion matrix. In a confusion matrix, each cell contains the raw number of exemplars classified for the corresponding combination of desired and actual network outputs. The confusion matrices showing the classification results of the classifiers used for classification of the CTS are given in Table 3. From these matrices one can tell the frequency with which record is misclassified as another. The test performance of the classifiers can be determined by the computation of sensitivity, specificity and total classification accuracy. The sensitivity, specificity and total classification accuracy are defined as: Sensitivity Specificity Total classification accuracy number number number number number number of of of of of of true positive decisions/ actually positive cases true negative decisions/ actually negative cases correct decisions/total cases

median sensory latency, left median sensory latency) was analyzed. The performance of the RNNs on the diagnosis of CTS (normal, right CTS, left CTS, bilateral CTS) was presented. In order to evaluate the used classifiers, the classification accuracies and ROC curves of the classifiers were considered. The classification results and the values of statistical parameters indicated that the RNN had considerable success in discriminating the CTS (total classification accuracy was 94.80%). The performance of the MLPNN was not as high as the RNN. This may be attributed to several factors including the training algorithms, estimation of the network parameters and the scattered and mixed nature of the features. References
1. Bland, J. D., Carpal tunnel syndrome. BMJ. 335:343346, 2007. doi:10.1136/bmj.39282.623553.AD. 2. Aroori, S., and Spence, R. A., Carpal tunnel syndrome. Ulster Med. J. 77:617, 2008. 3. Preston, D. C., and Shapiro, B. E., Electromyography and neuromuscular disorders. Elsevier Science, Philadelphia, pp. 255281, 2005. 4. Haykin, S., Neural networks: A Comprehensive Foundation. Macmillan, New York, 1994. 5. Basheer, I. A., and Hajmeer, M., Artificial neural networks: fundamentals, computing, design, and application. J. Microbiol. Methods. 43 (1)331, 2000. doi:10.1016/S0167-7012(00)00201-3. 6. Chaudhuri, B. B., and Bhattacharya, U., Efficient training and improved performance of multilayer perceptron in pattern classification. Neurocomputing. 34:1127, 2000. doi:10.1016/S09252312(00)00305-2. 7. Miller, A. S., Blott, B. H., and Hames, T. K., Review of neural network applications in medical imaging and signal processing. Med. Biol. Eng. Comput. 30:449464, 1992. doi:10.1007/BF02457822. 8. Mobley, B. A., Schechter, E., Moore, W. E., McKee, P. A., and Eichner, J. E., Predictions of coronary artery stenosis by artificial neural network. Artif. Intell. Med. 18:187203, 2000. doi:10.1016/ S0933-3657(99)00040-8. 9. beyli, E. D., Comparison of different classification algorithms in clinical decision-making. Expert Syst. 24 (1)1731, 2007. doi:10.1111/j.1468-0394.2007.00418.x. 10. beyli, E. D., Analysis of EEG signals by combining eigenvector methods and multiclass support vector machines. Comput. Biol. Med. 38 (1)1422, 2008. doi:10.1016/j.compbiomed.2007.07.004. 11. beyli, E. D., Combining neural network models for automated diagnostic systems. J. Med. Syst. 30 (6)483488, 2006. doi:10.1007/s10916-006-9034-z. 12. beyli, E. D., A mixture of experts network structure for breast cancer diagnosis. J. Med. Syst. 29 (5)569579, 2005. doi:10.1007/ s10916-005-6112-6. 13. beyli, E. D., Multiclass support vector machines for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. 35 (4)1733 1740, 2008. doi:10.1016/j.eswa.2007.08.067. 14. beyli, E. D., Modified mixture of experts for diabetes diagnosis. J. Med. Syst. 2009 (in press). 15. beyli, E. D., Adaptive neuro-fuzzy inference systems for automatic detection of breast cancer. J. Med. Syst. 2009 (in press) 16. beyli, E. D., and Dodu, E., Automatic detection of erythematosquamous diseases using k-means clustering. J. Med. Syst. 2009 (in press)

A true negative decision occurs when both the classifier and the physician suggested the absence of a positive detection. A true positive decision occurs when the positive detection of the classifier coincided with a positive detection of the physician. In order to show performance of the classifiers used for classification of the CTS, the classification accuracies (specificity, sensitivity, total classification accuracy) on the test sets of the classifiers are presented in Table 4. Receiver operating characteristic (ROC) plots provide a view of the whole spectrum of sensitivities and specificities because all possible sensitivity/specificity pairs for a particular test are graphed. The performance of a test can be evaluated by plotting a ROC curve for the test and therefore, ROC curves were used to describe the performance of the classifiers. A good test is one for which sensitivity rises rapidly and 1specificity hardly increases at all until sensitivity becomes high. ROC curves which are shown in Fig. 4 demonstrate the performances of the classifiers on the test files. From the classification results presented in Table 4 and Fig. 4 (classification accuracies and ROC curves), one can see that the RNN trained on the features produce considerably high performance than that of the MLPNN.

Conclusions The accuracy of RNNs trained on the features of CTS (right median motor latency, left median motor latency, right

650 17. beyli, E. D., lbay, K., lbay, G., Sahin, D., and Akansel, G., Differentiation of two subtypes of adult hydrocephalus by mixture of experts. J. Med. Syst. 2009 (in press). 18. Elman, J. L., Finding structure in time. Cogn. Sci. 14 (2)179211, 1990. 19. beyli, E. D., Recurrent neural networks employing Lyapunov exponents for analysis of Doppler ultrasound signals. Expert Syst. Appl. 34 (4)25382544, 2008. doi:10.1016/j.eswa.2007.04.002. 20. beyli, E. D., Recurrent neural networks with composite features for detection of electrocardiographic changes in partial epileptic patients. Comput. Biol. Med. 38 (3)401410, 2008. doi:10.1016/j. compbiomed.2008.01.002. 21. beyli, E. D., Analysis of EEG signals by implementing eigenvector methods/recurrent neural networks. Digit. Signal Process. 19 (1)134143, 2009. doi:10.1016/j.dsp.2008.07.007. 22. beyli, E. D., Combining recurrent neural networks with eigenvector methods for classification of ECG beats. Digit. Signal Process. 19 (2)320329, 2009. doi:10.1016/j.dsp.2008. 09.002.

J Med Syst (2010) 34:643650 23. beyli, E. D., and beyli, M., Case studies for applications of Elman Recurrent Neural Networks, Recurrent Neural Networks, I-Tech Education and Publishing, Editors: Xiaolin Hu, P. Balasubramaniam, ISBN 978-953-7619-08-4, Chapter 17, pp. 357376, 2008. 24. Budak, F., Yenigun, N., Ozbek, A., et al., Carpal tunnel syndrome in carpet weavers. Electromyogr. Clin. Neurophysiol. 41:2932, 2001. 25. Pineda, F. J., Generalization of back-propagation to recurrent neural networks. Phys. Rev. Lett. 59 (19)22292232, 1987. doi:10.1103/PhysRevLett.59.2229. 26. Battiti, R., First- and second-order methods for learning: between steepest descent and Newtons method. Neural Comput. 4:141 166, 1992. doi:10.1162/neco.1992.4.2.141. 27. Hagan, M. T., and Menhaj, M. B., Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 5 (6) 989993, 1994. doi:10.1109/72.329697.

You might also like