You are on page 1of 10

Emotion Recognition Using Physiological Signals

Lan Li1 and Ji-hua Chen2


School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, 212013, P.R. China yaolan_us@ujs.edu.cn 2 Institute of Biomedical Engineering, JiangSu University, ZhenJiang, 212013, P.R. China
1

Abstract. The ability to recognize emotion is one of the hallmarks of emotional intelligence. This paper proposed to recognize emotion using physiological signals obtained from multiple subjects without much discomfort from the body surface. Film clips were used to elicit target emotions and an emotion elicitation protocol, verified to be effective in the preliminary study, was provided. Four physiological signals, electrocardiogram (ECG), skin temperature (SKT), skin conductance (SC) and respiration were selected to extract 22 features for recognition. We collected a set of data from 60 female undergraduates when experiencing the target emotion. Canonical correlation analysis was adopted as a pattern classifier, and correct-classification ratio is 85.3%. The research indicated the feasibility of user-independent emotion recognition using physiological signals. But before emotion interpretation can occur at the level of human abilities, there still remains much work to be done.

1 Introduction
Nowadays affective computing has become the hotspot in computer science. Recording and recognizing physiologic signatures of emotion has become an increasingly important field of research in affective computing and human-computer interface [1]. Traditional investigation, which has made considerable achievements, is based on the recording and statistical analysis of physiological signals from Autonomic nervous system [2]. Some researchers have been doing their best to develop wearable devices, while others devoting themselves to implementing a physiological signal-based emotion recognition system [3,4,5]. In 1999, researchers at IBM developed an emotion mouse about 75 percent successful in determining a user's emotional state [3]. In 2001, Picard and colleagues at MIT Media Laboratory developed pattern recognition algorithms which attained 81% classification accuracy [4]. Because of data acquired from only one subject, these emotion recognition methods can only measure one subjects emotion. In 2004, Kim and his group developed a multiple-users emotion recognition system using short-term monitoring of physiological signals. A support vector machine (SVM) was adopted as a pattern classifier, and correct-classification ratio for 50 subjects is 78.4% [5]. This paper discussed how to recognize emotion using four physiological signals obtained from multiple subjects. It is novel and different from previous research. Film clips were used to arouse the inner feelings of the subjects, and an emotion elicitation
Z. Pan et al. (Eds.): ICAT 2006, LNCS 4282, pp. 437 446, 2006. Springer-Verlag Berlin Heidelberg 2006

438

L. Li and J.-h. Chen

protocol, verified to be effective in the preliminary study, was provided. 22 features were extracted from 4 physiological signals. Canonical correlation analysis was adopted to find the relationship between three emotions and extracted features. And the recognition accuracy is 85.3%.

2 Method
2.1 Emotion Elicitation Protocol Compared with other emotion elicitation techniques such as images, sounds, facial and body movement, scripted and unscripted social interactions, music, and so on, films are more reliable and more naturalistic to induce internal feelings of the subjects[6]. In order to evoke specific target emotion statuses effectively, we chose over three film clips (3~8 minutes in length) as stimulus for each target emotion. 89 (male 36; female 53) undergraduate students, aged from 18 to 23, in Jiangsu University took part in the preliminary test. Through completing post-film questionnaire, the subjects were encouraged to report the state of the emotion elicited by the film clips, and to use a five-point scale (0-very low 1-low 2-high 3- median high 4-very high) to report the intensity of the emotion . Validity (the percentage of subjects who report that the given stimulus properly induced the intended motion) and the average intensity of the emotion were used to evaluate the quality of the emotion stimulus. After the preliminary test, we drew conclusions as following: The stimulus clips for fear, neutral, and joy were chosen successfully, while others for anger, disgust, sadness, embarrassment, surprise, anguish, contempt, stress, interest, satisfied, were not. The most effective stimulus for three emotions was picked out, and showed in Label 1. Validity of fear, neutral and joy is 86.67%, 93.33% and 90.0% respectively; while the average intensity is 2.85, 2.63 and 3.25 respectively. Given the same stimulus, female students felt much stronger than male students. That three stimulus were shown sequently in a way, fear-joy-neutral, was proved to be more effective for the subjects to transition from one emotion state to another emotion state than in the way, fear-neutral-joy.
Table 1. Summary of emotion-elicitation protocol Target Emotion Fear Joy neutral Decscription Frightened Uplifting happiness Relaxation, vacancy Film The Doll Tom and Jerry Noncommercial screen saver Clip length 512 645 326

2.2 Acquisition of Emotion-Specific Physiological Signals Subjects. Since target emotion state is easier to be elicited from female students, our target subjects were 60 female undergraduate students aged from 18 to 23. All

Emotion Recognition Using Physiological Signals

439

subjects were healthy volunteers without any history of medical, neurological or psychiatric illness. Before the test, they have not done exercise heavily in four hours and have not taken any medicine in a week. Otherwise, the subjects were chosen randomly, while the order of the subjects is arranged randomly. Selection of the physiological signals. To develop a practical recognition algorithm, the physiological signals which we can choose are very important as well as very limited at the same time. Although electroencephalogram (EEG), facial electromyograms, and blood pressure may be helpful for the research, the attachment of electrodes to the scalp or face seems not to be tolerable for practical use, thus they are left out of consideration. Nowadays almost all wearable devices developed can measure four major physiological signals: skin temperature (SKT), skin conductance (SC), heart rate (HR), and electrocardiogram (ECG). SKT is an important and effective indicator of emotion states. The variation in SKT due to emotional stimulus was studied by Shusterman, Barnea (1995) and Kataoka(1998) [7]. It reflects autonomic nervous system activity. Skin conductance (SC), sometimes called the Electrodermal activity (EDA) or Galvanic skin response (GSR), is another important signal to represent the activity of the autonomic nervous system. It characterizes changes in the electrical properties of the skin due to the activity of sweat glands and is physically interpreted as conductance. Sweat glands distributed on the skin only receive output from the sympathetic nervous system, and thus SC is a good indicator of arousal level due to cognitive stimulus. Heart rate (HR) is dually controlled by the sympathetic (increase) and parasympathetic (decrease) branch of the ANS that may act independently [8]. In addition, time-domain features, such as mean and standard deviation (SD) of the HRV (heart rate variability) time series have also been considered to be significant for the exploration of autonomic nervous system in many previous studies for cardiac function assessment and psychophysiological investigation. They have been frequently used as features [9]. Since ECG signal can be obtained relatively easily, and both time-domain features of HRV and HR can be computed from it, ECG signal seems to be very important in this study. In addition, respiration is very important in the emotion research. Specific emotion expressions, such as crying, laughing, or shouting, have unique respiratory signatures. A detailed quantification of volume, timing and shape parameters in the respiratory pattern waveform can map into different emotional states along the dimensions of calm-excitement, relaxation-tenseness, and active vs. passive coping [10]. Initial evidence indicates that respiratory parameters also map into the affective space dimensions of valence (aversive vs. appetitive stimulus quality) and arousal (activating vs. calming stimulus quality) [11]. So, in this research, four physical signals--SKT, SC, ECG and Respiration are selected to extract features for emotion recognition. Experiment devices. Good devices are essential to gather good data. The equipments used in this research are:

440

L. Li and J.-h. Chen

World-renowned PowerLab Data Acquisition System with Chart software. It is developed by AD Instruments Company in Australia, including: ML870 Powerlab 8 channel data acquisition system MLT409/A SKT probe ML309 Thermister Pod ML116F GSR finger electrodes ML116 GSR AMP MLT1132 Piezo respiratory belt transducer MLA700 Reusable ECG clamp electrodes ML132 Bio AMP A USB camera 800*600 pixels Two computers. One is used to show the film clips to elicit the target emotion of the subject; while the other, connecting with PowerLab ML870 and a USB camera in covert, is used to record, analyze, preprocess the physiological data, and monitor the facial or body expression of the subject. Experimental method Preparation. Before the experiment, the subject was requested to report the personal information and sign a volunteer grant. By talking, we could make the subject feel relax, not curious and be a calm state. Then, attach the sensors to the subject. The MLT409/A SKT prob was placed on the tip of the thumb of the left hand. The bipolar MLT116F GSR finger electrodes measured SC from the middle of the three segments of the index and middle fingers of the left hand. And based on lead I, three reusable ECG clamp electrodes of the MLA700 were attached to wrists and right ankle. While the MLT1132 piezo respiratory belt transducer was placed around the body at the level of maximum respiratory expansion. Instructions. The test will begin soon. Please make yourself be as relaxed and comfortable as possible. After 1 minute, film clips will be showed. Please watch the films carefully; dont move casually; let yourself experience whatever emotions you have as fully as you can, and dont try to hold back or hold in your feelings. Thank you for your cooperation! Data collection. Acquisition of high-quality database of physiological signals is vital for developing successful emotion recognition algorithm. The emotion stimulus was showed in a way, fear-joy-neutral. One emotion session took approximately 5~8 minutes. The first 1 min was taken to measure the baseline without any stimulus. Subsequently, emotional stimulus was applied. The physiological signals were recorded, at the sampling rate of 400 HZ for all the channels, by using PowerLab 8-channel physiological Data Acquisition System with Chart software. At the same time, a trained graduate monitored the subject with a hidden PC camera. When there were sudden changes in the facial or body expression, special marks and annotations would be added to the record to provide supplementary information for later data-processing. After one session, the subject was requested to

Emotion Recognition Using Physiological Signals

441

complete the post-film questionnaire. Then there would be 5 minutes interval for the subject to be calmed down before another emotion session. Two segments of raw data (SC and Respiratory signal) under three different emotion states were shown in Fig. 1.

(mv)

(mv)

(c) Respiratory signal

1200 (f) HRV 1000 800 600

RR Intervals (ms)

1200 1000 800

RR Intervals (ms)

1200 1000 800

RR Intervals (ms)

10 30 50 Interval Number

600

10 30 50 Interval Number

600

10 30 50 Interval Number

Fig. 1. Examples of physiological signals measured from a user during the period of experiencing the target emotion (Fear, Joy, Neutral). From top to bottom: (a) skin conductivity (MicroSiemens), (b) the difference of skin conductivity, (c) respiratory signal (raw digital signal obtained with the respiratory belt), (d) respiratory rate, (e) Heart rate , (f) HRV time series. The sampling rate is 400/s. The segment shown here are visiblely different for the three emotions, which was not true in general.

442

L. Li and J.-h. Chen

2.3 Data Processing Since each emotion session would last around 5~8 minutes, and signals were sampled at 400Hz, there would be about 120 to 192 thousand samples per physiological signal. It is necessary to extract a short time significant segment of the signals from the raw data. The only criterion of extraction is whether the signals were sampled during the period of experiencing strong feelings of the target emotion. According to the emotion theory, SC is a good indicator of arousal level. Base on the comprehensive consideration of SC differentiation and the marks & annotations made during the period of experiments, data segments (1 min length) were taken for features extraction from the raw pattern waveform of SKT, ECG, Respiration for each of the three emotions. SKT. No special signal processing was necessary for SKT. Although frequencydomain analysis of the time-varying SKT has been reported [7], here we used the mean and the difference between the maximum and the minimum within 1 min as the features of SKT. SC. The raw SC signal was shown in Fig. 1a, while differentiation signal, computed by chart software, shown in Fig. 1b. In this research, the SC differentiation signal was used to be one important basis to choose samples from raw waveform. And the first difference of it was used as an important feature [4]. Respiration. Respiration parameters can be computed from the changes in thoracic or abdominal circumference during respiration, which can be measured easily by a piezo-electric device contained by the Respiratory Belt Transducer MLT1132. Here, we took respiratory rate (see Fig. 1d) and peak inspiratory amplitude as features. ECG. Heart rate can be computed by R-peak detection. Fig. 1e illustrates the heart rate waveform detected from raw ECG signals. Mean of the heart rate and the difference between the maximum and the minimum are taken into account. The time-domain features of HRV can be calculated by Chart conveniently and accurately. These are : HF power____high-frequency heart rate variability spectral power [0.15~0.4HZ] LF power____low-frequency heart rate variability spectral power [0.04~0.15HZ] LF/HF____ratio of low-to high-frequency power Mean NN____ the mean of the normal cardiac cycle SDNN____the standard deviation of the normal cardiac cycle SD Delta NN____the standard deviation of the delta of the normal cardiac cycles Mean T____the mean of T-wave amplitude SDT____the standard deviation of T-amplitude SD Delta T____the standard deviation of the delta of T-amplitude Mean R____the mean of R-wave amplitude SDR____the standard deviation of R-wave amplitude SD Delta R____the standard deviation of the delta of R-amplitude PNN50____(NN50 count) / (total NN count), the fraction of consecutive NN intervals that differ by more than 50 ms RMSSD____the square root of the cumulate of the square of the delta of the normal cardiac cycles Ratio____ SDNN/SD Delta NN

Emotion Recognition Using Physiological Signals

443

The features listed above were adopted in this research. Some of them were considered to correlate with emotion from the literature, and some we supposed to. 2.4 Pattern Classification Using Canonical Correlation Analysis Feature vectors extracted from multiple subjects under the same emotion state form a distribution in high-dimension space. Duda had projected them onto two-dimension space for visualization by a Fisher project. The research showed that the projected feature vectors from the same emotion state formed a cluster with a large amount of variation, and the clusters of feature vectors from different emotion state significantly overlapped. Kim proposed to solve this difficult high-dimension classification problem with the support vector machine (SVM) classifier[5]. SVM is based on the property that separation by a linear classifier becomes more promising after nonlinear mapping onto high-dimensional space. The linear classifier can be obtained with maximum generalization performance derived from the statistical learning theory of Vapnik. Unlike SVM, CCA, known as multivariate multiple regression analysis, can find two sets of basis vectors in which the correlation matrix between the variables is diagonal and the correlations on the diagonal are maximized. Given two random variables, X and Y, the basic CCA model is: CVX1 = a1X1 + a2X2 +...+ apXp. , CVY1 = b1Y1 + b2 Y2 + ... + bm Ym . (1)

The goal is to describe the relationships between the two sets of variables. The canonical weights (coefficients) a1, a2, a3, ... ap are applied to the p X variables, while b1, b2, b3, ... bm applied to the m Y variables in such a way that the correlation between CVX1 and CVY1 is maximized. It is a combination of predictor and dimension-reducer. More details can be found in [12]. In this research, considering the consistent success in previous evaluations of feature selection algorithms, CCA was adopted as a classifier.

3 Results
The classification results are shown in Table 2. 85.3% of test cases can be correctly classified. The classification rates for fear, neutral, joy were 76%, 94%, 84% respectively. And One combined-group figure was shown in Fig. 2, which can visualize the three emotions territorial map. In the figure, we can see there is somewhat overlaped between the emotion fear and joy, Which may be the reason of low classification rate on fear and joy.
Table 2. Classification rates for three emotions Initial emotion fear neutral joy Predicted Group Members fear neutral 38 2 1 47 6 1 Total 50 50 50

joy 10 2 43

444

L. Li and J.-h. Chen


6 4 Fuction2 2 0 -2 -4 -4 -2 0 2 Fuction1 4 Emotion Fear Neutral Joy Group centroid

Fig. 2. Combined-group figure

4 Discussion
Like previous research, Our study demonstrated the feasibility of a physiological signal-based emotion recognition again. Although we dont know what emotion is, how it happens and how it reacts, there does lie clues, pehaps facial or body expression, tone of speech, etc., among which the physiological signals are more natural and more belivable ones (see Fig. 1), to estimate the subjects emotion state. Although there are some clues in physiological signals, no one know exactly which signal will work best, so signals adopted are somewhat different. Furthermore, there is no standard database. All these make the comparitive study of different method impossible. Expecting to share opinion with researchers in this field, I will analyze why the classification accuracy of our method can reach up to 85.3%. First, data is good and effective. Our data was obtained under emotion states without any externel disturbance or interference. To achieve this goal, the subject was left single in a typical room, electrically shielded and soudproof inside. At the same time, the mobile would be turned off. Sequently, carefully sellected film clips were shown to arouse the inner feelings of the subject. In general, target emtion can be elicited, thus our collecting data really corresponding to real emotion states. Second, emotion category is limited to three kinds, which makes the classification simplified. Since Picard got 88.3% accuracy for three emotion recognition in 2001 [4], it seems that less kinds of emotion, higher accuracy of classification. Third, all 60 subjects were female undergraduates. Since there have been reports of sex differences in several aspects of emotional responding, and in the preliminary study, we did find the problem, we decided to perform research only for female subjects to avoid the sex differences. In addition, a good classifier, CCA, was adopted to find the relationship for the high-dimensional classification problem. The shortcomings of our method is about 3 aspects: First of all, our database is far from complete. Secondly, only three kinds of emotion are concerned. And thirdly, signal processing is relatively simple for SKT and respiration signal. For pactical application, future work shuld be done according to the three aspects.

Emotion Recognition Using Physiological Signals

445

5 Conclusion
Emotion recognition is one of the key skills of emotion intelligence for adaptive learning systems. The sheer difficulty in this field is how to gather data corresponding to real emotion states and how to find the relationship between the emotion and the physiological signals in high-dimension space. After 2 years research, we have developed a novel method for a user dependent emotion recognition based on the processing of physiological signals. In our research, the database used for emotion recognition is obtained from multiple subjects when they were experiencing the specific feelings, so the bio-signal database can represent the nature emotion state. To arouse the inner feelings of the subject, film clips, the more effective technique, were used as stimulus. It is different from most of the previous studies, in which the emotion was intentionally tried and felt [3], or acted out[4]. After the preliminary test, three clips that can successfully elicit the target emotion were selected. A good emotion elicitation protocol for the emotion research on Chinese undergraduates is provided. For practical application, we selected four significant physiological signals including ECG, SC, SKT, and respiration, which is easy to obtain relatively, to recognize emotion. Based on the data-processing, 22 features were obtained from the raw signals. To overcome the high-dimension classification difficulty, CCA, a combination of predictor and dimension-reducing technique, was adopted. Recognition accuracy is up to 85.3%, which is much higher than previous studies. The classification results were quite encouraging, and showed the feasibility of a user-independent emotion recognition based on physiological signals. We expect to develop a simple and accurate affect estimation system for machine to understand the users feelings. Future work will be aiming at two aspects: Build a good database, which is most arduous, difficult, but foundational and significant as well. Since there is no standard database, there is no standard to evaluate the researchers work. A standard database is a matter of great urgency for the study. Featrues extraction. For us, what emotion is, how it happens and how it reacts remain unknown till now, which makes it very difficult to select the most corresponding physiological features for classification. Great foundational work should be done to explore the relationship between emotion and the physiological signals. It is key of the emotion recognition. In a word, it is feasible to classify emotion with physiological signals. But before emotion interpretation can occur at the level of human abilities, there still remains much work to be done. Acknowledgement. The work is supported by Academy Natural Science Foundation of Jiangsu Province (04KJB310171), and Advanced Technologist Research Start Foundation of Jiangsu University (05JDG029).

446

L. Li and J.-h. Chen

References
1. Picard, R. W.: Affective Computing, MIT press, Cambridge, MA, (1997) 2. Andreassi, J. L.: Psychophysiology: human behavior and physiological response, Lawrence Erlbaum Associates, New Jersey, (2000) 3. Ark, W., Dryer, D. C., and Lu, D. J.: The emotion mouse, 8th Int. Conf. Human-computer Interaction, (1999) 453-458 4. Picard, R. W., Vyzas, E., Healey, J.: Toward machine emotional intelligence: analysis of affective physiological state. IEEE Transactions Pattern Analysis and Machine Intelligence, (2001), 23(10) 1175-1191 5. Kim, K.H., Bang, S.W., and et al.: Emotion recognition system using short-term monitoring of physiological signals, Med. Biol. Comput., (2004) 42: 419-427 6. Gross, J.J., Levenson, R.W.: Emotional suppression: Physiology, self-report, and expressive behavior, Journal of Personality & Social psychology, (1993) 64: 970-986 7. Shusterman, V., Barnea, O.: Analysis of skin-temparature variability compared to variability of blood pressure and heart rate, IEEE Ann. Conf. Engineering Medicine Biology Society, (1995) 1027-1028 8. Berntson, G.G., Cacippo, J.T., et al.: Autonomic determinism: the modes of autonomic control, the doctrine of autonomic space, and the laws of autonomic constraint, Psychological Review, (1991) 98: 459~487 9. Mccraty, R., Atkinson, M., et al.: The effects of emotions on short term power spectrum analysis of heart rate variability, AM. J. Cardiol., (1995) 76:1089-1093 10. Grossman, P., Wientjes, C. J.: How breathing adjusts to mental and physical demands, Respiration and emotion, spring, New York, (2001) 43-53 11. Ritz, J., Nixon, A., et al.: Airway response of healthy individuals to affective picture series, International Journal of Psychophysiology, (2002) 46(1): 67-75 12. Hotelling H.: Relations between two sets of variates, Biometrika, (1936), 28: 321-377

You might also like