Professional Documents
Culture Documents
Abstract
In this paper, we introduce a hand gesture recognition
system to recognize real time gesture in unconstrained
environments. The system consists of three modules: real
time hand tracking, training gesture and gesture
recognition using pseudo two dimension hidden Markov
models (P2-DHMMs). We have used a Kalman filter and
hand blobs analysis for hand tracking to obtain motion
descriptors and hand region. It is fairy robust to
background cluster and uses skin color for hand gesture
tracking and recognition. Furthermore, there have been
proposed to improve the overall performance of the
approach: (1) Intelligent selection of training images and
(2) Adaptive threshold gesture to remove non-gesture
pattern that helps to qualify an input pattern as a gesture.
A gesture recognition system which can reliably
recognize single-hand gestures in real time on standard
hardware is developed. In the experiments, we have
tested our system to vocabulary of 36 gestures including
the America sign language (ASL) letter spelling alphabet
and digits, and results effectiveness of the approach.
Keywords: Hand gesture recognition; Hand tracking;
Kalman filter; Pseudo 2-D Hidden Markov models.
1. Introduction
Gesture recognition is an area of active current research
in computer vision. It brings visions of more accessible
computer system. In this paper we focus on the problem
of hand gesture recognition using a real time tracking
method with Pseudo two-dimensional hidden Markov
models (P2-DHMMs). We have considered singlehanded gestures, which are sequences of distinct hand
shapes and hand region. Gesture is defined as a motion
of the hand to communicate with a computer.
Many approaches to gesture recognition have been
developed. Pavlovic [1] present an extensive review of
existing technique for interpretation of hand gestures. A
large variety of techniques have been used for modeling
the hand. An approach based on the 2-D locations of
fingertips and palms was used by Davis and Shah [2].
362
Database of
predefined
gestures
Predefined gestures is
forward to the
recognition algorithms
Hand
Detected
&
Tracking
Hand
ROI
Hidden
Markov
Models
Gesture by
best match
Gestured
recognition
20
x
x
I ( x, y ) ,
M 02 = y 2 I ( x, y) .
then the hand orientation (major axis) is
arctan
Hand region &
filtered trajectory
M
2
M
M
20
00
11
00
x c2
x c y c
M 02
y c2
M 00
(a + c) + b 2 + (a c) 2 ,
w=
2
(a + c) b 2 + (a c) 2
2
363
I 2 x 2 represent a 2x2
x t = x t 1 + Gwt 1
(2)
y t = Hx t + v t
(3)
0 T
1 0
0
0
1
0
0
T
0
(4)
0 0 1 0
G=
0 0 0 1
1 0 0 0
H =
0 1 0 0
(5)
(6)
(7)
xt = {xt 1 + K t 1 ( yt 1 Hxt 1 )}
(8)
Pt = ( Pt 1 K t 1 HPt 1 )T +
w2
Qt 1
v2
(9)
Pt
equals t |t 1 / v2 ,
t |t 1
represents the
x t + m|t = m {x t + K t ( y t Hx t 1 )}
( )
Pt + m|t = m ( Pt Kt HPt ) T
m 1
w2
kQ
v2 k = 0
(10)
( )
T k
(11)
t + m|t
represents
K t = Pt H T ( HPt H T + I 2 x 2 ) 1
(a)
(b)
(c)
(d)
364
4. Gesture Recognition
Hidden Markov Models are finite non-deterministic state
machines, which have been successfully applied to
numerous applications. They consist of a fixed number
states with associated output probability density
functions (pdfs) as well as transition probabilities aij. For
r
a continuous HMM the pdf b j (o ) of state Sj is usually
given by a finite Gaussian mixture of the form:
r
b j (o ) =
m =1
r r
c jm N ( o , jm ,
jm
(12)
r r
and N (o , jm ,
jm
r
Pr(Ot , Qrt ) =
q1,q2 ,..,qT
allQ
q1bq1 (o1t )
s=2
(14)
P (O ) a
1
t =2
rt 1rt Prt
r
(Ot )
(15)
r
Pr(O ) =
likelihood. Note that both of the Eqs. (14) and (15) can
be effectively approximated by the Viterbi score. One
immediate goal of the Viterbi search is the calculation of
r
the matching likelihood score between O and HMM. The
objective function for an HMM is defined by the
maximum likelihood as
Pr(O,Q) =
allQ
q1 ,q2 ,..,qT
q1 bq1 (o)
t =2
(13)
1 i, j M
r
(Ot , k ) = max
Q
a
s =1
q s 1q s b q s
r
(o st )
(16)
r
D ( O , ) = max
k
a
t =1
rt 1 rt
r
( O t , rt )
(17)
365
thus far, a P2DHMM add only one parameter set, i.e., the
super-state transitions, to the conventional HMM
parameter sets. Therefore it is simple extension to
conventional HMM.
Design of hand gesture models. Although these paper
[11,12] introduced the P2-DHMMs and show promising
results, a formal definition of the model as well as details
of the algorithms used have been omitted. But this
approach is for the first time introduced to hand gesture
recognition framework. For each gesture there is a P2DHMMs.
Super-states
state
Di2, j =
( d ( n) d
i
j ( n))
(18)
n =1
Blocks extraction
Sampling windows
Figure 6: Sampling window
366
t1
t2
t3
tn
(e)"A" gesture
(20)
P(XTMG)(G)<(XTMTM)P(TM) (21)
where XG, XTM, G, TM denote a gesture pattern, a nongesture pattern, the target gesture model and the
threshold model, respectively.
It is imply that a gesture should best match with the
corresponding gesture model and that a non-gesture with
the threshold model, respectively. Inequality (20) say that
the likelihood of a gesture model should be greater than
that of the threshold model.
5. Experimental results
In this section we present the recognition results of 36
gestures, which includes ASL letter spelling alphabet and
digits (Fig. 9). We discuss results of each of the steps
involved before presenting the overall results.
5.1 Results of tracking
The complete system works at about 25 frames/sec. If
the hand moves too fast Kalman filter at this will not find
hand pixels correctly, the tracker starts going out of track
due to the translation and scale parameters get distorted
values. Beside this if initialization is not good, the tracker
can not archived the result of tracking and lost of track.
367
6. Conclusions
We have developed a gesture recognition system that is
shown to be robust for ASL gestures. The system is fully
automatic and it works in real-time. It is fairly robust to
background cluster. The advantage of the system lies in
the ease of its use. The users do not need to wear a glove,
neither is there need for a uniform background.
Experiments on a single hand database have been carried
out and recognition accuracy of up to 98% has been
achieved. We plan to extend our system into 3D tracking.
Currently, our tracking method is limited to 2D. We will
therefore investigate a practice 3D hand tracking
technique using multiple cameras. Focus will also be
given to further improve our system and the use of a
lager hand database to test system and recognition.
References
[1] V.I. Pavlovic, R. Sharma, T.S. Huang. Visual
interpretation of hand gestures for human-computer
interaction, A Review, IEEE Transactions on Pattern
Analysis and Machine Intelligence 19(7): 677-695, 1997.
[2] J.Davis, M.Shah. Recognizing hand gestures. In
Proceedings of European Conference on Computer
Vision, ECCV: 331-340, 1994.
[3] D.J.Turman, D. Zeltzer. Survey of glove-based input.
IEEE Computer Graphics and Application 14:30-39,
1994.
[4] Starner, T. and Pentland. Real-Time American Sign
Language Recognition from Video Using Hidden
Markov Models, TR-375, MIT Media Lab, 1995.
[5] R.Kjeldsen, J.Kender. Visual hand gesture
recognition for window system control, in IWAFGR:
184-188, 1995.
[6] M.Zhao, F.K.H. Quek, Xindong Wu. Recursive
induction learning in hand gesture recognition, IEEE
Trans. Pattern Anal. Mach. Intell. 20 (11): 1174-1185,
1998.
[7] Hyeon-Kyu Lee, Jin H. Kim. An HMM- based
threshold model approach for gesture recognition, IEEE
Trans. Pattern Anal. Mach. Intell. 201(10): 961-973,
1999.
[8] Ho-Sub Yoon, Jung Soh, Younglae J. Bae, Hyun
Seung Yang. Hand gesture recognition using combined
features of location, angle, velocity, Pattern Recognition
34 : 1491-1501, 2001.
[9] R. Lockton, A. W. Fitzgibbon. Real-time gesture
recognition using deterministic boosting, Preceedings of
British Machine Vision Conference, 2002.
[10] K. Oka, Y. Sato and H.Koike. Real-Time Fingertip
Tracking and Gesture Recognition. IEEE Computer
Graphics and Applications: 64-71, 2002.
[11] O.E. Agazzi and S.S.Kuo. Pseudo two-dimensional
hidden markov model for document recognition. AT&T
Technical Journal, 72(5): 60-72, Oct, 1993.
368