Decoupled EKF For Simultaneous Target Model and Relative Pose Estimation Using Feature Points

Proceedings of the TA6.
4
2005 IEEE Conference on Control Applications
Toronto, Canada, August 28-31, 2005
Decoupled EKF for Simultaneous Target Model and Relative

Pose Estimation Using Feature Points
Lingfeng Deng∗, W. J. Wilson† , F. Janabi-Sharifi‡
∗ MDA
Space Missions, Brampton, Ontario, Canada
† University
of Waterloo, Waterloo, Ontario, Canada
‡ Ryerson University, Toronto, Ontario, Canada
Email: {lfdeng,wwilson}@kingcong.uwaterloo.ca, fsharifi@ryerson.ca

Abstract— In this paper, the problem of combined target based features can be estimated without any knowledge of a
model and relative pose estimation for position-based visual complex scene, and therefore, more realistic visual positioning
servoing is addressed. The target object is assumed to be or tracking tasks (such as pedestrian tracking) can be achieved.
stationary with at least 3 distinguishable feature points. The
midpoint triangulation method and a rough estimation method In the image moment-based method [1], six different image
are developed for initial estimation of the target model and moment features were selected to maximize the decoupling of
relative pose of target in current robot end-effector frame. A each DOF motion control in IBVS. Well-behaved Cartesian
novel decoupled Extended Kalman Filter (EKF)-based online and image trajectories can be achieved simultaneously and
estimation algorithm is proposed to improve the target model the control design is free of sensory space singularities and
and relative pose estimation simultaneously under continuous
robot dynamic motion. This new method is robust to large initial local minima. PBVS control scheme was applied in [11] for
estimation errors and provides consistent and accurate target automatic disassembly of used car engine compartment. Based
model estimation for optimal pose estimation as required in on the edge element extraction and a polyhedral boundary
position-based visual servoing. Experimental results are given representation of each target object, the EKF was used for
to demonstrate the performance of the proposed method. full 3D online pose estimation.
I. I NTRODUCTION This work still focuses on the scenario of using distin-
guishable points of interest on the unknown target object
It is well known that an accurate target model is ex-
since it represents a wide range of real applications. The
plicitly required for proper position-based visual servoing
PBVS control scheme is chosen because it is more prefer-
(PBVS) [13]. On the other hand, in the image-based visual
able to IBVS when dealing with approaching and grasping
servoing (IBVS) the target model is also required to define
of an unknown target. The motion command and feedback
the desired image locations if not using an off-line teach-by-
are both directly available in 3D Cartesian space, which is
showing method. It has been shown in [4] that sensitivities
much more straightforward for implementation and graphical
of the two methods with respect to target modeling errors are
visualization. There are two main contributions of this paper.
similar when comparing the two methods on a common basis.
First, the midpoint triangulation and rough estimation meth-
To address the problem of visual servoing in the presence
ods are developed for initial target model and relative pose
of unknown target model, several model-free hybrid-based [7],
estimation, under the environment of robot visual servoing
[8] and position-based [10] visual servoing methods have been
with respect to a stationary target object. For both methods,
proposed to deal with an unknown target with distinguishable
only a minimum of 3 feature points are required compared
points of interest using the Euclidean reconstruction technique.
with the approaches in [7], [8], [10]. The second and the
This method necessitates decomposition of the homography
most important contribution is the development of a decoupled
for partial pose estimation, which is prone to image mea-
EKF-based algorithm for the simultaneous online estimation
surement noise. Furthermore, at least 4 and 8 feature points
of the unknown target model and relative pose, based on
are required for planar and non-coplanar targets respectively.
the previous midpoint or rough initial estimations. This new
Recently, several new strategies have been proposed to com-
method has good convergence performance and is robust to
pletely remove a prior knowledge of the observed scene. To
large initial estimation errors. Furthermore, compared with the
deal with the positioning task with respect to planar targets of
methods in [7], [8], [10] effects of the image measurement
unknown shape, a coupled image-based visual servoing and
noise and feature correspondence problem can be resolved
3D estimation method [2] was proposed, assuming that three
due to the continuous operation of the Kalman filtering and
points of interest can be extracted from the image. Although it
its image window prediction function. Finally, the method is
is assumed that the target is planar, the method can be applied
easy to implement, since again only a minimum of 3 feature
to positioning task with respect to complex non-coplanar target
points are required.
as long as three well separated points of interest can be
specified on the target object. The points of interest require- This paper is organized as follows. The midpoint and
ment was completely removed in [1], [3], [11]. In the image rough initial estimation methods are addressed in the next two
motion-based visual servoing approach [3] (also called d(2D)dt sections respectively. The decoupled EKF-based online target
visual servoing), motion in the image plane was used as the model and relative pose estimation algorithms are presented in
input of the image-based control scheme. The image motion- Section IV. The experimental results are shown in Section V.
0-7803-9354-6/05/$20.00 ©2005 IEEE 749
Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on December 14, 2008 at 00:36 from IEEE Xplore. Restrictions apply.
II. M IDPOINT I NITIAL E STIMATION space. sk = [(xI1 )k , (y1I )k , · · · , (xIp )k , (ypI )k ]T is the vector
O
In this section, the problem of finding the position of a of image feature locations, and WE k
= [X, Y, Z, φ, α, ψ]T
point in 3D Cartesian space given its position in two images, represents the relative pose (3D position and roll-pitch-yaw
taken by the same camera with known calibration and pose, orientation) of target object in robot end-effector frame. β c =
is considered. This process requires the intersection of two (f, Px , Py , RE E
C , dC ) is the camera model including the intrin-
known rays in space and is commonly known as triangula- sic (f, Px , Py ) and extrinsic (RE E E
C , dC ) parameters. RC and dC
E
tion [5]. In the absence of image measurement noise, this define the orientation and position of robot end-effector frame
problem is trivial. When noise is present, the two rays will in camera frame. β O = [X1O , Y1O , Z1O , · · · , XpO , YpO , ZpO ]T is
not generally meet in 3D Cartesian space, and therefore it is the target model.
necessary to find the best point of intersection. A commonly Consider two image points at the first and the second sample
suggested method for triangulation, the midpoint method [5], steps, ((xIj )1 , (yjI )1 ) and ((xIj )2 , (yjI )2 ), corresponding to the
is to find the midpoint of the common perpendicular to the two same feature point PO j on the target as shown in Fig. 1, where
rays corresponding to the matched points. It was shown in [5] k = 2 is assumed to be the current sample step. In the absence
that under the Euclidean reconstruction the midpoint method of image measurement noise, the two rays must meet in 3D
provides a more desirable solution than other triangulation Cartesian space, which leads to the following equality:
methods. In this section, application of the midpoint method
OC2 + DC2 t2 = TC
C2 (OC1 + DC1 t1 )
1
(3)
in a robot visual servoing environment will first be introduced.
Then, initial estimation of the target model and relative pose where TC 1
C2 defines the homogeneous transformation from the
based on this method will be described. camera frame C1 to the camera frame C2 . Since TC 1 C2
B and TB
A. Application of Midpoint Method in Visual Servoing can be solved from the robot forward kinematics and the robot
joint encoder measurements, the homogeneous transformation
As shown in Fig. 1, at each sample step k = 1, 2, · · · of
between the two camera frames can then be determined by
robot dynamic motion with respect to a stationary target object,
a ray defined by the j-th feature point PO j on the target object TC C1 −1 C2
C1 = (TB )
2
TB . (4)
and its projection ((xIj )k , (yjI )k ) onto the image plane I, can
be expressed in the camera frame Ck as: Equation (3) results in three equations in terms of two un-
⎡ ⎤ ⎡ ⎤ known parameters t1 and t2 , which can be solved using linear
(xIj )k Px −(xIj )k Px
I
⎢ (yj )k Py ⎥ I
⎢ −(yj )k Py ⎥ least-square methods. This minimizes the squared distance
rCk (tk ) = OCk + DCk tk = ⎣ ⎦+⎣ t ⎦ k (1) between the two rays. Finally, the 3D Cartesian coordinates of
−f f
1 0 the midpoint along the common perpendicular (the minimum
distance) between the two rays expressed in the current camera
where OCk represents the origin of the ray, DCk represents
frame C2 is given by
the direction of the ray, and tk is the ray parameter which
defines individual points along the ray. f is the camera focal OC2 + DC2 t1 + TCC2 (OC1 + DC1 t1 )
1
length in meters, Px and Py are the inter-pixel spacings in PC

j =
2
. (5)
2
m/pixel, along the image plane X and Y axes respectively.
It is clear that above calculations are sensitive to image
(( x j ) 2 , ( y j ) 2 )
I I
Y I measurement noise, especially when the two image feature
Image Plane measurements are very close to each other. Therefore, two
I XI X C2
Y I
Camera
well separated samples (image feature views) are required in
X Frame at k=2
(( x j )1 , ( y j )1 )
I I X C1
C 2 (current) the midpoint method. Furthermore, given the camera extrinsic
Y
d CC12 parameters (RE E
C , dC ), the feature coordinates in the current
d CE Y E2 end-effector frame E2 in Fig. 1 can also be solved, i.e.,
Camera Y C1
Z C2 End-effector
Frame at k=1
(previous)
X E2 Frame at k=2 PE 2 E −1 C2
j = (TC ) Pj . (6)
Ray 1
Ray 2 (current)
d BC1 Z C1 d EO2 Z E2 Next, the procedure for estimating the target model β o and
O
ZB
relative pose WE 2
of target in current end-effector frame will
d BC1
be presented.
PjO
YB XO
YO B. Initial Estimation of Target Model and Relative Pose
Stationary Target
ZO
X B For the initial estimation of the target model and relative
Robot Base Frame
pose based on PE j
2
from (6), the key step is to establish an
Fig. 1. Ray tracing for two image views of a stationary target. appropriate target coordinate frame. This is illustrated in Fig. 2
using a non-coplanar target with 5 feature points.
A key relationship embedded in Fig. 1 is [13]:
O
1) Label the feature points from 1 to 5.
sk = Ĝ(WE k
, βC , βO ) (2) 2) Pick feature 5 as the origin of the target frame.
i.e., the pin-hole camera mapping of the distinguishable tar- 3) The positive direction of target frame X O -axis is deter-
get feature points from the Cartesian space to the image mined by feature 1 and the origin.
750
4) The X O − Y O coordinate plane is defined by adding 3 in the target frame, which can be easily determined as
another feature, feature 4, and the perpendicular y-axis O −1 E2

can then be defined in either direction Y O or Y O . PO
j = (TE2 ) Pj , for j = 2, 3. (9)
5) Finally, the Z O -axis is defined to be perpendicular to Therefore, based on the midpoint method and appropriately
the X O − Y O coordinate plane by the right-hand rule. defined target frame, both the target model and initial relative
YI
pose of target in current end-effector frame can be estimated.
XI However, for practical consideration, it is also desirable to
End-effector
(( x5 ) 2 , ( y5 ) 2 )
I I
Frame at k=2 obtain a rough estimation of these parameters. This motivates
X C2 (current) the rough initial estimation method in the next section.
Camera Y C2 X E2 Y E2
Frame at k=2 d CE
(current) III. ROUGH I NITIAL E STIMATION
The Coordinate Plane
O′ O Determined by Features
Z Z C2 X Z E2
Only the current camera measurements and the robot joint
d CO2 5, 1, and 4
position feedback is required in this method. It is assumed that
Y O′ 1
d EO2
(a1) All the feature points belong to a plane where the robot
2
3
base is located.
β (a2) The end-effector is approximately located in a normal
5 position with respect to the unknown target object.
4
A. Rough Initial Estimation of Target Model
Table
Pick as Origin ZO YO This method is illustrated in Fig. 3 using the same target
Fig. 2. Establishment of the target coordinate frame
object as the previous section. Again, the first step is to estab-
lish an appropriate target coordinate frame X O − Y O − Z O
After establishing the target coordinate frame, the target using the same procedures as before.
model can then be determined as follows.
The first step is to determine the coordinates of feature 1 The Image Distance in X O ( X I′ )
and 4 in the target frame. Since the Euclidean distances d15 , Image Plane
d14 , and d45 can be solved based on PC j for j = 1, 4, and 5,
2
Focal
si − s j 1 2 XI
the β angle defined in Fig. 2 can then be solved as Length f Camera Focal
2 Point β1
d15 + d245 − d214 3 β2 β3 4
β = arccos . (7) Y O (Y I ′ )
2d15 d45
h γ
Since there are two solutions for above equation, the
coordinates of features 1 and 4 in the target frame
5
can be obtained in homogeneous form as PO 1 =
The Euclidean
d ij YI
[d15 , 0, 0, 1]T , PO
4 = [d45 cos β, d45 sin β, 0, 1] T
, or PO
4 = Distance between i-th
T and j-th feature
[d45 cos β, −d45 sin β, 0, 1] . It is obvious that the two alterna-
Fig. 3. Rough estimation of the target model
tives for feature 4 coordinates correspond to the two coordinate
frames shown in Fig. 2 and so we can choose either one. The second step is to estimate all the Euclidean distances
The second step is to determine the relative pose of target in dij between the i-th and j-th feature points on the unknown
O target. By assumptions (a1) and (a2), the relative depth of
current end-effector frame WE 2
. Note that only the relative
orientation needs to be solved since the relative position is each feature point in the camera frame can be approximated
already available from the coordinates of target origin in cur- by the end-effector height h with respect to robot base, which
rent end-effector frame PE 2
5 . The homogeneous transformation
can be solved from the robot kinematics and current joint
between the feature coordinates expressed in target frame and encoder readings. Therefore, based on the two similar triangles
current end-effector frame is in Fig. 3, dij can be estimated as
⎡ ⎤ ⎡ O⎤ ⎡ ⎤ ⎡ O⎤
XjE2
⎢ Y E2 ⎥
Xj R11 R12 R13 X5E2 Xj
⎢ Y O ⎥ ⎢R21 R22 R23 Y E2 ⎥ ⎢ Y O ⎥
h I
⎢ j ⎥ = TO ⎢ j ⎥ ⎢ ⎥ ⎢ j ⎥ dij = ((xi − xIj )Px )2 + ((yiI − yjI )Py )2 . (10)
E2 ⎣ O ⎦ = ⎣ . (8)
5
⎣ Z E2 ⎦ f
j
Zj R31 R32 R33 Z5E2 ⎦ ⎣ ZjO ⎦
1 1
The third step is to solve β1 , β2 , and β3 as defined in Fig. 3.
1 0 0 0 1
This can be done in a similar way to (7) using dij from above.
Based on PO O
1 and P4 obtained above, the first two columns A unique solution can be determined by establishing a new

of the rotation matrix r1 = [R11 R21 R31 ]T , r2 = image plane coordinate frame X I − Y I , which is coincident
[R12 R22 R32 ]T can be solved. Finally, according to the with the image of the target coordinate frame X O − Y O as
orthogonal property of rotation matrix, the last column of the shown in Fig. 3. Then, the homogeneous transformation from

rotation matrix is just the cross product of the first two columns the new to the original image frame TII can be solved through
r3 = [R13 R23 R33 ]T = r1 × r2 . The pose vector WE O
2
can the image location of feature 5 (xI5 , y5I ) and the angle γ =

O
then be extracted from TE2 using the method described in [9]. arctan(y1I − y5I /xI1 − xI5 ) defined in Fig. 3. By using TII and
The final step is to solve for the coordinates of features 2 and the image feature measurements in the original image plane
751
X I − Y I , the image feature locations in the newly defined EKF algorithm [12], using two separate Kalman filters for state

image frame X I − Y I can be calculated such that the signs and parameter estimation, provides much better convergence
of angles β1 to β3 can be uniquely determined. performance. The midpoint or rough estimation of the target
Finally, rough estimation of the target model β̂O can be model and initial relative pose of target in robot end-effector
obtained using the Euclidean distances dij and angles β1 to frame obtained from the previous two sections can be used to
β3 obtained through the previous steps. initialize the dual EKF.
However, tests have shown that even the dual EKF can-
B. Rough Initial Estimation of Relative Pose
not provide a consistent target model estimation for optimal
Based on rough initial estimation of the target model, the relative pose estimation. This is because the target model
relative position and roll angle of target in robot end-effector and relative pose estimates are highly coupled since the two
frame can be estimated using the ray-plane intersection theory Kalman filters share the same output measurement equa-
and the feature matching technique respectively. tion (2). The target model estimation errors can directly lead
For a rough estimation of the relative position, let rC (t) to very erroneous pose estimation and vice versa, especially
be the ray that passes through the target origin and the corre- when using the rough target model estimation to initialize the
sponding image feature location. This ray can be expressed in dual EKF. In addition, both the target model and relative pose
the current end-effector frame as need to be estimated in the dual EKF algorithm, while there is
⎡ ⎤ ⎡ ⎤
OEX DEX no additional measurement information available for the dual
⎢OEY ⎥ ⎢DEY ⎥ EKF. Lack of sufficient measurement information makes it
rE = TC E rC = OE + DE t = ⎣ O
⎢ ⎥+⎢ ⎥
EZ ⎦ ⎣ DEZ ⎦ t. (11) difficult to uniquely estimate the target model and relative pose
1 0 simultaneously. This has motivated the decoupled target model
By assumption (a2), the relative Z distance of the target and relative pose estimation algorithm.
frame origin with respect to the end-effector frame can be The novel feature of the decoupled EKF algorithm is the
approximated by the end-effector height h, i.e., use of both the image feature and robot joint position mea-
surements to uniquely determine the unknown target model
OEZ + DEZ t = h. (12) for optimal pose estimation under continuous robot dynamic
Therefore, the unknown parameter t can be solved as: motion. The combined measurement information can be con-
structed as follows.
h − O EZ
t= . (13) As shown by equations (3) to (6), the target feature coordi-
DEZ nates in the current end-effector frame PE k
j can be determined,
Finally, by substituting t into (11), a rough initial estimation given image feature and joint position measurements at the
of the relative position of target in end-effector frame d̂O E can
current (k) and previous (k − 1) sampled robot dynamic
be obtained. motion. Since we have

O
By assumptions (a1) and (a2), only the relative roll angle of REk dO
PEj
k
= Ek PO
j (14)
target with respect to end-effector frame needs to be estimated. 01×3 1
This can be done by minimizing the image plane feature
distance s0 −ŝj for (φ̂O where RO O T
Ek = RZ,φ RY,θ RX,ψ and dEk = [X, Y, Z] , a new
E )j ∈ [−180, 179] deg with a step size
j = 1 deg. s0 represents the vector of initial camera feature output equation for the target model estimation Kalman filter
measurements, and ŝj = Ĝ([d̂O O T can be defined as follows:
E , (φ̂E )j , 0, 0] , β C , β̂ O ) is the

vector of calculated image feature locations using (2). P Ek = RV1 + V2 = F(WE
O
k
, βO ), (15)
IV. D ECOUPLED EKF E STIMATION where
T
Although the midpoint method provides better estimation P Ek = [X1E , Y1E , Z1E , X2E , Y2E , Z2E , X3E , Y3E , Z3E , X4E , Y4E , Z4E ]k
accuracy than the rough estimation method, it still may not be
R = diag{RO O O O
Ek , R Ek , R Ek , R Ek }
satisfactory when the two camera views are not well separated.
T
Furthermore, it is desirable to have a continuous and optimal V1 = X1O , 0, 0, X2O , Y2O , Z2O , X3O , Y3O , Z3O , X4O , Y4O , 0
estimation of the target model and relative pose simultaneously T
for position-based visual servoing. The Kalman filter-based V2 = X, Y, Z, X, Y, Z, X, Y, Z, X, Y, Z
combined state and parameter estimation can be applied for From (15), some feature point coordinates have been explicitly
this purpose. Since the target feature coordinates are required defined as zero when establishing the target coordinate frame,
in the output equation (2) of the EKF-based pose estimation and therefore, the vector of unknown target model parameters
algorithm as described in [13], they can be treated as un- will only contain β O = [X1O , X2O , Y2O , Z2O , X3O , Y3O , Z3O , X4O , Y4O ]T .
known parameters that need to be estimated. The unknown Note that the measurement vector, which contains the target
relative pose can still be defined as the Kalman filter states feature coordinates in the current end-effector frame, is not
as in [13]. In the literature, there are two typical methods directly measurable and needs to be calculated using equations
used to implement this combined estimation. The first one (3) to (6). Since both the image feature and robot joint
treats the unknown target model parameters as augmented position measurements are embedded in this new measurement
states. However, this algorithm is known to have potential vector, implementation of the target model estimation EKF
convergence problems [6]. An alternative approach, the dual using above system output equation is only feasible during
752
continuous robot dynamic motion. The Kalman filter equations
can be implemented as follows:
YE
β̂ O(k,k−1) = A1 β̂ O(k−1,k−1) X
E
ZE
YO
Pk,k−1 = A1 Pk−1,k−1 AT1 XO ZO
O
∂F(WE , βO ) (a) (b) (c) (d)
Hk = k
∂β O O , β =β̂
O =W
Fig. 4. (a). Experimental system; (b). First view (k = 1); (c). Well-separated
WE E O O(k,k−1) second view (k = 2); (d) Not well-separated second view (k = 2).
k k
K= + Hk Pk,k−1 HTk )−1

Pk,k−1 HTk (Rk
β̂ O(k,k) O , β̂O(k,k−1) ))
= β̂O(k,k−1) + K(P Ek − F(W ous sections. The actual end-effector and target coordi-
Ek
nate frames are labeled in Fig. 4(a). When using the
Pk,k = Pk,k−1 − KHk Pk,k−1
newly established (X O , Y O , Z O ) target frame as shown
where A1 = I9×9 since the feature coordinates are constant in Fig. 2, the “ideal” target coordinates vector is β O =
with respect to the target coordinate frame. [(0.06334, 0, 0)T , (0.06017, 0.015, −0.01171)T , (0.03259,

Let x = [X, Ẋ, Y, Ẏ , Z, Ż, φ, φ̇, θ, θ̇, ψ, ψ̇]T , then the pose − 0.00206, 0.03084)T , (0.02508, 0.03757, 0)T , (0, 0, 0)T ]T . In
estimation Kalman filter can be implemented as follows using the first test, the performance of the midpoint and rough es-
the constant velocity dynamic model [13] timation methods are investigated. Then, the decoupled EKF-
based target model and pose estimation algorithm is tested for
x̂k,k−1 = A2 x̂k−1,k−1
pre-defined continuous robot dynamic motion.
Pk,k−1 = A2 Pk−1,k−1 AT2 + Qk−1
A. Test of Midpoint and Rough Estimation Methods
∂ Ĝ(x, β C , β O )
Hk = For the midpoint method, the first test is to use two well-
∂x x=x̂k,k−1 ,β O =β̂ O(k,k−1)
separated image views as shown in Figs. 4(b) and 4(c), where
K = Pk,k−1 HTk (Rk + Hk Pk,k−1 HTk )−1 a large translational motion is commanded from k = 1 to
x̂k,k = x̂k,k−1 + K(sk − Ĝ(x̂k,k−1 , βC , β̂ O(k,k−1) )) k = 2. While in the second test, two not well-separated image
Pk,k = Pk,k−1 − KHk Pk,k−1 views from Figs. 4(b) and 4(d) are chosen, where only a
small rotation is applied from k = 1 to k = 2 such that

A2is a block diagonal matrix with each block defined
where the 4-th features between the two views are very close to
1 T each other. The target model and relative pose estimates are
as , and T is the sample time.
0 1 compared with those obtained by the rough estimation method
To decouple the target model estimation from the relative
in Tables 1 and 2 respectively.
pose estimation for sensitivity consideration, it is desirable to
Table 1
have a separate source of pose estimation in above equations,
Midpoint method by Figs. 4(b) and 4(c); Rough estimation by Fig. 4(c).
instead of using the pose estimates from the pose estimation
Kalman filter. Recall that this can be easily implemented Error (mm) X̂1O X̂2O Ŷ2O Ẑ2O X̂3O Ŷ3O Ẑ3O X̂4O Ŷ4O
through equation (8) after the measurement vector in (15) is Midpoint 1.25 1.52 0.05 0.44 0.50 0.02 0.83 0.19 0.08
calculated. This guarantees that only the target model estima- Rough 8.63 7.54 4.85 11.71 2.81 18.99 30.84 5.86 13.03
tion will be fed into the pose estimation Kalman filter, i.e., the O
Error X̂E 2
(mm) ŶEO2 (mm) O
ẐE 2
(mm) φ̂O ◦
E2 ( )
O ◦
θ̂E 2
( ) O ◦
ψ̂E 2
( )
target model estimation EKF is completely decoupled from the
Midpoint 0.5 0.5 3.0 0.3 2.1 0.9
relative pose estimation. However, since implementation of the
Rough 11.1 4.6 25.8 1.3 8.5 36.7
target model estimation EKF implies continuous operation of
the midpoint method, it should operate at a slower sample Table 2
rate than the pose estimation EKF to avoid image views that Midpoint method by Figs. 4(b) and 4(d); Rough estimation by Fig. 4(d).
are too closely spaced. Note that feature correspondence is
not a problem in this decoupled estimation algorithm since Error (mm) X̂1O X̂2O Ŷ2O Ẑ2O X̂3O Ŷ3O Ẑ3O X̂4O Ŷ4O
continuous image window prediction and feature tracking can Midpoint 4.00 4.10 0.65 0.66 0.80 0.15 3.09 1.86 1.84
be achieved from the pose estimation EKF using (2). Rough 7.19 9.22 1.96 11.71 3.13 5.04 30.84 1.92 4.72
O
V. E XPERIMENTAL R ESULTS Error X̂E 2
(mm) ŶEO2 (mm) O
ẐE 2
(mm) φ̂O ◦
E2 ( )
O ◦
θ̂E 2
( ) O ◦
ψ̂E 2
( )
Midpoint 5.8 1.5 14.7 0.8 1.6 0.04
The experimental system shown in Fig. 4(a) consists of a
Rough 12.7 1.7 24.3 2.8 8.7 36.4
5-DOF CRS Plus SRS-M1A small robot, an Intel 8086 based
robot controller, a Pulnix TM-5LC miniature camera, a Dipix The above results show that, for the midpoint method, when
FPG-44 image processor, and an Intel Pentium II 400MHz the two image views are well-separated, a good initial estima-
computer running the QNX 4.0 real-time operation system. tion of the target model can be achieved and most feature
The robot system permits direct joint position commands and coordinates estimation errors are within 1 mm. Furthermore,
the reading of joint position encoder counts, and therefore, an a good initial relative pose estimation is also obtained. The
inner joint position servo control is adopted in the experiments. maximum relative position and orientation estimation errors
As shown in Fig. 4(b), the target object used for are about 3 mm in Z and 2◦ in the pitch angle respectively.
experiments is the same as the one used in previ- However, when the two image views are not well-separated,
753
the target model estimation errors are relatively large (up to 3 Furthermore, note that the robustness domain of closed-
to 4 mm) in some feature coordinates. Also, there exist large loop PBVS dynamic stability with respect to the target model
estimation errors in X and Z pose parameters. This is because estimation error could not be derived analytically due to the
the midpoint method is sensitive to the image measurement complex estimation algorithm. However, it was observed in
noise introduced from closer image views. On the other hand, both simulations and experiments that the system never went
the rough target model and pose estimation errors are much unstable due to good target model estimation accuracy.
larger since the target is assumed to be planar, and only the
VI. C ONCLUSION
relative position and roll rotation can be recovered.
In this paper, the problem of combined target model esti-
B. Test of Decoupled EKF Estimation mation and position-based visual servoing is addressed. First,
In order to investigate the robustness of the decoupled EKF the midpoint triangulation and rough estimation methods are
algorithm in the presence of large initial estimation errors, the proposed for initial target model and relative pose estimation.
rough target model and pose estimates obtained from Fig. 4(d) Then, a decoupled EKF-based online estimation algorithm is
are used to initialize the two Kalman filters. Sample rates of developed to improve above initial estimations while under
the pose and target model estimation Kalman filters are chosen continuous robot dynamic motion. This new method is very
as 0.0333 s (30 Hz) and 0.0333 × 60 ≈ 2 s respectively, i.e., robust to large initial estimation errors, and it can provide very
the target model is updated every 60 samples of image feature consistent target model estimation for optimal pose estimation
measurements. The robot end-effector is commanded to move in position-based visual servoing. Experimental results have
periodically within a plane that is almost parallel to the robot demonstrated the good performance of the method. Future
base frame. To compare the pose estimation results, a separate research will be devoted to improving the converging speed
pose estimation EKF using the constant “ideal” target model of the decoupled EKF algorithm and enhancing its robustness
operates simultaneously in the experiments. The differences with respect to camera and robot kinematic modeling errors.
between the pose estimates from the decoupled EKF algorithm R EFERENCES
and this separate pose estimation EKF are calculated as the [1] F. Chaumette, “Image moments: a general and useful set of features
pose estimation errors. for visual servoing,” IEEE Transactions on Robotics, vol. 20, no. 4, pp.
713–723, Aug. 2004.
Table 3 [2] C. Collewet and F. Chaumette, “Positioning a camera with respect to
Decoupled EKF estimation result planar objects of unknown shape by coupling 2-D visual servoing and
3-D estimations,” IEEE Trans. Robot. Automat., vol. 18, no. 3, pp. 322–
X̂1O X̂2O Ŷ2O Ẑ2O X̂3O Ŷ3O Ẑ3O X̂4O Ŷ4O 333, June 2002.
[3] A. Cretual and F. Chaumette, “Visual servoing based on image motion,”
Error (mm) 0.35 0.33 0.45 1.13 0.56 0.22 2.36 0.98 0.15 Int. J. Robot. Res, vol. 20, no. 11, pp. 857–877, 2001.
[4] L. Deng, W. J. Wilson, and F. Janabi-Sharifi, “Characteristics of robot
O (mm) Ŷ O (mm) Ẑ O (mm) φ̂O (◦ )
X̂E O (◦ )
θ̂E O (◦ )
ψ̂E visual servoing methods and target model estimation,” in Proc. IEEE
E E E
Int. Sympos. Intell. Contr., Oct. 2002, pp. 684–689.
Error 0.36 0.08 0.99 0.13 1.07 0.94 [5] R. I. Hartley and P. Sturm, “Triangulation,” Computer Vision and Image
Understanding, vol. 68, no. 2, pp. 146–157, 1997.
The target model and pose estimations converge very close [6] L. Ljung, “Asymptotic behavior of the extended Kalman filter as a
to the actual values within 25 s. This procedure is slightly slow parameter estimator for linear systems,” IEEE Trans. Automat. Contr.,
due to the requirement of well-separated image views during vol. AC-24, no. 1, pp. 36–50, 1979.
[7] E. Malis and F. Chaumette, “Theorectical improvements in the stability
continuous application of the midpoint method in decoupled analysis of a new class of model-free visual servoing methods,” IEEE
EKF algorithm. The accuracy results in Table 3 show that Z2o Trans. Robot. Automat., vol. 18, no. 2, pp. 176–186, Apr. 2002.
and Z3o estimation errors are slightly larger, but most feature [8] E. Malis, F. Chaumette, and S. Boudet, “2-1/2-D visual servoing,” IEEE
Trans. Robot. Automat., vol. 15, no. 2, pp. 238–250, Oct. 1999.
coordinates estimation errors are still within 1 mm. Similarly, [9] M. W. Spong and M. Vidyasagar, Robot Dynamics and Control. New
pose estimations from the decoupled estimation method are York: John Wiley and Sons, 1989.
very consistent with those estimated from the independent [10] C. J. Taylor and J. P. Ostrowski, “Robust vision-based pose control,”
in Proc. IEEE Int. Conf. Robot. Automat., vol. 3, Apr. 2000, pp. 2734–
pose estimation EKF. The results show that the decoupled 2740.
EKF-based target model and relative pose estimation method [11] M. Tonko and H.-H. Nagel, “Model-based stereo-tracking of non-
is robust to large initial estimation errors, and it can provide a polyhedral objects for automatic disassembly experiments,” Int. J. Com-
put. Vision, vol. 37, no. 1, pp. 99–118, 2000.
fairly consistent and accurate target model estimation for good [12] E. A. Wan and A. T. Nelson, “Dual Kalman filtering methods for
position-based visual servoing control. nonlinear prediction, smoothing, and estimation,” in Proceedings of the
1996 Conference on Advances in Neural Information Processing Systems
9, Dec. 1996, pp. 793–799.
C. Discussion [13] W. J. Wilson, C. C. W. Hulls, and G. S. Bell, “Relative end-effector
Although the target object is assumed to be stationary in control using Cartesian position-based visual servoing,” IEEE Trans.
Robot. Automat., vol. 12, no. 5, pp. 684–696, Oct. 1996.
the previous analysis, the proposed decoupled EKF estimation
algorithm can also be applied to track slowing moving target
feature points. Simulations were tested to track maximum
changing rate of 0.5 mm/s in all feature point coordinates.
The results show that the target feature coordinate estimation
errors are still mostly within 1 mm while the pose estimation
errors are within 1 mm in translation and 1 deg in orientation.
754

Decoupled EKF For Simultaneous Target Model and Relative Pose Estimation Using Feature Points

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Decoupled EKF For Simultaneous Target Model and Relative Pose Estimation Using Feature Points

Uploaded by

Copyright:

Available Formats

Proceedings of the TA6.

Decoupled EKF for Simultaneous Target Model and Relative

Email: {lfdeng,wwilson}@kingcong.uwaterloo.ca, fshariﬁ@ryerson.ca

length in meters, Px and Py are the inter-pixel spacings in PC

Pk,k−1 = A1 Pk−1,k−1 AT1 XO ZO

K= + Hk Pk,k−1 HTk )−1

You might also like