Professional Documents
Culture Documents
4
2005 IEEE Conference on Control Applications
Toronto, Canada, August 28-31, 2005
Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on December 14, 2008 at 00:36 from IEEE Xplore. Restrictions apply.
II. M IDPOINT I NITIAL E STIMATION space. sk = [(xI1 )k , (y1I )k , · · · , (xIp )k , (ypI )k ]T is the vector
O
In this section, the problem of finding the position of a of image feature locations, and WE k
= [X, Y, Z, φ, α, ψ]T
point in 3D Cartesian space given its position in two images, represents the relative pose (3D position and roll-pitch-yaw
taken by the same camera with known calibration and pose, orientation) of target object in robot end-effector frame. β c =
is considered. This process requires the intersection of two (f, Px , Py , RE E
C , dC ) is the camera model including the intrin-
known rays in space and is commonly known as triangula- sic (f, Px , Py ) and extrinsic (RE E E
C , dC ) parameters. RC and dC
E
tion [5]. In the absence of image measurement noise, this define the orientation and position of robot end-effector frame
problem is trivial. When noise is present, the two rays will in camera frame. β O = [X1O , Y1O , Z1O , · · · , XpO , YpO , ZpO ]T is
not generally meet in 3D Cartesian space, and therefore it is the target model.
necessary to find the best point of intersection. A commonly Consider two image points at the first and the second sample
suggested method for triangulation, the midpoint method [5], steps, ((xIj )1 , (yjI )1 ) and ((xIj )2 , (yjI )2 ), corresponding to the
is to find the midpoint of the common perpendicular to the two same feature point PO j on the target as shown in Fig. 1, where
rays corresponding to the matched points. It was shown in [5] k = 2 is assumed to be the current sample step. In the absence
that under the Euclidean reconstruction the midpoint method of image measurement noise, the two rays must meet in 3D
provides a more desirable solution than other triangulation Cartesian space, which leads to the following equality:
methods. In this section, application of the midpoint method
OC2 + DC2 t2 = TC
C2 (OC1 + DC1 t1 )
1
(3)
in a robot visual servoing environment will first be introduced.
Then, initial estimation of the target model and relative pose where TC 1
C2 defines the homogeneous transformation from the
based on this method will be described. camera frame C1 to the camera frame C2 . Since TC 1 C2
B and TB
A. Application of Midpoint Method in Visual Servoing can be solved from the robot forward kinematics and the robot
joint encoder measurements, the homogeneous transformation
As shown in Fig. 1, at each sample step k = 1, 2, · · · of
between the two camera frames can then be determined by
robot dynamic motion with respect to a stationary target object,
a ray defined by the j-th feature point PO j on the target object TC C1 −1 C2
C1 = (TB )
2
TB . (4)
and its projection ((xIj )k , (yjI )k ) onto the image plane I, can
be expressed in the camera frame Ck as: Equation (3) results in three equations in terms of two un-
⎡ ⎤ ⎡ ⎤ known parameters t1 and t2 , which can be solved using linear
(xIj )k Px −(xIj )k Px
I
⎢ (yj )k Py ⎥ I
⎢ −(yj )k Py ⎥ least-square methods. This minimizes the squared distance
rCk (tk ) = OCk + DCk tk = ⎣ ⎦+⎣ t ⎦ k (1) between the two rays. Finally, the 3D Cartesian coordinates of
−f f
1 0 the midpoint along the common perpendicular (the minimum
distance) between the two rays expressed in the current camera
where OCk represents the origin of the ray, DCk represents
frame C2 is given by
the direction of the ray, and tk is the ray parameter which
defines individual points along the ray. f is the camera focal OC2 + DC2 t1 + TCC2 (OC1 + DC1 t1 )
1
Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on December 14, 2008 at 00:36 from IEEE Xplore. Restrictions apply.
4) The X O − Y O coordinate plane is defined by adding 3 in the target frame, which can be easily determined as
another feature, feature 4, and the perpendicular y-axis O −1 E2
can then be defined in either direction Y O or Y O . PO
j = (TE2 ) Pj , for j = 2, 3. (9)
5) Finally, the Z O -axis is defined to be perpendicular to Therefore, based on the midpoint method and appropriately
the X O − Y O coordinate plane by the right-hand rule. defined target frame, both the target model and initial relative
YI
pose of target in current end-effector frame can be estimated.
XI However, for practical consideration, it is also desirable to
End-effector
(( x5 ) 2 , ( y5 ) 2 )
I I
Frame at k=2 obtain a rough estimation of these parameters. This motivates
X C2 (current) the rough initial estimation method in the next section.
Camera Y C2 X E2 Y E2
Frame at k=2 d CE
(current) III. ROUGH I NITIAL E STIMATION
The Coordinate Plane
O′ O Determined by Features
Z Z C2 X Z E2
Only the current camera measurements and the robot joint
d CO2 5, 1, and 4
position feedback is required in this method. It is assumed that
Y O′ 1
d EO2
(a1) All the feature points belong to a plane where the robot
2
3
base is located.
β (a2) The end-effector is approximately located in a normal
5 position with respect to the unknown target object.
4
A. Rough Initial Estimation of Target Model
Table
Pick as Origin ZO YO This method is illustrated in Fig. 3 using the same target
Fig. 2. Establishment of the target coordinate frame
object as the previous section. Again, the first step is to estab-
lish an appropriate target coordinate frame X O − Y O − Z O
After establishing the target coordinate frame, the target using the same procedures as before.
model can then be determined as follows.
The first step is to determine the coordinates of feature 1 The Image Distance in X O ( X I′ )
and 4 in the target frame. Since the Euclidean distances d15 , Image Plane
d14 , and d45 can be solved based on PC j for j = 1, 4, and 5,
2
Focal
si − s j 1 2 XI
the β angle defined in Fig. 2 can then be solved as Length f Camera Focal
2 Point β1
d15 + d245 − d214 3 β2 β3 4
β = arccos . (7) Y O (Y I ′ )
2d15 d45
h γ
Since there are two solutions for above equation, the
coordinates of features 1 and 4 in the target frame
5
can be obtained in homogeneous form as PO 1 =
The Euclidean
d ij YI
[d15 , 0, 0, 1]T , PO
4 = [d45 cos β, d45 sin β, 0, 1] T
, or PO
4 = Distance between i-th
T and j-th feature
[d45 cos β, −d45 sin β, 0, 1] . It is obvious that the two alterna-
Fig. 3. Rough estimation of the target model
tives for feature 4 coordinates correspond to the two coordinate
frames shown in Fig. 2 and so we can choose either one. The second step is to estimate all the Euclidean distances
The second step is to determine the relative pose of target in dij between the i-th and j-th feature points on the unknown
O target. By assumptions (a1) and (a2), the relative depth of
current end-effector frame WE 2
. Note that only the relative
orientation needs to be solved since the relative position is each feature point in the camera frame can be approximated
already available from the coordinates of target origin in cur- by the end-effector height h with respect to robot base, which
rent end-effector frame PE 2
5 . The homogeneous transformation
can be solved from the robot kinematics and current joint
between the feature coordinates expressed in target frame and encoder readings. Therefore, based on the two similar triangles
current end-effector frame is in Fig. 3, dij can be estimated as
⎡ ⎤ ⎡ O⎤ ⎡ ⎤ ⎡ O⎤
XjE2
⎢ Y E2 ⎥
Xj R11 R12 R13 X5E2 Xj
⎢ Y O ⎥ ⎢R21 R22 R23 Y E2 ⎥ ⎢ Y O ⎥
h I
⎢ j ⎥ = TO ⎢ j ⎥ ⎢ ⎥ ⎢ j ⎥ dij = ((xi − xIj )Px )2 + ((yiI − yjI )Py )2 . (10)
E2 ⎣ O ⎦ = ⎣ . (8)
5
⎣ Z E2 ⎦ f
j
Zj R31 R32 R33 Z5E2 ⎦ ⎣ ZjO ⎦
1 1
The third step is to solve β1 , β2 , and β3 as defined in Fig. 3.
1 0 0 0 1
This can be done in a similar way to (7) using dij from above.
Based on PO O
1 and P4 obtained above, the first two columns A unique solution can be determined by establishing a new
of the rotation matrix r1 = [R11 R21 R31 ]T , r2 = image plane coordinate frame X I − Y I , which is coincident
[R12 R22 R32 ]T can be solved. Finally, according to the with the image of the target coordinate frame X O − Y O as
orthogonal property of rotation matrix, the last column of the shown in Fig. 3. Then, the homogeneous transformation from
rotation matrix is just the cross product of the first two columns the new to the original image frame TII can be solved through
r3 = [R13 R23 R33 ]T = r1 × r2 . The pose vector WE O
2
can the image location of feature 5 (xI5 , y5I ) and the angle γ =
O
then be extracted from TE2 using the method described in [9]. arctan(y1I − y5I /xI1 − xI5 ) defined in Fig. 3. By using TII and
The final step is to solve for the coordinates of features 2 and the image feature measurements in the original image plane
751
Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on December 14, 2008 at 00:36 from IEEE Xplore. Restrictions apply.
X I − Y I , the image feature locations in the newly defined EKF algorithm [12], using two separate Kalman filters for state
image frame X I − Y I can be calculated such that the signs and parameter estimation, provides much better convergence
of angles β1 to β3 can be uniquely determined. performance. The midpoint or rough estimation of the target
Finally, rough estimation of the target model β̂O can be model and initial relative pose of target in robot end-effector
obtained using the Euclidean distances dij and angles β1 to frame obtained from the previous two sections can be used to
β3 obtained through the previous steps. initialize the dual EKF.
However, tests have shown that even the dual EKF can-
B. Rough Initial Estimation of Relative Pose
not provide a consistent target model estimation for optimal
Based on rough initial estimation of the target model, the relative pose estimation. This is because the target model
relative position and roll angle of target in robot end-effector and relative pose estimates are highly coupled since the two
frame can be estimated using the ray-plane intersection theory Kalman filters share the same output measurement equa-
and the feature matching technique respectively. tion (2). The target model estimation errors can directly lead
For a rough estimation of the relative position, let rC (t) to very erroneous pose estimation and vice versa, especially
be the ray that passes through the target origin and the corre- when using the rough target model estimation to initialize the
sponding image feature location. This ray can be expressed in dual EKF. In addition, both the target model and relative pose
the current end-effector frame as need to be estimated in the dual EKF algorithm, while there is
⎡ ⎤ ⎡ ⎤
OEX DEX no additional measurement information available for the dual
⎢OEY ⎥ ⎢DEY ⎥ EKF. Lack of sufficient measurement information makes it
rE = TC E rC = OE + DE t = ⎣ O
⎢ ⎥+⎢ ⎥
EZ ⎦ ⎣ DEZ ⎦ t. (11) difficult to uniquely estimate the target model and relative pose
1 0 simultaneously. This has motivated the decoupled target model
By assumption (a2), the relative Z distance of the target and relative pose estimation algorithm.
frame origin with respect to the end-effector frame can be The novel feature of the decoupled EKF algorithm is the
approximated by the end-effector height h, i.e., use of both the image feature and robot joint position mea-
surements to uniquely determine the unknown target model
OEZ + DEZ t = h. (12) for optimal pose estimation under continuous robot dynamic
Therefore, the unknown parameter t can be solved as: motion. The combined measurement information can be con-
structed as follows.
h − O EZ
t= . (13) As shown by equations (3) to (6), the target feature coordi-
DEZ nates in the current end-effector frame PE k
j can be determined,
Finally, by substituting t into (11), a rough initial estimation given image feature and joint position measurements at the
of the relative position of target in end-effector frame d̂O E can
current (k) and previous (k − 1) sampled robot dynamic
be obtained. motion. Since we have
O
By assumptions (a1) and (a2), only the relative roll angle of REk dO
PEj
k
= Ek PO
j (14)
target with respect to end-effector frame needs to be estimated. 01×3 1
This can be done by minimizing the image plane feature
distance s0 −ŝj for (φ̂O where RO O T
Ek = RZ,φ RY,θ RX,ψ and dEk = [X, Y, Z] , a new
E )j ∈ [−180, 179] deg with a step size
j = 1 deg. s0 represents the vector of initial camera feature output equation for the target model estimation Kalman filter
measurements, and ŝj = Ĝ([d̂O O T can be defined as follows:
E , (φ̂E )j , 0, 0] , β C , β̂ O ) is the
vector of calculated image feature locations using (2). P Ek = RV1 + V2 = F(WE
O
k
, βO ), (15)
IV. D ECOUPLED EKF E STIMATION where
T
Although the midpoint method provides better estimation P Ek = [X1E , Y1E , Z1E , X2E , Y2E , Z2E , X3E , Y3E , Z3E , X4E , Y4E , Z4E ]k
accuracy than the rough estimation method, it still may not be
R = diag{RO O O O
Ek , R Ek , R Ek , R Ek }
satisfactory when the two camera views are not well separated.
T
Furthermore, it is desirable to have a continuous and optimal V1 = X1O , 0, 0, X2O , Y2O , Z2O , X3O , Y3O , Z3O , X4O , Y4O , 0
estimation of the target model and relative pose simultaneously
T
for position-based visual servoing. The Kalman filter-based V2 = X, Y, Z, X, Y, Z, X, Y, Z, X, Y, Z
combined state and parameter estimation can be applied for From (15), some feature point coordinates have been explicitly
this purpose. Since the target feature coordinates are required defined as zero when establishing the target coordinate frame,
in the output equation (2) of the EKF-based pose estimation and therefore, the vector of unknown target model parameters
algorithm as described in [13], they can be treated as un- will only contain β O = [X1O , X2O , Y2O , Z2O , X3O , Y3O , Z3O , X4O , Y4O ]T .
known parameters that need to be estimated. The unknown Note that the measurement vector, which contains the target
relative pose can still be defined as the Kalman filter states feature coordinates in the current end-effector frame, is not
as in [13]. In the literature, there are two typical methods directly measurable and needs to be calculated using equations
used to implement this combined estimation. The first one (3) to (6). Since both the image feature and robot joint
treats the unknown target model parameters as augmented position measurements are embedded in this new measurement
states. However, this algorithm is known to have potential vector, implementation of the target model estimation EKF
convergence problems [6]. An alternative approach, the dual using above system output equation is only feasible during
752
Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on December 14, 2008 at 00:36 from IEEE Xplore. Restrictions apply.
continuous robot dynamic motion. The Kalman filter equations
can be implemented as follows:
YE
β̂ O(k,k−1) = A1 β̂ O(k−1,k−1) X
E
ZE
YO
O
∂F(WE , βO ) (a) (b) (c) (d)
Hk = k
∂β O O , β =β̂
O =W
Fig. 4. (a). Experimental system; (b). First view (k = 1); (c). Well-separated
WE E O O(k,k−1) second view (k = 2); (d) Not well-separated second view (k = 2).
k k
continuous image window prediction and feature tracking can Midpoint 4.00 4.10 0.65 0.66 0.80 0.15 3.09 1.86 1.84
be achieved from the pose estimation EKF using (2). Rough 7.19 9.22 1.96 11.71 3.13 5.04 30.84 1.92 4.72
O
V. E XPERIMENTAL R ESULTS Error X̂E 2
(mm) ŶEO2 (mm) O
ẐE 2
(mm) φ̂O ◦
E2 ( )
O ◦
θ̂E 2
( ) O ◦
ψ̂E 2
( )
Midpoint 5.8 1.5 14.7 0.8 1.6 0.04
The experimental system shown in Fig. 4(a) consists of a
Rough 12.7 1.7 24.3 2.8 8.7 36.4
5-DOF CRS Plus SRS-M1A small robot, an Intel 8086 based
robot controller, a Pulnix TM-5LC miniature camera, a Dipix The above results show that, for the midpoint method, when
FPG-44 image processor, and an Intel Pentium II 400MHz the two image views are well-separated, a good initial estima-
computer running the QNX 4.0 real-time operation system. tion of the target model can be achieved and most feature
The robot system permits direct joint position commands and coordinates estimation errors are within 1 mm. Furthermore,
the reading of joint position encoder counts, and therefore, an a good initial relative pose estimation is also obtained. The
inner joint position servo control is adopted in the experiments. maximum relative position and orientation estimation errors
As shown in Fig. 4(b), the target object used for are about 3 mm in Z and 2◦ in the pitch angle respectively.
experiments is the same as the one used in previ- However, when the two image views are not well-separated,
753
Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on December 14, 2008 at 00:36 from IEEE Xplore. Restrictions apply.
the target model estimation errors are relatively large (up to 3 Furthermore, note that the robustness domain of closed-
to 4 mm) in some feature coordinates. Also, there exist large loop PBVS dynamic stability with respect to the target model
estimation errors in X and Z pose parameters. This is because estimation error could not be derived analytically due to the
the midpoint method is sensitive to the image measurement complex estimation algorithm. However, it was observed in
noise introduced from closer image views. On the other hand, both simulations and experiments that the system never went
the rough target model and pose estimation errors are much unstable due to good target model estimation accuracy.
larger since the target is assumed to be planar, and only the
VI. C ONCLUSION
relative position and roll rotation can be recovered.
In this paper, the problem of combined target model esti-
B. Test of Decoupled EKF Estimation mation and position-based visual servoing is addressed. First,
In order to investigate the robustness of the decoupled EKF the midpoint triangulation and rough estimation methods are
algorithm in the presence of large initial estimation errors, the proposed for initial target model and relative pose estimation.
rough target model and pose estimates obtained from Fig. 4(d) Then, a decoupled EKF-based online estimation algorithm is
are used to initialize the two Kalman filters. Sample rates of developed to improve above initial estimations while under
the pose and target model estimation Kalman filters are chosen continuous robot dynamic motion. This new method is very
as 0.0333 s (30 Hz) and 0.0333 × 60 ≈ 2 s respectively, i.e., robust to large initial estimation errors, and it can provide very
the target model is updated every 60 samples of image feature consistent target model estimation for optimal pose estimation
measurements. The robot end-effector is commanded to move in position-based visual servoing. Experimental results have
periodically within a plane that is almost parallel to the robot demonstrated the good performance of the method. Future
base frame. To compare the pose estimation results, a separate research will be devoted to improving the converging speed
pose estimation EKF using the constant “ideal” target model of the decoupled EKF algorithm and enhancing its robustness
operates simultaneously in the experiments. The differences with respect to camera and robot kinematic modeling errors.
between the pose estimates from the decoupled EKF algorithm R EFERENCES
and this separate pose estimation EKF are calculated as the [1] F. Chaumette, “Image moments: a general and useful set of features
pose estimation errors. for visual servoing,” IEEE Transactions on Robotics, vol. 20, no. 4, pp.
713–723, Aug. 2004.
Table 3 [2] C. Collewet and F. Chaumette, “Positioning a camera with respect to
Decoupled EKF estimation result planar objects of unknown shape by coupling 2-D visual servoing and
3-D estimations,” IEEE Trans. Robot. Automat., vol. 18, no. 3, pp. 322–
X̂1O X̂2O Ŷ2O Ẑ2O X̂3O Ŷ3O Ẑ3O X̂4O Ŷ4O 333, June 2002.
[3] A. Cretual and F. Chaumette, “Visual servoing based on image motion,”
Error (mm) 0.35 0.33 0.45 1.13 0.56 0.22 2.36 0.98 0.15 Int. J. Robot. Res, vol. 20, no. 11, pp. 857–877, 2001.
[4] L. Deng, W. J. Wilson, and F. Janabi-Sharifi, “Characteristics of robot
O (mm) Ŷ O (mm) Ẑ O (mm) φ̂O (◦ )
X̂E O (◦ )
θ̂E O (◦ )
ψ̂E visual servoing methods and target model estimation,” in Proc. IEEE
E E E
Int. Sympos. Intell. Contr., Oct. 2002, pp. 684–689.
Error 0.36 0.08 0.99 0.13 1.07 0.94 [5] R. I. Hartley and P. Sturm, “Triangulation,” Computer Vision and Image
Understanding, vol. 68, no. 2, pp. 146–157, 1997.
The target model and pose estimations converge very close [6] L. Ljung, “Asymptotic behavior of the extended Kalman filter as a
to the actual values within 25 s. This procedure is slightly slow parameter estimator for linear systems,” IEEE Trans. Automat. Contr.,
due to the requirement of well-separated image views during vol. AC-24, no. 1, pp. 36–50, 1979.
[7] E. Malis and F. Chaumette, “Theorectical improvements in the stability
continuous application of the midpoint method in decoupled analysis of a new class of model-free visual servoing methods,” IEEE
EKF algorithm. The accuracy results in Table 3 show that Z2o Trans. Robot. Automat., vol. 18, no. 2, pp. 176–186, Apr. 2002.
and Z3o estimation errors are slightly larger, but most feature [8] E. Malis, F. Chaumette, and S. Boudet, “2-1/2-D visual servoing,” IEEE
Trans. Robot. Automat., vol. 15, no. 2, pp. 238–250, Oct. 1999.
coordinates estimation errors are still within 1 mm. Similarly, [9] M. W. Spong and M. Vidyasagar, Robot Dynamics and Control. New
pose estimations from the decoupled estimation method are York: John Wiley and Sons, 1989.
very consistent with those estimated from the independent [10] C. J. Taylor and J. P. Ostrowski, “Robust vision-based pose control,”
in Proc. IEEE Int. Conf. Robot. Automat., vol. 3, Apr. 2000, pp. 2734–
pose estimation EKF. The results show that the decoupled 2740.
EKF-based target model and relative pose estimation method [11] M. Tonko and H.-H. Nagel, “Model-based stereo-tracking of non-
is robust to large initial estimation errors, and it can provide a polyhedral objects for automatic disassembly experiments,” Int. J. Com-
put. Vision, vol. 37, no. 1, pp. 99–118, 2000.
fairly consistent and accurate target model estimation for good [12] E. A. Wan and A. T. Nelson, “Dual Kalman filtering methods for
position-based visual servoing control. nonlinear prediction, smoothing, and estimation,” in Proceedings of the
1996 Conference on Advances in Neural Information Processing Systems
9, Dec. 1996, pp. 793–799.
C. Discussion [13] W. J. Wilson, C. C. W. Hulls, and G. S. Bell, “Relative end-effector
Although the target object is assumed to be stationary in control using Cartesian position-based visual servoing,” IEEE Trans.
Robot. Automat., vol. 12, no. 5, pp. 684–696, Oct. 1996.
the previous analysis, the proposed decoupled EKF estimation
algorithm can also be applied to track slowing moving target
feature points. Simulations were tested to track maximum
changing rate of 0.5 mm/s in all feature point coordinates.
The results show that the target feature coordinate estimation
errors are still mostly within 1 mm while the pose estimation
errors are within 1 mm in translation and 1 deg in orientation.
754
Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on December 14, 2008 at 00:36 from IEEE Xplore. Restrictions apply.