00065489

LEARNING-BASED HYBRID CONTROL
OF CLOSED-KINEMATIC CHAIN ROBOTIC END-EFFECTORS

Charles C. Nguyen and Farhad J. Pooran
Center for Artificial Intelligence and Robotics
School of Engineering and Architecture
Catholic University of America
Uashington DC. 20064
Timothy Premack
Goddard Space Flight Center (NASA)
Greenbelt. Maryland 20771
learning algorithm based on explicit modeling of robot
manipulators was developed.
In this paper, combining the concepts of learning
control and hybrid position/force control [151, we
propose a control scheme to control position and
contact force of a six-degree-of-freedom robot
end-effector recently built at the Goddard Space
Flight Center (NASA) to study potential applications
of robotic assembly in space 1141.
The proposed
control scheme mainly consists of a feedback hybrid
control system and a learning control system whose
structure is also hybrid. Using linearization around
a selected desired pattern, we.wi11 show that the
controller gains of the hybrid control system and the
learning control system can be designed such that the
end-effector can autonomously improve its performance
by repeating the assembly task. The developed control
scheme is then tested on a 2-degree-of-freedom robot
end-effector. Experimental results will be given and
the performance of the proposed control scheme will be
discussed.
Abstract: This paper proposes a control scheme that

combines the concepts of hybrid control and learning
control for controlling force and position of a
six-degree-of-freedom
robotic
end-effector
with
closed-kinemat ic chain mechanism, which performs
repeatable tasks. The control scheme consists of two
control systems:
the hybrid control system and the
learning control system. The hybrid control system is
composed of two feedback loops, a position loop and a
force loop, which produce inputs to the end-effector
actuators, based on errors in position and contact
forces of selected degrees of freedom. The learning
control system consisting of two PD-type learning
controllers arranged also in a hybrid structure
provides addtional inputs to the actuators to improve
the end-effector performance after each trial. The
paper shows that with proper learning controller gains
the actual position and contact force can approach the
desired values as the number of trials increases.
Experimental studies performed on a
two-degree-offreedom end-effector show that the control scheme
provides path tracking with sat isfactory precision
while maintaining contact forces with minimal errors
after several trials.
1.
INTFtODw3TION
To perform repeatable tasks, robot manipulator
can be taught off-line by a so-called teaching and
playback scheme or can learn on-line from its
experience via a self learning control scheme without
human supervision.
Learning control theory has
attracted attention of control researchers since many
years [51 and recently has been developed for control
of robot manipulators [11-[41. The first learning
control scheme was proposed by Uchiyama [181 for
Motivated by the idea
control 1 ing mechanical arms.
that robots can learn autonomously from measurement
data of previous operations to improve their
performance, Arimoto and his research group [21
proposed a so-called Betterment Process whose
operation is based on a simple iteration rule. The
iteration rule autonomously generates a present
actuator input better than the previous one by using
stored data of errors in previous operation.
Application of the proposed learning control scheme
has been studied mostly for the problem of controlling
robot position. [ l ] and some attention was paid to
force* control [7]. Hara and others [61 considered
the synthesis of repetitive control systems for a
subclass of control systems whose outputs are
controlled to follow periodic reference commands.
Togai and Yamano 1171 studied the synthesis of a
discrete learning control scheme that relaxes the rank
condition in Arimoto learning control scheme. Craig
[41 studied an adaptive scheme for robot manipulators
for the case of repeated trials of a path. In [31, a
*)
In this paper, position implies both position
and orientation and force implies both force and
torque .
0-8186-2012-9/89/0000/0545$01.OO 0 1989 IEEE
THE CKCn TIC

END-EFFECTOR
The structure of the CKCM robotic end-effector is
depicted in Figure 1 . The end-effector is attached to
the last link of a robot manipulator. In a typical
assembly process, the manipulator provides gross
motion and the end-effector fine and precise motion.
Closed-kinematic chain mechanism was chosen to design
the end-effector because it has several advantages
compared to open-kinematic chain mechanism, such as
smaller positioning errors due to the lack of
cantileverlike configuration and higher forcekorque
and
greater
payload
hand1 ing
capabi 1 ity.
The
end-effector has six degrees of freedom provided by
s i x in-parallel actuators and mainly consists of a
base platform and a payload platform coupled together
through the actuators using pin joints and ball joints
at both ends of the actuator rods. Position feedback
is achieved by 6 linear voltage differential
transformers (LVDT) mounted along the actuator
lenghts.
Force sensing is provided by mounting a
six-degree-of-freedom wrist sensor between the last
link of the manipulator and the base platform.
For the above type of end-effectors, the inverse
kinematic problem possesses closed-form solutions.
However its forward kinematic problem must be solved
by using an iterative method such as Newton-Raphson
algorithm.
A
complete development of forward
kinematics, inverse kinematics, Jacobian matrix and
dynamics can be found in [121.
2.
545
3.
THE LEARNING-BASED HYBRID CONTROL SCHEME
Figure 2 illustrates the structure of the

learning-based hybrid control scheme proposed for
controlling
position
and
force of
the
CKCM
end-effector presented in Section 2.
The control
scheme consists mainly of two systems: the hybrid
control system is aimed at the improvement of the
end-effector dynamics in terms of stability and

tracking capability of desired pattern and the
learning control schene at the control of the
transient and steady-state errors of the end-effector
responses using repeated trials. The hybrid control
scheme [151 is composed of two loops: the upper for
position and the lower for force.
In the position
loop, the actual Cartesian positions are computed by
using the forward kinematic transformation and the
actual joint positions measured by six LVDT's. In the
force loop, the actual Cartesian contact forces are
obtained by transforming tbe force sensor signals into
the base coordinate system by employing an appropriate
coordinate transformation. Depending on a particular
task to be performed during a robotic part assembly.
the user may specify h i c h degrees of freedom are to
be
position-controlled and
which
are
to be
force-controlled. The selection is done by using a
Selectton Matrix S, which is a (6x6) diagonal matrix
whose main diagonal elements specify which degrees of
freedom are to be position-controlled (sii = 0 ) and
which are to be force-controlled ( s i i = 1) for i =
1.2. ..., 6 by discarding the unwanted components of
the positions and forces. In the position loop. the
joint position errors corresponding to the Cartesian
position errors for degrees of freedom selected to be
position-controlled are obtained by
the
inverse
Jacobian matrix.
Similarly, the Joint force errors
corresponding to the Cartesian force errors are
computed by the transpose Jacobian matrix in the force
loop. The selected error-driven control signals are
then transformed into the corresponding Joint variable
errors and joint force errors via the inverse Jacobian
matrix
and its transpose, respectively.
Using
linearization about a desired pattern of position and
force. the gains of the fixed PD position controller
and the PD force controller are designed such that the
end-effector tracks a Cartesian position trajectory
with minimum steady-state errors and sett1 ing time
while keeping a relatively constant pattern of desired
contact forces on the environment.
This hybrid
control scheme has been proved to provide satisfactory
performance by both simulation and experimental
results for fixed PD control in [ll] and for adaptive
control in [13]. However a deviation still exists
between the desired and actual patterns of force and
position because of
dynamic
interferences and
frictional disturbances resulted from the reaction
force. Therefore. for robotic assembly tasks that are
repeatable, we propose to add to the above system a
learning control system that can "learn" from the
errors o f the force and position pattern during a
trial and provide additional signals which improve the
end-effector performance during the next trial. The
design of the learning control system is based also
on the hybrid structure. It consists of two W - t W
learning controllers, one learns from the Joint
position errors corresponding to those in degrees of
freedom selected to be position-controlled and the
other learns from the joint force errors corresponding
to thosb in degrees of freedom selected to be
force-controlled.
After each trial. the signal
provided by the learning control system i s improved by
using the data of previous signal and the errors of
joint positions and joint forces during the previous
trial. The improvement process is described in the
following learning scheme [Figure 31:
;o(t) = JTz[Fd(t)
- Fc(t)l
(3)
where J is the Jacobtan matrix, xd(t) and x(t) denote

the
desired
and
actual
Cartesian
positions,
respectively. Fd(t) and F d t ) denote the desired and
actual contact forces, respectively.
The Selection
Matrix S of the learning control system is defined
similar to the Selection Matrix S ,of the hybrid
control system. The Selection Matrix S is introduced
here instead of using the same matrix S for we can
independently select which degrees of freedom are to
be force- or position-controlled in the learning
process.
During the kth trial, the information of Uk+l is
stored in the lower part of a memory (LSI RAM) as a
set of densely sampled digital data.
After the kth
trial. the stored data will be loaded to the upper
part of the memory and will be sent to the actuator.:
b r i n g the (k+l)th trial.
The lower part of the
memory is now ready to store new data. During the kth
trial the end-effector input is composed of U,
auxiliary signal rd and signals coming from the
position loop. the force loop the learning control
system, namely:
(4)
The auxiliary signal 7 d is added to the end-effector

input in order to compensate the end-effector dynamics
as seen later in the 1 inearization process discussed
in Section 4.
We observe that the hybrid control acts here as a
feedback controller that improves the end-effector
dynamtcs and the learning control system provides
8dditional input to improve the transient and
steady-state responses of the end-effector using
repeated trials. The controller gains of the hybrid
control scheme should be designed such that the
closed-loop feedback system is asymptotically stable
with minimized steady-state errors. The design of the
learning control scheme should be performed the
transient errors converge to zero as the number o f
trials increases.
DESIGN OF TRE CONTROLLER CAI=
In this section we present the design of the
controller gains of the hybrid control system and the
learning control system. The design mainly consists
of three steps: 1) Linearizing the end-effector
dynamics about a desired operating pattern, 2)
Selecting controller gains for the hybrid control
system such that the linearized closed-loop control
system is asynptotically stable and 3) Selecting the
controller gains o f the learning control system such
that the error between the desired and actual pattern
during the k-th trial converges to zero as k increases
to infinitive.
The dynamics of
the end-effector
in an
n-dimensional joint space is expressed in terms of
the joint position q(t) as 1131
4.
M(q)q(t) + N(q.4) + G(q) + JTFc(t) = ~ ( t )
(5)
where M(q1, the end-effector mass matrix is an (nxn)

positive definite matrix, N(q,q) represents the (nxl)
centrifugal and Coriolis force vector. G(q) is the
(nxl) gravitational force vector, apd r(t) is the
(nxl) Joint force vector. The term JFc(t1 represents
Joint forces Corresponding to the Cartesian force
Fc(t) exerted by the end-effector on the environment
( h i c h is equal to the reaction force) and can be
rewritten as
where 0 is a positive definite matrix, af and f3f are

non-negative constant scalars and
JTFFc(t)= JTKx(t) = JTKT(q) = R(q)
(2).
546
(6)
where K is the (nxn) diagonal generalized stiffness

matrix representing the stiffness of the force sensor
R ( t ) i s positive d e f i n i t e f o r
learning scheme i s defined by
and p a r t of the environment, which i s i n contact w i t h

K
the
end-effector.
The
diagonal
elements
of
corresponding t o those degrees o f freedom being
p o s i t i o n - c o n t r o l l e d are s e t t o zero.
L i n e a r i z i n g ( 5 ) around the given desired j o i n t
p o s i t i o n t r a j e c t o r y qd( t ) corresponding t o the given
desired Cartesian p o s i t i o n t r a j e c t o r y X d ( t ) , by using
Taylor Series expansion and neglecting higher order
terms, we obtain
M(t);(t)
where
N ( t ) L ( t ) + G ( t ) o ( t ) + Td(t) = T
M ( t ) = M(qd)
(7)
(10)
qd
rd(t)
= M(qd)qd + N(q , q
d
and
o(t) = q(t)
The end-effector
T(t)
+ G(qd) + R(qd)
q,(t).
(11)
(13)
where
A (t) = K
A ( t ) + K A ( t ) + K A ( t ) + KfdA,,(t)(14)
w 3
A 4 ( t ) = J'SKJI
qd
(16)
qd
where Kpp and Kpd are the gains o f the PD-control l e r

i n the postion
loop and K f p and K f d are the
PD-controller gains i n the f o r c e loop.
S u b s t i t u t i n g (13) i n t o ( 7 ) y i e l d s
M(t)o(t)+[N(t)+Az(t)Ib(t)+[G(t)+Al(t)lo(t)
ukil(t)
= u(t)
( t )- Oil(t)o ( t ) - 0 i 2 ( t ) G k ( t )
M(t)ik(t)
27 1
~ ~ += ~
U ((t )t +) 0[6 ( 0 - 0 1 + a (Gd-Lk)J.
p
(17)
and
= uk(t)
(18)
(28)
o ( 0 ) = q ( 0 ) - q ( 0 )= 0 = wd(0)
(30)
(19)
REMARK:
I n the above proof, w e considered the special
case o f designing the l e a r n i n g control system based on
the pure p o s i t i o n c o n t r o l .
I n general, the above
proof can be extended t o the h y b r i d case where
concentration on c e r t a i n degrees o f freedom can be
s p e c i f i e d through the Selection Matrix S .
It i s
expected t h a t the h y b r i d learning control scheme
performs b e t t e r than the pure p o s i t i o n
learning
c o n t r o l scheme as we w i l l see when implementing the
h y b r i d learning control scheme i n the next section.
Since o ( t ) represents the d e v i a t i o n between the

actual j o i n t p o s i t i o n q ( t ) and the desired j o i n t
p o s i t i o n qd(t). the desired e r r o r u d ( t ) should be set
t o wd(t)=O. Therefore (18) can be expressed as
= u k ( t ) + OIAl(t)(od-uk)
since X k ( 0 ) = xd(0) and Xk(0) = xd(0) as given from

the hypothesis o f the main theorem.
Besides, M ( t ) i s
p o s i t i v e d e f i n i t e on [O,Tl since M(q) i s p o s i t i v e
d e f i n i t e [see Eq. (511. W
e a l s o observe t h a t (28) and
(23) assume the same form where i n (28)
= 0, which
p o s i t i v e d e f i n i t e and u i ( t ) i s continuous on [ O , T l
because the closed-loop system (17) i s asymptotically
stable.
Now using Lemma 1, s e l e c t i n g ap and 6p such
t h a t 0 < u p 5 1 and 0 5 Bp < 1 w i l l make o k ( t )
converge t o d t ) = 0 u n i f o r m l y on [O,Tl as k --> a,
meaning t h a t Xk(t) converges t o Xd(t) u n i f o r m l y on
[ O , T l as k --> o. W
e j u s t proved the main theorem.
+ [N(t) + @A2(t)lGk(t) +
+ [ G ( t ) + @Al(t)lok(t)
ukil(t)
= i ( 0 ) ;z ( 0 )= zd(0)
W
e note t h a t M ( t ) , N ( t ) + @ A 2 ( t ) = N ( t ) + a@, G ( t ) +
OAi(t) = G ( t ) + f3@ are continuously d i f f e r e n t i a b l e
matrices on
[O,Tl
because Xd(t.1 and F d ( t ) are
continuously d i f f e r e n t i a b l e on [O,Tl. I n a d d i t i o n ,
i, (0)= q (0) - q ( 0 ) = 0 = Ld (0)
(29)
which represents a 1 inear time-varying m u l t i v a r i a b l e

system.
The c o n t r o l l e r gains Kw,
Kpd, K f p and K f d
can be systematically designed such t h a t system (17)
i s asymptotically s t a b l e by using f o r example,
the
eigenvalues assignment method proposed i n [91.
W
e proceed t o r e w r i t e ( 1 ) as
where
Proof: The i t e r a t i v e learning scheme expressed by (23)

and (19) i s intended f o r the general h y b r i d l e a r n i n g
c o n t r o l system.
To prove the above theorem,
we
consider the special case o f pure po_sition l e a r n i n g
In this
scheme, which i s r e a l i z e d by s e t t i n g S = 0.
case, using (20)-(23), we obtain
(15)
pd3
J-l(I-S)JI
26 1
fP 4
pd3
A ( t ) = K A ( t ) + KfdA4(t)
A3(t) =
R ( t ) z k ( t ) + Q ( t ) t k (+t P
) (t)z (t)= vk(t)
-2
Cartesian p o s i t i o n x d ( t ) and desired Cartesian contact

f o r c e F d ( t ) are continuously d i f f e r e n t i a b l e on [ O , T l ,
and i ( k ( 0 ) = x d (01, xk(0) = xd(O), then the l e a r n i n g
c o n t r o l scheme described by (23) and (19) can be
designed such t h a t x k ( t ) converges t o x d ( t ) u n i f o r m l y
on [O,T] as k--> W.
input r ( t )can be expressed as

Td(t)
Main Theorem:
Consider the end-effector described by
(5) and adequately modelled by ( 7 ) .
I f the desired
(12)
= -A ( t ) o ( t ) - A 2 ( t ) L ( t ) + u ( t ) +
[O,TI.
where a and 6 are constant scalars, Z d ( t ) denotes the

desired p a t t e r n f o r z ( t ) and r i s a p o s i t i v e d e f i i t e
square matrix.
I f v i ( t ) i s continuous,
zd(t) i s
continuously
differentiable
on
[O,TI,
and
the
constant scalars a and 6 are set such t h a t 0 < a 5 1
and 0 5 /3 < 1. then z k ( t ) converges t o z d ( t ) u n i f o r m l y
on I 0 , T l as k --> m.
The proof o f Lemma 1 can be found i n [71.
Now
the main r e s u l t o f t h i s paper i s given below
G ( t ) = a/aq[M(q)qd + N(q,q) + R(q) + G ( q ) l I
ta(t
25 )
zk(0)
(9)
d'
#3(zd-zk)1
vkil(t)
(8)
N ( t ) = a/aq[n(q,q)iIq
= v (t) +
all
+ A 2 ( t ) ( L d - b k ) l . (23)
Equations (23) and (19) represent the general learning

scheme o f the h y b r i d learning control system.
Before presenting the main r e s u l t s o f t h i s paper,
we f i r s t s t a t e the f o l l o w i n g lemma:
5.
STUDY OF THE SPECIAL CASE
I n t h i s section we present the implementation o f

the proposed learning-based h y b r i d control scheme f o r
c o n t r o l l i n g a two-degree-of-freedom robot end-effector
resembling a p a r t o f the complete CKCM end-effector
discussed i n Section 2.
Experimental study w i l l be
performed on the above robot end-effector.
Consider a class o f n-dimensional l i n e a r

Lema 1:
time-varying systems described by
(24)
R(t)z(t) + Q(t)z(t) + P(t)z(t) = v(t)
where z ( t ) and v ( t ) are the ( n x l ) c o n t r o l l e d v a r i a b l e
R(t), Q(t)
and the ( n x l ) input vectors, respectively.
and P ( t ) are time-varying square matrices t h a t are
continuously d i f f e r e n t i a b l e on [ O , T l .
I n addition,
547
THE "UO-DEGREE-OF-FREEIOM END-EFFM;TOR
5.1
s t a r t i n g p o i n t i s a b w t .15 in. v e r t i c a l l y above the

r e a c t i o n t a b l e h o s e v e r t i c a l postion i s -33.35 i n . .
The end-effector parameters and c o n t r o l l e r gains used
i n t h i s experiment are given below:
Figure 3 i l l u s t r a t e s the r o b o t i c end-effector

b u i l t a t CUA f o r the study o f postion/force c o n t r o l .
Two l i n k s o f the end-effector are made up by 2
ball-screw actuators, driven by 2 E motors and hung
below a f i x e d p l a t f o r m b y means o f p i n Joints.
The
other ends o f the l i n k s are Joined together also by
p i n j o i n t s on which a 2-degree-of-freedom force sensor
i s mounted t o provide contact
force
feedback.
P o s i t i o n feedback is achieved by 2 LVDT's mounted
along the l i n k lenghts.
A reaction table with
v a r i a b l e s t i f f n e s s i s placed under t h e end-effector.
The r e a c t i o n t a b l e i s made by p l e x i g l a s s so t h a t
horizontal f r i c t i o n i s minimized and i s h e l d t o a
s o l i d board by 4 springs a t f o u r corners.
The Cartesian p o s i t i o n s
x
and y,
of
the
end-effector end-point, expressed w i t h respect t o a
reference coordinate system attached t o the f i x e d
platform, are r e l a t e d t o the l i n k lenghts 11 and 12 by
End-Fffector Parapetem:
d = 29 in.,
1. = 11 in.,
ml = 0.59 kg, m = 4.5 kg.
liybrid Control System (PD-control l e r

P-controller f o r force):
.7Vsec/in.
Kfp
[0;18 VAn.
Learning
0
O.8V/in.Ii
Control System
p o s i t i o n and force):
for
position,
0
.7Vsec/in.
=
(PO-control l e r s
for
both
a = af = 1/20; Bp = 6, = 19/20
.
where d i s the distance between the p i n J o i n t s hangtng

the actuators.
Using the Lagrangian formulation the
dynamical equations o f t h i s end-effector are given by
H ( q ) q ( t ) + N(q.q)q(t)
+ G(q) + JT(q)Fc = T ( t )
where t h e j o i n t p o s i t i o n vector q i s defined by

[ l , l2IT
and
M =
:];
milm( 1 2-1 1)/(3u)

e
(331
[m,lm~1,-l2)/(3u)
(35)
G = [Gl G21'
where G G
1'
2'
1'
and
2
are given by:
G, = {-mlg[2ul1 (; 1 , I m+i2i
,+21 2i
2 ) - i 2i
1 1+12)-1,u211/(4dl 1: 2u)
-mglmf21:ul(
G~
2 2
1 m
-mgl m[ 21 ;u2( 1 1+12) -1

U1
(37)
= {-mg[2u 12(1 1 +I 1 +21112)-111mu

2I
= 12-1:+
2
d2;
U2
2 m
1 ~ 2)1/ ( 4 d l 11;U)
= l:-l:+d2;
(38)
= [4d21:-uz11'2
(39)
where m i i s the mass o f the moving p a r t o f the l i n k , m

i s the t o t a l mass o f the l i n k , and 1. represents the
fixed
lenght
of
the
actuators.
In
this
exper i mentat ion
study,
we
are
interested
in
6.
controlling
the
horizontal
position
x
of
the
end-effector and the v e r t i c a l contact f o r c e Fy the
end-effector applies on the r e a c t i o n table.
Thus for
t h i s p a r t i c u l a r case
the contact force vector i s
and the end-effector Jacobian
given by Fc= ( 0
i s given by
Fey?
5.2
EXPERIlIEHTuSTUDY
t h i s experimental
study, the end-eftector
as
shown i n Figure 3 is c o n t r o l l e d to follow
a desired
h o r i z o n t a l t r a j e c t o r y s p e c i f i e d by
x
= x
3 ( 3 e-l-tSt
Figures 5 and 6 i l l u s t r a t e the p o s i t i o n e r r o r s

and f o r c e e r r o r s o f the experiment, respectively. I n
the f i r s t t r i a l , when the end-effector s t a r t e d t o move
out o f the home p o s i t i o n , the force e r r o r s were very
b i g because the end-effector was not i n contact 4 t h
t h e r e a c t i o n t a b l e yet.
Trying t o compensate the
force error,
the end-effector
moved down i n the
d i r e c t i o n o f the r e a c t i o n t a b l e as q u i c k l y as possible
and consequently disturbed the p o s i t i o n . This created
r e l a t i v e l y b i g p o s i t i o n e r r o r s a t the beginning.
A f t e r touching the table, the end-effector followed
t h e desired p o s i t i o n t r a j e c t o r y w i t h some steady-state
e r r o r s I n force (about f 0 . 6 l b f ) and i n p o s i t i o n (about
0.1in.l.
The e r r o r s i n force and p o s i t i o n were
produced because the
h y b r i d control
could not
compensate the unforeseen t a b l e surface f r i c t i o n and
changing s t i f f n e s s o f the table.
By repeating the
same task, the end-effector was able t o learn from the
e r r o r s o f the previous operation through the learning
c o n t r o l system and improved the performance as the
As seen i n Figures 5 and
number o f t r i a l s increased.
6. the end-effector performance a t the 10th t r i a l has
been improved dPastlcal1y w i t h almost zero e r r o r l n
p o s i t i o n and L l l b f i n contact force.
1)
while keeping a desired v e r t i c a l contact force Fdv =

31bf.
I n each t r i a l , the end-effector t i p w i l l s t a r t
a t the same p o s i t i o n which i s the home p o s i t i o n
The
indicated by {xO = 13.564 i n . , yo = -33.2 In.).
COXLUSION
I n t h i s paper,
we proposed a learning-based
h y b r i d c o n t r o l scheme f o r c o n t r o l l i n g p o s i t i o n / f o r c e
o f a CKCM end-effector.
The proposed control scheme
consists o f a h y b r i d c o n t r o l system serving as a
feedback c o n t r o l l e r and a learning c o n t r o l system t h a t
can
"learn"
to
improve
the
performance
of
of
theend-effector.
Employing
the
method
l i n e a r i z a t i o n around a desired force and p o s i t i o n
pattern, we showed t h a t the c o n t r o l l e r gains o f the
hybrid
control
system
can
be
selected
to
asymptotically s t a b i l i z e the end-effector.
After
t h a t . using the r e s u l t s i n 171. we proved t h a t the
c o n t r o l l e r gains o f the learnina c o n t r o l system can be
designed such t h a t the end-effector responses approach
t h e desired p a t t e r n uniformly as t h e number of t r i a l s
increases.
Experimental i n v e s t i g a t i o n performed on a
real-life
end-effector
showed t h a t
the proposed
c o n t r o l scheme provided excellent convergence o f the
e r r o r s i n force and postion.
I n order t o f u r t h e r
improve the end-ef f ector performance. it is suggested
t h a t f u t u r e research be extended t o studying
the
inplementation o f the proposed learning control scheme
i n a r o b o t i c system whose feedback control system i s
h @ r i d and adaptive [131, [161.
548
ll
7
- -
9.
Nguyen, C.C., "Arbitrary Eigenvalue Assignments
for
Linear
Time-Varying
Multivariable
Controi
Systems, International Journal of Control, Vol. 45,
NO. 3, pp. 1051-1057, 1987.
10. Nguyen, C.C., Pooran, F.J., and Premack, T.,
"Control of Robot Manipulator Compliance, " in Recent
Trends in Robotics: Modeling, Control and Education,
edited by M. Jamshidi, J.Y.S. Luh, and M . Shahinpoor,
North Holland. New York, pp. 237-242, 1986.
11. Nguyen, C. C., Pooran. F. J. and Premack, T.,
"Modified Hybrid Control of Robot Manipulators for
ISMM
High Precision Assembly Operations." h o c .
Conference
on Computer Applications
in Design,
Simulation
and Analysis, Honolulu, Hawaii, pp.
191-195, Feb. 1988.
12. Nguyen, C. C., and Pooran, F. J., "Kinematics and
Dynamics of a Six-Degree-of-Freedom Robot Manipulator
with Closed-Kinematic Chain Mechanism." Proc. Second
Int. Symp. Robotics and Manufact. Research, Education
and Applications, New Mexico, November 1988.
13. Nguyen, C.C. and Pooran, F.J., "Adaptive Control
of Robot Manipulators with Closed-Kinematic Chain
Hechanism." h o c . Second International Symposium on
Robotics and Manufacturing Research, Education and
Applications, New Mexico, November 1988.
14. Premack, T., et.al., "Design and Implementation
of a Compliant Robot with Force Feedback and Strategy
Planning Sofuare," NASA Technical Memorandum 86111,
1984.
15. Raibert, M.
H.,
and Craig, J.J., "Hybrid
Positioflorce Control of Manipulators." ASME, Journal
of Dynamic Systems, Measurement and Control, 102, pp.
126-133, June 1981.
16. Seraji, H., "Adaptive Force and Position Control
of Manipulators." Journal of Robotics Systems, 4(4),
pp. 551-578, 1987.
17. Togai, M . , and Yamano, O., "Analysis and Design
of an Optimal Learning Control Scheme for Industrial
Robots: A Discrete System Approach", Proc. 24th Conf.
on Decision and Control, pp. 1399-1404, Dec. 1985.
18. Uchiyama, M . , "Formation of High Speed Motion
Pattern of Mechanical Arm by Trial". Transactions.
Society of Instrument and Control Engineers, Vol. 19.
(I 5, pp. 706-712, 1978.
ACKNOWLEDCMENTS
The research work presented in this paper has
been sponsored by Goddard Space Flight Center (NASA)
under the research grant with Grant Number NAG 5-780.
The authors would like to express their appreciation
to NASA for continuous support of the research
project.
7.
I'
REFERENCES
1.
Arimoto, S., et al., "Can Robot Learn by
Themselves?", Robotic Research, 3rd Int. Symposium, H.
Hanafusa and H, Inoue, E&, MIT Press, Massachusetts,
1985.
2.
Arimoto, S. et al., "Bettering Operation of
Dynamic S y s t e m by Learning: A New Control Theory for
Servomechanism or Mechatronics System", h o c . 23rd
Conf. Decision and Control, pp. 1064-1069, December
1984.
3.
Atkeson, C. G.,
and McIntyre, J., "Robot
Trajectory Learning Through Practice", h o c . IEEE
Conf. Robotics and Automation, pp. 1737-1742, San
Francisco, 1986.
4.
Craig, J. J., "Adaptive Control of Manipulators
Through Repeated Trials", h o c . American Control
Conf., pp 1566-1573, June 1984.
5.
Fu, K.S., "Learning Control Systems-Review and
Outlook", IEEE Trans. Automatic Control, AC-15, pp.
210-221, April 1970.
6. Hara, S., Omata, T., and Nakano, M . . "Synthesis
of Repetitive Control Systems and Its Applications",
R o c . 24th Conf. on Decision and Control, pp.
1387-1392, Dec. 1985.
7.
Kawamura, S., Miyazaki, F . , and Arimoto, S.,
"Applications of Learning Method for Dynamic Control
of Robot Manipulators", Proc. 24th Conf. on Decision
and Control, pp. 1381-1386, Dec. 1985.
8.
Mita, T. and Kato, E., "Iterative Control and Its
Application to Motion Control of Robot Arm: A Direct
Approach to Servo Problems", h o c . 24th Conf. Decision
and Control, pp. 1393-1398, Dec 1985.
FORYARO
KINEMATIC
TRANSFORMA
TlON
POSlTlON
SENSORS
b
END-EFFECTOR
AND
ENVIRONMENT
FC
CF-1-J
FORCE
~-
COORDINATE
TRANSFORMA
FORCE
SENSORS
TION
Fig. 2:
The Learning-Based Hybrid Control System

549
I,
'actuator' .
o rm
The CKCM Robotic End-Effector
F i g . 1:
4e
"
'k
'kI1
L S I RAM
Fig. 4:
F i g . 3: The Learning Controller
The 2-wF CKCU End-EffeCtW
0.5
0.5
k=2
-0.5
-0.5
0
10
15
20
Time (sec)
10
20
15
50
-.
5
Time (sec)
Fig. 5:
10
Time (sec)
Experimental Results o f Position Errors
3-
2-
2-
2 - .
1-
0-
-1
k=l
1-
0-
-1
k=2
1--
k=10
0-
-1
15
20

00065489

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

00065489

Uploaded by

Copyright:

Available Formats

LEARNING-BASED HYBRID CONTROL

OF CLOSED-KINEMATIC CHAIN ROBOTIC END-EFFECTORS

Abstract: This paper proposes a control scheme that

0-8186-2012-9/89/0000/0545$01.OO 0 1989 IEEE

THE CKCn TIC

THE LEARNING-BASED HYBRID CONTROL SCHEME

Figure 2 illustrates the structure of the

end-effector dynamics in terms of stability and

where J is the Jacobtan matrix, xd(t) and x(t) denote

The auxiliary signal 7 d is added to the end-effector

M(q)q(t) + N(q.4) + G(q) + JTFc(t) = ~ ( t )

where M(q1, the end-effector mass matrix is an (nxn)

where 0 is a positive definite matrix, af and f3f are

JTFFc(t)= JTKx(t) = JTKT(q) = R(q)

where K is the (nxn) diagonal generalized stiffness

and p a r t of the environment, which i s i n contact w i t h

where Kpp and Kpd are the gains o f the PD-control l e r

Since o ( t ) represents the d e v i a t i o n between the

since X k ( 0 ) = xd(0) and Xk(0) = xd(0) as given from

which represents a 1 inear time-varying m u l t i v a r i a b l e

Proof: The i t e r a t i v e learning scheme expressed by (23)

Cartesian p o s i t i o n x d ( t ) and desired Cartesian contact

input r ( t )can be expressed as

where a and 6 are constant scalars, Z d ( t ) denotes the

G ( t ) = a/aq[M(q)qd + N(q,q) + R(q) + G ( q ) l I

Equations (23) and (19) represent the general learning

STUDY OF THE SPECIAL CASE

I n t h i s section we present the implementation o f

Consider a class o f n-dimensional l i n e a r

THE "UO-DEGREE-OF-FREEIOM END-EFFM;TOR

s t a r t i n g p o i n t i s a b w t .15 in. v e r t i c a l l y above the

Figure 3 i l l u s t r a t e s the r o b o t i c end-effector

ml = 0.59 kg, m = 4.5 kg.

liybrid Control System (PD-control l e r

where d i s the distance between the p i n J o i n t s hangtng

where t h e j o i n t p o s i t i o n vector q i s defined by

milm( 1 2-1 1)/(3u)

are given by:

-mgl m[ 21 ;u2( 1 1+12) -1

= {-mg[2u 12(1 1 +I 1 +21112)-111mu

where m i i s the mass o f the moving p a r t o f the l i n k , m

Figures 5 and 6 i l l u s t r a t e the p o s i t i o n e r r o r s

while keeping a desired v e r t i c a l contact force Fdv =

The Learning-Based Hybrid Control System

The CKCM Robotic End-Effector

F i g . 3: The Learning Controller

The 2-wF CKCU End-EffeCtW

Experimental Results o f Position Errors

You might also like