Download
Download
Download
As the rst steps are now being taken to- Observation of animal behaviour seemed to
wards implementing imitation in autonomous con rm many clear cases of imitation. For ex-
robots, we considered it timely to prepare an ample, chicks following their mother's example
introductory overview of the concept of im- in avoiding roads, feeding only in certain ar-
itation. Imitation is an established { and eas, and eating certain species of plants; birds
controversial { topic in psychology. The is- learning to puncture and feed from milk bot-
sues highlighted in the psychological literaturetles; and kittens, exposed to adult cats that
are discussed in the next section, where we attained food by manipulating levers, learning
also propose a mechanistic de nition of imita- to perform the same manipulations much faster
tion. Subsequent sections of this paper review than a control group (Galef, 1988).
the potential bene ts of imitation learning for The evidence seemed compelling, but it was
robots, and outline the major issues involved eventually realized that the behaviours de-
in the implementation of imitation learning. A scribed above were not necessarily the result of
suggested framework for mapping progress in `witnessing' an act and learning it: they could
robot imitation is then introduced, and is used be more parsimoniously explained by an inter-
to brie y review the published work on robot play of simpler mechanisms such as reinforce-
imitation to date. ment learning, following behaviour, social facil-
itation, matched dependent behaviour and stim-
Finally, we describe an ongoing project at ulus enhancement.
the Electrotechnical Laboratory in Japan, that 1
The answer to this question was considered at the
aims to have a robot develop the ability to im- time to be an important test of evolution theory { see
itate. Galef (1988).
To Appear in AISB'96 Workshop on Learning in Robots and Animals 3
In the cat experiment cited above, for ex- Imitation takes place when an
ample (Chesler, 1969), the kittens may have agent learns a behaviour from ob-
been more likely to manipulate the lever sim- serving the execution of that be-
ply because the adult cat had left a scent on haviour by a teacher.
it. A chance pawing of the lever then led to
reinforcement and subsequent repetition of the Note that the roles of `teacher' and `agent'
behaviour. Similarly, many of the apparent im- are not xed; they can be reversed from one en-
itatory behaviours of birds can be explained by counter to the next, for example during bouts
the innate following behaviour of chicks. of mutual imitation (Piaget, 1962).
Such alternative learning mechanisms, which
together have the same behavioural result as im- The Promise of Imitation
itation (i.e., the spread of similar behaviours
amongst animals) were unfortunately (and con- In this section we discuss the reasons why we
fusingly) labelled as special cases of imitation should be interested in endowing robots with
in the literature: instinctive imitation, pseudo- the ability to imitate. What speci c advan-
imitation, re ective imitation and so on (Galef, tages would such a learning mechanism give?
1988). Imitation, as de ned above, was rela- As will become clear below, the advantages
belled `true imitation'! are varied and quite substantial, both for in-
It is our submission that in robot imitation dividual agents and for societies of interacting
we are chie y interested in `true imitation', as agents.
described by Thorndike: a behaviour is ob-
served, understood, and reproduced. Our goal Adaptation
is to investigate how robots can be endowed An agent with the ability to imitate has an
with this powerful learning mechanism. Any excellent mechanism for adapting to its en-
other form of `imitation' that does not involve vironment. By observing other agent's ac-
the adoption of a behaviour from the observa- tions, the agent can quickly learn new be-
tion of that behaviour will be precluded from haviours that are likely to be useful; likely,
our discussion.2 because they are already being used by
This is not to say that the simple mecha- agents successfully operating in the same
nisms noted above, which may be used to con- environment.
trol the contagion of behaviours in societies of
robots, are not worthy of study in their own Imitation also acts as an ongoing means of
right. Indeed, we are pursuing this direction adaptation, allowing the agent to induct
in more detail elsewhere (Bakker, 1996b). It new behaviours from fellow agents as the
is simply proposed that they not be included environment changes, new skills are dis-
under the banner of `robot imitation', to avoid covered by other agents, or as the agent
a terminological confusion. moves to a new setting.
Our de nition of imitation is therefore stated Ecient Communication
in terms of the mechanism involved:3
Imitation provides agents with an ecient
2
The term `observation' here includes perception non-verbal means of communication. Be-
cause it is non-verbal, it does not require
through any of the robot's available faculties: \To see or
sense, esp. through directed careful analytic attention"the teacher and the agent to `speak the
(Webster Online Dictionary).
3
Compare this to the more `behaviouristic' de nitionsame language'. This is also true at the
o ered in a psychological text: Imitation is the motoricsomatic level: agents can learn from other
or verbal performance of speci c acts or sounds that agents that are of a di erent species or are
are like those previously performed by a model (Yando, built from di erent hardware.
Seitz, & Zigler, 1978). Such a de nition does not pre-
clude the various types of pseudo-imitation outlined above.
To Appear in AISB'96 Workshop on Learning in Robots and Animals 4
The basic reason for this advantage is that When one agent gains a useful new be-
communication via imitation takes place haviour { be it by trial and error, observ-
at a high level (i.e., in terms of actions) ing a human, or simply from being repro-
rather than at a lower level (such as motor grammed { imitation provides a mecha-
commands). nism for rapidly communicating the dis-
Communication via imitation is also e- covery of this behaviour through the whole
cient because a large amount of impor- society of agents. It provides a means
tant information can be transmitted simul- of combining the power of all the agents'
taneously with each act { the context in diverse learning schemes, to bene t the
which it occurs, the objects that are ma- whole society.
nipulated, the outcomes, and the tools in- Imitation thereby increases the adaptation
volved. and survivability of a society as a whole.
Finally, imitation has the advantage of be- It also ensures the survival of useful be-
ing an undemanding and unobtrusive com- haviours { such behaviours will rapidly
munication method because a teacher does spread, and may even be passed on to fol-
not have to go `o -line' to transfer a be- lowing generations.
haviour to an agent: the agent (or mul-
tiple agents simultaneously) can learn by `Good Company'
observing the teacher without interfering
A nal point is that providing robots with
in the teacher's performance.
imitation ability gives them a skill that
Compatibility with other Learning Mecha- has thus far only been demonstrated in
nisms higher animals { primates, cetaceans and
Imitation can be used as a learning mech- humans. These are also the only animals
anism in conjunction with existing learn- that we consider to exhibit advanced intel-
ing schemes for agents (such as reinforce- ligence. Based on the advantages outlined
ment learning, trial-and-error learning, or above, learning by imitation would seem to
symbolic induction schemes). While the be one of the major components of general
current work examines imitation learning intelligent behaviour. Robots with imita-
in isolation, it can naturally be used as a tory ability may hence nd themselves in
supplemental learning strategy, increasing ethologically `good company' for display-
the agents' learning capabilities overall. ing truly intelligent behaviour.
Ecient Learning
The most signi cant advantage of robot Implementation Issues in
learning by imitation promises to be the Robot Imitation
eciency of the learning process, particu-
larly in a society of agents. This follows After the discussion of the prospective bene ts
from the communication and compatibil- of imitation given above, one might well wonder
ity facets discussed above. what kind of oversight has led to its exclusion
In the traditional learning paradigm for from robot learning thus far! The answer is, of
robots, each agent is `alone' in the envi- course, that there is a good reason why imita-
ronment. All new behaviours must be dis- tion in nature is restricted to higher animals {
covered through personal learning experi- imitation itself requires signi cant perceptual
ence. Imitation opens a rich new vein of and cognitive abilities. Understanding (much
information to the learning robot: the be- less implementing) many of these abilities is
haviours of other agents operating in the still an open problem in psychology, arti cial
same environment. intelligence, and robotics.
To Appear in AISB'96 Workshop on Learning in Robots and Animals 5
Nevertheless, the rst few tentative steps to- Process the relevant environmental infor-
wards robot imitation have already been taken. mation accompanying the action { the con-
Some of the recent work will be reviewed in the text in which the action occurred, the
next section; the purpose of this section is to participants, the tools or objects manip-
identify the substantive issues that must be ad- ulated.
dressed before robots can be considered able to
imitate. Representation
The following list is based on our own re-
search, and from analyzing the problems ad- Choose an appropriate representation for
dressed (and purposefully avoided) in recent actions;
work in robotics. While probably not an ex- Convert an observed action to the agent's
haustive list, it does outline what is at least internal representation (this subsumes an
required to implement imitation in an au- analogy mapping problem { mapping the
tonomous agent. The list also suggests a frame- teacher's actuators to the agent's actua-
work for reviewing future contributions to this tors).
area.
Reproduction
A Conceptual Framework for Robot
Motivate the agent to consider executing
Imitation
an observed action;
Imitation in robots (or in animals, or humans)
Choose the appropriate context in which
would appear to be composed of three funda-
mental processes, described by Kuniyoshi et al. to reproduce the action;
(1994) as \seeing, understanding and doing" Adapt the action to the current environ-
(p. 800). In a reformulation of this statement, ment.
we will propose that for an agent to imitate an
action by a teacher, it must at least:
A Review of Recent Work in
1. observe the action, Robot Imitation
2. represent the action, Here follows a brief review of recent work in
robot imitation. We will discuss these papers in
3. and reproduce the action the context of the framework described above,
highlighting for each paper the issues that were
Each of these fundamental processes in turn addressed, and those which were (purposefully)
involve some important problems: avoided.
Observation Kuniyoshi et al. (1994)
Motivate the agent to observe a teacher; In Kuniyoshi et al. (1994), a robot agent
watches a human teacher performing a simple
Identify an appropriate teacher to observe;
assembly task in a tabletop environment. The
Identify when the teacher is performing an motor movements of the human are classi ed as
action that should be learned; actions known to the robot (pick, move, place
etc.). When the assembly task is completed,
Accurately observe the teacher's action the robot is commanded to reproduce the se-
(via vision or some other sense { includes quence of actions. It can successfully do this
tracking, attention and segmentation is- even if the initial position of the items to be
sues); manipulated has changed.
To Appear in AISB'96 Workshop on Learning in Robots and Animals 6
Observation. The authors chose to focus stricting the teacher to two essential acts: turn-
on the problem of perceiving the teacher's ac- ing by 90 degrees and moving straight ahead.
tions: determining the start and nish points Representation. The problem of mapping
of actions, and tracking the human hand. This actions from the teacher to the agent is el-
problem is simpli ed by the fact that the robot egantly solved by commanding the agent to
is only required to recognize actions that it al- always follow the teacher. Through the act
ready knows. of following, it must immediately imitate the
The robot was not required to choose which teacher's locomotive action. This agent ac-
teacher or which actions to observe: all actions tion (turning or moving straight ahead) is then
had to be observed in `seeing' mode, and then stored with the environmental information.
reproduced in `doing' mode. Actions are represented as symbolic rules.
Representation. The problem of mapping The antecedent of such a rule is a description of
action sequences to the robot's actuators was the environment in which an action occurred,
solved by using symbolic labels. Once the robot and the RHS is the action to be executed in
correctly interpreted a human action, the label that environment. For example, right-hand
could be mapped to a pre-programmed robot turns become associated with an environment
action sequence. in which there is one wall to the left and one
Reproduction. The robot was explicitly straight ahead.
told when to reproduce the observed actions, Reproduction. When the agent attempts
by being switched to `doing' mode. Kuniyoshi to navigate the maze by itself, it constantly
et al. (1994) focused on the problem of how tries to match the perceived environment to the
to adapt the imitated action to the current en- antecedents of stored production rules. The is-
vironment. The initial state of the table was sue of motivation is thus avoided: in this mode,
analyzed, and the parameters of the action se- the agent is always and only looking to repro-
quence were changed to conform to this state duce observed actions.
(e.g., if the position of objects on the table has In summary, Hayes and Demiris (1994) ad-
changed between the seeing and doing modes). dressed two major issues: how to map a teacher
To sum up, Kuniyoshi et al. (1994) addressed action to an agent action, and how to learn
two fundamental problems: how to accurately when to reproduce an observed action. The
perceive teacher's actions, and how to adapt an solutions implemented by Hayes and Demiris
imitated action to the environment in which it (1994) are simple and elegant, but only work
is reproduced. because of the simplicity of the maze environ-
ment and the restricted set of locomotive ac-
Hayes and Demiris (1994) tions available.
In Hayes and Demiris (1994), a robot agent
is taught the skill of maze traversal by imita- Dautenhahn (1995)
tion. The agent follows a teacher robot through
a maze, detecting signi cant teacher actions In Dautenhahn (1995), agents traverse a `hilly
(such as turning), and physically copying those landscape', attaching themselves to teacher
actions. The agent also learns to associate robots and imitating their trajectories. As in
the environment { the position in the maze Hayes and Demiris (1994), imitation is so far
(in terms of local wall positions) { with the limited to the act of following of other agents.
teacher's actions. Observation. Agents are explicitly pro-
Observation. There is only one teacher to grammed to seek out other agents and attach
attend to, and it must be watched (and fol- themselves. Eventually agents learn to recog-
lowed) at all times in training mode. The iden- nize suitable teachers from positive (or nega-
ti cation of teacher actions is simpli ed by re- tive) learning experiences in the past.
To Appear in AISB'96 Workshop on Learning in Robots and Animals 7
fact that the set of possible actions is very sim- Observation who to observe
ple { movement in a given direction { and the which actions to observe
agent robot can map these actions to its own perceiving teacher actions
body by following the teacher. perceiving the relevant context
Reproduction. Learned movements are as- (environment, tools, objects)
sociated with the local gradient in the hilly
landscape. This would allow the agent to gen- representing actions
eralize behaviours to other, similar areas in the Imitation Representation mapping observed actions to the
hilly landscape. agent’s actuators
Assimilation provides the child with a crude Importantly, this teleological approach to be-
means of selecting an action to perform as a haviour induction is physically grounded in the
circular reaction. It is based purely on the ob- rst stage of learning, where the agent explores
servation of the outcome of an action, and not the space of possible behaviours allowed by its
on any understanding of the process involved. physical embodiment, and builds up a reper-
The mechanisms of exploration, circular re- toire of behaviours.
actions and assimilation are sucient to allow A computational architecture for controlling
the infant to reach stage 3 of development { the development of imitation has been speci-
where conservative imitation is performed with ed along the lines sketched out above. As a
accuracy. rst experiment, we are applying this control
In stage 4, the step is made to `true' imi- architecture to the learning of speech sounds
tation { the ability to learn novel behaviours by a robot. The agent is provided with hearing
through imitation. The trigger for this is capabilities and an articulatory synthesizer (a
the maturing mechanism of Accommoda- model of the human vocal tract) with which
tion: the ability to detect and correct the dif- to produce speech. It is allowed to explore
ferences between an observed behaviour and its range of vocalizations, and store the re-
one's own reproduction of it. For example, an sulting sounds. In interaction with a human
adult says \barn" and the child responds with teacher, the robot will engage in conserva-
\bah". The development of a more discriminat- tive imitation, eventually tailor its set of ut-
ing representation of speech sounds allows the terances (in the rst instance, monosyllables)
child to perceive the di erence between these to the teacher's language, and by the fourth
two utterances, and the mechanism of accom- stage engage in true imitation of novel speech
modation adjusts her speech act gradually un- sounds. If successful, this will demonstrate for
til it ts the desired outcome. For example, the rst time a learning scheme that allows
the child might try \bah", \bag", \bahg", and robots to learn truly novel behaviours (Brooks
nally \barn". & Mataric, 1993).
In the physical domain, a child might observe
a teacher clicking his ngers. He tries tapping
his ngers together { a known action { and ex-
periments with adjusting his behaviour from Conclusion
there. Success is not guaranteed, and would
depend on the level of skill so far attained, and Robot learning is a eld full of promise, but
the continued help and encouragement of the fraught with intractable problems. Imitation
teacher. learning o ers a new approach that promises a
richer environment for the learning robot, pro-
Discussion
vides a simpler channel of communication be-
tween humans and robots, and may nally em-
By pursuing the developmental route, we hope power robots to induct novel behaviours. The
to solve one of the critical problems of imita- aim of this paper has been to review both the
tion: how one maps a novel observed action to promise and the problems of this burgeoning
one's own body. It is assumed, in our model, new eld in robotics. While recent work is
that this cannot be done for a completely novel encouraging, there are still many deep prob-
action: the observed action must rst be map- lems in perception and representation to be ad-
pable to one that the agent already knows, and dressed before general imitation learning will
then adapted stepwise from there. In other become a reality in robots. Recent work at
words, and conforming to the old adage, it is ETL is drawing on insights from developmental
not possible to learn a new behaviour unless psychology to address some of these problems
one almost knows it already. in a practical way.
To Appear in AISB'96 Workshop on Learning in Robots and Animals 10