Yoon Kevin 2007 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Teaching procedural flow through dialog and demonstration

Kevin Yoon, Paul E. Rybski


School of Computer Science, Carnegie Mellon University
5000 Forbes Ave., Pittsburgh, PA, 15213
{kmy,prybski}@cs.cmu.edu

Abstract— In order for robots to act as valuable assistants for tasks where the features that affect task flow must be explic-
non-expert users, they need to be able to learn new abilities and itly conveyed and not inferred. Additionally, because tasks
do so through natural methods of communication. Furthermore, are symbolically referenced with natural language labels,
it is often desirable that tasks be learned quickly without
having to provide multiple demonstrations. Training should they are transferable across heterogeneous robots that share
also be conducted in such a way that the user has a clear the same or a similar primitive behavior set.
understanding of the manner in which environmental features In this paper, we present some enhancements and mod-
affect the behavior of the learned activity, so that execution ifications to this task training technique that include the
behavior is predictable. ability to capture conditional looping so that repetitive, or
We present an interactive framework for teaching a robot the
flow of an activity composed of elements from a set of prim- cyclic tasks, can be created. Interrupt events that may occur
itive behaviors and previously trained activities. Conditional at any point during a task can also be specified to trigger
branching and looping, order-independent activity execution, contingency actions. Additionally, a new construct called a
and contingency (or interrupt) actions can all be captured by todolist has been added, which permits order-independent
our activity structures. Additional convenience functionality to activity execution. Moreover, tasks can now be trained “on
aid in the training process is also provided.
By providing a natural method of communicating production the fly” — that is, while training another task that uses
rules analogous to rigid programming structures, well-defined it — to support a top-down design approach while still
tasks can be trained easily. We demonstrate our task training permitting the bottom-up construction of tasks. Furthermore,
procedure on a mobile robot. locational context is no longer inferred automatically as this
is not always desirable in some situations. However, location-
I. I NTRODUCTION
specific actions can still be specified explicitly with a simple
In the future, robots will inevitably be employed as assis- grounding utterance.
tants or team partners. However, if such robots are ever to
gain widespread and long term acceptance, they will need to II. R ELATED W ORK
be capable of not only learning new tasks, but also learning Robot task learning and programming-by-demonstration
them from non-expert users. (PBD) has been explored by several groups. In [1], [2],
We have previously introduced a method for task training and [12], robots learn actuator trajectories or control policies
via dialog and demonstration in [11]. Therein we described from user task demonstrations. In [13], a task is built using
a collaborative natural language procedure for constructing gestures by discerning which primitive actions, from a base
tasks from a set of primitive behaviors and/or previously- set of capabilities, can be combined to conduct the task
trained tasks, which in turn could be used to build other demonstrated.
tasks. This modular task architecture supports an expanding Our method has the ability to discern, to a limited extent,
repertoire of abilities. Different training modes enable differ- which primitive actions should be combined to execute a
ent features, such as the ability to attach locational context to given task by way of inferring locational context on actions.
a given command, reducing the explanatory responsibilities We note, however, that this is not the main focus of our work,
of the human trainer. Preconditions on task actions serve as nor is it meant for deriving low-level control strategies. It is
a failure-handling mechanism that appropriately directs task primarily a method by which the control flow of a task, using
flow should an action fail. The robot also engages the human primitive actions and previously learned tasks, can be com-
in a verification dialog to resolve ambiguities in task flow municated through a training procedure employing natural
and, in so doing, brings about mutual understanding of the interaction, thereby converging to mutual task understanding
task representation. for both robot and user.
This understanding can be desirable, sometimes essential, Our work is largely inspired by [8] and [7]. In [8], a mobile
in situations where the time or opportunity to provide multi- robot is joysticked through multiple demonstrations of a task
ple demonstrations and/or make corrections through practice from which it generates a generalized task representation in
trials is unavailable, and the chance of the robot exhibiting the form of a directed acyclic graph (DAG). The task is then
unexpected behavior due to conditions unencountered during pruned down to a linear sequence through teacher feedback
training is unacceptable. The training dialog described herein in the form of verbal cues over multiple practice trials.
enables the human to quickly construct rigidly-formulated In [7], a stationary humanoid robot that understands some
speech, though it is also unable to speak itself, learns tasks Selector that, upon parsing given speech commands, places
by communicating through gestures and facial expressions. the appropriate activities in the Activity Repertoire onto the
Our approach employs a similar turn-taking framework for Current Activity List for execution.
instruction and task refinement, but we endow the robot with
the capability of speech which we believe conveys more
directly the robot’s understanding of the task and guides the
human more effectively in resolving ambiguities. In this way,
we obviate the need to refine a learned task through practice.
Similar dialog-driven interaction mechanisms have been
developed in the area of plan recognition, though primarily
in the Human-Computer Interaction (HCI), as opposed to
Human-Robot Interaction (HRI), domain. A plan recognition
algorithm is introduced in [10] and [6] where characteristics
of the collaborative setting are exploited to reduce the amount
of input required of the user. This recognition strategy,
however, requires some prior knowledge in the form of
Fig. 1. CMAssist software architecture
SharedPlans (or mutually-believed goals, actions, intentions)
and a set of recipes (or action plans for achieving goals). This
work differs from ours in that the goal is to help the user An activity is the encompassing term for behaviors, tasks,
accomplish tasks according to perceived intent whereas we and todolists which are described in more detail in Section
are striving to teach a robot new tasks. Our approach could IV. The Activity Repertoire is the collective map that asso-
potentially be used instead to build the recipes necessary for ciates natural language symbolic labels to known activities.
this plan recognition method to work. For example, “Go to” in the phrase “Go to the door” maps
In [9], an augmentation-based learning approach is de- directly to the navigation behavior which would be put onto
scribed. The task structure, including conditional branching the Current Activity List with the location parameter “door”.
and looping, is inferred from user demonstration. Manual An activity building behavior can also add new activities to
edits can also be made to fix incorrect task structures and the Activity Repertoire as will be shown in Section V-A.
constrain the induction procedure on subsequent demon- Though the various activity types have differing internal
strations. Again, this approach is explored in the software structures, they are all executed by the same function form
application domain and there is no effort to conduct a where the inputs are the sensors and a command object,
collaborative discourse with the user for natural interaction. and the outputs are an integer status flag and a new
Additionally, in our work, branching and looping structures command object.
are explicitly and quickly communicated by the user, rather
(status, command) = Activity(sensors, command)
than being inferred over multiple demonstrations.
A multi-modal interface for programming tasks is de- The sensors object gives an activity module access to
scribed in [4] that additionally allows the user to control task sensory data while command is an object that can be
priority during execution. Instruction-Based Learning [5] is modified by an activity to store actuator commands, such as
similar to our work in that it uses a base set of behaviors that motor velocities or speech output. A single command object
are associated with natural language symbolic labels and a is passed through each of the activities in the Current Activity
modular architecture for symbolic tasks. List so that commands requested by activities of lower prior-
None of these works, however, describe the ability to ity are visible to higher priority activities. Activities can take
convey branching or looping flow constructs within the task this information into account when actuator commands need
structure that are conditioned on explicitly-communicated to be overridden. For example, when the obstacle avoidance
features. Nor do they address the issue of structuring tasks behavior needs to decide whether to veer left or right to
for activities that need not be executed in the order in which circumvent an obstacle, it can check the command object
they were communicated. This severely limits robustness and to see in which direction the navigation behavior was trying
the types of tasks that can be trained. Through speech one to drive the robot and choose to go in a similar direction.
can very compactly format instructions for execution based The main execution loop then involves processing all of the
on detectable environmental states. No intention beliefs are activities in the Current Activity List with the given sensory
maintained that may result in unexpected behavior during data. When the last activity on the Current Activity List is
execution, but rather, by engaging the user in a true spoken completed, status is routed back to the Activity Selector
dialog, we can quickly train tasks with clearly defined which determines if behaviors need to be removed from the
execution flow that is necessarily understood by the user. Current Activity List. The command object is processed to
drive the actuators.
III. S YSTEM OVERVIEW The Activity Selector is triggered on speech input and
Figure 1 depicts a simple overview of the system architec- is responsible for inserting commanded activities, removing
ture we employ. Within the top-level behavior is an Activity conflicting ones, and removing completed or failed activities.
IV. ACTIVITY S TRUCTURES
A. Behaviors
A behavior maps low-level sensory data to actuator trajec-
tories in order to accomplish some high-level goal(s). The
robot is assumed to be preprogrammed with some basic set
of behaviors. For a mobile platform, these primitive skills
might include obstacle avoidance and high-level navigation (a) Linear (b) Conditional branching (c) Conditional looping
capabilities.
Fig. 3. Task flow structures
B. Tasks
The basic building block of a task is the task item
(Figure 2). A task item consists of three main components: a is the activity to execute when k is true and r is a boolean
a (potentially empty) precondition list, an activity and a list value determining whether or not the original task should be
of execution parameters, and a pointer list to subsequent resumed when either k is no longer true or a has completed.
task items. The precondition list contains the conditions that
must be satisfied before the action can be executed. There C. Todolists
are two types of preconditions: enabling and permanent. Todolists are a special type of activity that allows the
Enabling preconditions are evaluated only once before the user to specify a list of items that are to be executed in
task item’s activity is executed. Permanent preconditions are no particular order. These todolist items, as with task items,
monitored continuously for as long as the activity is being can refer to any activity: behaviors, tasks, and other todolists.
executed. As previously mentioned, an activity can refer to a There is nothing unique about the structure of a todolist.
behavior, a previously-trained task, or a todolist. Depending It is simply a list of disconnected activities that, unlike
on the completion status of the activity (i.e. success or fail), tasks, cannot capture conditional branching and looping.
the associated link is followed to the next task item to be It is rather the manner in which a todolist is executed
executed. that distinguishes it from the other activities enabling it to
accomplish unordered tasks as people do on a daily basis.
We currently employ a round-robin execution scheme
where we iteratively loop through the list and attempt each
item until it has either completed successfully or failed
maxN umT ries times, where maxN umT ries is specified
during training.
Fig. 2. Task item
Clearly, some optimal scheduling strategy to minimize
failed attempts could be applied here when taking into ac-
count information like estimated todolist item durations and
A task then is a temporally ordered sequence of task items
reasons for past failures. Item priority could be an additional
captured in a directed graph structure. They can represent
constraint that such a strategy might take into account. This
simple linear sequences such as in Figure 3(a). Here, the
is beyond the scope of this work where we simply provide a
robot executes Task items 1 through N in order. Tasks can
construct in which order-independent execution of activities
also represent conditional branching as shown in Figure 3(b).
is made possible.
Depending on the evaluation of <condition>, either Task
item 2a or Task item 2b will be evaluated followed by
V. T RAINING
whichever tasks follow it until the branches reconnect at Task
item N. Cyclic tasks can be represented by loops as shown The basic method behind the training approach we employ
in 3(c). For as long as <condition> is true, Task item 2 is allowing the user to convey production rules through the
and the subsequent task items inside the loop are executed. primary method of speech. Each recognized user utterance is
This is made possible by applying the while-condition as a mapped to one of three things: (1) an activity in the activity
permanent precondition on all task items inside the loop. repertoire that is to be appended to the current activity
For some tasks it may be necessary to execute contingency structure, (2) a control structure that affects where and how
activities, such as when some event occurs requiring special subsequent activities are appended to the current activity
action and the current task be put on hold. Rather than structure, or (3) a “special” command, such as a question
inserting if and while statements throughout the task, the that the user might ask during the training procedure.
user can optionally specify contingency event-action pairs Throughout the training procedure, the robot responds
that are checked for the duration of the task execution. Unlike with an affirmative “ok” after every user utterance to indicate
the previous conditional constructs, a contingency plan is not understanding. The robot will also ask the user questions
represented within the directed graph itself but is an attribute about parameters that were not defined when the user has
of the task structure. Each task has an associative structure finished training, thus guiding the user through dialog to-
that maps an interrupt event k to an action tuple (a, r), where wards a well-defined activity structure.
A. Training Tasks would begin the sing a song task if Paul was detected at
Task training is itself a behavior that can be invoked in any time during the task (i.e. while dancing or charging
one of two modes: dialog-only and dialog-and-observation batteries) and would continue to do so until the sing a song
modes. The former is invoked with the keyphrase “When I task completed or Paul became no longer visible. During
say T ” and the latter with “Let me show you what to do when training, the robot also asks the user if it should resume the
I say T ”, where T is the name of the task to be trained and original task after executing the contingency action.
is typically an imperative statement. In dialog-only mode, Special utterances can be used to indicate that the task
all commands must be issued to the robot verbally. In should be exited. “Exit task” and “The task has failed” create
dialog-and-observation mode, the robot invokes its following task items that when executed will terminate the task, the first
behavior such that it is always in the vicinity of the human with a success flag and the latter with a failure flag. (The
trainer as he moves around the environment. In this manner, task exits with a success flag by default even when “Exit
the robot can interpret deictic utterances like “come here”. In task” is not said.) This is particularly useful when tasks are
the previous work [11], this mode was used to automatically used in a todolist where the return status indicates whether
attach locational context to each command given by the user. a todolist item should be reattempted or not.
In an effort to provide a framework for the training of more As can be seen, this approach to task training places
general tasks — where it is not necessarily appropriate to more of the design burden on the user than some of the
assume that actions should be executed where they were PBD techniques mentioned in Section II, but it comes with
demonstrated — locational contexts are no longer assumed the added benefit of increased mutual task understanding
but can be easily and naturally anchored to subsequent between the user and robot and consequently more pre-
commands with the “come here” phrase. dictable execution behavior. Also, tasks cannot be overfit
Task flow control is communicated by keyphrases summa- to training set conditions because task flow depends on
rized in Table I. An example of a user utterance that creates explicitly specified features. Moreover, the natural interaction
a conditional branching structure (Figure 3(b)) is “If you see framework allows for quick and easy construction of tasks.
Kevin, say ‘Hi Kevin’. Otherwise, say ‘Where is Kevin?’ Figure 4 shows a simple schematic for this Task-building
before looking for Paul”. The resulting task would cause behavior where we can see the speech input being processed
the robot to say either “Hi Kevin” or “Where is Kevin” by the Speech Parser. Therein, we first check if the utterance
depending on whether Kevin was detected. It would then is a special command, such as those shown in Table II. If it is
begin the activity called looking for Paul. not, then we check if it is a flow control command and add
nodes or update pointers to the task under construction as
TABLE I appropriate. If it is not that, then we check if it corresponds
TASK FLOW COMMANDS to an activity that already exists in the Activity Repertoire.
Command Description
If so, then we add a task item containing the activity to
“If Appends a conditional node to the task graph. Sub- the task under construction. Finally, if the user has ended
<condition>” sequent commands are added to True branch. the task training sequence, the robot engages the human
“Otherwise” Causes subsequent commands to be added to the in a verification dialog to confirm the task description by
False branch of the current if node.
“before” Connects True and False branches of current if node reading it back to the human and to acquire any additional
with the following command. (Ends if block.) information that might be necessary, such as what to do
“While Appends a conditional node to the task graph. Sub- when an if condition does not hold and the otherwise case
<condition>” sequent commands are added to True branch and
preconditioned on <condition>. was not specified, before saving the task to the Activity
“After that” Routes execution flow to the current while node and Repertoire. The command object passed out of the Speech
appends subsequent commands to the False branch. Parser contains speech output commands as well as any
(Ends while loop.)
“Meanwhile if Adds a contingency event to the task object and maps motor commands set by the Follow behavior.
<condition>” it to the next activity command. If <condition>
becomes true at any point during task execution, the
specified activity is executed.
“Exit Task” Appends a node that exits task with success flag.
“The task has Appends a node that exits task with fail flag.
failed”

Cyclic constructs (Figure 3(c)) can be specified with a


phrase like “While Kevin is around, do a dance. After that
charge batteries”. Executing the resulting task would make
the robot conduct the do a dance activity for as long as it
sees Kevin. If Kevin leaves, the loop is exited and the robot
begins the charge batteries activity.
Contingency event-action pairs are specified with the
“meanwhile” keyphrase. If we appended “Meanwhile if you Fig. 4. Task builder
see Paul, sing a song” to the previous example, the robot
The constructs in Figure 3 can be combined and nested to
create activities that can richly capture task flow. It can also
be seen that activities can become arbitrarily complex. While Omnicamera
our task training approach is well-suited for composing
complex tasks from simpler subtask, the robot can provide
descriptive feedback and verify with the user the flow of the Speaker
trained task to minimize errors during a long and potentially
Stereo Cameras
confusing training sequence. In Table II are some phrases
that can be understood by the robot to aid the user during
the training process. Computers x2

TABLE II
T RAINING HELPER FUNCTIONS

Command Robot Function Description


(T = Training mode, E = Execution mode) Fig. 5. The CMAssist robot interacts with a user.
“Describe T ” T,E: Describes the task T
“What did you say?” T,E: Repeats the last thing it said
“Can you repeat that?”
“Where was I?” T: Repeats the last two task item in the as our research platform for human-robot interaction. An
current task earlier version of this task training work was demonstrated
Unrecognized/Misheard T: Asks if the user was referring to the name at the Robocup@Home competition in June 2006 where our
utterance of a new task to train, and starts a new task
training process if this is so. team placed 2nd out of 11 teams.
E: Robot says that it did not understand and The robot has modular hardware and software architec-
asks the user to repeat himself. tures to enable rapid prototyping and integration of new
sensory, actuator, and computational components. An om-
Note that, in training mode, if the user says a phrase that nidirectional camera and stereo camera allow it to sense
is unrecognized, the robot will give the user the option of the presence of people wearing color-coded shirts, while the
training a new task under the assumption that he may have stereo camera and laser range finder together are used for
been referring to a task that has not yet been created. In this navigation and obstacle avoidance. The robot can also rec-
way, the user can follow an ad-hoc, top-down approach and ognize a subset of natural English language speech and speak
train tasks “on the fly” as they are required without needing through a Text-To-Speech (TTS) engine. These capabilities
to plan out all the required low-level subtasks ahead of time. equip the robot with sufficient spatial and environmental
The user can say “Thank you” to simply end the training
information to execute our interactive training algorithm.
process and the learned task is saved to the Activity Reper-
A list of relevant behaviors used by our robot is as follows:
toire as is. Or, by asking “Is that understood?”, the robot
• Goto(x,y)/Goto(name) Drives the robot to a location
will dictate the task description and await confirmation from
specified either by global coordinates or a location label.
the user. If the task is correct, the robot then attempts to
• Say(s)/Ask(s,p) Generates speech output from the TTS
clarify ambiguities. Currently this involves asking the user
engine. Say(s) causes the robot to speak an utterance s.
for instructions for unspecified “otherwise” cases. If the task
Ask(s,p) requires that the robot identify and speak the
is incorrect, the task training procedure is restarted.
utterance to a particular person p if present and wait for
B. Training Todolists a response.
The todolist training behavior is invoked with the • Follow(p) Causes the robot to follow person p while
keyphrase “Let’s make a todolist called L”, where L is the maintaining a fixed distance of approximately 1m.
name of the todolist to be trained. The user then simply • StateChecker(a) Unique in that the useful output is the
lists the activities that are to be added. When finished, the status flag rather than the command object, which is
user says “Thank you” to end training or asks “Is that not modified at all. Uses sensors to calculate a status
understood?” to have the robot repeat the todolist items flag indicating whether or not an assertion a is true
dictated. If the user confirms that the todolist is correct, the or false. Used by task items containing conditional
robot then asks the user for the number of times it should statements such as if and while nodes.
attempt to repeat failed tasks. • TaskTrain(f) Invokes the task training procedure. The
Unlike tasks, todolists are learned exclusively through Follow behavior is simultaneously executed if f is true
dictation only, since todolist items themselves are typically causing the robot to follow the teacher and learn the
high-level actions that can be trained as tasks. task based on both the spoken utterances as well as the
VI. S YSTEM I MPLEMENTATION locations of the teacher.
• TodolistTrain() Invokes the todolist training procedure.
The task training procedure was evaluated on our CMAs-
In order for the locations in the environment to be se-
sist1 robot, pictured in Figure 5, that was expressly developed
mantically meaningful as part of the training process, a map
1 http://www.cs.cmu.edu/∼coral/cmassist/ of the environment is provided to the robot which contains
0 “Let me show you what to do when
semantic information in the form of location labels. For
I say patrol the lab”
instance, the locations of named objects such as couch, <Ok, let’s begin>
table, and television can be added to the map as well as 6 “While Kevin is not here”
general locations of rooms such as lab or living room. This “Drive around the lab”
a priori information is used to ground locations that are either <I don’t understand. Did you want to
teach me Drive around the lab?>
mentioned in the human’s speech or are visited as the human
“Yes”
walks about the environment. <Ok, please begin teaching me Drive around the lab>
41 Kevin walks to the door
VII. E XPERIMENTAL R ESULTS “Come here”
63 Kevin walks to the QRIOs
The robot was trained to conduct a series of tasks that
“Come here”
highlight the expanded capability of this task training frame- 81 Kevin walks to the AIBOs
work. We focus on capabilities not already described in “Come here”
[11]. The first task is a security activity called Patrol the “Is that understood?”
lab. This example illustrates both conditional branching and <Yes. You said go to the door. Go to the QRIOs.
Go to the AIBOs. Is this correct?>
looping, makes use of an interrupt event, and demonstrates
“Yes”
the training of tasks “on the fly”. <Thank you. Let’s continue training patrol the lab>
The transcripts for the training procedure are shown below. “Where was I?”
First, a task called Sound the alert (Figure 6) is trained, <You previously said while Kevin is not here,
which is then used as the contingency action triggered when drive around the lab. What shall I do next?>
“Meanwhile if you see someone”
someone is detected by the robot in the Patrol the lab task “sound the alert”
(Figure 7). <Ok. When I’m finished with that shall
For brevity, the “ok” feedback from the robot after every I resume patrol the lab?>
user utterance is omitted. Quoted phrases are those uttered by “Yes”
the user while phrases in <> are those uttered by the robot.
Fig. 7. Training transcript for the Patrol the lab task
Unquoted phrases describe what is physically happening in
the scene. The numbers on the left in the transcripts are
simply timestamps that denote where actions were executed
on the robot’s path depicted in the corresponding scenario vi-
sualizations. Figure 8 shows the visualization for the training
of Patrol the lab.

A. Patrol the lab


The training process for the Patrol the lab task is initiated
in dialog-and-observation mode. Kevin says “Drive around
the lab” which was not understood by the robot, so it begins
a new training process. Kevin then proceeds to teach it how Fig. 8. Patrol the lab task training. Door, QRIOs, and AIBOs denote map
to drive around the lab by leading it to different locations and locations.
saying “come here” which the robot automatically converts
into a Go to X command where X is the symbolic label
for the current location of the human as determined from a
given map. Kevin then asks “Is that understood?” to begin By saying “Patrol the lab”, the task is invoked. Figure
a verification process after which he resumes training the 9 shows the transcript of what happened and the scene is
original Patrol the lab task. Having forgotten his place in depicted in Figure 10. The robot first enters the while loop
the task he asks “Where was I?” and the robot reports the after determining that Kevin is not there - the robot cannot
last two tasks items that were added. Finally, Kevin adds see people when an obstacle obstructs its view - and travels
a contingency action, Sound the alert, in the event that the from the door, to the QRIOs, to the AIBOs, and back to
robot sees someone. the door again (Figure 10(a)). As the robot travels from
the QRIOs to the AIBOs, Kevin walks in triggering the
“When I say sound the alert” interrupt condition “meanwhile if you see someone”. The
“If you see Kevin” robot stops and executes Sound the alert causing the robot
“say hi Kevin” to say <Hi Kevin> (Figure 10(b)). Kevin then leaves and
“Otherwise”
“say intruder alert”
the robot continues with the patrol task and travels to the
“Thank you” AIBOs. As it travels towards the door, Paul steps into view
<Task training is complete. Thank you.> triggering the interrupt event once more. Paul is considered
to be an intruder (by virtue of not being Kevin) and the robot
Fig. 6. Training transcript for the Sound the alert task says <Intruder alert> (Figure 10(c)). Paul then leaves and
the robot again continues with the patrol task (Figure 10(d)).
0 “Patrol the lab” “Let me show you what to do when
Goto door I say give the QRIO tour”
13 Goto QRIOs <Ok, let’s begin>
35 Goto AIBOs Kevin goes to the QRIOs
52 Goto door “Come here”
79 Goto QRIOs “Say these are the QRIOs”
103 Goto AIBOs “If Kevin is there”
114 Kevin walks in. Robot stops. “ask Kevin to talk about the QRIOs”
<Hi kevin> “Otherwise”
Kevin leaves “say unfortunately Kevin is not around to
122 Goto door talk about the QRIOs”
136 Paul walks in. Robot stops. “The task has failed”
<Intruder alert> “before”
Paul leaves “say that concludes the QRIOs tour”
152 Goto QRIOs
Continue drive around the lab task Fig. 11. Training transcript for the Give the QRIO tour task

Fig. 9. Execution transcript for the Patrol the lab task “Let’s make a todolist called talk about the robots”
<Ok, let’s begin>
“Give the QRIO tour”
“Give the AIBO tour”
“Is that understood?”
<Yes, you said give the QRIO tour.
Give the AIBO tour. Is this correct?>
“Yes”
<How many times shall I attempt each
item before giving up?>
(a) Execution: Drive around lab (b) Execution: Interrupt (Kevin) “Two”
<Ok. Todolist training is complete. Thank you.>

Fig. 12. Training transcript for the Talk about the robots task

0 “Let me show you what to do when i


say give the lab tour”
<Ok, let’s begin>
16 Kevin goes to the door
“Come here”
(c) Execution: Interrupt (Intruder) (d) Execution: Drive around lab
38 “Say welcome to the multi-robot lab”
(cont’d) “Talk about the robots”
50 Kevin goes to the lab center
Fig. 10. Patrol the lab task “Come here”
“Say thank you for visiting”

Fig. 13. Training transcript for the Give the lab tour task
B. Give the lab tour
The second task is Give the lab tour where the robot acts
as a tour guide by taking the visitors to each exhibit and then
asking a human expert to talk about it in more detail. The task
involves greeting visitors at the door, executing its subtasks
Give the QRIO tour and Give the AIBO tour, and then saying
goodbye at the center of the lab. The task is similar to the
one described in [11] but includes two main differences: (1)
the phrase “The task has failed” is used in the subtasks and
(2) the subtasks are contained in a todolist. This enables the
robot to recognize when a task on the todolist has failed and Fig. 14. Give the lab tour task training
to reattempt it at a later time. The transcript for the training
of the task Give the QRIO tour is shown in Figure 11. The
task Give the AIBO tour is trained similarly except that it in Figure 15 and visualized in Figure 16. After greeting the
invokes the help of Paul, instead of Kevin, to talk about the visitors at the door, the robot begins the Talk about the robots
AIBOs. The training transcript for the todolist Talk about todolist and goes first to the QRIOs where it finds that Kevin
the robots and the top-level task Give the lab tour is shown is not there and consequently the Give the QRIO tour task
in Figures 12 and 13, respectively. The visualization of the fails (Figure 16(a)). It then travels to the AIBOs where it sees
Give the lab tour is shown in Figure 14. Paul and requests that he speak about them (Figure 16(b)).
The execution of the Give the lab tour task is transcribed When done, Paul tells the robot he is finished and the robot
0 “Give the lab tour”
Section II and could potentially be augmented with trajectory
Goto the door
<Welcome to the multi-robot lab> learning and task generalization techniques described therein.
11 Goto the QRIOs Learning of new objects, people, and locations, which
<These are the QRIOs. Unfortunately, was investigated to some extent in [3], would be a ca-
Kevin is not around to talk about them.> pability worth integrating into our system so that tasks
33 Goto the AIBOs
can be conditioned on new features. This would involve
49 <These are the AIBOs. Paul, could you please
talk about the AIBOs?> improved spatial reasoning and deictic expression compre-
Paul talks about the AIBOs and then tells the hension which would be useful in enhancing the dialog-and-
robot that he is finished. observation mode of training. It would also be appropriate
Goto the QRIOs to symbolically parameterize tasks so that they are more
78 <These are the QRIOs. Kevin, could
generalized. The task Give the AIBO tour then could use
you please talk about the QRIOs?>
Kevin talks about the QRIOs and then the same code as the Give the QRIO tour task, only it would
tells the robot that he is finished. use and be conditioned on different feature parameters. This
Goto the lab center would not only decrease training times, but would require
106 <Thank you for visiting> fewer resources due to code sharing.
Fig. 15. Execution transcript for the Give the lab tour task R EFERENCES
[1] D. Bentivegna, C. Atkeson, and G. Cheng. Learning from observation
and practice at the action generation level. In IEEE International
decides to retry the failed items in the todolist, i.e. Give the Conference on Humanoid Robots, Karlsruhe and Munich, Germany,
September/October 2003.
QRIO tour. This time Kevin is there and the task is completed [2] S. Calinon and A. Billard. Incremental learning of gestures by
successfully (Figure 16(c)). The tour is completed when the imitation in a humanoid robot. In Proceedings of the 2007 ACM/IEEE
robot goes to the lab center and thanks the visitors for coming International Conference on Human-Robot Interaction, Washington,
D.C., March 2007.
(Figure 16(d)). [3] A. Haasch, S. Hohenner, S. Huewel, M. Kleinehagenbrock, S. Lang,
I. Toptsis, G.A. Fink, J. Fritsch, B. Wrede, and G. Sagerer. Biron
— the bielefeld robot companion. In Proceedings of International
Workshop on Advances in Service Robotics, pages 27–32, Stuttgart,
Germany, May 2004.
[4] S. Iba, C. J.J. Paredis, and P. K. Khosla. Interactive multi-modal robot
programming. In Proceedings of IEEE International Conference on
Robotics and Automation, Washington D.C., May 2002.
[5] S. Lauria, G. Bugmann, T. Kyriacou, and E. Klein. Mobile robot
programming using natural language. Robotics and Autonomous
Systems, 38(3–4):171–181, 2002.
[6] N. Lesh, C. Rich, and C. Sidner. Using plan recognition in human-
(a) Give QRIO tour failed. (No (b) Giving AIBO tour computer collaboration. In Proceedings of the Seventh International
Kevin) Conference on User Modelling, Banff, Canada, June 1999.
[7] A. Lockerd and C. Brezeal. Tutelage and socially guided robot
learning. In Proceedings of IEEE/RSJ International Conference on
Intelligent Robots and Systems, Sendai, Japan, September 2004.
[8] M. Nicolescu and M. Matarić. Natural methods for robot task learning:
Instructive demonstration, generalization and practice. In Proceedings
of the Second International Joint Conference on Autonomous Agents
and Multi-Agent Systems, Melbourne, Australia, July 2003.
[9] D. Oblinger, V. Castelli, , and L. Bergman. Augmentation-based
learning: combining observations and user edits for programming
by demonstration. In Proceedings of the International Conference
on Intelligent User Interfaces, pages 202–209, Sydney, Australia,
(c) Giving QRIO tour. (Kevin is (d) “Thank you for visiting”
January-February 2006.
there now)
[10] C. Rich, C. Sidner, and N. Lesh. Collagen: Applying collaborative
Fig. 16. Give the lab tour task execution discourse theory to human-computer interaction. In AI Magazine,
Special Issue on Intelligent User Interfaces, November 2001.
[11] P. E. Rybski, K. Yoon, J. Stolarz, and M. Veloso. Interactive robot task
training through dialog and demonstration. In Proceedings of the 2007
ACM/IEEE International Conference on Human-Robot Interaction,
VIII. S UMMARY AND F UTURE W ORK Washington D.C., March 2007.
[12] J. Saunders, C. L. Nehaniv, and K. Dautenhahn. Teaching robots by
We have presented an enhanced task training procedure moulding behavior and scaffolding the environment. In Human-Robot
that permits the user to easily communicate a rich set of task Interaction, Salt Lake City, Utah, March 2006.
flow structures. Through dialog and observation of the user as [13] R. M. Voyles, J. D. Morrow, and P. K. Khosla. Towards gesture-based
programming: Shape from motion primoridal learning of sensorimotor
he moves around, this framework allows for natural methods primitives. Robotics and Autonomous Systems, 22:361–375, November
of conveying rigid production rules to construct these flow 1997.
structures when training a task.
There still remains some avenues to explore in giving our
robot system true utility as a personal assistant. Indeed, our
work is complementary to much of the work described in

You might also like