Visnav Lecture Notes

Computer Vision Group
Visual Navigation for Flying Robots Lecture Notes Summer Term 2012
Lecturer: Dr. Jrgen Sturm Teaching Assistant: Nikolas Engelhard
http://vision.in.tum.de/teaching/ss2012/visnav2012
Acknowledgements
This slide set would not have been possible without the help and support of many other people. In particular, I would like to thank all my great colleagues who made their lecture slides available for everybody on the internet or sent them to me personally. My thanks go to (in alphabetical order) Alexander Kleiner Andrew Davison Andrew Zisserman Antonio Torralba Chad Jenkins Cyrill Stachniss Daniel Cremers Georgio Grisetti Jan Peters Jana Kosecka Jrg Mller Jrgen Hess Kai Arras Kai Wurm Kurt Konolige Li Fei-Fei Maxim Likhachev Margaritha Chli Nicholas Roy Paul Newman Richard Newcombe Richard Szeliski Roland Siegwart Sebastian Thrun Steve Seitz Steven Lavalle Szymon Rusinkiewicz Volker Grabe Vijay Kumar Wolfram Burgard
Table of Contents
Introduction ....................................................................................................... 1 3D Geometry and Sensors ...............................................................................17 Probabilistic Models and State Estimation .......................................................37 Robot Control .................................................................................................. 55 Visual Motion Estimation .................................................................................69 Simultaneous Localization and Mapping (SLAM) .............................................85 Bundle Adjustment and Stereo Correspondence ............................................101 Place Recognition, ICP, and Dense Reconstruction ........................................117 Motion Planning ............................................................................................. 133 Planning under Uncertainty, Exploration and Coordination ............................149 Experimentation, Evaluation and Benchmarking ...........................................165
Computer Vision Group Prof. Daniel Cremers
Organization
Tue 10:15-11:45
Lectures, discussions Lecturer: Jrgen Sturm
Visual Navigation for Flying Robots Welcome

Dr. Jrgen Sturm
Thu 14:15-15:45
Lab course, homework & programming exercises Teaching assistant: Nikolas Engelhard
Course website
Dates, additional material Exercises, deadlines http://cvpr.in.tum.de/teaching/ss2012/visnav2012
Who are we?

Computer Vision group: 1 Professor, 2 Postdocs, 7 PhD students Research topics: Optical flow and motion estimation, 3D reconstruction, image segmentation, convex optimization My research goal: Apply solutions from computer vision to realworld problems in robotics.
Goal of this Course

Provide an overview on problems/approaches for autonomous quadrocopters Strong focus on vision as the main sensor Areas covered: Mobile Robotics and Computer Vision Hands-on experience in lab course
Course Material
Probabilistic Robotics. Sebastian Thrun, Wolfram Burgard and Dieter Fox. MIT Press, 2005. Computer Vision: Algorithms and Applications. Richard Szeliski. Springer, 2010.
http://szeliski.org/Book/
Lecture Plan
1. Introduction 2. Robots, sensor and motion models 3. State estimation and control 4. Guest talks 5. Feature detection and matching 6. Motion estimation 7. Simultaneous localization and mapping 8. Stereo correspondence 9. 3D reconstruction 10. Navigation and path planning 11. Exploration 12. Evaluation and Benchmarking
Basics on mobile robotics
Camera-based localization and mapping
Advanced topics
Lab Course
Thu 14:15 15:45, given by Nikolas Engelhard
Exercises: room 02.09.23 (6x, obliged, homework discussion) Robot lab: room 02.09.34/36 (in weeks without exercises, in case you need help, recommended!)
Exercises Plan
Exercise sheets contain both theoretical and programming problems 3 exercise sheets + 1 mini-project Deadline: before lecture (Tue 10:15) Hand in by email (visnav2012@cvpr.in.tum.de)
Group Assignment and Schedule

3 Ardrones (red/green/blue) + Joystick + 2x Batteries + Charger + PC 20 students in the course, 2-3 students per group 7-8 groups Either use lab computers or bring own laptop (recommended) Will put up lists for groups and robot schedule in robot lab (room 02.09.36)
VISNAV2012: Team Assignment

Team Name Student Name Student Name Student Name
Team Name
Student Name Student Name Student Name
VISNAV2012: Robot Schedule

Each team gets one time slot with programming support The robots/PCs are also available during the rest of the week (but without programming support)
Red Thu 2pm 3pm Thu 3pm 4pm Thu 4pm 5pm Green Blue
Safety Warning
Quadrocopters are dangerous objects Read the manual carefully before you start Always use the protective hull If somebody gets injured, report to us so that we can improve safety guidelines If something gets damaged, report it to us so that we can fix it NEVER TOUCH THE PROPELLORS DO NOT TRY TO CATCH THE QUADROCOPTER WHEN IT FAILS LET IT FALL/CRASH!
Agenda for Today

History of mobile robotics Brief intro on quadrocopters Paradigms in robotics Architectures and middleware
General background
Autonomous, automaton
self-willed (Greek, auto+matos)
Robot
Karel Capek in 1923 play R.U.R. (Rossums Universal Robots) labor (Czech or Polish, robota) workman (Czech or Polish, robotnik)
History
In 1966, Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to spend the summer linking a camera to a computer and getting the computer to describe what it saw. We now know that the problem is slightly more difficult than that. (Szeliski 2009, Computer Vision)
Shakey the Robot (1966-1972)
Shakey the Robot (1966-1972)
Stanford Cart (1961-80)
Rhino and Minerva (1998-99)

Museum tour guide robots University of Bonn and CMU Deutsches Museum, Smithsonian Museum
Roomba (2002)
Sensor: one contact sensor Control: random movements Over 5 million units sold
Neato XV-11 (2010)

Sensors:
1D range sensor for mapping and localization Improved coverage
Darpa Grand Challenge (2005)
Kiva Robotics (2007)

Pick, pack and ship automation
Fork Lift Robots (2010)
Quadrocopters (2001-)
Aggressive Maneuvers (2010)
Autonomous Construction (2011)
Mapping with a Quadrocopter (2011)
Our Own Recent Work (2011-)

RGB-D SLAM (Nikolas Engelhard) Visual odometry (Frank Steinbrcker) Camera-based navigation (Jakob Engel)

Current Trends in Robotics

Robots are entering novel domains
Industrial automation Domestic service robots Medical, surgery Entertainment, toys Autonomous cars Aerial monitoring/inspection/construction
Flying Robots
Recently increased interest in flying robots
Shift focus to different problems (control is much more difficult for flying robots, path planning is simpler, )
Application Domains of Flying Robots

Stunts for action movies, photography, sportscasts Search and rescue missions Aerial photogrammetry Documentation Aerial inspection of bridges, buildings, Construction tasks Military Today, quadrocopters are often still controlled by human pilots
Especially quadrocopters because

Can keep position Reliable and compact Low maintenance costs
Trend towards miniaturization
Quadrocopter Platforms
Commercial platforms
Ascending Technologies Height Tech Used in the Parrot Ardrone lab course
Flying Principles
Fixed-wing airplanes
generate lift through forward airspeed and the shape of the wings controlled by flaps
Helicopters/rotorcrafts
main rotor for lift, tail rotor to compensate for torque controlled by adjusting rotor pitch
Community/open-source projects
Mikrokopter Paparazzi
For more, see http://multicopter.org/wiki/Multicopter_Table
Quadrocopter/quadrotor
four rotors generate lift controlled by changing the speeds of rotation
Helicopter
Swash plate adjusts pitch of propeller cyclically, controls pitch and roll Yaw is controlled by tail rotor
Quadrocopter
Keep position: Torques of all four rotors sum to zero Thrust compensates for earth gravity
Quadrocopter: Basic Motions
Ascend
Descend
Turn Left
Turn Right
Accelerate Forward
Accelerate Backward
Accelerate to the Right
Accelerate to the Left
Autonomous Flight
Low level control (not covered in this course)
Maintain attitude, stabilize Compensate for disturbances
Challenges
Limited payload
Limited computational power Limited sensors
High level control

Compensate for drift Avoid obstacles Localization and Mapping Navigate to point Return to take-off position Person following
Limited battery life Fast dynamics, needs electronic stabilization Quadrocopter is always in motion Safety considerations
Robot Ethics
Where does the responsibility for a robot lie? How are robots motivated? Where are humans in the control loop? How might society change with robotics? Should robots be programmed to follow a code of ethics, if this is even possible?
Robot Ethics
Three Laws of Robotics (Asimov, 1942): A robot may not injure a human being or, through inaction, allow a human being to come to harm. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.
Robot Design
Imagine that we want to build a robot that has to perform navigation tasks
Robot Hardware/Components
Sensors Actuators Control Unit/Software
How would you tackle this? What hardware would you choose? What software architecture would you choose?
Evolution of Paradigms in Robotics

Classical robotics (mid-70s)
Exact models No sensing necessary
Classical / hierarchical paradigm

Sense Plan Act
Reactive paradigms (mid-80s)

No models Relies heavily on good sensing
Hybrid approaches (since 90s)

Model-based at higher levels Reactive at lower levels
Inspired by methods from Artificial Intelligence (70s) Focus on automated reasoning and knowledge representation STRIPS (Stanford Research Institute Problem Solver): Perfect world model, closed world assumption Shakey: Find boxes and move them to designated positions
Classical paradigm: Stanford Cart

Take nine images of the environment, identify interesting points, estimate depth Integrate information into global world model Correlate images with previous image set to estimate robot motion On basis of desired motion, estimated motion, and current estimate of environment, determine direction in which to move Execute motion
Classical paradigm as horizontal/functional decomposition

Motor Control
Acting Act Perception Execute Model Plan Environment
Sensing
Characteristics of hierarchical paradigm

Good old-fashioned Artificial Intelligence (GOFAI): Symbolic approaches Robot perceives the world, plans the next action, acts All data is inserted into a single, global world model Sequential data processing
Reactive Paradigm
Sense
Sense-act type of organization Multiple instances of stimulus-response loops (called behaviors) Each behavior uses local sensing to generate the next action Combine several behaviors to solve complex tasks Run behaviors in parallel, behavior can override (subsume) output of other behaviors
Reactive Paradigm as Vertical Decomposition

Explore Wander Avoid obstacles Sensing Acting
Characteristics of Reactive Paradigm

Situated agent, robot is integral part of the world No memory, controlled by what is happening in the world Tight coupling between perception and action via behaviors Only local, behavior-specific sensing is permitted (ego-centric representation)
Environment
Subsumption Architecture
Introduced by Rodney Brooks in 1986 Behaviors are networks of sensing and acting modules (augmented finite state machines) Modules are grouped into layers of competence Layers can subsume lower layers
Level 1: Avoid
feel force
force
runaway
heading
turn
sonar sensors
collide
halt
move forward
Level 2: Wander
look
Level 3: Follow Corridor

distance, direction traveled stay in the middle wander integrate
heading to middle
wander avoid stereo
modified heading avoid
stop motion
feel force
force
runaway
heading
turn
feel force
force
runaway
heading
turn
sonar sensors
collide
halt
move forward
sonar sensors
collide
halt
move forward
Roomba Robot
Exercise: Model the behavior of a Roomba robot.
Navigation with Potential Fields

Treat robot as a particle under the influence of a potential field Robot travels along the derivative of the potential Field depends on obstacles, desired travel directions and targets Resulting field (vector) is given by the summation of primitive fields Strength of field may change with distance to obstacle/target
10
Primitive Potential Fields
Example: reach goal and avoid obstacles
Uniform
Perpendicular
Attractive
Repulsive
Tangential
Corridor Following Robot

Level 1 (collision avoidance) add repulsive fields for the detected obstacles Level 2 (wander) add a uniform field into a (random) direction Level 3 (corridor following) replaces the wander field by three fields (two perpendicular, one parallel to the walls)
Characteristics of Potential Fields

Simple method which is often used Easy to visualize Easy to combine different fields (with parameter tuning) But: Suffer from local minima
Random motion to escape local minimum Backtracking Increase potential of visited regions High-level planner
Goal
Hybrid deliberative/reactive Paradigm

Plan
Modern Robot Architectures

Robots became rather complex systems Often, a large set of individual capabilities is needed Flexible composition of different capabilities for different tasks
Sense
Act
Combines advantages of previous paradigms

World model used in high-level planning Closed-loop, reactive low-level control
11
Best Practices for Robot Architectures

Modular Robust De-centralized Facilitate software re-use Hardware and software abstraction Provide introspection Data logging and playback Easy to learn and to extend
Robotic Middleware
Provides infrastructure Communication between modules Data logging facilities Tools for visualization Several systems available
Open-source: ROS (Robot Operating System), Player/Stage, CARMEN, YARP, OROCOS Closed-source: Microsoft Robotics Studio
Example Architecture for Navigation

User interface / mission planning
Stanleys Software Architecture

SENSOR INTERFACE
RDDF database Laser 1 interface
RDDF corridor (smoothed and original) driving mode corridor
PERCEPTION
PLANNING&CONTROL
Top level control
pause/disable command
USER INTERFACE
Touch screen UI Wireless E-Stop
Laser 2 interface
Global path planning Local path planning + collision avoidance Actuator interface(s)
Laser 3 interface Laser 4 interface Laser 5 interface Camera interface Radar interface
Road finder
laser map
road center
Path planner
Laser mapper Vision mapper Radar mapper

vehicle state (pose, velocity)
map vision map obstacle list
trajectory
VEHICLE INTERFACE
Touareg interface
Localization module
Steering control
GPS position
UKF Pose estimation

vehicle state (pose, velocity)
vehicle state
Throttle/brake control Power server interface
Sensor interface(s)
GPS compass IMU interface Wheel velocity Brake/steering
Surface assessment
velocity limit
Sensor driver(s)
Actuator driver(s)
GLOBAL SERVICES
heart beats
Linux processes start/stop
emergency stop health status
Process controller
data
Health monitor
power on/off
Data logger
Communication requests Communication channels
File system
clocks
Robot Hardware
Inter-process communication (IPC) server
Time server
PR2 Software Architecture

Two 7-DOF arms, grippers, torso, 2-DOF head 7 cameras, 2 laser scanners Two 8-core CPUs, 3 network switches 73 nodes, 328 message topics, 174 services
Communication Paradigms
Message-based communication
A msg var x var y B
Direct (shared) memory access

memory A var x var y B
12
Forms of Communication
Push Pull Publisher/subscriber Publish to blackboard Remote procedure calls / service calls Preemptive tasks / actions
Push
Broadcast One-way communication Send as the information is generated by the producer P
data
Pull
Data is delivered upon request by the consumer C (e.g., a map of the building) Useful if the consumer C controls the process and the data is not required (or available) at high frequency
Publisher/Subscriber
The consumer C requests a subscription for the data by the producer P (e.g., a camera or GPS) The producer P sends the subscribed data as it is generated to C Data generated according to a trigger (e.g., sensor data, computations, other messages, )
subscription request C P data (t=0) data (t=1) C
data request P data
data ()
Publish to Blackboard
The producer P sends data to the blackboard (e.g., parameter server) A consumer C pull data from the blackboard B Only the last instance of data is stored in the blackboard B
data request
P
Service Calls
The client C sends a request to the server S The server returns the result The client waits for the result (synchronous communication) Also called: Remote Procedure Call
request + input data
data
data
result
13
Actions (Preemptive Tasks)

The client requests the execution of an enduring action (e.g., navigate to a goal location) The server executes this action and sends continuously status updates Task execution may be canceled from both sides (e.g., timeout, new navigation goal,)
Robot Operating System (ROS)

We will use ROS in the lab course http://www.ros.org/ Installation instructions, tutorials, docs
Concepts in ROS
Nodes: programs that communicate with each other Messages: data structure (e.g., Image) Topics: typed message channels to which nodes can publish/subscribe (e.g., /camera1/image_color) Parameters: stored in a blackboard
camera_driver
Image face_detector
Software Management
Package: atomic unit of building, contains one or more nodes and/or message definitions Stack: atomic unit of releasing, contains several packages with a common theme Repository: contains several stacks, typically one repository per institution
Useful Tools
roscreate-pkg rosmake roscore rosnode list/info rostopic list/echo rosbag record/play rosrun
Tutorials in ROS
14
Exercise Sheet 1
On the course website Solutions are due in 2 weeks (May 1st)
Summary
History of mobile robotics Brief intro on quadrocopters Paradigms in robotics Architectures and middleware
Theory part: Define the motion model of a quadrocopter (will be covered next week) Practical part: Playback a bag file with data from quadrocopter & plot trajectory
Questions?
See you next week!
15
16
Organization: Lecture
Student request to change lecture time to Tuesday afternoon due to time conflicts with other course Problem: At least 3 students who are enrolled for this lecture have time Tuesday morning but not on Tuesday afternoon Therefore: No change Lectures are important, please choose which course to follow Note: Still students on the waiting list
Visual Navigation for Flying Robots
3D Geometry and Sensors

Dr. Jrgen Sturm
Organization: Lab Course

Robot lab: room 02.09.38 (around the corner) Exercises: room 02.09.23 (here) You have to sign up for a team before May 1st (team list in student lab) After May 1st, remaining places will be given to students on waiting list This Thursday: Visual navigation demo at 2pm in the student lab (in conjunction with TUM Girls Day)
Todays Agenda
Linear algebra 2D and 3D geometry Sensors
Vectors
Vector and its coordinates
Vector Operations
Scalar multiplication Addition/subtraction Length Normalized vector Dot product Cross product
Vectors represent points in an n-dimensional space
17
Vector Operations
Vector Operations
Vector Operations
Vector Operations
are orthogonal if is linearly dependent from
if
Vector Operations
Scalar multiplication Addition/subtraction Length Normalized vector Dot product Cross product Definition
Cross Product
Matrix notation for the cross product
Verify that
18
Matrices
Rectangular array of numbers
rows columns
Matrices
Column vectors of a matrix
First index refers to row Second index refers to column
Geometric interpretation: for example, column vectors can form basis of a coordinate system
Matrices
Row vectors of a matrix
Matrices
Square matrix Diagonal matrix Upper and lower triangular matrix Symmetric matrix Skew-symmetric matrix (Semi-)positive definite matrix Invertible matrix Orthonormal matrix Matrix rank
Matrices
Square matrix Diagonal matrix Upper and lower triangular matrix Symmetric matrix Skew-symmetric matrix (Semi-)positive definite matrix Invertible matrix Orthonormal matrix Matrix rank
Matrix Operations
Scalar multiplication Addition/subtraction Transposition Matrix-vector multiplication Matrix-matrix multiplication Inversion
19
Matrix Operations
Matrix-Vector Multiplication
Definition
column vectors
Geometric interpretation: a linear combination of the columns of X scaled by the coefficients of b
Matrix-Vector Multiplication

Matrix Operations
column vectors
Geometric interpretation: A linear combination of the columns of A scaled by the coefficients of b coordinate transformation
Matrix-Matrix Multiplication
Operator Definition
Matrix-Matrix Multiplication
Not commutative (in general) Associative
Interpretation: transformation of coordinate systems Can be used to concatenate transforms
Transpose
20
Matrix Operations
Matrix Inversion
If is a square matrix of full rank, then there is a unique matrix such that . Different ways to compute, e.g., Gauss-Jordan elimination, LU decomposition, When A is orthonormal, then
Recap: Linear Algebra

Vectors Matrices Operators
Geometric Primitives in 2D
2D point
Augmented vector
Now lets apply these concepts to 2D+3D geometry Homogeneous coordinates
Homogeneous vectors that differ only be scale represent the same 2D point Convert back to inhomogeneous coordinates by dividing through last element
2D line 2D line equation
Points with or ideal points
are called points at infinity
21
Normalized line equation vector with where is the distance of the line to the origin
Polar coordinates of a line: (e.g., used in Hough transform for finding lines)
Line joining two points Intersection point of two lines
3D point (same as before) Augmented vector
Homogeneous coordinates
3D plane 3D plane equation Normalized plane with unit normal vector ( ) and distance d
3D line through points Infinite line: Line segment joining :
22
2D Planar Transformations
2D Transformations
Translation
where is the identity matrix (2x2) and is the zero vector
2D Transformations
Rotation + translation (2D rigid body motion, or 2D Euclidean transformation) or
2D Transformations
Scaled rotation/similarity transform
or
where is an orthonormal rotation matrix, i.e., Distances (and angles) are preserved
Preserves angles between lines
2D Transformations
Affine transform
2D Transformations
Projective/perspective transform
Parallel lines remain parallel
Note that is homogeneous (only defined up to scale) Resulting coordinates are homogeneous Parallel lines remain parallel
23
2D Transformations
3D Transformations
Translation
Euclidean transform (translation + rotation), (also called the Special Euclidean group SE(3))
Scaled rotation, affine transform, projective transform
3D Transformations
3D Rotations
Rotation matrix (also called the special orientation group SO(3)) Euler angles Axis/angle Unit quaternion
Rotation Matrix
Orthonormal 3x3 matrix
Euler Angles
Product of 3 consecutive rotations Roll-pitch-yaw convention is very common in aerial navigation (DIN 9300)
Column vectors correspond to coordinate axes Special orientation group Main disadvantage: Over-parameterized (9 parameters instead of 3)
24
Euler Angles
Yaw , Pitch , Roll to rotation matrix Advantage:
Euler Angles
Minimal representation (3 parameters) Easy interpretation
Disadvantages: Rotation matrix to Yaw-Pitch-Roll

Many alternative Euler representations exist (XYZ, ZXZ, ZYX, ) Singularities (gimbal lock)
Gimbal Lock
When the axes align, one degree-of-freedom (DOF) is lost
Axis/Angle
Represent rotation by
rotation axis and rotation angle
4 parameters 3 parameters
length is rotation angle also called the angular velocity minimal but not unique (why?)
Derivation of Angular Velocities

Assume we have a rotational motion in SO(3) As this rotations are orthonormal matrices, we have Now take the derivative on both sides (w.r.t. t)

Linear ordinary differential equation (ODE)
Solution of this ODE Conversions
Thus,
must be skew-symmetric, i.e.,
25

Linear ordinary differential equation (ODE)
Conversion
Rodriguez formula
Inverse
The space of all skew-symmetric matrices is called the tangent space

Space of all rotations in 3D (Special orientation group)
see: An Invitation to 3D Vision, Y. Ma, S. Soatto, J. Kosecka, S. Sastry, Chapter 2 (available online)
Exponential Twist
The exponential map can be generalized to Euclidean transformations (incl. translations) Tangent space (Special) Euclidean group (group of all Euclidean transforms) Rigid body velocity
Exponential Twist
Convert to homogeneous coordinates
Exponential map between se(3) and SE(3) There are also direct formulas (similar to Rodriguez)
Unit Quaternions
Quaternion Unit quaternions have Opposite sign quaternions represent the same rotation Otherwise unique
Unit Quaternions
Advantage: multiplication and inversion operations are really fast Quaternion-Quaternion Multiplication
Inverse (flip sign of v or w)
26
Unit Quaternions
Quaternion-Vector multiplication (rotate point p with rotation q)
Spherical Linear Interpolation (SLERP)

Useful for interpolating between two rotations
with Relation to Axis/Angle representation
3D to 2D Projections
Orthographic projections Perspective projections
3D to 2D Perspective Projection
3D point (in the camera frame) 2D point (on the image plane) Pin-hole camera model
Remember, normalize
is homogeneous, need to
27
Camera Intrinsics
So far, 2D point is given in meters on image plane But: we want 2D point be measured in pixels (as the sensor does)
Camera Intrinsics
Need to apply some scaling/offset
Focal length Camera center Skew
Camera Extrinsics
Assume is given in world coordinates Transform from world to camera (also called the camera extrinsics)
Recap: 2D/3D Geometry

points, lines, planes 2D and 3D transformations Different representations for 3D orientations
Choice depends on application Which representations do you remember?
Full camera matrix
3D to 2D perspective projections You really have to know 2D/3D transformations by heart (read Szeliski, Chapter 2)
C++ Libraries for Lin. Alg./Geometry

Many C++ libraries exist for linear algebra and 3D geometry Typically conversion necessary Examples:
C arrays, std::vector (no linear alg. functions) gsl (gnu scientific library, many functions, plain C) boost::array (used by ROS messages) Bullet library (3D geometry, used by ROS tf) Eigen (both linear algebra and geometry, my recommendation)
Example: Transform Trees in ROS

TF package represents 3D transforms between rigid bodies in the scene as a tree
base_link
camera
rotor1
rotor2
person
map
28
Example: Video from PR2
Sensors
Classification of Sensors
What:
Proprioceptive sensors
Measure values internally to the system (robot) Examples: battery status, motor speed, accelerations,
Classification of Sensors
Tactile sensors Contact switches, bumpers, proximity sensors, pressure Wheel/motor sensors Potentiometers, brush/optical/magnetic/inductive/capacitive encoders, current sensors Heading sensors Compass, infrared, inclinometers, gyroscopes, accelerometers Ground-based beacons GPS, optical or RF beacons, reflective beacons Active ranging Ultrasonic sensor, laser rangefinder, optical triangulation, structured light Motion/speed sensors Doppler radar, Doppler sound Vision-based sensors CCD/CMOS cameras, visual servoing packages, object tracking packages
Exteroceptive sensors
Provide information about the environment Examples: compass, distance to objects,
How:
Passive sensors
Measure energy coming from the environment
Active sensors
Emit their proper energy and measure the reaction Better performance, but influence on environment
Example: Ardrone Sensors

Tactile sensors Contact switches, bumpers, proximity sensors, pressure Wheel/motor sensors Potentiometers, brush/optical/magnetic/inductive/capacitive encoders, current sensors Heading sensors Compass, infrared, inclinometers, gyroscopes, accelerometers Ground-based beacons GPS, optical or RF beacons, reflective beacons Active ranging Ultrasonic sensor, laser rangefinder, optical triangulation, structured light Motion/speed sensors Doppler radar, Doppler sound Vision-based sensors CCD/CMOS cameras, visual servoing packages, object tracking packages
Characterization of Sensor Performance

Bandwidth or Frequency Delay Sensitivity Cross-sensitivity (cross-talk) Error (accuracy)
Deterministic errors (modeling/calibration possible) Random errors
Weight, power consumption,
29
Sensors
Motor/wheel encoders Compass Gyroscope Accelerometers GPS Range sensors Cameras
Motor/wheel encoders
Device for measuring angular motion Often used in (wheeled) robots Output: position, speed (possibly integrate speed to get odometry)
Motor/wheel encoders
Working principle:
Regular: counts the number of transitions but cannot tell direction Quadrature: uses two sensors in quadrature phaseshift, ordering of rising edge tells direction Sometimes: Reference pulse (or zero switch)
Magnetic Compass
Measures earths magnetic field Inclination angle approx. 60deg (Germany) Does not work indoor/affected by metal Alternative: gyro compass (spinning wheel, aligns with earths rotational poles, for ships)
Magnetic Declination
Angle between magnetic north and true north Varies over time Good news ;-): by 2050, magnetic declination in central Europe will be zero
Magnetic Compass
Sensing principle: Hall sensor Construction: 3 orthogonal sensors
30
Mechanical Gyroscope
Measures orientation (standard gyro) or angular velocity (rate gyro, needs integration for angle) Spinning wheel mounted in a gimbal device (can move freely in 3 dimensions) Wheel keeps orientation due to angular momentum (standard gyro)
Modern Gyroscopes
Vibrating structure gyroscope (MEMS)
Based on Coriolis effect Vibration keeps its direction under rotation Implementations: Tuning fork, vibrating wheels,
Ring laser / fibre optic gyro

Interference between counter-propagating beams in response to rotation
Accelerometer
Measures all external forces acting upon them (including gravity) Acts like a spring-damper system To obtain inertial acceleration (due to motion alone), gravity must be subtracted
MEMS Accelerometers
Micro Electro-Mechanical Systems (MEMS) Spring-like structure with a proof mass Damping results from residual gas Implementations: capacitive, piezoelectric,
Inertial Measurement Unit

3-axes MEMS gyroscope
Provides angular velocity Integrate for angular position Problem: Drifts slowly over time (e.g., 1deg/hour), called the bias
Inertial Measurement Unit

IMU: Device that uses gyroscopes and accelerometers to estimate (relative) position, orientation, velocity and accelerations Integrate angular velocities to obtain absolute orientation Subtract gravity from acceleration Integrate acceleration to linear velocities Integrate linear velocities to position Note: All IMUs are subject to drift (position is integrated twice!), needs external reference
3-axes MEMS accelerometer

Provides accelerations (including gravity)
Can we use these sensors to estimate our position?
31
Example: AscTec Autopilot Board
GPS
GPS
24+ satellites, 12 hour orbit, 20.190 km height 6 orbital planes, 4+ satellites per orbit, 60deg distance
GPS
Position from pseudorange
Requires measurements of 4 different satellites Low accuracy (3-15m) but absolute
Position from pseudorange + phase shift

Very precise (1mm) but highly ambiguous Requires reference receiver (RTK/dGPS) to remove ambiguities
Satellite transmits orbital location + time 50bits/s, msg has 1500 bits 12.5 minutes
Range Sensors
Sonar Laser range finder
Range Sensors
Emit signal to determine distance along a ray Make use of propagation speed of ultrasound/light Traveled distance is given by Sound speed: 340m/s Light speed: 300.000km/s
Time of flight camera Structured light (will be covered later)
32
Ultrasonic Range Sensors

Range between 12cm and 5m Opening angle around 20 to 40 degrees Soft surfaces absorb sound Reflections ghosts Lightweight and cheap
Laser Scanner
Measures phase shift Pro: High precision, wide field of view, safety approved for collision detection Con: Relatively expensive + heavy
Laser Scanner
2D scanners
Camera
Lets design a camera
Idea 1: put a piece of film in front of an object Do we get a reasonable image?
3D scanners
Camera
Add a barrier to block off most of the rays
This reduces blurring The opening known as the aperture How does this transform the image?
Camera Lens
A lens focuses light onto the film
Rays passing through the optical center are not deviated
33
Camera Lens
A lens focuses light onto the film
Rays passing through the center are not deviated All rays parallel to the Optical Axis converge at the Focal Point
Camera Lens
There is a specific distance at which objects are in focus Other points project to a blur circle in the image
Lens Distortions
Radial distortion of the image
Caused by imperfect lenses Deviations are most noticeable for rays that pass through the edge of the lens
Lens Distortions
Radial distortion of the image
Caused by imperfect lenses Deviations are most noticeable for rays that pass through the edge of the lens
Typically compensated with a low-order polynomial
Digital Cameras
Vignetting De-bayering Rolling shutter and motion blur Compression (JPG) Noise
Dead Reckoning and Odometry

Estimating the position based on the issued controls (or IMU) readings Integrated over time
34
Exercise Sheet 1
Odometry sensor on Ardrone is an integrated package Sensors
Down-looking camera to estimate motion Ultrasonic sensor to get height 3-axes gyroscopes 3-axes accelerometer
Summary
Linear Algebra 2D/3D Geometry Sensors
IMU readings
Horizontal speed (vx/vy) Height (z) Roll, Pitch, Yaw
Integrate these values to get robot pose

Position (x/y/z) Orientation (e.g., r/p/y)
35
36
Organization
Next week: Three scientific guest talks Recent research results from our group (2011/12)
Conference Paper Conference Paper Research Conference Paper Conference Paper ICRA, IROS, CVPR, ICCV, NIPS,
Visual Navigation for Flying Robots 2
Probabilistic Models and State Estimation

Dr. Jrgen Sturm
Journal Article
PhD Thesis Journal Article
T-RO, AURO, RAS, PAMI,

Dr. Jrgen Sturm, Computer Vision Group, TUM
Guest Talks
An Evaluation of the RGB-D SLAM System (F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, W. Burgard), In Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA), 2012. Real-Time Visual Odometry from Dense RGB-D Images (F. Steinbruecker, J. Sturm, D. Cremers), In Workshop on Live Dense Reconstruction with Moving Cameras at the Intl. Conf. on Computer Vision (ICCV), 2011. Camera-Based Navigation of a Low-Cost Quadrocopter (J. Engel, J. Sturm, D. Cremers), Submitted to International Conference on Robotics and Systems (IROS), under review.
Visual Navigation for Flying Robots 3 Dr. Jrgen Sturm, Computer Vision Group, TUM
Perception
Perception and models are strongly linked
Perception
Perception and models are strongly linked Example: Human Perception
more on http://michaelbach.de/ot/index.html
Visual Navigation for Flying Robots 5 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 6 Dr. Jrgen Sturm, Computer Vision Group, TUM
37
Models in Human Perception

Count the black dots
State Estimation
Cannot observe world state directly Need to estimate the world state Robot maintains belief about world state Update belief according to observations and actions using models Sensor observations + sensor model Executed actions + action/motion model
State Estimation
What parts of the world state are (most) relevant for a flying robot?
State Estimation
What parts of the world state are (most) relevant for a flying robot? Position Velocity Obstacles Map Positions and intentions of other robots/humans
Models and State Estimation

Sensor Model Belief / State Estimate Motion Model
(Deterministic) Sensor Model

Robot perceives the environment through its sensors
sensor reading world state
Perception
Plan
Execution
observation function Sensing
Acting
Goal: Infer the state of the world from sensor readings

Physical World
38
(Deterministic) Motion Model

Robot executes an action (e.g., move forward at 1m/s) Update belief state according to motion model
transition function executed action
Probabilistic Robotics
Sensor observations are noisy, partial, potentially missing (why?) All models are partially wrong and incomplete (why?) Usually we have prior knowledge (why?)
current state
previous state
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 14 Dr. Jrgen Sturm, Computer Vision Group, TUM
Probabilistic Robotics
Probabilistic sensor and motion models Integrate information from multiple sensors (multi-modal) Integrate information over time (filtering)
Agenda for Today

Motivation Bayesian Probability Theory Bayes Filter Normal Distribution Kalman Filter
15
16
The Axioms of Probability Theory

Notation: proposition 1. 2. 3.
A Closer Look at Axiom 3
refers to the probability that holds
18
39
Discrete Random Variables

in is the probability that the random variable takes on value is called the probability mass function Example:
Continuous Random Variables

takes on continuous values or is called the probability density function (PDF)
denotes a random variable can take on a countable number of values
Example
20
Proper Distributions Sum To One

Discrete case
Joint and Conditional Probabilities

If and are independent then
Continuous case
If
is the probability of x given y and are independent then
21
22
Conditional Independence
Definition of conditional independence
Marginalization
Discrete case
Equivalent to
Continuous case Note: this does not necessarily mean that
23
24
40
Example: Marginalization
Law of Total Probability

Discrete case
Continuous case
25
26
Expected Value of a Random Variable

Discrete case Continuous case
Covariance of a Random Variable

Measures the squared expected deviation from the mean
The expected value is the weighted average of all values a random variable can take on. Expectation is a linear operator
27
28
The State Estimation Problem

We want to estimate the world state From sensor measurements and controls (or odometry readings)
Causal vs. Diagnostic Reasoning

is diagnostic is causal Often causal knowledge is easier to obtain Bayes rule allows us to use causal knowledge:
observation likelihood
We need to model the relationship between these random variables, i.e.,
prior on world state
prior on sensor observations

41
Bayes Formula
Normalization
Direct computation of can be difficult Idea: Compute improper distribution, normalize afterwards Step 1: Step 2:
(Law of total probability)
Step 3:
Bayes Rule with Background Knowledge
Example: Sensor Measurement

Quadrocopter seeks the landing zone Landing zone is marked with many bright lamps Quadrocopter has a brightness sensor
33
34

Binary sensor Binary world state Sensor model

Sensor model Prior on world state Probability after observation (using Bayes)
Prior on world state Assume: Robot observes light, i.e., What is the probability that the robot is above the landing zone?
42

Sensor model Prior on world state Probability after observation (using Bayes)
Combining Evidence
Suppose our robot obtains another observation (either from the same or a different sensor) How can we integrate this new information? More generally, how can we estimate ?
37
38
Combining Evidence
Suppose our robot obtains another observation (either from the same or a different sensor) How can we integrate this new information? More generally, how can we estimate ? Bayes formula gives us:
Recursive Bayesian Updates
39
40
Markov Assumption: is independent of
if we know
Markov Assumption: is independent of
if we know
41
42
43
Example: Second Measurement

Sensor model Previous estimate Assume robot does not observe marker What is the probability of being home?
Example: Second Measurement

Sensor model Previous estimate Assume robot does not observe marker What is the probability of being home?
The second observation lowers the probability that the robot is above the landing zone!
Actions (Motions)
Often the world is dynamic since
actions carried out by the robot actions carried out by other agents or just time passing by change the world
Typical Actions
Quadrocopter accelerates by changing the speed of its motors Position also changes when quadrocopter does nothing (and drifts..) Actions are never carried out with absolute certainty In contrast to measurements, actions generally increase the uncertainty of the state estimate
How can we incorporate actions?
45
Action Models
To incorporate the outcome of an action into the current state estimate (belief), we use the conditional pdf
Example: Take-Off
Action: World state:
0.99
air
This term specifies the probability that executing the action u in state x will lead to state x
Visual Navigation for Flying Robots 47 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots
0.9
0.01
ground
0.1
48 Dr. Jrgen Sturm, Computer Vision Group, TUM
44
Integrating the Outcome of Actions

Discrete case
Example: Take-Off
Prior belief on robot state: (robot is located on the ground) Robot executes take-off action What is the robots belief after one time step?
Continuous case
Question: What is the probability at t=2?

Markov Chain
A Markov chain is a stochastic process where, given the present state, the past and the future states are independent
Markov Assumption
Observations depend only on current state Current state depends only on previous state and current action Underlying assumptions
Static world Independent noise Perfect model, no approximation errors
51
52
Bayes Filter
Given:
Stream of observations and actions : Sensor model Action model Prior probability of the system state
Bayes Filter
For each time step, do 1. Apply motion model
2. Apply sensor model
Wanted:
Estimate of the state of the dynamic system Posterior of the state is also called belief
Note: Bayes filters also work on continuous state spaces (replace sum by integral)
45
Example: Localization
Discrete state Belief distribution can be represented as a grid This is also called a histogram filter
P = 1.0
Action Robot can move one cell in each time step Actions are not perfectly executed
P = 0.0
Action Robot can move one cell in each time step Actions are not perfectly executed Example: move east
Observation One (special) location has a marker Marker is sometimes also detected in neighboring cells
60% success rate, 10% to stay/move too far/ move one up/move one down
Lets start a simulation run (shades are handdrawn, not exact!)
t=0 Prior distribution (initial belief) Assume we know the initial location (if not, we could initialize with a uniform prior)
59
60
46
t=1, u=east, z=no-marker Bayes filter step 1: Apply motion model
t=1, u=east, z=no-marker Bayes filter step 2: Apply observation model
61
62
t=2, u=east, z=marker Bayes filter step 2: Apply motion model
t=2, u=east, z=marker Bayes filter step 1: Apply observation model
63
64
Bayes Filter - Summary

Markov assumption allows efficient recursive Bayesian updates of the belief distribution Useful tool for estimating the state of a dynamic system Bayes filter is the basis of many other filters
Kalman filter Particle filter Hidden Markov models Dynamic Bayesian networks Partially observable Markov decision processes (POMDPs)
Kalman Filter
Bayes filter with continuous states State represented with a normal distribution Developed in the late 1950s Kalman filter is very efficient (only requires a few matrix operations per time step) Applications range from economics, weather forecasting, satellite navigation to robotics and many more Most relevant Bayes filter variant in practice exercise sheet 2
47
Normal Distribution
Univariate normal distribution
Normal Distribution
Multivariate normal distribution
Example: 2-dimensional normal distribution

pdf iso lines
67
68
Properties of Normal Distributions

Linear transformation remains Gaussian
Linear Process Model

Consider a time-discrete stochastic process (Markov chain)
Intersection of two Gaussians remains Gaussian
69
70

Consider a time-discrete stochastic process Represent the estimated state (belief) by a Gaussian

Consider a time-discrete stochastic process Represent the estimated state (belief) by a Gaussian Assume that the system evolves linearly over time, then
71
72
48

Consider a time-discrete stochastic process Represent the estimated state (belief) by a Gaussian Assume that the system evolves linearly over time and depends linearly on the controls

Consider a time-discrete stochastic process Represent the estimated state (belief) by a Gaussian Assume that the system evolves linearly over time, depends linearly on the controls, and has zero-mean, normally distributed process noise
with
Linear Observations
Further, assume we make observations that depend linearly on the state
Linear Observations
Further, assume we make observations that depend linearly on the state and that are perturbed by zero-mean, normally distributed observation noise
with
75
76
Kalman Filter
Estimates the state of a discrete-time controlled process that is governed by the linear stochastic difference equation
Variables and Dimensions

State Controls Observations Process equation
and (linear) measurements of the state
Measurement equation
with
and
77 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 78 Dr. Jrgen Sturm, Computer Vision Group, TUM
49
Kalman Filter
Initial belief is Gaussian
From Bayes Filter to Kalman Filter

Next state is also Gaussian (linear transformation)
Observations are also Gaussian



For each time step, do 2. Apply sensor model
with
Kalman Filter
For the interested readers: See Probabilistic Robotics for full derivation (Chapter 3)
Kalman Filter
Highly efficient: Polynomial in the measurement dimensionality k and state dimensionality n:
2. Apply sensor model
Optimal for linear Gaussian systems! Most robotics systems are nonlinear!
with
50
Nonlinear Dynamical Systems

Most realistic robotic problems involve nonlinear functions Motion function
Taylor Expansion
Solution: Linearize both functions Motion function
Observation function
Observation function
85
86
Extended Kalman Filter

For each time step, do 1. Apply motion model with 2. Apply sensor model
For the interested readers: See Probabilistic Robotics for full derivation (Chapter 3)
Example
2D case State Odometry Observations of visual marker (relative to robot pose)
with
and
Example
Motion Function and its derivative
Example
Observation Function ( Sheet 2)
89
90
51
Example
Dead reckoning (no observations) Large process noise in x+y
Example
Dead reckoning (no observations) Large process noise in x+y+yaw
91
92
Example
Now with observations (limited visibility) Assume robot knows correct starting pose
Example
What if the initial pose (x+y) is wrong?
93
94
Example
What if the initial pose (x+y+yaw) is wrong?
Example
If we are aware of a bad initial guess, we set the initial sigma to a large value (large uncertainty)
95
96
52
Example

Summary
Observations and actions are inherently noisy Knowledge about state is inherently uncertain Probability theory Probabilistic sensor and motion models Bayes Filter, Histogram Filter, Kalman Filter, Examples
97
98
53
54
Organization - Exam
Oral exams in teams (2-3 students) At least 15 minutes per student individual grades Questions will address
Material from the lecture Material from the exercise sheets Your mini-project
Visual Navigation for Flying Robots Robot Control

Dr. Jrgen Sturm
Control Architecture
Robot
DC Motors

Actuators
Trajectory
Localization Attitude Estimation RPM Estimation Position Control Attitude Control Motor Speed Control
Maybe you built one in school Stationary permanent magnet Electromagnet induces torque Split ring switches direction of current
Sensors Position Velocity Acceleration Forces Torques
Physical World
Kinematics Dynamics 3
Brushless Motors
Used in most quadrocopters Permanent magnets on the axis Electromagnets on the outside Requires motor controller to switch currents Does not require brushes (less maintenance)
Attitude + Motor Controller Boards

Example: Mikrokopter Platform
55
Pulse Width Modulation (PWM)

Protocol used to control motor speed Remote controls typically output PWM
I2C Protocol
Serial data line (SDA) + serial clock line (SCL) All devices connected in parallel 7-10 bit address, 100-3400 kbit/s speed Used by Mikrocopter for motor control
Control Architecture
Robot
Kinematics and Dynamics

Kinematics
Integrate acceleration to get velocity Integrate velocity to get position
Trajectory
Localization Attitude Estimation RPM Estimation Position Control Attitude Control Motor Speed Control Actuators Position Velocity Acceleration Forces Torques
Dynamics
Actuators induce forces and torques Forces induce linear acceleration Torques induce angular acceleration
Sensors
Physical World
What types of forces do you know? What types of torques do you know?
Example: 1D Kinematics
State Action Process model
Dynamics - Essential Equations

Force (Kraft)
Torque (Drehmoment) Kalman filter How many states do we need for 3D?
56
Forces
Gravity Friction
Stiction (static friction) Damping (viscous friction)
Example: Spring-Damper System

Combination of spring and damper Forces Resulting dynamics
Spring Magnetic force
13
14
Torques
Definition Torques sum up Torque results in angular acceleration (with , moment of inertia) Friction same as before
Dynamics of a Quadrocopter
Each propeller induces force and torque by accelerating air Gravity pulls quadrocopter downwards
15
16
Vertical Acceleration
Thrust
Vertical and Horizontal Acceleration

Thrust
17
18
57
Vertical and Horizontal Acceleration

Thrust Acceleration
Pitch (and Roll)

Attitude changes when opposite motors generate unequal thrust Induced torque Induced angular acceleration
attitude
Side view of quadrocopter
19
20
Yaw
Each propeller induces torque due to rotation and the interaction with the air Induced torque Induced angular acceleration
Robot
Cascaded Control
Trajectory
Localization Attitude Estimation RPM Estimation Sensors Position Velocity Acceleration Forces Torques Position Control Attitude Control Motor Speed Control Actuators
Physical World
21
Assumptions of Cascaded Control

Dynamics of inner loops is so fast that it is not visible from outer loops Dynamics of outer loops is so slow that it appears as static to the inner loops
Cascaded Control Example

Motor control happens on motor boards (controls every motor tick) Attitude control implemented on microcontroller with hard real-time (at 1000 Hz) Position control (at 10 250 Hz) Trajectory (waypoint) control (at 0.1 1 Hz)
23
24
58
Feedback Control - Generic Idea

Controller (Regler) Plant (Regelstrecke)
Desired value 35
Desired value 35
25
26

Controller (Regler) Plant (Regelstrecke)

Controller (Regler) 45 How can we correct? Error 35 25 Turn hotter (not colder) Plant (Regelstrecke)
Desired value 35 Sensor 45 35
Desired value 35
Sensor 45 35
Measured temperature
25
How hot is it?
Measured temperature
25
How hot is it?
27
28
Feedback Control - Example

Controller Plant
Measurement Noise
What effect has noise in the measurements?
Measurement
Poor performance for K=1 How can we fix this?

59
Proper Control with Measurement Noise

Lower the gain (K=0.15)
What do High Gains do?

High gains are always problematic (K=2.15)
31
32
What happens if sign is messed up?

Check K=-0.5
Saturation
In practice, often the set of admissible controls u is bounded This is called (control) saturation
33
34
Block Diagram
Delays
In practice most systems have delays Can lead to overshoots/oscillations/destabilization
Controller
Plant
Measurement
One solution: lower gains (why is this bad?)

60
Delays
What is the total dead time of this system?
100ms delay in water pipe Controller Measurement Plant
Delays
100ms delay in water pipe Controller Plant
Measurement
50ms delay in sensing
50ms delay in sensing
Can we distinguish delays in the measurement from delays in actuation?

Delays
Controller Plant (and measurement)
Smith Predictor
Allows for higher gains Requires (accurate) model of plant
Controller Delay-free plant model Delay model Plant with delay
Can we distinguish delays in the measurement from delays in actuation? No!

Visual Navigation for Flying Robots 39 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 40
Smith Predictor
Plant model is available 5 seconds delay Results in perfect compensation Why is this unrealistic in practice?
Smith Predictor
Time delay (and plant model) is often not known accurately (or changes over time) What happens if time delay is overestimated?
41
42
61
Smith Predictor
Time delay (and plant model) is often not known accurately (or changes over time) What happens if time delay is underestimated?
Robot
Position Control
Next waypoint Localization Position Control
Sensors position velocity acceleration
Actuators
forces torques
Physical World Kinematics Dynamics
43
44
Rigid Body Kinematics

Consider a rigid body Free floating in 1D space, no gravity How does this system evolve over time?

Consider a rigid body Free floating in 1D space, no gravity How does this system evolve over time? Example:
45
46

Consider a rigid body Free floating in 1D space, no gravity How does this system evolve over time? Example:

Consider a rigid body Free floating in 1D space, no gravity In each time instant, we can apply a force F Results in acceleration Desired position
47
48
62
P Control
What happens for this control law? This is called proportional control
P Control
What happens for this control law? This is called proportional control
49
50
PD Control
What happens for this control law? Proportional-Derivative control
PD Control
What happens for this control law? What if we set higher gains?
51
52
PD Control
What happens for this control law? What if we set lower gains?
PD Control
What happens when we add gravity?
53
54
63
Gravity compensation
Add as an additional term in the control law Any known (inverse) dynamics can be included
PD Control
What happens when we have systematic errors? (noise with non-zero mean) Example: unbalanced quadrocopter, wind, Does the robot ever reach its desired location?
add example plot
55
56
PID Control
Idea: Estimate the system error (bias) by integrating the error Proportional+Derivative+Integral Control
add example plot
PID Control
Idea: Estimate the system error (bias) by integrating the error Proportional+Derivative+Integral Control For steady state systems, this can be reasonable Otherwise, it may create havoc or even disaster (wind-up effect)
57
Example: Wind-up effect

Quadrocopter gets stuck in a tree does not reach steady state What is the effect on the I-term?
De-coupled Control
So far, we considered only single-input, singleoutput systems (SISO) Real systems have multiple inputs + outputs MIMO (multiple-input, multiple-output) In practice, control is often de-coupled
Controller 1
Plant
Controller 2
64
How to Choose the Coefficients?

Gains too large: overshooting, oscillations Gains too small: long time to converge Heuristic methods exist In practice, often tuned manually
Example: Ardrone
Cascaded control Inner loop runs on embedded PC and stabilizes flight Outer loop runs externally and implements position control
Laptop Outer loop Ardrone (=seen as the plant by the outer loop) Inner loop onboard, 1000Hz wireless, approx. 15Hz Plant
61
62
Ardrone: Inner Control Loop

Plant input: motor torques
Ardrone: Outer Control Loop

Outer loop sees inner loop as a plant (black box) Plant input: roll, pitch, yaw rate, z velocity Plant output:
Plant output: roll, pitch, yaw rate, z velocity
attitude (measured using gyro + accelerometer)
z velocity (measured using ultrasonic distance sensor + attitude)
63
64
Mechanical Equivalent
PD Control is equivalent to adding springdampers between the desired values and the current position
PID Control Summary

PID is the most used control technique in practice P control simple proportional control, often enough PI control can compensate for bias (e.g., wind) PD control can be used to reduce overshoot (e.g., when acceleration is controlled) PID control all of the above
65
65
Optimal Control
What other control techniques do exist? Linear-quadratic regulator (LQR) Reinforcement learning Inverse reinforcement learning ... and many more
Optimal Control
Find the controller that provides the best performance Need to define a measure of performance What would be a good performance measure?
Minimize the error? Minimize the controls? Combination of both?
67
68
Linear Quadratic Regulator

Given: Discrete-time linear system Quadratic cost function
Reinforcement Learning
In principle, any measure can be used Define reward for each state-action pair Find the policy (controller) that maximizes the expected future reward Compute the expected future reward based on
Known process model Learned process model (from demonstrations)
Goal: Find the controller with the lowest cost LQR control
Inverse Reinforcement Learning

Parameterized reward function Learn these parameters from expert demonstrations and refine Example: [Abbeel and Ng, ICML 2010]
Interesting Papers at ICRA 2012

Flying robots are a hot topic in the robotics community 4 (out of 27) sessions on flying robots, 4 sessions on localization and mapping Robots: quadrocopters, nano quadrocopters, fixed-wing airplanes Sensors: monocular cameras, Kinect, motion capture, laser-scanners
71
66
Autonomous Indoor 3D Exploration with a Micro-Aerial Vehicle

Shaojie Shen, Nathan Michael, and Vijay Kumar
Decentralized Formation Control with Variable Shapes for Aerial Robots

Matthew Turpin, Nathan Michael, and Vijay Kumar
Map a previously unknown building Find good exploration frontiers in partial map
Move in formation (e.g., to traverse a window) Avoid collisions Dynamic role switching
73
74
Versatile Distributed Pose Estimation and Sensor Self-Calibration for an Autonomous MAV
Stephan Weiss, Markus W. Achtelik, Margarita Chli, Roland Siegwart
On-board Velocity Estimation and Closed-loop Control of a Quadrotor UAV based on Optical Flow
Volker Grabe, Heinrich H. Blthoff, and Paolo Robuffo Giordano
IMU, camera EKF for pose, velocity, sensor bias, scale, intersensor calibration
Ego-motion from optical flow using homography constraint Use for velocity control
75
76
Autonomous Landing of a VTOL UAV on a Moving Platform Using Image-based Visual Servoing
Daewon Lee, Tyler Ryan and H. Jin. Kim
Resonant Wireless Power Transfer to Ground Sensors from a UAV

Brent Griffin and Carrick Detweiler
Tracking and landing on a moving platform Switch between tracking and landing behavior
Quadrocopter transfers power to light a LED
77
78
67
Using Depth in Visual Simultaneous Localisation and Mapping

Sebastian A. Scherer, Daniel Dube and Andreas Zell
ICRA Papers
Will put them in our paper repository Remember password (or ask by mail) See course website
Combine PTAM with Kinect Monocular SLAM: scale drift Kinect: has small maximum range
79
80
68
Organization: Exam
Registration deadline: June 30 Course ends: July 19 Examination dates: t.b.a. (mid August)
Oral team exam Sign up for a time slot starting from Mid July List will be placed on blackboard in front of our secretary
Visual Navigation for Flying Robots Visual Motion Estimation

Dr. Jrgen Sturm
Motivation

Visual Motion Estimation

Quick geometry recap Image filters 2D image alignment Corner detectors Kanade-Lucas-Tomasi tracker 3D motion estimation
Angular and linear velocities

Linear velocity Angular velocity Linear and angular velocity together form a twist
y

Linear velocity Angular velocity Now consider a 3D point body moving with twist
of a rigid
x z
x z
69

Linear velocity Angular velocity Now consider a 3D point of a rigid body moving with twist What is the velocity at point ?
y

y
x z
x z

y
Recap: Perspective Projection
x z
Recap: Perspective Projection
3D point (in the camera frame) 2D point (on the image plane) Pin-hole camera model
Remember, normalize
is homogeneous, need to
11
12
70
Camera Intrinsics
So far, 2D point is given in meters on image plane But: we want 2D point be measured in pixels (as the sensor does)
Camera Intrinsics
Need to apply some scaling/offset
Focal length Camera center Skew

13 14
Image Plane
Pixel coordinates Image plane
Image Functions
We can think of an image as a function gives the intensity at position Color images are vector-valued functions
Example:
Discrete case (default in this course) Continuous case
15
16
Image Functions
Realistically, the image function is only defined on a rectangle and has finite range
Example
Image can be represented as a matrix Alternative notations

111 115 113 111 112 111 112 111 135 138 137 139 145 146 149 147 163 168 188 196 206 202 206 207 180 184 206 219 202 200 195 193
often (row,column) often (column,row)

189 193 214 216 104 191 201 217 220 103 195 205 216 222 113
79 59 68
83 60 69
77 68 83
199 203 223 228 108
68
71
77
18
71
Digital Images
Light intensity is sampled by CCD/CMOS sensor on a regular grid Electric charge of each cell is quantized and gamma compressed (for historical reasons) with CRTs / monitors do the inverse Almost all images are gamma compressed Double brightness results only in a 37% higher intensity value (!)
Aliasing
High frequencies in the scene and a small fill factor on the chip can lead to (visually) unpleasing effects Examples
20
Rolling Shutter
Most CMOS sensors have a rolling shutter Rows are read out sequentially Sensitive to camera and object motion Can we correct for this?
Image Filtering
We want to remove unwanted sources of variation, and keep the information relevant for whatever task we need to solve
Example tasks: de-noising, (de-)blurring, computing derivatives, edge detection,

Linear Filtering
Each output is a linear combination of all the input values
Spatially Invariant Filtering

We are often interested in spatially invariant operations
In matrix form
G=HF
Example
111 115 113 111 112 111 112 111 135 138 137 139 145 146 149 147 163 168 188 196 206 202 206 207 180 184 206 219 202 200 195 193
-1 -1 -1
2 2 2
-1 -1 -1
c =
189 193 214 216 104 191 201 217 220 103 195 205 216 222 113
79 59 68
83 60 69
77 68 83
?
199 203 223 228 108
68
71
77
23
24
72
Spatially Invariant Filtering

We are often interested in spatially invariant operations
Impulses Shifts Blur
Important Filters
Example
111 115 113 111 112 111 112 111 135 138 137 139 145 146 149 147 163 168 188 196 206 202 206 207 180 184 206 219 202 200 195 193 189 193 214 216 104 191 201 217 220 103 195 205 216 222 113 79 59 68 83 60 69 77 68 83 ? ? ? -5 -29 -50 -41 -24 -23 ? ? 9 18 40 41 37 33 ? ? -9 24 142 ? 21 4 -88 ? -12 -7 -34 ? 10 5 10 0 ? ? ? ? ? ? ? ?
Gaussian Bilateral filter Motion blur
Edges
Finite difference filter Derivative filter Oriented filters Gabor filter
-1 -1 -1
2 2 2
-1 -1 -1
? ? ? ? ? ?
264 -175 -71
349 -224 -120 -10 360 -217 -134 -23 ? ? ? ?
199 203 223 228 108
68
71
77

25
Impulse
Image shift (translation)
2 pixels
0 0 0 0 0
0 0 0 0 0
0 0 1 0 0
0 0 0 0 0
0 0 0 0 0
0 0 1 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
27
28
Image rotation
Image rotation
Image rotation is a linear operator (why?), but not a spatially invariant operation (why?). There is no convolution.
?
?
73
Rectangular Filter
Rectangular Filter
31
32
Rectangular Filter
Gaussian Blur
Gaussian distribution
Example of resulting kernel
33
34
Gaussian Blur
s=1
Image Gradient
The image gradient points in the direction of increasing intensity (steepest ascend)
s=2
s=4
36
74
Image Gradient
The image gradient points in the direction of increasing intensity (steepest ascend)
Image Gradient
Gradient direction (related to edge orientation)
Gradient magnitude (edge strength)
37
38
Image Gradient
How can we differentiate a digital image ? Option 1: Reconstruct a continuous image, then take gradient Option 2: Take discrete derivative (finite difference filter) Option 3: Convolve with derived Gaussian (derivative filter)
Finite difference
First-order central difference
Corresponding convolution kernel:
-.5
.5
39
40
Finite difference
First-order central difference (half pixel)
Second-order Derivative
Differentiate again to get second-order central difference
-1
-2
41
42
75
Example
Example
-1
-1 1
43
44
(Dense) Motion Estimation

2D motion
Problem Statement
Given: two camera images Goal: estimate the camera motion
3D motion For the moment, lets assume that the camera only moves in the xy-plane, i.e., Extension to 3D follows
General Idea
1. Define an error metric that defines how well the two images match given a motion vector 2. Find the motion vector with the lowest error
Error Metrics for Image Comparison

Sum of Squared Differences (SSD)
with displacement and residual errors
47
48
76
Robust Error Metrics

SSD metric is sensitive to outliers Solution: apply a (more) robust error metric

Sum of Absolute Differences
Sum of truncated errors
Geman-McClure function (Huber norm)
49
50
Windowed SSD
Images (and image patches) have finite size Standard SSD has a bias towards smaller overlaps (less error terms) Solution: divide by the overlap area Root mean square error
51
52
Exposure Differences
Images might be taken with different exposure (auto shutter, white balance, ) Bias and gain model With SSD we get
Cross-Correlation
Maximize the product (instead of minimizing the differences)
Normalized cross-correlation (between -1..1)
53
54
77
General Idea
1. Define an error metric that defines how well the two images match given a motion vector 2. Find the motion vector with the lowest error
Finding the minimum

Full search (e.g., 16 pixels) Gradient descent Hierarchical motion estimation
55
56
Hierarchical motion estimation

Construct image pyramid
Gradient Descent
Perform gradient descent on the SSD energy function (Lucas and Kanade, 1981) Taylor expansion of energy function
Estimate motion on coarse level Use as initialization for next finer level
with
Least Squares Problem

Goal: Minimize
Least Squares Problem

1. Compute A,b from image gradients using
Solution: Compute derivative (and set to zero)

with and with and
2. Solve
All of these computation are super fast!

78
Covariance of the Estimated Motion

Assuming (small) Gaussian noise in the images
Optical Computer Mouse (since 1999)

E.g., ADNS3080 from Agilent Technologies, 2005
6400 fps 30x30 pixels 4 USD
with results in uncertainty in the motion estimate with covariance (e.g., useful for Kalman filter)
61
62
Image Patches
Sometimes we are interested of the motion of a small image patches Problem: some patches are easier to track than others What patches are easy/difficult to track? How can we recognize good patches?
Image Patches
Sometimes we are interested of the motion of a small image patches Problem: some patches are easier to track than others
63
64
Example
Lets look at the shape of the energy functional
Corner Detection
Idea: Inspect eigenvalues

of Hessian
small no point of interest large, small edge large corner
Harris detector (does not need eigenvalues)
Shi-Tomasi (or Kanade-Lucas)

79
Corner Detection
1. For all pixels, computer corner strength 2. Non-maximal suppression (E.g., sort by strength, strong corner suppresses weaker corners in circle of radius r)
Other Detectors
Frstner detector (localize corner with subpixel accuracy) FAST corners (learn decision tree, minimize number of tests super fast) Difference of Gaussians / DoG (scale-invariant detector)
strongest responses
non-maximal suppression
Kanade-Lucas-Tomasi (KLT) Tracker

Algorithm
1. Find (Shi-Tomasi) corners in first frame and initialize tracks 2. Track from frame to frame 3. Delete track if error exceeds threshold 4. Initialize additional tracks when necessary 5. Repeat step 2-4
Example
KLT tracker is highly efficient (real-time on CPU) but provides only sparse motion vectors Dense optical flow methods require GPU
3D Motion Estimation
(How) Can we recover the camera motion from the estimated flow field?
Approach [Grabe et al., ICRA12]

Compute optical flow Estimate homography between images Extract angular and (scaled) linear velocity Additionally employ information from IMU
Research paper: Grabe et al., ICRA 2012

http://www9.in.tum.de/~sturmju/dirs/icra2012/data/papers/2025.pdf
72
80
Assumptions
1. The quadrocopter moves slowly relative to the sampling rate limited search radius
Apparent Velocity of a Point

Stationary 3D point feature, given in camera frame Moving camera with twist
2. The environment is planar with normal image transformation is a homography
Apparent velocity of the point in camera frame
73
74
Continuous Homography Matrix

Assumption: All feature points are located on a plane
Continuous Homography Matrix

Rewrite this to and plug it into the equation for the apparent velocity, we obtain
with plane normal and distance is called the continuous homography matrix Note: contains both the linear/angular velocity and the scene structure
Continuous Homography Constraint

The camera observes point (assuming for simplicity) at pixel

We now have
1. 2. (time derivative of the optical flow constraint ) and
The KLT tracker estimates the motion feature track in the image Constraint:
of the
Lets combine these two formulas
77
78
81

Combining these formulas gives us
Approach
Result: For all observed motions in the image, the continuous homography constraint holds How can we use this to estimate the camera motion?!
Multiply both sides with
gives us
79
80
Approach
Result: For all observed motions in the image, the continuous homography constraint holds How can we use this to estimate the camera motion?
1. Estimate 2. Recover
Remember:
Step 1: Estimate H
Continuous homography constraint Stack matrix H as a vector Linear system of equations For several feature tracks and rewrite
from at least 4 feature tracks and from
82
Step 1: Estimate H
Linear set of equations
Step 2: Recover camera motion

Grabe et al. investigated three alternatives: 1. Recover from using the 8-point algorithm (not yet explained) 2. Use angular velocity from IMU to de-rotate observed feature tracks beforehand, then: 3. Additionally use gravity vector from IMU as plane normal , then
Solve for
using least squares
83
82
Evaluation
Comparison of estimated velocities with ground truth from motion capture system
Algorithm Pure vision Ang. vel. known Normal known Norm error 0.134 0.117 0.113 Std. deviation 0.094 0.093 0.088
Visual Velocity Control

All computations are carried out on-board (18fps)
Comparison of actual velocity with desired velocity (closed-loop control)

Algorithm Pure vision Ang. vel. known Normal known Visual Navigation for Flying Robots Norm error 0.084 0.039 0.028 85 Std. deviation 0.139 0.042 0.031 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 86 Dr. Jrgen Sturm, Computer Vision Group, TUM
[Grabe et al., ICRA 12]
Landing on a Moving Platform

Similar approach, but with offboard computation
Commercial Solutions
Helicommand 3D from Robbe 2(?) cameras, IMU, air pressure sensor, 450 EUR Parrot Mainboard + Navigation board 1 camera, IMU, ultrasound sensor, 210 USD
[Heriss et al., T-RO 12]

Lessons Learned Today

How to estimate the translational motion from camera images Which image patches are easier to track than others How to estimate 3D motion from multiple feature tracks (and IMU data)
A Few Ideas for Your Mini-Project

Person following (colored shirt or wearing a marker) Flying camera for taking group pictures (possibly using the OpenCV face detector) Fly through a hula hoop (brightly colored, white background) Navigate through a door (brightly colored) Navigate from one room to another (using ground markers) Avoid obstacles using optical flow Landing on a moving platform Your own idea here be creative! ...
89
83
Joggobot
Follows a person wearing a visual marker
[http://exertiongameslab.org/projects/joggobot] `
84
Organization: Exam Dates

Registration deadline: June 30 Course ends: July 19 Examination dates: August 9+14 (Thu+Tue)
Oral team exam Sign up for a time slot starting from now List placed on blackboard in front of our secretary
Visual Navigation for Flying Robots Simultaneous Localization and Mapping (SLAM)
Dr. Jrgen Sturm
VISNAV Oral Team Exam

Date and Time Tue, Aug. 9, 10am Tue, Aug. 9, 11am Tue, Aug. 9, 2pm Tue, Aug. 9, 3pm Tue, Aug. 9, 4pm Thu, Aug. 14, 10am Student Name Student Name Student Name
The SLAM Problem

SLAM is the process by which a robot builds a map of the environment and, at the same time, uses the map to compute its location Localization: inferring location given a map Mapping: inferring a map given a location
Thu, Aug. 14, 11am

Thu, Aug. 14, 2pm Thu, Aug. 14, 3pm Thu, Aug. 14, 4pm
The SLAM Problem

Given: The robots controls (Relative) observations Wanted: Map of features Trajectory of the robot
SLAM Applications
SLAM is central to a range of indoor, outdoor, in-air and underwater applications for both unmanned and autonomous vehicles. Examples At home: vacuum cleaner, lawn mower Air: inspection, transportation, surveillance Underwater: reef/environmental monitoring Underground: search and rescue Space: terrain mapping, navigation
85
SLAM with Ceiling Camera (Samsung Hauzen RE70V, 2008)
SLAM with Laser + Line camera (Neato XV 11, 2010)
Localization, Path planning, Coverage (Neato XV11, $300)
SLAM vs. SfM

In Robotics: Simultaneous Localization and Mapping (SLAM)
Laser scanner, ultrasound, monocular/stereo camera Typically in combination with an odometry sensor Typically pre-calibrated sensors
In Computer Vision: Structure from Motion (SfM), sometimes: Structure and Motion
Monocular/stereo camera Sometimes uncalibrated sensors (e.g., Flick images)
Agenda for Today

This week: focus on monocular vision
Feature detection, descriptors and matching Epipolar geometry Robust estimation (RANSAC) Examples (PTAM, Photo Tourism)
How Do We Build a Panorama Map?

We need to match (align) images Global methods sensitive to occlusion, lighting, parallax effects How would you do it by eye?
Next week: focus on optimization (bundle adjustment), stereo cameras, Kinect In two weeks: map representations, mapping and (dense) 3D reconstruction
86
Matching with Features

Detect features in both images

Detect features in both images Find corresponding pairs
13
14

Detect features in both images Find corresponding pairs Use these pairs to align images

Problem 1: We need to detect the same point independently in both images
no chance to match!
We need a reliable detector


Problem 2: For each point correctly recognize the corresponding one
Ideal Feature Detector

Always finds the same point on an object, regardless of changes to the image Insensitive (invariant) to changes in:
Scale Lightning Perspective imaging Partial occlusion
We need a reliable and distinctive descriptor

87
Harris Detector
Rotation invariance?
Harris Detector
Rotation invariance?
Remember from last week
19
20
Harris Detector
Rotation invariance
Harris Detector
Invariance to intensity change?
Remember from last week
Ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner response R is invariant to rotation
22
Harris Detector
Partial invariance to additive and multiplicative intensity changes
Only derivatives are used invariance to intensity shift Intensity scale : Because of fixed intensity threshold on local maxima, only partial invariance
R R
Harris Detector
Invariant to scaling?
threshold
x (image coordinate)
x (image coordinate)
24
88
Harris Detector
Not invariant to image scale
Difference Of Gaussians (DoG)

Alternative corner detector that is additionally invariant to scale change Approach:
Run linear filter (diff. of two Gaussians, Do this at different scales Search for a maximum both in space and scale )
All points classified as edge
Point classified as corner =
25
26
Example: Difference of Gaussians
SIFT Detector
Search for local maximum in space and scale
Corner detections are invariant to scale change

f
Image 1
Image 2
scale = 1/2
Scale
Scale
SIFT Detector
1. Detect maxima in scale-space 2. Non-maximum suppression 3. Eliminate edge points (check ratio of eigenvalues) 4. For each maximum, fit quadratic function and compute center at sub-pixel accuracy
Rap em sl e Br l u Sbat ut c r
Example
1. Input image 233x189 pixel 2. 832 candidates DoG minima/maxima (visualization indicate scale, orient., location) 3. 536 keypoints remain after thresholding on minimum contrast and principal curvature
1) 2) 3)
29
30
89
Feature Matching
Now, we know how to find repeatable corners Next question: How can we match them?
Template Convolution
Extract a small as a template
?
Convolve image with this template
32
Template Convolution
Invariances Scaling: No Rotation: No (maybe rotate template?) Illumination: No (use bias/gain model?) Perspective projection: Not really
Scale Invariant Feature Transform (SIFT)

Lowe, 2004: Transform patches into a canonical form that is invariant to translation, rotation, scale, and other imaging parameters
SIFT Features
Scale Invariant Feature Transform (SIFT)

Approach 1. Find SIFT corners (position + scale) 2. Find dominant orientation and de-rotate patch 3. Extract SIFT descriptor (histograms over gradient directions)
Select Dominant Orientation

Create a histogram of local gradient directions computed at selected scale (36 bins) Assign canonical orientation at peak of smoothed histogram Each key now specifies stable 2D coordinates (x, y, scale, orientation)
35
36
90
SIFT Descriptor
Compute image gradients over 16x16 window (green), weight with Gaussian kernel (blue) Create 4x4 arrays of orientation histograms, each consisting of 8 bins In total, SIFT descriptor has 128 dimensions
Feature Matching
Given features in , how to find best match in ? Define distance function that compares two features Test all the features in , find the one with the minimal distance
37
38
Feature Distance
How to define the difference between features? Simple approach is Euclidean distance (or SSD)
Feature Distance
How to define the difference between features? Simple approach is Euclidean distance (or SSD) Problem: can give good scores to ambiguous (bad) matches
39
40
Feature Distance
How to define the difference between features? Better approach with best matching feature from second best matching feature from Gives small values for ambiguous matches
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search
41
42
91
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree)
Efficient Matching
Localize query in tree Search nearby leaves until nearest neighbor is guaranteed found
43
44
Efficient Matching
Efficient Matching
45
46
Efficient Matching
Localize query in tree Search nearby leaves until nearest neighbor is guaranteed found Best-bin-first: use priority queue for unchecked leafs
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree) Approximate search
Locality sensitive hashing Approximate nearest neighbor
48
92
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree) Approximate search
Locality sensitive hashing Approximate nearest neighbor
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree) Approximate search Vocabulary trees
49
50
Other Descriptors
SIFT (Scale Invariant Feature Transform) [Lowe, 2004] SURF (Speeded Up Robust Feature) [Bay et al., 2008] BRIEF (Binary robust independent elementary features) [Calonder et al., 2010] ORB (Oriented FAST and Rotated Brief) [Rublee et al, 2011]
Example: RGB-D SLAM

[Engelhard et al., 2011; Endres et al. 2012]
Feature descriptor: SURF Feature matching: FLANN (approximate nearest neighbor)
52
Structure From Motion (SfM)

Now we can compute point correspondences What can we use them for?
Four Important SfM Problems

Camera calibration
Known 3D points, observe corresponding 2D points, compute camera pose
Point triangulation
Known camera poses, observe 2D point correspondences, compute 3D point
Motion estimation (epipolar geometry)

Observe 2D point correspondences, compute camera pose (up to scale)
Bundle adjustment (next week!)

Observe 2D point correspondences, compute camera pose and 3D points (up to scale)
93
Camera Calibration
Given: Wanted: such that 2D/3D correspondences
Step 1: Estimate M
Each correspondence generates two equations
The algorithm has two parts:

1. Compute 2. Decompose
Multiplying out gives equations linear in the elements of

via QR decomposition
into
55
Re-arrange in matrix form

Step 1: Estimate M
Re-arranged in matrix form
Step 2: Recover K,R,t

Remember The first 3x3 submatrix is the product of an upper triangular and orthogonal (rot.) matrix
with Concatenate equations for n6 correspondences

Wanted vector is in the null space of Initial solution using SVD (vector with least singular value), refine using non-linear min.
Procedure: 1. Factor into using QR decomposition 2. Compute translation as

Example: ARToolkit Markers (1999)

1. 2. 3. 4. 5. Threshold image Detect edges and fit lines Intersect lines to obtain corners Estimate projection matrix M Extract camera pose R,t (assume K is known)
Triangulation
Given: cameras point correspondence Wanted: Corresponding 3D point
The final error between measured and projected points is typically less than 0.02 pixels
94
Triangulation
Where do we expect to see ?
Epipolar Geometry
Consider two cameras that observe a 3D world point
Minimize the residuals (e.g., using least squares)
61
62
Epipolar Geometry
The line connecting both camera centers is called the baseline
Epipolar Geometry
Given the image of a point in one view, what can we say about its position in another?
epipolar line of x
baseline (line joining both camera centers)
A point in one image generates a line in another image (called the epipolar line)
63
Epipolar Geometry
Left line in left camera frame Right line in right camera frame where are the (local) ray directions
Epipolar Geometry
Left line in right camera frame Right line in right camera frame where are the (local) ray directions Intersection of both lines
=0 0=
this is called the epipolar constraint

95
Epipolar Geometry
Note: The epipolar constraint holds for every pair of corresponding points
8-Point Algorithm: General Idea

1. Estimate the essential matrix E from at least eight point correspondences 2. Recover the relative pose R,t from E (up to scale)
where
is called the essential matrix
67
68
Step 1: Estimate E
Epipolar constraint Written out (with )
Step 1: Estimate E
Each correspondence gives us one constraint
Stack the elements into two vectors
Linear system with n equations e is in the null-space of Z Solve using SVD (assuming
69
Normalized 8-Point Algorithm

[Hartley 1997]
Step 2: Recover R,t

Note: The absolute distance between the two cameras can never be recovered from pure images measurements alone!!! Illustration
Noise in the point observations is unequally distributed in the constraints, e.g.,

double noise
normal noise
Estimation is sensitive to scaling Normalize all points to have zero mean and unit variance
We can only recover the translation up to scale
noise free
96
Step 2a: Recover t

Remember: Therefore, is in the null space of
Step 2b: Recover R

Remember, the cross-product
projects a vector onto a set of orthogonal basis vectors including zeros out the component rotates the other two by 90
Recover
(up to scale) using SVD
73
74
Step 2b: Recover R

Plug this into the essential matrix equation
= =
Summary: 8-Point Algorithm

Given: Image pair
By identifying
and
, we obtain
Find: Camera motion R,t (up to scale) Compute correspondences Compute essential matrix Extract camera motion
75
How To Deal With Outliers?
Robust Estimation
Example: Fit a line to 2D data containing outliers
Problem: No matter how good the feature descriptor/matcher is, there is always a chance for bad point correspondences (=outliers)
There are two problems 1. Fit the line to the data 2. Classify the data into inliers (valid points) and outliers (using some threshold)
97
RANdom SAmple Consensus (RANSAC)

[Fischler and Bolles, 1981]
RANdom SAmple Consensus (RANSAC)

1. one random subset
Goal: Robustly fit a model to a data set which contains outliers Algorithm: 1. Randomly select a (minimal) subset of data points and instantiate the model from it 2. Using this model, classify the all data points as inliers or outliers 3. Repeat 1&2 for iterations 4. Select the largest inlier set, and re-estimate the model from all points in this set
3. classify as inliers/outliers
2. fit model based on this subset
4. re-fit the model based on all inliers
RANSAC is used very widely Many improvements/variants, e.g., MLESAC:
80
How Many Samples?

For probability of having no outliers, we need to sample for subset size and outlier ratio E.g., for p=0.95: subsets PTAM
Two Examples
G. Klein and D. Murray, Parallel Tracking and Mapping for Small AR Workspaces, International Symposium on Mixed and Augmented Reality (ISMAR), 2007
http://www.robots.ox.ac.uk/~gk/publications/KleinMurray2007ISMAR.pdf
Photo Tourism
N. Snavely, S. M. Seitz, R. Szeliski, Photo tourism: Exploring photo collections in 3D, ACM Transactions on Graphics (SIGGRAPH), 2006
http://phototour.cs.washington.edu/Photo_Tourism.pdf
81
82
PTAM (2007)
Architecture optimized for dual cores
Mapping thread
image image image
PTAM Tracking Thread

Compute pyramid Detect FAST corners Project points Measure points Update Camera Pose
Coarse stage
Tracking Thread
Thread 1 Thread 2
Track camera Optimize map
Track camera
Track camera Optimize map
Project points Measure points Update Camera Pose

Fine stage
Tracking thread runs in real-time (30Hz) Mapping thread is not real-time

Draw Graphics
98
PTAM Feature Tracking

Generate 8x8 matching template (warped from key frame to current pose estimate) Search a fixed radius around projected position
Using SSD Only search at FAST corner points
PTAM Mapping Thread

Mapping Thread Initialization
Wait for new key frame

Add new map points Optimize map Map maintenance Tracking Thread
85
86
PTAM Example Timings

Tracking thread
Total Key frame preparation Feature Projection Patch search Iterative pose update 19.2 ms 2.2 ms 3.5 ms 9.8 ms 3.7 ms
PTAM Video
Mapping thread
Key frames
Local Bundle Adjustment Global Bundle Adjustment 2-49 50-99 100-149
170 ms 380 ms
270 ms 1.7 s
440 ms 6.9 s
87
88
Photo Tourism (2006)

Overview
Input Photographs (from Flickr) Scene Reconstruction Photo Explorer
Photo Tourism Scene Reconstruction

Processing pipeline
Feature detection (SIFT) Pair-wise matching
Correspondence estimation (RANSAC)

Incremental structure from motion
Relative camera positions and orientations Point cloud Sparse correspondence
Automatically estimate
Position, orientation and focal length of all cameras 3D positions of point features
89
99
Photo Tourism Input Images
Photo Tourism Feature Detection
91
92
Photo Tourism Feature Matching
Incremental Structure From Motion

To help get good initializations, start with two images only (compute pose, triangulate points) Non-linear optimization Iteratively add more images
93
94
Photo Tourism Video

how to detect and match feature points how to compute the camera pose and to triangulate points ... how to deal with outliers
95
96
100
Project Proposal Presentations

This Thursday Dont forget to put title, team name, team members on first slide Pitch has to fit in 5 minutes (+5 minutes discussion) 9 x (5+5) = 90 minutes Recommendation: use 3-5 slides
Visual Navigation for Flying Robots Bundle Adjustment and Stereo Correspondence
Dr. Jrgen Sturm
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA
Agenda for Today
Remember: 3D Transformations
Representation as a homogeneous matrix
Pro: easy to concatenate and invert Con: not minimal
Map optimization
Graph SLAM Bundle adjustment
Depth reconstruction
Laser triangulation Structured light (Kinect) Stereo cameras
Representation as a twist coordinates

Pro: minimal Con: need to convert to matrix for concatenation and inversion
Remember: 3D Transformations
From twist coordinates to twist
Remember: Rodrigues formula

Given: Twist coordinates with Return: Homogeneous transformation
Exponential map between se(3) and SE(3)

alternative notation:
101
Notation
Camera poses in a minimal representation (e.g., twists)
Incremental Motion Estimation

Idea: Estimate camera motion from frame to frame
as transformation matrices as rotation matrices and translation vectors

Idea: Estimate camera motion from frame to frame Motion concatenation (for twists) Motion composition operator (in general)

10

Loop Closures
Idea: Estimate camera motion from frame to frame Problem:
Estimates are inherently noisy Error accumulates over time drift
11
12
102


Idea: Estimate camera motion from frame to frame Two ways to compute :
13
14
Loop Closures
Solution: Use loop-closures to minimize the drift / minimize the error over all constraints
Graph SLAM
[Thrun and Montemerlo, 2006; Olson et al., 2006]
Use a graph to represent the model Every node in the graph corresponds to a pose of the robot during mapping Every edge between two nodes corresponds to a spatial constraint between them Graph-based SLAM: Build the graph and find the robot poses that minimize the error introduced by the constraints
15
Example: Graph SLAM on Intel Dataset
Graph SLAM Architecture

Focus of today Constraint/graph generation (Front-end)
raw sensor data
graph (nodes and edges) camera poses
Graph optimization (Back-end) map
Interleaving process of front-end and back-end A consistent map helps to determine new constraints by reducing the search space
103
Problem Definition
Given: Set of observations Wanted: Set of camera poses State vector
Map Error
Real observation Expected observation Difference between observation and expectation
Given the correct map, this difference is the result of sensor noise
Error Function
Assumption: Sensor noise is normally distributed Error term for one observation (proportional to negative loglikelihood)
Error Function
Map error (over all observations)
Minimize this error by optimizing the camera poses
Note: error is a scalar

How can we solve this optimization problem?

Non-Linear Optimization Techniques

Gradient descend Gauss-Newton Levenberg-Marquardt 1. 2. 3. 4. 5.
Gauss-Newton Method
Linearize the error function Compute its derivative Set the derivative to zero Solve the linear system Iterate this procedure until convergence
23
24
104
Step 1: Linearize the Error Function

Error function
Linearize the Error Function

Approximate the error function around an initial guess using Taylor expansion
Evaluate the error function around the initial guess
with
Lets derive this term first
25
26
Derivatives of the Error Terms

Does one error function state variables in ? depend on all

Does one error function state variables in ?
No, depends only on
depend on all
and
27
28

No, depends only on

No, depends only on
depend on all
and
depend on all
and
Is there any consequence on the structure of the Jacobian?
Is there any consequence on the structure of the Jacobian?

Yes, it will be non-zero only in the columns corresponding to and Jacobian is sparse
29
Jij (x) = 0
@eij (x) @ci

30
@eij (x) @cj
105
Linearizing the Error Function

Linearize
Illustration of the Structure

Non-zero only at and
with
What is the structure of and ? (Remember: all s are sparse)



Non-zero on the main diagonal at and
Non-zero on the main diagonal at and ... and at the blocks ij,ji

b: dense vector
(Linear) Least Squares Minimization

1. Linearize error function 2. Compute the derivative
++
H: sparse block structure with main diagonal
3. Set derivative to zero 4. Solve this linear system of equations, e.g.,

++
106
Gauss-Newton Method
Problem: is non-linear! Algorithm: Repeat until convergence 1. Compute the terms of the linear system
Sparsity of the Hessian

The Hessian is
positive semi-definit symmetric sparse
This allows the use of efficient solvers
2. Solve the linear system to get new increment 3. Update previous estimate
Sparse Cholesky decomposition (~100M matrix elements) Preconditioned conjugate gradients (~1.000M matrix elements) many others
Example in 1D
Two camera poses State vector One (distance) observation Initial guess Observation Sensor noise Error
Example in 1D
Jacobian Build linear system of equations
Solve the system but ???

What Went Wrong?

The constraint only specifies a relative constraint between two nodes Any poses for the nodes would be fine as long as their relative coordinates fit One node needs to be fixed
Option 1: Remove one row/column corresponding to the fixed pose Option 2: Add to a linear constraint Option 3: Add the identity matrix to (LevenbergMarquardt)
Fixing One Node

The constraint only specifies a relative constraint between two nodes Any poses for the nodes would be fine as long as their relative coordinates fit One node needs to be fixed (here: Option 2)
additional constraint that sets
43
107
Levenberg-Marquardt Algorithm
Idea: Add a damping factor
Levenberg-Marquardt Algorithm
Idea: Add a damping factor
What is the effect of this damping factor?

Small ? Large ?
What is the effect of this damping factor?

Small same as least squares Large steepest descent (with small step size)
Algorithm
If error decreases, accept If error increases, reject
and reduce and increase

Non-Linear Minimization
One of the state-of-the-art solution to compute the maximum likelihood estimate Various open-source implementations available
g2o [Kuemmerle et al., 2011] sba [Lourakis and Argyros, 2009] iSAM [Kaess et al., 2008]
Bundle Adjustment
Graph SLAM: Optimize (only) the camera poses
Bundle Adjustment: Optimize both 6DOF camera poses and 3D (feature) points
Other extensions:
Robust error functions Alternative parameterizations
Typically
(why?)
Error Function
Camera pose Feature point Observed feature location Expected feature location
Error Function
Difference between observation and expectation
Error function
Covariance is often chosen isotropic and on the order of one pixel

108

Each camera sees several points Each point is seen by several cameras Cameras are independent of each other (given the points), same for the points
Primary Structure
Characteristic structure
50
51
Primary Structure
Insight: and are block-diagonal (because each constraint depends only on one camera and one point)
Schur Complement
Given: Linear system
If D is invertible, then (using Gauss elimination)
This can be efficiently solved using the Schur Complement

Reduced complexity, i.e., invert one matrix instead of one matrix

and
Example Hessian (Lourakis and Argyros, 2009)
From Sparse Maps to Dense Maps

So far, we only looked at sparse 3D maps
We know where the (sparse) cameras are We know where the (sparse) 3D feature points are
How can we turn these models into volumetric 3D models?
54
56
109
From Sparse Maps to Dense Maps

Today: Estimation of depth dense images (stereo cameras, laser triangulation, structured light/Kinect) Next week: Dense map representations and data fusion
Human Stereo Vision
57
Stereo Correspondence Constraints

Given a point in the left image, where can the corresponding point be in the right image?
Reminder: Epipolar Geometry

A point in one image generates a line in another image (called the epipolar line) Epipolar constraint
Epipolar line Epipolar plane Epipolar line
Epipole
Baseline
Epipole
Epipolar Plane
All epipolar lines intersect at the epipoles An epipolar plane intersects the left and right image planes in epipolar lines
Epipolar line Epipolar plane Epipolar line
Epipolar Constraint
This is useful because it reduces the correspondence problem to a 1D search along an epipolar line
Epipole
Epipole
Baseline
61
110
Example: Converging Cameras
Example: Parallel Cameras
63
64
Rectification
In practice, it is convenient if the image scanlines (rows) are the epipolar lines Reproject image planes onto a common plane parallel to the baseline (two 3x3 homographies) Afterwards pixel motion is horizontal
Example: Rectification
66
Basic Stereo Algorithm

For each pixel in the left image
Compare with every pixel on the same epipolar line in the right image Pick pixel with minimum matching cost (noisy) Better: match small blocks/patches (SSD, SAD, NCC)
Block Matching Algorithm

Input: Two images and camera calibrations Output: Disparity (or depth) image Algorithm:
1. Geometry correction (undistortion and rectification) 2. Matching cost computation along search window 3. Extrema extraction (at sub-pixel accuracy) 4. Post-filtering (clean up noise)
left image
right image
111
Example
Input
What is the Influence of the Block Size?

Common choices are 5x5 .. 11x11 Smaller neighborhood: more details Larger neighborhood: less noise Suppress pixels with low confidence (e.g., check ratio best match vs. 2nd best match)
Output
3x3
20x20
Problems with Stereo

Block matching typically fails in regions with low texture
Global optimization/regularization (speciality of our research group) Additional texture projection
Example: PR2 Robot with Projected Texture Stereo

wide-angle stereo pair pattern projector narrow-angle stereo pair 5 MP high-res camera
71
72
Laser Triangulation
Idea: Well-defined light pattern (e.g., point or line) projected on scene Observed by a line/matrix camera or a position-sensitive device (PSD) Simple triangulation to compute distance
Laser Triangulation
Function principle
Laser baseline Pin-hole CCD focal length
disparity
depth
Depth triangulation (note: same for stereo disparities)

112
Example: Neato XV-11

K. Konolige, A low-cost laser distance sensor, ICRA 2008 Specs: 360deg, 10Hz, 30 USD
camera laser
How Does the Data Look Like?
75
76
Laser Triangulation
Stripe laser + 2D camera Often used on conveyer belts (volume sensing) Large baseline gives better depth resolution but more occlusions use two cameras
Structured Light
Multiple stripes / 2D pattern Data association more difficult
77
78
Structured Light
Multiple stripes / 2D pattern Data association more difficult Coding schemes
Temporal: Coded light
Structured Light
Multiple stripes / 2D pattern Data association more difficult Coding schemes
Temporal: Coded light Wavelength: Color Spatial: Pattern (e.g., diffraction patterns)
79
80
113
Sensor Principle of Kinect

Kinect projects a diffraction pattern (speckles) in near-infrared light CMOS IR camera observes the scene
stereo Baseline
Example Data
Kinect provides color (RGB) and depth (D) video This allows for novel approaches for (robot) perception
Infrared Infrared pattern camera projectorColor camera


Infrared pattern
(known)

Pattern is memorized at a known depth For each pixel in the IR image
Extract 9x9 template from memorized pattern Correlate with current IR image over 64 pixels and search for the maximum Interpolate maximum to obtain sub-pixel accuracy (1/8 pixel) Calculate depth by triangulation
Infrared image
(with distorted pattern)
Standard block matcher (9x9) Disparity image Depth image

(color encodes distance from camera)
84
Technical Specs
Infrared camera has 640x480 @ 30 Hz
Depth correlation runs on FPGA 11-bit depth image 0.8m 5m range Depth sensing does not work in direct sunlight (why?)
History
2005: Developed by PrimeSense (Israel) 2006: Offer to Nintendo and Microsoft, both companies declined 2007: Alex Kidman becomes new incubation director at Microsoft, decides to explore PrimeSense device. Johnny Lee assembles a team to investigate technology and develop game concepts 2008: The group around Prof. Andrew Blake and Jamie Shotton (Microsoft Research) develops pose recognition 2009: The group around Prof. Dieter Fox (Intel Labs / Univ. of Washington) works on RGB-D mapping and RGB-D object recognition Nov 4, 2010: Official market launch Nov 10, 2010: First open-source driver available 2011: First programming competitions (ROS 3D, PrimeSense), First workshops (RSS, Euron) 2012: First special Issues (JVCI, T-SMC)
RGB camera has 640x480 @ 30 Hz

Bayer color filter
Four 16-bit microphones with DSP for beam forming @ 16kHz Requires 12V (for motor), weighs 500 grams Human pose recognition runs on Xbox CPU and uses only 10-15% processing power @30 Hz
(Paper: http://research.microsoft.com/apps/pubs/default.aspx?id=145347)
85 Visual Navigation for Flying Robots Dr. Jrgen Sturm, Computer Vision Group, TUM
114
Impact of the Kinect Sensor

Sold >18M units, >8M in first 60 days (Guiness: fastest selling consumer electronics device) Has become a standard sensor in robotics
Kinect: Applications
87
88
Open Research Questions

How can RGB-D sensing facilitate in solving hard perception problems in robotics?
Interest points and feature descriptors? Simultaneous localization and mapping? Collision avoidance and visual navigation? Object recognition and localization? Human-robot interaction? Semantic scene interpretation?
89
115
116
Exercise Sheet 5
Prepare mid-term presentation Proposed structure: 3 slides
1. Remind people who you are and what you are doing (can be same slide as last time) 2. Your work/achievements so far (video is a plus) 3. Your plans for the next two weeks
Visual Navigation for Flying Robots Place Recognition, ICP, and Dense Reconstruction
Dr. Jrgen Sturm
Hand in slides before July 3, 10am
Agenda for Today

Localization
Visual place recognition Scan matching and Iterative Closest Point
Remember: Loop Closures

Use loop-closures to minimize the drift / minimize the error over all constraints
Mapping with known poses (3D reconstruction)

Occupancy grids Octtrees Signed distance field Meshing
Loop Closures
How can we detect loop closures efficiently?
Loop Closures
How can we detect loop closures efficiently? 1. Compare with all previous images (not efficient)
117
Loop Closures
How can we detect loop closures efficiently? 2. Use motion model and covariance to limit search radius (metric approach)
Loop Closures
How can we detect loop closures efficiently? 3. Appearance-based place recognition (using bag of words)
Appearance-based Place Recognition

Appearance can help to recover the pose estimate where metric approaches might fail
Analogy to Document Retrieval

Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted sensory, brain, point by point to visual centers in the brain; the visual, perception, cerebral cortex was a movie screen, so to speak, retinal, cerebral cortex, upon which the image in the eye was projected. Through the discoveries of Hubel and Wiesel we eye, cell, optical now know that behind the origin of the visual perception in thenerve, image brain there is a considerably more complicated course of events. By following Hubel, Wiesel the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a step-wise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image.
This is the same location!
China is forecasting a trade surplus of $90bn (51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. China, trade, The figures are likely to further annoy the US, surplus, commerce, which has long argued that China's exports are unfairly helped by a deliberately undervalued exports, imports, US, yuan. Beijing agrees the surplus is too high, but yuan, bank, domestic, says the yuan is only one factor. Bank of China governor Zhou Xiaochuanincrease, foreign, said the country also needed to do more to boost domestic demand trade, value so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value.
Object/Scene Recognition
Analogy to documents: The content can be inferred from the frequency of visual words
Bag of Visual Words

Visual words = (independent) features
object
bag of visual words

face
11 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 12
features
118
Bag of Visual Words

Visual words = (independent) features Construct a dictionary of representative words
Bag of Visual Words

Visual words = (independent) features Construct a dictionary of representative words Represent the image based on a histogram of word occurrences (bag)
Each detected feature is assigned to the closest entry in the codebook
dictionary of visual words (codebook)
13
Overview
feature detection and extraction (e.g., SIFT, ) ... ... codewords dictionary
Learning the Dictionary
descriptor vectors (e.g., SIFT, SURF, )
image representation (histogram of word occurrences)

example patch
16
cluster center = code words
clustering, e.g., k-means

119
Learning the Visual Vocabulary
Example Image Representation

Build the histogram by assigning each detected feature to the closest entry in the codebook
frequency
feature extraction & clustering
codewords
19
20
Object/Scene Recognition
Compare histogram of new scene with those of known scenes, e.g., using
simple histogram intersection nave Bayes more advanced statistical methods
10 Parking lot 5 0
Example: FAB-MAP
[Cummins and Newman, 2008]
Highway ?
Timing Performance

Summary: Bag of Words

[Fei-Fei and Perona, 2005; Nister and Stewenius, 2006]
Inference: 25 ms for 100k locations SURF detection + quantization: 483 ms
Compact representation of content Highly efficient and scalable Requires training of a dictionary Insensitive to viewpoint changes/image deformations (inherited from feature descriptor)
23
24
120
Laser-based Motion Estimation

So far, we looked at motion estimation (and place recognition) from visual sensors Today, we cover motion estimation from range sensors
Laser scanner (laser range finder, ultrasound) Depth cameras (time-of-flight, Kinect )
Laser Scanner
Measures angles and distances to closest obstacles Alternative representation: 2D point set (cloud) Probabilistic sensor model
measured distance z max range
25
26

How can we best align two laser scans?

How can we best align two laser scans? Exhaustive search Feature extraction (lines, corners, ) Iterative minimization (ICP)
27
28
Exhaustive Search
Convolve first scan with sensor model
Example: Exhaustive Search [Olson, 09]

Multi-resolution correlative scan matching Real-time by using GPU Remember: SE(2) has 3 DOFs
Sweep second scan over likelihood map, compute correlation and select best pose
29
30
121
Does Exhaustive Search Generalize To 3D As Well?
Iterative Closest Point (ICP)

Given: Two corresponding point sets (clouds)
Wanted: Translation and rotation that minimize the sum of the squared error
where
and
are corresponding points

Known Correspondences
Note: If the correct correspondences are known, both rotation and translation can be calculated in closed form.
Idea: The center of mass of both point sets has to match
Subtract the corresponding center of mass from every point Afterwards, the point sets are zero-centered, i.e., we only need to recover the rotation
Decompose the matrix
Unknown Correspondences
If the correct correspondences are not known, it is generally impossible to determine the optimal transformation in one step
using singular value decomposition (SVD) Theorem If , the optimal solution of is unique and given by
(for proof, see http://hss.ulb.uni-bonn.de/2006/0912/0912.pdf, p.34/35)

122
ICP Algorithm
[Besl & McKay, 92]
Example: ICP
Algorithm: Iterate until convergence

Find correspondences Solve for R,t
Converges if starting position is close enough
37
38
ICP Variants
Many variants on all stages of ICP have been proposed: Selecting and weighting source points Finding corresponding points Rejecting certain (outlier) correspondences Choosing an error metric Minimization

Performance Criteria
Various aspects of performance
Speed Stability (local minima) Tolerance w.r.t. noise and/or outliers Basin of convergence (maximum initial misalignment)
Choice depends on data and application
39
40
Selecting Source Points

Use all points Uniform sub-sampling Random sampling Feature-based sampling Normal-space sampling
Ensure that samples have normals distributed as uniformly as possible
Spatially Uniform Sampling

Density of points usually depends on the distance to the sensor no uniform distribution Can lead to a bias in ICP
41
42
123
Feature-based Sampling
Detect interest points (same as with images) Decrease the number of correspondences Increase efficiency and accuracy Requires pre-processing
Normal-Space Sampling
Uniform sampling
Normal-space sampling
3D Scan (~200.000 Points)

Extracted Features (~5.000 Points)

Example: Normal-Space Sampling

Normal-space sampling can help on mostlysmooth areas with sparse features
Selection and Weighting

Selection is a form of (binary) weighting Instead of re-sampling one can also use weighting Weighting strategy depends on the data Pre-processing / run-time trade-off
Random sampling
Normal-space sampling
Finding Correspondences
Has greatest effect on convergence and speed Closest point Normal shooting Closest compatible point Projection Speed-up using kd-trees (or oct-trees)
Closest Point Matching

Find closest point in the other point set Distance threshold
Closest-point matching generally stable, but slow and requires pre-processing

124
Normal Shooting
Project along normal, intersect other mesh
Closest Compatible Point

Can improve effectiveness of both the previous variants by only matching to compatible points Compatibility based on normals, colors, In the limit, degenerates to feature matching
Slightly better than closest point for smooth meshes, worse for noisy or complex meshes
Speeding Up Correspondence Search

Finding closest point is most expensive stage of the ICP algorithm Build index for one point set (kd-tree) Use simpler algorithm (e.g., projection-based matching)
Projection-based Matching
Slightly worse performance per iteration Each iteration is one to two orders of magnitude faster than closest-point Requires point-to-plane error metric
51
52
Error Metrics
Point-to-point Point-to-plane lets flat regions slide along each other
point-to-plane distance
Minimization
Only point-to-point metric has closed form solution(s) Other error metrics require non-linear minimization methods
Which non-linear minimization methods do you remember? Which robust error metrics do you remember?
normal
Generalized ICP: Assign individual covariance to each data point [Segal, 2009]
125
Example: Real-Time ICP on Range Images

[Rusinkiewicz and Levoy, 2001]
Real-time scan alignment Range images from structure light system (projector and camera, temporal coding)
55
56
ICP: Summary
ICP is a powerful algorithm for calculating the displacement between point clouds The overall speed depends most on the choice of matching algorithm ICP is (in general) only locally optimal can get stuck in local minima
Agenda for Today

Localization
Visual place recognition Scan matching and Iterative Closest Point
Mapping with known poses (3D reconstruction)

Occupancy grids Octtrees Signed distance field Meshing
57
Occupancy Grid
Idea: Represent the map using a grid Each cell is either free or occupied Robot maintains a belief on map state
Occupancy Grid - Assumptions

Map is static Cells have binary state (empty or occupied) All cells are independent of each other As a result, each cell can be estimated independently from the sensor observations Will also drop index (for the moment)
Goal: Estimate the belief from sensor observations

60
126
Mapping
Goal: Estimate How can this be computed?
Binary Bayes Filter

Goal: Estimate How can this be computed? E.g., using the Bayes Filter from Lecture 3
61
62
Binary Bayes Filter

Prior probability that cell is occupied (often 0.5) Inverse sensor model is specific to the sensor used for mapping The log-odds representation can be used to increase speed and numerical stability
Binary Bayes Filter using Log-Odds

In each time step, compute
previous belief inverse sensor model map prior
When needed, compute current belief as
63
64
Clamping Update Policy

Often, the world is not fully static Consider an appearing/disappearing obstacle To change the state of a cell, the filter needs as many positive (negative) observations Idea: Clamp the beliefs to min/max values
Sensor Model
For the Bayes filter, we need the inverse sensor model
Lets consider an ultrasound sensor

Located at (0,0) Measures distance of 2.5m How does the inverse sensor model look like?
65
66
127
Typical Sensor Model for Ultrasound

Combination of a linear function (in xdirection) and a Gaussian (in y-direction)
Example: Updating the Occupancy Grid
Question: What about a laser scanner?

Resulting Map
Memory Consumption
Consider we want to map a building with 40x40m at a resolution of 0.05cm How much memory do we need?
Note: The maximum likelihood map is obtained by clipping the occupancy grid map at a threshold of 0.5
Memory Consumption
Consider we want to map a building with 40x40m at a resolution of 0.05cm How much memory do we need?
Map Representation by Octtrees

Tree-based data structure Recursive subdivision of space into octants Volumes can be allocated as needed Multi-resolution
And for 3D?
And what about a whole city?

72
128
Example: OctoMap
[Wurm et al., 2011]
Example: OctoMap
[Wurm et al., 2011]
Freiburg, building 79 44 x 18 x 3 m3, 0.05m resolution, 0.7mb on disk
Freiburg computer science campus 292 x 167 x 28 m3, 0.2m resolution, 2mb on disk
73
74
Signed Distance Field (SDF)

[Curless and Levoy, 1996]
Signed Distance Field (SDF)

[Curless and Levoy, 1996]
Idea: Instead of representing the cell occupancy, represent the distance of each cell to the surface Occupancy grid maps: explicit representation
z = 1.8 x zero = free space 0 1 0.5 0.5 one = occupied x
Algorithm: 1. Estimate the signed distance field 2. Extract the surface using interpolation (surface is located at zero-crossing)
x negative = outside obj. -1.3 -0.3 0.7 1.7
SDF: implicit representation

negative = outside obj.
positive = inside obj.
z = 1.8 x -0.3 0.7 1.7 positive = inside obj. x
-1.3
75
76
Weighting Function
Weight each observation according to its confidence
weight (=confidence) signed distance to surface
Data Fusion
Each voxel cell in the SDF stores two values
Weighted sum of signed distances Sum of all weights
When new range image arrives, update every voxel cell according to
measured depth
distance
Weight can additionally be influenced by other modalities (reflectance values, )

129
Two Nice Properties

Noise cancels out over multiple measurements
x x
SDF Example
A cross section through a 3D signed distance function of a real scene
brightness encodes
Zero-crossing can be extracted at sub-voxel accuracy (least squares estimate)

1D Example:
Surface with cross-section
SDF
SDF Fusion
Visualizing Signed Distance Fields

Common approaches to iso surface extraction: 1. Ray casting (GPU, fast) For each camera pixel, shoot a ray and search for zero crossing 2. Poligonization (CPU, slow) E.g., using the marching cubes algorithm Advantage: outputs triangle mesh
81
82
Ray Casting
For each camera pixel, shoot a ray and search for the first zero crossing in the SDF Value in the SDF can be used to skip along when far from surface
Ray Casting
Interpolation reduces artifacts Close to surface, gradient represents the surface normal
83
84
130
Marching Cubes
First in 2D, marching squares: Evaluate each cell separately Check which edges are inside/outside Generate triangles according to lookup table Locate vertices using least squares
Marching Cubes
85
86
KinectFusion
[Newcombe et al., 2011]

How to quickly recognize previously seen places How to align point clouds How to estimate occupancy maps How to reconstruct triangle meshes at subvoxel accuracy
Projective ICP with point-to-plane metric Truncated signed distance function (TSDF) Ray Casting
87
88
131
132
in.tum.summer party & career forum
Visual Navigation for Flying Robots Motion Planning

Dr. Jrgen Sturm
The Department of Informatics would like to invite its students and employees to its summer party and career forum.
July 4, 2012
3 pm 6 pm Career Forum: Presentations given by Google, Capgemini etc, stands, panel discussion: TUM alumni talk about their career paths in informatics 3 pm 6 pm Foosball Tournament Starting at 5 pm Summer Party: BBQ, live band and lots of fun!
www.in.tum.de/2012summerparty
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA
Motivation: Flying Through Forests

1 2 3
Motion Planning Problem

Given obstacles, a robot, and its motion capabilities, compute collision-free robot motions from the start to goal.

What are good performance metrics?

What are good performance metrics? Execution speed / path length Energy consumption Planning speed Safety (minimum distance to obstacles) Robustness against disturbances Probability of success
133
Motion Planning Examples

Motion planning is sometimes also called the piano movers problem
Robot
Robot Architecture
Global Map (SLAM)
Path Planner Executive
Path Tracking
Local Obstacle Map Localization .. Sensors Collision Avoidance
Position Control
.. Actuators
Physical World
Agenda for Today

Configuration spaces Roadmap construction Search algorithms Path optimization and re-planning Path execution
Configuration Space
Work space
Typically 3D pose (position + orientation) 6 DOF
Configuration space
Reduced pose (position + yaw) 4 DOF Full pose 6 DOF Pose + velocity 12 DOF Joint angles of manipulation robot
Planning takes place in configuration space

Configuration Space
The configuration space (C-space) is the space of all possible configurations C-space topology is usually not Cartesian C-space is described as a topological manifold
wrap around start
Notation
Configuration space Configuration Free space Obstacle space
connecting path wraps around

obstacle
Properties
goal
134
Free Space Example

What are admissible configurations for the robot? Equiv.: What is the free space? Point robot
obstacle robot
Example
What are admissible configurations for the robot? Equiv.: What is the free space? Point robot
obstacle robot
13
14
Example
What are admissible configurations for the robot? Equiv.: What is the free space? Circular robot
robot footprint
Example
What are admissible configurations for the robot? Equiv.: What is the free space? Circular robot
robot footprint in work space (disk)
robot footprint in configuration space (point) obstacle in configuration space
?
Example
What are admissible configurations for the robot? Equiv.: What is the free space? Large circular robot
Computing the Free Space

Free configuration space is obtained by sliding the robot along the edge of the obstacle regions "blowing them up" by the robot radius This operation is called the Minowski sum
where
17
18
135
Example: Minowski Sum

Triangular robot and rectangular obstacle
Example
Polygonal robot, translation only
Work space
Configuration space
Reference point
C-space is obtained by sliding the robot along the edge of the obstacle regions
Basic Motion Planning Problem

Given
Free space Initial configuration Goal configuration
Motion Planning Sub-Problems

1. C-Space discretization (generating a graph / roadmap) 2. Search algorithm (Dijkstras algorithm, A*, ) 3. Re-planning (D*, ) 4. Path tracking (PID control, potential fields, funnels, )
Goal: Find a continuous path
with
22
C-Space Discretizations
Two competing paradigms Combinatorial planning (exact planning) Sampling-based planning (probabilistic/randomized planning)
Combinatorial Methods
Mostly developed in the 1980s Extremely efficient for low-dimensional problems Sometimes difficult to implement Usually produce a road map in Assume polygonal environments
23
24
136
Roadmaps
A roadmap is a graph in where Each vertex is a configuration Each edge is a path for which and are vertices
(Desired) Properties of Roadmaps

Accessibility From anywhere in , it is easy to compute a path that reaches at least one of the vertices Connectivity-preserving If there exists a path between and in then there must also exist a path in the road map
25
26
Roadmap Construction
We consider here three combinatorial methods: Trapezoidal decomposition Shortest path roadmap Regular grid but there are many more! Afterwards, we consider two sampling-based methods: Probabilistic roadmaps (PRMs) Rapidly exploring random trees (RRTs)
Decompose horizontally in convex regions using plane sweep Sort vertices in x direction. Iterate over vertices while maintaining a vertically sorted list of edges
28
Place vertices
in the center of each trapezoid on the edge between two neighboring trapezoids
Example Query
Compute path from to Identify start and goal trapezoid Connect start and goal location to center vertex Run search algorithm (e.g., Dijkstra)
Resulting road map

Quick check on properties: - Accessibility - Connectivity-preserving?
29
30
137
Properties of Trapezoidal Decomposition

+ Easy to implement + Efficient computation + Scales to 3D - Does not generate shortest path
Shortest-Path Roadmap
Contains all vertices and edges that optimal paths follow when obstructed Imagine pulling a tight string between and
31
32
Vertices = all sharp corners (>180deg, red) Edges
1. Two consecutive sharp corners on the same obstacle (light blue) 2. Bitangent edges (when line connecting two vertices extends into free space, dark blue)
Example Query
Compute path from to Connect start and goal location to all visible roadmap vertices Run search algorithm (e.g., Dijkstra)
33
34
Example Query
+ Easy to construct in 2D + Generates shortest paths - Optimal planning in 3D or more dim. is NP-hard
Approximate Decompositions
Construct a regular grid High memory consumption (and number of tests) Any ideas?
qI
qG qI
36
qG
138
Construct a regular grid Use quadtree/octtree to save memory Sometimes difficult to determine status of cell
+ Easy to construct + Most used in practice - High number of tests
qI
qG
qG
qI
qI
qG
qG
qI
Summary: Combinatorial Planning

Pro: Find a solution when one exists (complete) Con: Become quickly intractable for higher dimensions Alternative: Sampling-based planning Weaker guarantees but more efficient
Sampling-based Methods
Abandon the concept of explicitly characterizing and and leave the algorithm in the dark when exploring The only light is provided by a collisiondetection algorithm that probes to see whether some configuration lies in We will have a look at
Probabilistic road maps (PRMs) Rapidly exploring random trees (RRTs)
39
40
Probabilistic Roadmaps (PRMs)

[Kavraki et al., 1992]
PRM Example
1. Sample vertex 2. Find neighbors 3. Add edges
Vertex: Take random sample from , check whether sample is in Edge: Check whether line-of-sight between two nearby vertices is collision-free
Options for nearby: k-nearest neighbors or all neighbors within specified radius Add vertices and edges until roadmap is dense enough
Step 3: Check edges for collisions, e.g., using discretized line search
42
139
Probabilistic Roadmaps
+ Probabilistic. complete - Do not work well for some problems (e.g., + Scale well to higher narrow passages) dimensional C-spaces - Not optimal, not + Very popular, many complete extensions
qG Cobs Cobs Cobs qI
Rapidly Exploring Random Trees

[Lavalle and Kuffner, 1999]
Idea: Grow tree from start to goal location
qG Cobs Cobs qI
Cobs Cobs

Algorithm
1. Initialize tree with first node 2. Pick a random target location (every 100th iteration, choose ) 3. Find closest vertex in roadmap 4. Extend this vertex towards target location 5. Repeat steps until goal is reached

Algorithm
1. Initialize tree with first node 2. Pick a random target location (every 100th iteration, choose ) 3. Find closest vertex in roadmap 4. Extend this vertex towards target location 5. Repeat steps until goal is reached
Why not pick

every time?
Why not pick every time? This will fail and run into instead of exploring

[Lavalle and Kuffner, 1999]
RRT Examples
2-DOF example
RRT: Grow trees from start and goal location towards each other, stop when they connect
3-DOF example (2D translation + rotation)
47
48
140
Non-Holonomic Robots
Some robots cannot move freely on the configuration space manifold Example: A car can not move sideways
2-DOF controls (speed and steering) 3-DOF configuration space (2D translation + rotation)
Non-Holonomic Robots
RRTs can naturally consider such constraints during tree construction Example: Car-like robot
49
50

+ Probabilistic. complete - Metric sensitivity + Balance between - Unknown rate of greedy search and convergence exploration - Not optimal, not + Very popular, many complete extensions
Summary: Sampling-based Planning

More efficient in most practical problems but offer weaker guarantees Probabilistically complete (given enough time it finds a solution if one exists, otherwise, it may run forever) Performance degrades in problems with narrow passages
51
52
Motion Planning Sub-Problems

1. C-Space discretization (generating a graph / roadmap) 2. Search algorithms (Dijkstras algorithm, A*, ) 3. Re-planning (D*, ) 4. Path tracking (PID control, potential fields, funnels, )
Search Algorithms
Given: Graph G consisting of vertices and edges (with associated costs) Wanted: find the best (shortest) path between two vertices
What search algorithms do you know?
54
141
Uninformed Search
Breadth-first
Complete Optimal if action costs equal Time and space
Example: Dijkstras Algorithm

Extension of breadth-first with arbitrary (nonnegative) costs
Depth-first
Not complete in infinite spaces Not optimal Time Space (can forget explored subtrees)
55
56
Informed Search
Idea
Select nodes for further expansion based on an evaluation function First explore the node with lowest value
Informed Search
Idea
Select nodes for further expansion based on an evaluation function First explore the node with lowest value
What is a good evaluation function?
What is a good evaluation function? Often a combination of

Path cost so far Heuristic function (e.g., estimated distance to goal, but can also encode additional domain knowledge)
57
58
Informed Search
Greedy best-first search
Simply expand the node closest to the goal
What is a Good Heuristic Function?

Choice is problem/application-specific Two popular choices
Manhattan distance (neglecting obstacles) Euclidean distance (neglecting obstacles) Value iteration / Dijkstra (from the goal backwards)
Not optimal, not complete
A* search
Combines path cost with estimated goal distance
Optimal and complete (if overestimates actual cost)

never
142
Comparison Search Algorithms
Problems on A* on Grids
1. The shortest path is often very close to obstacles (cutting corners)
Uncertain path execution increases the risk of collisions Uncertainty can come from delocalized robot, imperfect map, or poorly modeled dynamic constraints
2. Trajectories are aligned to grid structure

Path looks unnatural Paths are longer than the true shortest path in continuous space
Problems on A* on Grids
3. When the path turns out to be blocked during traversal, it needs to be re-planned from scratch
In unknown or dynamic environments, this can occur very often Replanning in large state spaces is costly Can we re-use (repair) the initial plan?
Map Smoothing
Problem: Path gets close to obstacles Solution: Convolve the map with a kernel (e.g., Gaussian)
Leads to non-zero probability around obstacles Evaluation function

Lets look at solutions to these problems

Example: Map Smoothing
Path Smoothing
Problem: Paths are aligned to grid structure (because they have to lie in the roadmap) Paths look unnatural and are sub-optimal Solution: Smooth the path after generation
Traverse path and find pairs of nodes with direct line of sight; replace by line segment Refine initial path using non-linear minimization (e.g., optimize for continuity/energy/execution time)
65
66
143
Example: Path Smoothing

Replace pairs of nodes by line segments
D* Search
Problem: In unknown, partially known or dynamic environments, the planned path may be blocked and we need to replan Can this be done efficiently, avoiding to replan the entire path?
Non-linear optimization
67
68
D* Search
Idea: Incrementally repair path keeping its modifications local around robot pose Many variants:
D* (Dynamic A*) [Stentz, ICRA 94] [Stentz, IJCAI 95] D* Lite [Koenig and Likhachev, AAAI 02] Field D* [Ferguson and Stenz, JFR 06]
D* Search
Main concepts Invert search direction (from goal to start)
Goal does not move, but robot does Map changes (new obstacles) have only local influence close to current robot pose
Mark the changed node and all dependent nodes as unclean (=to be re-evaluated) Find shortest path to start (using A*) while reusing previous solution
69
D* Example
Situation at start
BreadthFirstSearch A* BreadthFirstSearch
D* Example
After discovery of blocked cell
A*
Start Goal Expanded nodes (goal distance calculated)
D* Lite
Blocked cell Updated nodes
D* Lite
All other nodes remain unaltered, the shortest path can reuse them.
144
D* Search
D* is as optimal and complete as A* D* and its variants are widely used in practice Field D* was running on Mars rovers Spirit and Opportunity
D* Lite for Footstep Planning

[Garimort et al., ICRA 11]
73
74
Real-Time Motion Planning

What is the maximum time needed to re-plan in case of an obstacle detection? What if the robot has to react quickly to unforeseen, fast moving objects?
Real-Time Motion Planning

What is the maximum time needed to re-plan in case of an obstacle detection?
In principle, re-planning with D* can take arbitrarily long
What if the robot has to react quickly to unforeseen, fast moving objects?
Need a collision avoidance algorithm that runs in constant time!
Do we really need to re-plan for every obstacle on the way?
Do we really need to re-plan for every obstacle on the way?

Could trigger re-planning only if path gets obstructed (or robot predicts that re-planning reduces path length by p%)
75
Robot Architecture
Robot
Layered Motion Planning

Executive
Global Map (SLAM)
Path Planner
Path Tracking
Local Obstacle Map Localization .. Sensors Collision Avoidance
Position Control
.. Actuators
An approximate global planner computes paths ignoring the kinematic and dynamic vehicle constraints (not real-time) An accurate local planner accounts for the constraints and generates feasible local trajectories in real-time (collision avoidance)
Physical World
145
Local Planner
Given: Path to goal (sequence of via points), range scan of the local vicinity, dynamic constraints Wanted: Collision-free, safe, and fast motion towards the goal (or next via point) Typical approaches:
Potential fields Dynamic window approach
Navigation with Potential Fields

Treat robot as a particle under the influence of a potential field Pro:
easy to implement
Con:
suffers from local minima no consideration of dynamic constraints
79
80
Dynamic Window Approach

[Simmons, 96], [Fox et al., 97], [Brock & Khatib, 99]

Consider a 2D planar robot

all possible speeds of the robot forward velocity 0.9m/s
Consider a 2D planar robot + 2D environment

all possible speeds of the robot forward velocity 0.9m/s
obstacle-free area
-90deg/s
+90deg/s
angular velocity
-90deg/s
+90deg/s
angular velocity


Consider additionally dynamic constraints

all possible speeds of the robot forward velocity 0.9m/s current robot speed
Navigation function (potential field)

Maximizes velocity
NF f f vel n n go
dynamic window (speeds reachable in one time frame)
Admissible space
forward velocity

0.9m/s
obstacle-free area
Current robot pose
-90deg/s
+90deg/s
angular velocity
Path from A*
-90deg/s +90deg/s angular velocity
84
146


Maximizes velocity
NF f NF f f vel n n goal f vel n n go

Rewards alignment to A* path gradient Maximizes velocity Rewards alignment to A* path gradient Rewards large advances on A* path
forward velocity forward velocity

0.9m/s


0.9m/s
Current robot pose
Path from A*
Current robot pose
Path from A*
85
86

Example: Dynamic Window Approach

[Brock and Khatib, ICRA 99]
Discretize dynamic window and evaluate navigation function (note: window has fixed size = real-time!) Find the maximum and execute motion
87
88
Problems of DWAs
DWAs suffer from local minima (need tuning), e.g., robot does not slow down early enough to enter doorway:

Motion planning problem and configuration spaces Roadmap construction Search algorithms and path optimization Local planning for path execution
Can you think of a solution? Note: General case requires global planning
147
148
Agenda for Today

Planning under Uncertainty Exploration with a single robot Coordinated exploration with a team of robots Coverage
Visual Navigation for Flying Robots Planning under Uncertainty, Exploration and Coordination
Dr. Jrgen Sturm
Agenda For Next Week

First half: Good practices for experimentation, evaluation and benchmarking Second half: Time for your questions on course material
Motivation: Planning under Uncertainty

Consider a robot with range-limited sensors and a feature-poor environment Which route should the robot take?
maximum sensor range
Prepare your questions (if you have)
Reminder: Performance Metrics

Execution speed / path length Energy consumption Planning speed Safety (minimum distance to obstacles) Robustness against disturbances Probability of success
Reminder: Belief Distributions

In general, actions of the robot are not carried out perfectly Position estimation ability depends on map Lets look at the belief distributions
149
Reminder: Belief Distributions

Actions increase the uncertainty (in general) Observations decrease the uncertainty (always) Observations are not always available
Solution 1: Shape The Environment To Decrease Uncertainty

Assume a robot without sensors What is a good navigation plan?
goal

Plan 1: Take the shortest path What is the probability of success of plan 1?

What is the probability of success of plan 2?
goal
goal
10

Pro: Simple solution, need fewer/no sensors Con: Requires task specific design/engineering of both the robot and the environment Applications:
Docking station Perception-less manipulation (on conveyer belts)
Solution 2: Add (More/Better) Sensors
11
12
150
Solution 3: POMDPs
Partially observable Markov decision process (POMDP) Considers uncertainty of the motion model and sensor model Finite/infinite time horizon Resulting policy is optimal One solution technique: Value iteration Problem: In general (and in practice) computationally intractable (PSPACE-hard)
Continuum of Possible Approaches to Motion Planning

Conventional path planner tractable not robust
maybe we can find something in between
POMDP
intractable robust
Remember: Motion Planning
Remember: Motion Planning in HighDimensional Configuration Spaces

Assumes a controller exists to transfer from xt to xt+1 start
start
GOAL GOAL
Slides adopted from Nick Roy

15
Goal: shortest path, subject to kinematic and environmental constraints Dr. Jrgen Sturm, Computer Vision Group, TUM

Remember: Probabilistic Roadmaps 1. Add vertices (sampled in free space) 2. Add edges between neighboring vertices (when line of sight is not obstructed) 3. Find shortest path (Dijkstra, )
Remember: Motion Planning in HighDimensional Configuration Spaces

Problem: The roadmap does not consider the sensor capabilities of the robot Can the robot actually keep position at each vertex?
Can it localize at the vertex? Given localization abilities, what is the probability of hitting into an obstacle?
Motion Planning in Information Space

[Roy et al.]
start
GOAL
Can the robot robustly navigate between two vertices?

Line of sight is not enough Robot might get lost or hit into an obstacle
17
1. Sample vertices and localization distributions where p(xCobst) < e 2. Add edges between points where p(xCobst) < e along path 3. Perform graph search Sturm, Computer Vision Group, TUM 18 Dr. Jrgen
151
Motion Planning in Information Space

Problem: Posterior distribution depends also on the path taken to the vertex Example
start
Belief Roadmap
[He et al., 2008]
z7 z1 z2 z4
z5
z6
z8 z9 z10 GOAL
start GOAL
z3
19

1. Sample vertices from Cfree, build graph and estimate belief dist. transfer functions 2. Propagate covariances by performing graph search 20 Dr. Jrgen Sturm, Computer Vision Group, TUM
Planning in Information Spaces

[He et al., 2008]

[He et al., 2008]
Given: Roadmap Goal: Find path from start to goal nodes that results in minimum uncertainty at goal
How can we propagate the belief distribution along an edge? 1. Sample waypoints, use forward simulation to compute full posterior 2. Linearize model and use Kalman filter
Problem: How can we estimate the belief distribution at the goal (efficiently)?
?
21
Example: Belief Roadmap

[He et al., 2008]
Belief Propagation
[He et al., 2008]
The posterior distribution depends on the prior distribution
Initial Conditions
u0:T,z0:T
Different initial Conditions
u0:T,z0:T
?
23
24
152

[He et al., 2008]
Summary: Planning Under Uncertainty

Actions and observations are inherently noisy Planners neglecting this are not robust Consider the uncertainty during planning to increase robustness
The posterior distribution at a vertex depends on the prior distribution (and thus on path to the vertex) Need to perform forward simulation (and belief prediction) along each edge for every start state Computing minimum cost path of 30 edges: 100 seconds
25
26
User
Mission Planning
Task Planner Global Map (SLAM)
Local Obstacle Map Mission Planner Robot
Mission Planning
Goal: Generate and execute a plan to accomplish a certain (navigation) task Example tasks
Exploration Coverage Surveillance Tracking
Global Planner
Local Planner
Localization
.. Sensors
Position Control
.. Actuators
Physical World
Task Planning
Goal: Generate and execute a high level plan to accomplish a certain task Often symbolic reasoning (or hard-coded)
Propositional or first-order logic Automated reasoning systems Common programming languages: Prolog, LISP
Exploration and SLAM

SLAM is typically passive, because it consumes incoming sensor data Exploration actively guides the robot to cover the environment with its sensors Exploration in combination with SLAM: Acting under pose and map uncertainty Uncertainty should/needs to be taken into account when selecting an action
Multi-agent systems, communication Artificial Intelligence

153
Exploration
By reasoning about control, the mapping process can be made much more effective Question: Where to move next?
Exploration
Choose the action that maximizes utility
Question: How can we define utility?
This is also called the next-best-view problem

Example
Where should the robot go next?
Maximizing the Information Gain

Pick the action that maximizes the information gain given a map m
unknown
unexplored
occupied
empty
Information Theory
Entropy is a general measure for the uncertainty of a probability distribution Entropy = Expected amount of information needed to encode an outcome
Example: Binary Random Variable

Binary random variable Probability distribution How many bits do we need to transmit one sample of ?
For p=0? For p=0.5? For p=1?
35
36
154
Example: Binary Random Variable

Binary random variable Probability distribution How many bits do we need to transmit one sample of ? Answer:
Example: Map Entropy

probability
Low entropy
occupied
free
probability
Low entropy
occupied free
probability
High entropy
occupied free
The overall entropy is the sum of the individual entropy values

Information Theory
Information gain = Uncertainty reduction
Maximizing the Information Gain

To compute the information gain one needs to know the observations obtained when carrying out an action This quantity is not known! Reason about potential measurements
Conditional entropy
39
40
Example
Exploration Costs
So far, we did not consider the cost of executing an action (e.g., time, energy, )
Utility = uncertainty reduction cost Select the action with the highest expected utility
41
42
155
Exploration
For each location <x,y>
Estimate the number of cells robot can sense (e.g., simulate laser beams using current map) Estimate the cost of getting there
Exploration
Greedy strategy: Select the candidate location with the highest utility, then repeat
43
44
Exploration Actions
So far, we only considered reduction in map uncertainty In general, there are many sources of uncertainty that can be reduced by exploration
Map uncertainty (visit unexplored areas) Trajectory uncertainty (loop closing) Localization uncertainty (active re-localization by re-visiting known locations)
Example: Active Loop Closing

[Stachniss et al., 2005]
Reduce map uncertainty
Reduce map + path uncertainty
45
46


Entropy evolution
47
48
156
Example: Reduce uncertainty in map, path, and pose [Stachniss et al., 2005]
Selected target location
Corridor Exploration
The decision-theoretic approach leads to intuitive behaviors: re-localize before getting lost
Some animals show a similar behavior (dogs marooned in the tundra of north Russia)
Multi-Robot Exploration
Given: Team of robots with communication Goal: Explore the environment as fast as possible
Complexity
Single-robot exploration in known, graph-like environments is in general NP-hard Proof: Reduce traveling salesman problem to exploration Complexity of multi-robot exploration is exponential in the number of robots
[Wurm et al., IROS 2011]

Motivation: Why Coordinate?

Robot 1 Robot 2
Levels of Coordination
1. No exchange of information 2. Implicit coordination: Sharing a joint map
Communication of the individual maps and poses Central mapping system
3. Explicit coordination: Determine better target locations to distribute the robots
Without coordination, two robots might choose the same exploration frontier
Central planner for target point assignment Minimize expected path cost / information gain /
157
Typical Trajectories
Implicit coordination: Explicit coordination:
Exploration Time
55
56
Coordination Algorithm
In each time step: Determine set of exploration targets Compute for each robot and each target the expected cost/utility Assign robots to targets using the Hungarian algorithm
Hungarian Algorithm
[Kuhn, 1955]
Combinatorial optimization algorithm Solves the assignment problem in polynomial time General idea: Algorithm modifies the cost matrix until there is zero cost assignment
57
58
Hungarian Algorithm: Example
1. Compute the cost matrix (non-negative)

59 / 16 60 / 16
2. Find minimum element in each row
158
3. Subtract minimum from each row element

61 / 16
4. Find minimum element in each column

Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 62
5. Subtract minimum from each column element

6a. Assign (if possible)


6b. If no assignment is possible: Connect all 0s by lines Find the minimum in all remaining elements and subtract Repeat step 2 6

If there are not enough targets: Copy targets to allow multiple assignments
65
66
159
Example: Segmentation-based Exploration

[Wurm et al., IROS 2008]
Summary: Exploration
Exploration aims at generating robot motions so that an optimal map is obtained Coordination reduces exploration time Hungarian algorithm efficiently solves the assignment problem (centralized, 1-step lookahead) Challenges (active research):
Limited bandwidth and unreliable communication Decentralized planning and task assignment
Two-layer hierarchical role assignments using Hungarian algorithm (1: rooms, 2: targets in room) Reduces exploration time and risk of interferences
67
68
Coverage Path Planning

Given: Known environment with obstacles Wanted: The shortest trajectory that ensures complete (sensor) coverage
[images from Xu et al., ICRA 2011]

Coverage Path Planning: Applications

For flying robots
Search and rescue Area surveillance Environmental inspection Inspection of buildings (bridges)

What is a good coverage strategy? What would be a good cost function?
For service robots

Lawn mowing Vacuum cleaning
For manipulation robots

Painting Automated farming
160

What is a good coverage strategy? What would be a good cost function?
Amount of redundant traversals Number of stops and rotations Execution time Energy consumption Robustness Probability of success

Related to the traveling salesman problem (TSP): Given a weighted graph, compute a path that visits every vertex once In general NP-complete Many approximations exist Many approximate (and exact) solvers exist
74
Coverage of Simple Shapes

Approximately optimal solution often easy to compute for simple shapes (e.g., trapezoids)
Idea
[Mannadiar and Rekleitis, ICRA 2011]
75
76
Idea
Idea
2 1 4
161
Coverage Based On Cell Decomposition

Step 1: Boustrophedon Cellular Decomposition [Mannadiar and Rekleitis, ICRA 2011]

Similar to trapezoidal decomposition Can be computed efficiently
Approach: 1. Decompose map into simple cells 2. Compute connectivity between cells and build graph 3. Solve coverage problem on reduced graph
cells
critical points (=produce splits or joins)

Step 2: Build Reeb Graph

Step 3: Compute Euler Tour

Vertices = Critical points (that triggered the split) Edges = Connectivity between critical points
Extend graph so that vertices have even order Compute Euler tour (linear time)
81
82
Resulting Coverage Plan

Robotic Cleaning of 3D Surfaces

[Hess et al., IROS 2012]
Follow the Euler tour Use simple coverage strategy for cells Note: Cells are visited once or twice
Goal: Cover entire surface while minimizing trajectory length in configuration space
Approach:
Discretize 3D environment into patches Build a neighborhood graph Formulate the problem as generalized TSP (GTSP)
162
Robotic Cleaning of 3D Surfaces

[Hess et al., IROS 2012]

How to generate plans that are robust to uncertainty in sensing and locomotion How to explore an unknown environment
With a single robot With a team of robots
How to generate plans that fully cover known environments
85
86
Video: SFLY Final Project Demo (2012)
87
163
164
Agenda for Today

Course Evaluation Scientific research: The big picture Best practices in experimentation Datasets, evaluation criteria and benchmarks
Experimentation, Evaluation and Benchmarking

Dr. Jrgen Sturm
Time for questions
Course Evaluation
Much positive feedback thank you!!! We are also very happy with you as a group. Everybody seemed to be highly motivated! Suggestions for improvements (from course evaluation forms)
Workload was considered a bit too high ECTS have been adjusted to 6 credits ROS introduction lab course would be helpful Will do this next time
Scientific Research General Idea

1. Observe phenomena 2. Formulate explanations and theories 3. Test them
Any further suggestions/comments?

Scientific Research Methodology

1. 2. 3. 4. 5. Generate an idea Develop an approach that solves the problem Demonstrate the validity of your solution Disseminate your results At all stages: iteratively refine
Scientific Research in Student Projects

How can you get involved in scientific research during your study?
165
Scientific Research in Student Projects

How can you get involved in scientific research during your study?
Bachelor lab course (10 ECTS) Bachelor thesis (15 ECTS) Graduate lab course (10 ECTS) Interdisciplinary project (16 ECTS) Master thesis (30 ECTS) Student research assistant (10 EUR/hour, typically 10 hours/week)
Step 1: Generate the Idea

Be creative Follow your interests / preferences Examples:
Research question Challenging problem Relevant application Promising method (e.g., try to transfer method from another field)
Step 1b: Find related work

There is always related work Find related research papers
Use Google scholar, paper repositories, Navigate the citation network Read survey articles
Step 2: Develop a Solution

Practitioner
Start programming Realize that it is not going to work, start over, When it works, formalize it (try to find out why it works and what was missing before) Empirically verify that it works
Browse through (recent) text books Ask your professor, colleagues, Its very unlikely that somebody else has already perfectly solved exactly your problem, so dont worry! Technology evolves very fast
Theorist
Formalize the problem Find suitable method (Theoretically) prove that it is right (If needed) implement a proof-of-concept
Step 3: Validation
What are your claims? How can you prove them?
Theoretical proof (mathematical problem) Experimental validation
Qualitative (e.g., video) Quantitative (e.g., many trials, statistical significance)
Step 4: Dissemination
Good solution/expertise alone is not enough You need to convince other people in the field Usual procedure:
3-6 month 1. Write research paper (usually 6-8 pages) 2. Submit PDF to an international conference or journal 3-6 month 3. Paper will be peer-reviewed 4. Improve paper (if necessary) 5. Give talk or poster presentation at conference 15 min. 6. Optionally: Repeat step 1-5 until PhD 3-5 years
Compare and discuss your results with respect to previous work/approaches

166
Step 5: Refinement
Step 5: Refinement
Discuss your work with
Your colleagues Your professor Other colleagues at conferences
Improve your approach and evaluation

Adopt notation to the standard Get additional references/insights Conduct more/additional experiments
[http://www.phdcomics.com]
Simplify and generalize your approach Collaborate with other people (in other fields)
Scientific Research
This was the big picture Todays focus is on best practices in experimentation What do you think are the (desired) properties of a good scientific experiment?
What are the desired properties of a good scientific experiment?

Reproducibility / repeatability
Document the experimental setup Choose (and motivate) an your evaluation criterion
Experiments should allow you to validate/falsify competing hypotheses Current trends: Make data available for review and criticism Same for software (open source)
15
Challenges
Reproducibility is sometimes not easy to guarantee Any ideas why?
Challenges
Randomized components/noise (beat with the law of large numbers/statistical tests) Experiment requires special hardware
Self-built, unique robot Expensive lab equipment
Experiments cost time (Video) Demonstrations will suffice Technology changes fast
167
Benchmarks
Effective and affordable way of conducting experiments Sample of a task domain Well-defined performance measurements Widely used in computer vision and robotics Which benchmark problems do you know?
Example Benchmark Problems

Computer Vision Middlebury datasets (optical flow, stereo, ) Caltech-101, PASCAL (object recognition) Stanford bunny (3d reconstruction) Robotics RoboCup competitions (robotic soccer) DARPA challenges (autonomous car) SLAM datasets
19
Image Denoising: Lenna Image

512x512 pixel standard image for image compression and denoising Lena Sderberg, Playboy magazine Nov. 1972 Scanned by Alex Sawchuck at USC in a hurry for a conference paper
Object Recognition: Caltech-101

Pictures of objects belonging to 101 categories About 40-800 images per category Recognition, classification, categorization
http://www.cs.cmu.edu/~chuck/lennapg/
RoboCup Initiative
Evaluation of full system performance Includes perception, planning, control, Easy to understand, high publicity By mid-21st century, a team of fully autonomous humanoid robot soccer players shall win the soccer game, complying with the official rule of the FIFA, against the winner of the most recent World Cup.
RoboCup Initiative
24
168
SLAM Evaluation
Intel dataset: laser + odometry [Haehnel, 2004] New College dataset: stereo + omni-directional vision + laser + IMU [Smith et al., 2009] TUM RGB-D dataset [Sturm et al., 2011/12]
TUM RGB-D Dataset

[Sturm et al., RSS RGB-D 2011; Sturm et al., IROS 2012]
RGB-D dataset with ground truth for SLAM evaluation Two error metrics proposed (relative and absolute error) Online + offline evaluation tools Training datasets (fully available) Validation datasets (ground truth not publicly available to avoid overfitting)
25
Recorded Scenes
Various scenes (handheld/robot-mounted, office, industrial hall, dynamic objects, ) Large variations in camera speed, camera motion, illumination, environment size,
Dataset Acquisition
Motion capture system
Camera pose (100 Hz)
Microsoft Kinect
Color images (30 Hz) Depth maps (30 Hz) IMU (500 Hz)
External video camera (for documentation)
27
28
Motion Capture System

9 high-speed cameras mounted in room Cameras have active illumination and preprocess image (thresholding) Cameras track positions of retro-reflective markers
Calibration
Calibration of the overall system is not trivial: 1. Mocap calibration 2. Kinect-mocap calibration 3. Time synchronization
29
30
169
Calibration Step 1: Mocap

Need at least 2 cameras for position fix Need at least 3 markers on object for full pose Calibration stick for extrinsic calibration
Calibration Step 1: Mocap

trajectory of the calibration stick in 3D
trajectory of the calibration stick in the individual cameras

Example: Raw Image from Mocap

detected markers
Example: Position Triangulation of a Single Marker
33
34
Example: Tracked Object (4 Markers)
Example: Recorded Trajectory
35
36
170
Calibration Step 2: Mocap-Kinect

Need to find transformation between the markers on the Kinect and the optical center Special calibration board visible both by Kinect and mocap system (manually gauged)
Calibration Step 3: Time Synchronization

Assume a constant time delay between mocap and Kinect messages Choose time delay that minimizes reprojection error during checkerboard calibration
time delay
37
38
Calibration - Validation
Intrinsic calibration Extrinsic calibration color + depth Time synchronization color + depth Mocap system slowly drifts (need re-calibration every hour) Validation experiments to check the quality of calibration
2mm length error on 2m rod across mocap volume 4mm RMSE on checkerboard sequence
Example Sequence: Freiburg1/XYZ

External view Color channels Depth channel
Sequence description (on the website): For this sequence, the Kinect was pointed at a typical desk in an office environment. This sequence contains only translatory motions along the principal axes of the Kinect, while the orientation was kept (mostly) fixed. This sequence is well suited for debugging purposes, as it is very simple.
41
42
171
Dataset Website
In total: 39 sequences (19 with ground truth) One ZIP archive per sequence, containing
Color and depth images (PNG) Accelerometer data (timestamp ax ay az) Trajectory file (timestamp tx ty ty qx qy qz qw)
Sequences also available as ROS bag and MRPT rawlog

http://vision.in.tum.de/data/datasets/rgbd-dataset
What Is a Good Evaluation Metric?

Compare camera trajectories
Ground truth trajectory Estimate camera trajectory
Relative Pose Error (RPE)

Measures the (relative) drift Recommended for the evaluation of visual odometry approaches
Relative error Ground truth True motion Estimated motion Estimated traj.
Two common evaluation metrics

Relative pose error (drift per second) Absolute trajectory error (global consistency)
RGB-D sequence Ground truth camera traj.
Visual odometry / SLAM system
Estimated camera trajectory
Trajectory comparison Relative error
46
Absolute Trajectory Error (ATE)

Measures the global error Requires pre-aligned trajectories Recommended for SLAM evaluation
Absolute error
Evaluation metrics
Average over all time steps
Ground truth
Pre-aligned estimated traj.
Reference implementations for both evaluation metrics available Output: RMSE, Mean, Median (as text) Plot (png/pdf, optional)
48
172
Example: Online Evaluation
49
50
Summary TUM RGB-D Benchmark

Dataset for the evaluation of RGB-D SLAM systems Ground-truth camera poses Evaluation metrics + tools available
Discussion on Benchmarks
Pro: Provide objective measure Simplify empirical evaluation Stimulate comparison Con: Introduce bias towards approaches that perform well on the benchmark (overfitting) Evaluation metrics are not unique (many alternative metrics exist, choice is subjective)
51
Three Phases of Evolution in Research

Novel research problem appears (e.g., market launch of Kinect, quadrocopters, )
Is it possible to do something at all? Proof-of-concept, qualitative evaluation
Final Exam
Oral exam in teams (2-3 students) At least 15 minutes per student individual grades Questions will address
Your project Material from the exercise sheets Material from the lecture
Consolidation
Problem is formalized Alternative approaches appear Need for quantitative evaluation and comparison
Settled
Benchmarks appear Solid scientific analysis, text books,
54
173
Exercise Sheet 6
Prepare final presentation Proposed structure: 4-5 slides
1. 2. 3. 4. 5. Title slide with names + motivating picture Approach Results (video is a plus) Conclusions (what did you learn in the project?) Optional: Future work, possible extensions
Time for Questions
Hand in slides before Tue, July 17, 10am (!)

174

Visnav Lecture Notes

Uploaded by

Copyright:

Available Formats

Visnav Lecture Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Visnav Lecture Notes

Uploaded by

Copyright:

Available Formats

Computer Vision Group

Computer Vision Group Prof. Daniel Cremers

Visual Navigation for Flying Robots Welcome

Who are we?

Goal of this Course

Camera-based localization and mapping

Group Assignment and Schedule

VISNAV2012: Team Assignment

VISNAV2012: Robot Schedule

Agenda for Today

Shakey the Robot (1966-1972)

Shakey the Robot (1966-1972)

Stanford Cart (1961-80)

Rhino and Minerva (1998-99)

Neato XV-11 (2010)

Darpa Grand Challenge (2005)

Kiva Robotics (2007)

Fork Lift Robots (2010)

Aggressive Maneuvers (2010)

Autonomous Construction (2011)

Mapping with a Quadrocopter (2011)

Our Own Recent Work (2011-)

Current Trends in Robotics

Application Domains of Flying Robots

Especially quadrocopters because

Trend towards miniaturization

Quadrocopter: Basic Motions

Quadrocopter: Basic Motions

Quadrocopter: Basic Motions

Quadrocopter: Basic Motions

Accelerate to the Right

Accelerate to the Left

High level control

Evolution of Paradigms in Robotics

Classical / hierarchical paradigm

Reactive paradigms (mid-80s)

Hybrid approaches (since 90s)

Classical paradigm: Stanford Cart

Classical paradigm as horizontal/functional decomposition

Characteristics of hierarchical paradigm

Reactive Paradigm as Vertical Decomposition

Characteristics of Reactive Paradigm

Level 3: Follow Corridor

wander avoid stereo

modified heading avoid

Navigation with Potential Fields

Primitive Potential Fields

Example: reach goal and avoid obstacles

Corridor Following Robot

Characteristics of Potential Fields

Hybrid deliberative/reactive Paradigm

Modern Robot Architectures

Combines advantages of previous paradigms

Best Practices for Robot Architectures

Example Architecture for Navigation

Stanleys Software Architecture

Laser mapper Vision mapper Radar mapper

map vision map obstacle list

UKF Pose estimation

Throttle/brake control Power server interface