Visnav Lecture Notes
Visnav Lecture Notes
Visnav Lecture Notes
Visual Navigation for Flying Robots Lecture Notes Summer Term 2012
Lecturer: Dr. Jrgen Sturm Teaching Assistant: Nikolas Engelhard
http://vision.in.tum.de/teaching/ss2012/visnav2012
Acknowledgements
This slide set would not have been possible without the help and support of many other people. In particular, I would like to thank all my great colleagues who made their lecture slides available for everybody on the internet or sent them to me personally. My thanks go to (in alphabetical order) Alexander Kleiner Andrew Davison Andrew Zisserman Antonio Torralba Chad Jenkins Cyrill Stachniss Daniel Cremers Georgio Grisetti Jan Peters Jana Kosecka Jrg Mller Jrgen Hess Kai Arras Kai Wurm Kurt Konolige Li Fei-Fei Maxim Likhachev Margaritha Chli Nicholas Roy Paul Newman Richard Newcombe Richard Szeliski Roland Siegwart Sebastian Thrun Steve Seitz Steven Lavalle Szymon Rusinkiewicz Volker Grabe Vijay Kumar Wolfram Burgard
Table of Contents
Introduction ....................................................................................................... 1 3D Geometry and Sensors ...............................................................................17 Probabilistic Models and State Estimation .......................................................37 Robot Control .................................................................................................. 55 Visual Motion Estimation .................................................................................69 Simultaneous Localization and Mapping (SLAM) .............................................85 Bundle Adjustment and Stereo Correspondence ............................................101 Place Recognition, ICP, and Dense Reconstruction ........................................117 Motion Planning ............................................................................................. 133 Planning under Uncertainty, Exploration and Coordination ............................149 Experimentation, Evaluation and Benchmarking ...........................................165
Organization
Tue 10:15-11:45
Lectures, discussions Lecturer: Jrgen Sturm
Thu 14:15-15:45
Lab course, homework & programming exercises Teaching assistant: Nikolas Engelhard
Course website
Dates, additional material Exercises, deadlines http://cvpr.in.tum.de/teaching/ss2012/visnav2012
Course Material
Probabilistic Robotics. Sebastian Thrun, Wolfram Burgard and Dieter Fox. MIT Press, 2005. Computer Vision: Algorithms and Applications. Richard Szeliski. Springer, 2010.
http://szeliski.org/Book/
Lecture Plan
1. Introduction 2. Robots, sensor and motion models 3. State estimation and control 4. Guest talks 5. Feature detection and matching 6. Motion estimation 7. Simultaneous localization and mapping 8. Stereo correspondence 9. 3D reconstruction 10. Navigation and path planning 11. Exploration 12. Evaluation and Benchmarking
Basics on mobile robotics
Advanced topics
Lab Course
Thu 14:15 15:45, given by Nikolas Engelhard
Exercises: room 02.09.23 (6x, obliged, homework discussion) Robot lab: room 02.09.34/36 (in weeks without exercises, in case you need help, recommended!)
Exercises Plan
Exercise sheets contain both theoretical and programming problems 3 exercise sheets + 1 mini-project Deadline: before lecture (Tue 10:15) Hand in by email (visnav2012@cvpr.in.tum.de)
Team Name
Student Name Student Name Student Name
Safety Warning
Quadrocopters are dangerous objects Read the manual carefully before you start Always use the protective hull If somebody gets injured, report to us so that we can improve safety guidelines If something gets damaged, report it to us so that we can fix it NEVER TOUCH THE PROPELLORS DO NOT TRY TO CATCH THE QUADROCOPTER WHEN IT FAILS LET IT FALL/CRASH!
General background
Autonomous, automaton
self-willed (Greek, auto+matos)
Robot
Karel Capek in 1923 play R.U.R. (Rossums Universal Robots) labor (Czech or Polish, robota) workman (Czech or Polish, robotnik)
History
In 1966, Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to spend the summer linking a camera to a computer and getting the computer to describe what it saw. We now know that the problem is slightly more difficult than that. (Szeliski 2009, Computer Vision)
Roomba (2002)
Sensor: one contact sensor Control: random movements Over 5 million units sold
Quadrocopters (2001-)
Flying Robots
Recently increased interest in flying robots
Shift focus to different problems (control is much more difficult for flying robots, path planning is simpler, )
Quadrocopter Platforms
Commercial platforms
Ascending Technologies Height Tech Used in the Parrot Ardrone lab course
Flying Principles
Fixed-wing airplanes
generate lift through forward airspeed and the shape of the wings controlled by flaps
Helicopters/rotorcrafts
main rotor for lift, tail rotor to compensate for torque controlled by adjusting rotor pitch
Community/open-source projects
Mikrokopter Paparazzi
For more, see http://multicopter.org/wiki/Multicopter_Table
Quadrocopter/quadrotor
four rotors generate lift controlled by changing the speeds of rotation
Helicopter
Swash plate adjusts pitch of propeller cyclically, controls pitch and roll Yaw is controlled by tail rotor
Quadrocopter
Keep position: Torques of all four rotors sum to zero Thrust compensates for earth gravity
Ascend
Descend
Turn Left
Turn Right
Accelerate Forward
Accelerate Backward
Autonomous Flight
Low level control (not covered in this course)
Maintain attitude, stabilize Compensate for disturbances
Challenges
Limited payload
Limited computational power Limited sensors
Limited battery life Fast dynamics, needs electronic stabilization Quadrocopter is always in motion Safety considerations
Robot Ethics
Where does the responsibility for a robot lie? How are robots motivated? Where are humans in the control loop? How might society change with robotics? Should robots be programmed to follow a code of ethics, if this is even possible?
Robot Ethics
Three Laws of Robotics (Asimov, 1942): A robot may not injure a human being or, through inaction, allow a human being to come to harm. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.
Robot Design
Imagine that we want to build a robot that has to perform navigation tasks
Robot Hardware/Components
Sensors Actuators Control Unit/Software
How would you tackle this? What hardware would you choose? What software architecture would you choose?
Inspired by methods from Artificial Intelligence (70s) Focus on automated reasoning and knowledge representation STRIPS (Stanford Research Institute Problem Solver): Perfect world model, closed world assumption Shakey: Find boxes and move them to designated positions
Sensing
Reactive Paradigm
Sense
Sense-act type of organization Multiple instances of stimulus-response loops (called behaviors) Each behavior uses local sensing to generate the next action Combine several behaviors to solve complex tasks Run behaviors in parallel, behavior can override (subsume) output of other behaviors
Environment
Subsumption Architecture
Introduced by Rodney Brooks in 1986 Behaviors are networks of sensing and acting modules (augmented finite state machines) Modules are grouped into layers of competence Layers can subsume lower layers
Level 1: Avoid
feel force
force
runaway
heading
turn
sonar sensors
collide
halt
move forward
Level 2: Wander
look
heading to middle
stop motion
feel force
force
runaway
heading
turn
feel force
force
runaway
heading
turn
sonar sensors
collide
halt
move forward
sonar sensors
collide
halt
move forward
Roomba Robot
Exercise: Model the behavior of a Roomba robot.
10
Uniform
Perpendicular
Attractive
Repulsive
Tangential
Goal
Sense
Act
11
Robotic Middleware
Provides infrastructure Communication between modules Data logging facilities Tools for visualization Several systems available
Open-source: ROS (Robot Operating System), Player/Stage, CARMEN, YARP, OROCOS Closed-source: Microsoft Robotics Studio
PERCEPTION
PLANNING&CONTROL
Top level control
pause/disable command
USER INTERFACE
Touch screen UI Wireless E-Stop
Laser 2 interface
Global path planning Local path planning + collision avoidance Actuator interface(s)
Laser 3 interface Laser 4 interface Laser 5 interface Camera interface Radar interface
Road finder
laser map
road center
Path planner
trajectory
VEHICLE INTERFACE
Touareg interface
Localization module
Steering control
GPS position
vehicle state
Sensor interface(s)
Surface assessment
velocity limit
Sensor driver(s)
Actuator driver(s)
GLOBAL SERVICES
heart beats
Process controller
data
Health monitor
power on/off
Data logger
Communication requests Communication channels
File system
clocks
Robot Hardware
Time server
Communication Paradigms
Message-based communication
A msg var x var y B
12
Forms of Communication
Push Pull Publisher/subscriber Publish to blackboard Remote procedure calls / service calls Preemptive tasks / actions
Push
Broadcast One-way communication Send as the information is generated by the producer P
data
Pull
Data is delivered upon request by the consumer C (e.g., a map of the building) Useful if the consumer C controls the process and the data is not required (or available) at high frequency
Publisher/Subscriber
The consumer C requests a subscription for the data by the producer P (e.g., a camera or GPS) The producer P sends the subscribed data as it is generated to C Data generated according to a trigger (e.g., sensor data, computations, other messages, )
subscription request C P data (t=0) data (t=1) C
data ()
Publish to Blackboard
The producer P sends data to the blackboard (e.g., parameter server) A consumer C pull data from the blackboard B Only the last instance of data is stored in the blackboard B
data request
P
Service Calls
The client C sends a request to the server S The server returns the result The client waits for the result (synchronous communication) Also called: Remote Procedure Call
request + input data
data
data
result
13
Concepts in ROS
Nodes: programs that communicate with each other Messages: data structure (e.g., Image) Topics: typed message channels to which nodes can publish/subscribe (e.g., /camera1/image_color) Parameters: stored in a blackboard
camera_driver
Image face_detector
Software Management
Package: atomic unit of building, contains one or more nodes and/or message definitions Stack: atomic unit of releasing, contains several packages with a common theme Repository: contains several stacks, typically one repository per institution
Useful Tools
roscreate-pkg rosmake roscore rosnode list/info rostopic list/echo rosbag record/play rosrun
Tutorials in ROS
14
Exercise Sheet 1
On the course website Solutions are due in 2 weeks (May 1st)
Summary
History of mobile robotics Brief intro on quadrocopters Paradigms in robotics Architectures and middleware
Theory part: Define the motion model of a quadrocopter (will be covered next week) Practical part: Playback a bag file with data from quadrocopter & plot trajectory
Questions?
See you next week!
15
16
Organization: Lecture
Student request to change lecture time to Tuesday afternoon due to time conflicts with other course Problem: At least 3 students who are enrolled for this lecture have time Tuesday morning but not on Tuesday afternoon Therefore: No change Lectures are important, please choose which course to follow Note: Still students on the waiting list
Todays Agenda
Linear algebra 2D and 3D geometry Sensors
Vectors
Vector and its coordinates
Vector Operations
Scalar multiplication Addition/subtraction Length Normalized vector Dot product Cross product
17
Vector Operations
Scalar multiplication Addition/subtraction Length Normalized vector Dot product Cross product
Vector Operations
Scalar multiplication Addition/subtraction Length Normalized vector Dot product Cross product
Vector Operations
Scalar multiplication Addition/subtraction Length Normalized vector Dot product Cross product
Vector Operations
Scalar multiplication Addition/subtraction Length Normalized vector Dot product Cross product
if
Vector Operations
Scalar multiplication Addition/subtraction Length Normalized vector Dot product Cross product Definition
Cross Product
Verify that
18
Matrices
Rectangular array of numbers
rows columns
Matrices
Column vectors of a matrix
Geometric interpretation: for example, column vectors can form basis of a coordinate system
Matrices
Row vectors of a matrix
Matrices
Square matrix Diagonal matrix Upper and lower triangular matrix Symmetric matrix Skew-symmetric matrix (Semi-)positive definite matrix Invertible matrix Orthonormal matrix Matrix rank
Matrices
Square matrix Diagonal matrix Upper and lower triangular matrix Symmetric matrix Skew-symmetric matrix (Semi-)positive definite matrix Invertible matrix Orthonormal matrix Matrix rank
Matrix Operations
Scalar multiplication Addition/subtraction Transposition Matrix-vector multiplication Matrix-matrix multiplication Inversion
19
Matrix Operations
Scalar multiplication Addition/subtraction Transposition Matrix-vector multiplication Matrix-matrix multiplication Inversion
Matrix-Vector Multiplication
Definition
column vectors
Matrix-Vector Multiplication
Matrix Operations
Scalar multiplication Addition/subtraction Transposition Matrix-vector multiplication Matrix-matrix multiplication Inversion
column vectors
Geometric interpretation: A linear combination of the columns of A scaled by the coefficients of b coordinate transformation
Matrix-Matrix Multiplication
Operator Definition
Matrix-Matrix Multiplication
Not commutative (in general) Associative
Transpose
20
Matrix Operations
Scalar multiplication Addition/subtraction Transposition Matrix-vector multiplication Matrix-matrix multiplication Inversion
Matrix Inversion
If is a square matrix of full rank, then there is a unique matrix such that . Different ways to compute, e.g., Gauss-Jordan elimination, LU decomposition, When A is orthonormal, then
Geometric Primitives in 2D
2D point
Augmented vector
Geometric Primitives in 2D
Homogeneous vectors that differ only be scale represent the same 2D point Convert back to inhomogeneous coordinates by dividing through last element
Geometric Primitives in 2D
2D line 2D line equation
21
Geometric Primitives in 2D
Normalized line equation vector with where is the distance of the line to the origin
Geometric Primitives in 2D
Polar coordinates of a line: (e.g., used in Hough transform for finding lines)
Geometric Primitives in 2D
Line joining two points Intersection point of two lines
Geometric Primitives in 3D
3D point (same as before) Augmented vector
Homogeneous coordinates
Geometric Primitives in 3D
3D plane 3D plane equation Normalized plane with unit normal vector ( ) and distance d
Geometric Primitives in 3D
3D line through points Infinite line: Line segment joining :
22
2D Planar Transformations
2D Transformations
Translation
2D Transformations
Rotation + translation (2D rigid body motion, or 2D Euclidean transformation) or
2D Transformations
Scaled rotation/similarity transform
or
where is an orthonormal rotation matrix, i.e., Distances (and angles) are preserved
2D Transformations
Affine transform
2D Transformations
Projective/perspective transform
Note that is homogeneous (only defined up to scale) Resulting coordinates are homogeneous Parallel lines remain parallel
23
2D Transformations
3D Transformations
Translation
Euclidean transform (translation + rotation), (also called the Special Euclidean group SE(3))
3D Transformations
3D Rotations
Rotation matrix (also called the special orientation group SO(3)) Euler angles Axis/angle Unit quaternion
Rotation Matrix
Orthonormal 3x3 matrix
Euler Angles
Product of 3 consecutive rotations Roll-pitch-yaw convention is very common in aerial navigation (DIN 9300)
Column vectors correspond to coordinate axes Special orientation group Main disadvantage: Over-parameterized (9 parameters instead of 3)
24
Euler Angles
Yaw , Pitch , Roll to rotation matrix Advantage:
Euler Angles
Minimal representation (3 parameters) Easy interpretation
Gimbal Lock
When the axes align, one degree-of-freedom (DOF) is lost
Axis/Angle
Represent rotation by
rotation axis and rotation angle
4 parameters 3 parameters
length is rotation angle also called the angular velocity minimal but not unique (why?)
Thus,
25
Conversion
Rodriguez formula
Inverse
Exponential Twist
The exponential map can be generalized to Euclidean transformations (incl. translations) Tangent space (Special) Euclidean group (group of all Euclidean transforms) Rigid body velocity
Exponential Twist
Convert to homogeneous coordinates
Exponential map between se(3) and SE(3) There are also direct formulas (similar to Rodriguez)
Unit Quaternions
Quaternion Unit quaternions have Opposite sign quaternions represent the same rotation Otherwise unique
Unit Quaternions
Advantage: multiplication and inversion operations are really fast Quaternion-Quaternion Multiplication
26
Unit Quaternions
Quaternion-Vector multiplication (rotate point p with rotation q)
3D to 2D Projections
Orthographic projections Perspective projections
3D to 2D Perspective Projection
3D to 2D Perspective Projection
3D to 2D Perspective Projection
3D point (in the camera frame) 2D point (on the image plane) Pin-hole camera model
Remember, normalize
is homogeneous, need to
27
Camera Intrinsics
So far, 2D point is given in meters on image plane But: we want 2D point be measured in pixels (as the sensor does)
Camera Intrinsics
Need to apply some scaling/offset
Camera Extrinsics
Assume is given in world coordinates Transform from world to camera (also called the camera extrinsics)
3D to 2D perspective projections You really have to know 2D/3D transformations by heart (read Szeliski, Chapter 2)
camera
rotor1
rotor2
person
map
28
Sensors
Classification of Sensors
What:
Proprioceptive sensors
Measure values internally to the system (robot) Examples: battery status, motor speed, accelerations,
Classification of Sensors
Tactile sensors Contact switches, bumpers, proximity sensors, pressure Wheel/motor sensors Potentiometers, brush/optical/magnetic/inductive/capacitive encoders, current sensors Heading sensors Compass, infrared, inclinometers, gyroscopes, accelerometers Ground-based beacons GPS, optical or RF beacons, reflective beacons Active ranging Ultrasonic sensor, laser rangefinder, optical triangulation, structured light Motion/speed sensors Doppler radar, Doppler sound Vision-based sensors CCD/CMOS cameras, visual servoing packages, object tracking packages
Exteroceptive sensors
Provide information about the environment Examples: compass, distance to objects,
How:
Passive sensors
Measure energy coming from the environment
Active sensors
Emit their proper energy and measure the reaction Better performance, but influence on environment
29
Sensors
Motor/wheel encoders Compass Gyroscope Accelerometers GPS Range sensors Cameras
Motor/wheel encoders
Device for measuring angular motion Often used in (wheeled) robots Output: position, speed (possibly integrate speed to get odometry)
Motor/wheel encoders
Working principle:
Regular: counts the number of transitions but cannot tell direction Quadrature: uses two sensors in quadrature phaseshift, ordering of rising edge tells direction Sometimes: Reference pulse (or zero switch)
Magnetic Compass
Measures earths magnetic field Inclination angle approx. 60deg (Germany) Does not work indoor/affected by metal Alternative: gyro compass (spinning wheel, aligns with earths rotational poles, for ships)
Magnetic Declination
Angle between magnetic north and true north Varies over time Good news ;-): by 2050, magnetic declination in central Europe will be zero
Magnetic Compass
Sensing principle: Hall sensor Construction: 3 orthogonal sensors
30
Mechanical Gyroscope
Measures orientation (standard gyro) or angular velocity (rate gyro, needs integration for angle) Spinning wheel mounted in a gimbal device (can move freely in 3 dimensions) Wheel keeps orientation due to angular momentum (standard gyro)
Modern Gyroscopes
Vibrating structure gyroscope (MEMS)
Based on Coriolis effect Vibration keeps its direction under rotation Implementations: Tuning fork, vibrating wheels,
Accelerometer
Measures all external forces acting upon them (including gravity) Acts like a spring-damper system To obtain inertial acceleration (due to motion alone), gravity must be subtracted
MEMS Accelerometers
Micro Electro-Mechanical Systems (MEMS) Spring-like structure with a proof mass Damping results from residual gas Implementations: capacitive, piezoelectric,
31
GPS
GPS
24+ satellites, 12 hour orbit, 20.190 km height 6 orbital planes, 4+ satellites per orbit, 60deg distance
GPS
Position from pseudorange
Requires measurements of 4 different satellites Low accuracy (3-15m) but absolute
Satellite transmits orbital location + time 50bits/s, msg has 1500 bits 12.5 minutes
Range Sensors
Sonar Laser range finder
Range Sensors
Emit signal to determine distance along a ray Make use of propagation speed of ultrasound/light Traveled distance is given by Sound speed: 340m/s Light speed: 300.000km/s
32
Laser Scanner
Measures phase shift Pro: High precision, wide field of view, safety approved for collision detection Con: Relatively expensive + heavy
Laser Scanner
2D scanners
Camera
Lets design a camera
Idea 1: put a piece of film in front of an object Do we get a reasonable image?
3D scanners
Camera
Add a barrier to block off most of the rays
This reduces blurring The opening known as the aperture How does this transform the image?
Camera Lens
A lens focuses light onto the film
Rays passing through the optical center are not deviated
33
Camera Lens
A lens focuses light onto the film
Rays passing through the center are not deviated All rays parallel to the Optical Axis converge at the Focal Point
Camera Lens
There is a specific distance at which objects are in focus Other points project to a blur circle in the image
Lens Distortions
Radial distortion of the image
Caused by imperfect lenses Deviations are most noticeable for rays that pass through the edge of the lens
Lens Distortions
Radial distortion of the image
Caused by imperfect lenses Deviations are most noticeable for rays that pass through the edge of the lens
Digital Cameras
Vignetting De-bayering Rolling shutter and motion blur Compression (JPG) Noise
34
Exercise Sheet 1
Odometry sensor on Ardrone is an integrated package Sensors
Down-looking camera to estimate motion Ultrasonic sensor to get height 3-axes gyroscopes 3-axes accelerometer
Summary
Linear Algebra 2D/3D Geometry Sensors
IMU readings
Horizontal speed (vx/vy) Height (z) Roll, Pitch, Yaw
35
36
Organization
Next week: Three scientific guest talks Recent research results from our group (2011/12)
Conference Paper Conference Paper Research Conference Paper Conference Paper ICRA, IROS, CVPR, ICCV, NIPS,
Visual Navigation for Flying Robots 2
Journal Article
Guest Talks
An Evaluation of the RGB-D SLAM System (F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, W. Burgard), In Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA), 2012. Real-Time Visual Odometry from Dense RGB-D Images (F. Steinbruecker, J. Sturm, D. Cremers), In Workshop on Live Dense Reconstruction with Moving Cameras at the Intl. Conf. on Computer Vision (ICCV), 2011. Camera-Based Navigation of a Low-Cost Quadrocopter (J. Engel, J. Sturm, D. Cremers), Submitted to International Conference on Robotics and Systems (IROS), under review.
Visual Navigation for Flying Robots 3 Dr. Jrgen Sturm, Computer Vision Group, TUM
Perception
Perception and models are strongly linked
Perception
Perception and models are strongly linked Example: Human Perception
more on http://michaelbach.de/ot/index.html
Visual Navigation for Flying Robots 5 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 6 Dr. Jrgen Sturm, Computer Vision Group, TUM
37
State Estimation
Cannot observe world state directly Need to estimate the world state Robot maintains belief about world state Update belief according to observations and actions using models Sensor observations + sensor model Executed actions + action/motion model
State Estimation
What parts of the world state are (most) relevant for a flying robot?
State Estimation
What parts of the world state are (most) relevant for a flying robot? Position Velocity Obstacles Map Positions and intentions of other robots/humans
Visual Navigation for Flying Robots 10 Dr. Jrgen Sturm, Computer Vision Group, TUM
Perception
Plan
Execution
Acting
Physical World
Visual Navigation for Flying Robots 11 Dr. Jrgen Sturm, Computer Vision Group, TUM
38
Probabilistic Robotics
Sensor observations are noisy, partial, potentially missing (why?) All models are partially wrong and incomplete (why?) Usually we have prior knowledge (why?)
current state
Visual Navigation for Flying Robots 13
previous state
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 14 Dr. Jrgen Sturm, Computer Vision Group, TUM
Probabilistic Robotics
Probabilistic sensor and motion models Integrate information from multiple sensors (multi-modal) Integrate information over time (filtering)
15
16
18
39
Example
20
Continuous case
If
21
22
Conditional Independence
Definition of conditional independence
Marginalization
Discrete case
Equivalent to
23
24
40
Example: Marginalization
Continuous case
25
26
The expected value is the weighted average of all values a random variable can take on. Expectation is a linear operator
27
28
41
Bayes Formula
Normalization
Direct computation of can be difficult Idea: Compute improper distribution, normalize afterwards Step 1: Step 2:
(Law of total probability)
Step 3:
Visual Navigation for Flying Robots 31 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 32 Dr. Jrgen Sturm, Computer Vision Group, TUM
33
34
Prior on world state Assume: Robot observes light, i.e., What is the probability that the robot is above the landing zone?
Visual Navigation for Flying Robots 35 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 36 Dr. Jrgen Sturm, Computer Vision Group, TUM
42
Combining Evidence
Suppose our robot obtains another observation (either from the same or a different sensor) How can we integrate this new information? More generally, how can we estimate ?
37
38
Combining Evidence
Suppose our robot obtains another observation (either from the same or a different sensor) How can we integrate this new information? More generally, how can we estimate ? Bayes formula gives us:
39
40
if we know
if we know
41
42
43
The second observation lowers the probability that the robot is above the landing zone!
Visual Navigation for Flying Robots 43 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 44 Dr. Jrgen Sturm, Computer Vision Group, TUM
Actions (Motions)
Often the world is dynamic since
actions carried out by the robot actions carried out by other agents or just time passing by change the world
Typical Actions
Quadrocopter accelerates by changing the speed of its motors Position also changes when quadrocopter does nothing (and drifts..) Actions are never carried out with absolute certainty In contrast to measurements, actions generally increase the uncertainty of the state estimate
Visual Navigation for Flying Robots 46 Dr. Jrgen Sturm, Computer Vision Group, TUM
45
Action Models
To incorporate the outcome of an action into the current state estimate (belief), we use the conditional pdf
Example: Take-Off
Action: World state:
0.99
air
This term specifies the probability that executing the action u in state x will lead to state x
Visual Navigation for Flying Robots 47 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots
0.9
0.01
ground
0.1
48 Dr. Jrgen Sturm, Computer Vision Group, TUM
44
Example: Take-Off
Prior belief on robot state: (robot is located on the ground) Robot executes take-off action What is the robots belief after one time step?
Continuous case
Markov Chain
A Markov chain is a stochastic process where, given the present state, the past and the future states are independent
Markov Assumption
Observations depend only on current state Current state depends only on previous state and current action Underlying assumptions
Static world Independent noise Perfect model, no approximation errors
51
52
Bayes Filter
Given:
Stream of observations and actions : Sensor model Action model Prior probability of the system state
Bayes Filter
For each time step, do 1. Apply motion model
Wanted:
Estimate of the state of the dynamic system Posterior of the state is also called belief
Visual Navigation for Flying Robots 53 Dr. Jrgen Sturm, Computer Vision Group, TUM
Note: Bayes filters also work on continuous state spaces (replace sum by integral)
Visual Navigation for Flying Robots 54 Dr. Jrgen Sturm, Computer Vision Group, TUM
45
Example: Localization
Discrete state Belief distribution can be represented as a grid This is also called a histogram filter
P = 1.0
Example: Localization
Action Robot can move one cell in each time step Actions are not perfectly executed
P = 0.0
Visual Navigation for Flying Robots 55 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 56 Dr. Jrgen Sturm, Computer Vision Group, TUM
Example: Localization
Action Robot can move one cell in each time step Actions are not perfectly executed Example: move east
Example: Localization
Observation One (special) location has a marker Marker is sometimes also detected in neighboring cells
60% success rate, 10% to stay/move too far/ move one up/move one down
Visual Navigation for Flying Robots 57 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 58 Dr. Jrgen Sturm, Computer Vision Group, TUM
Example: Localization
Lets start a simulation run (shades are handdrawn, not exact!)
Example: Localization
t=0 Prior distribution (initial belief) Assume we know the initial location (if not, we could initialize with a uniform prior)
59
60
46
Example: Localization
t=1, u=east, z=no-marker Bayes filter step 1: Apply motion model
Example: Localization
t=1, u=east, z=no-marker Bayes filter step 2: Apply observation model
61
62
Example: Localization
t=2, u=east, z=marker Bayes filter step 2: Apply motion model
Example: Localization
t=2, u=east, z=marker Bayes filter step 1: Apply observation model
63
64
Kalman Filter
Bayes filter with continuous states State represented with a normal distribution Developed in the late 1950s Kalman filter is very efficient (only requires a few matrix operations per time step) Applications range from economics, weather forecasting, satellite navigation to robotics and many more Most relevant Bayes filter variant in practice exercise sheet 2
Visual Navigation for Flying Robots 66 Dr. Jrgen Sturm, Computer Vision Group, TUM
47
Normal Distribution
Univariate normal distribution
Normal Distribution
Multivariate normal distribution
67
68
69
70
71
72
48
with
Visual Navigation for Flying Robots 73 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 74 Dr. Jrgen Sturm, Computer Vision Group, TUM
Linear Observations
Further, assume we make observations that depend linearly on the state
Linear Observations
Further, assume we make observations that depend linearly on the state and that are perturbed by zero-mean, normally distributed observation noise
with
75
76
Kalman Filter
Estimates the state of a discrete-time controlled process that is governed by the linear stochastic difference equation
Measurement equation
with
Visual Navigation for Flying Robots
and
77 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 78 Dr. Jrgen Sturm, Computer Vision Group, TUM
49
Kalman Filter
Initial belief is Gaussian
with
Visual Navigation for Flying Robots 81 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 82 Dr. Jrgen Sturm, Computer Vision Group, TUM
Kalman Filter
For each time step, do 1. Apply motion model
For the interested readers: See Probabilistic Robotics for full derivation (Chapter 3)
Kalman Filter
Highly efficient: Polynomial in the measurement dimensionality k and state dimensionality n:
Optimal for linear Gaussian systems! Most robotics systems are nonlinear!
with
Visual Navigation for Flying Robots 83 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 84 Dr. Jrgen Sturm, Computer Vision Group, TUM
50
Taylor Expansion
Solution: Linearize both functions Motion function
Observation function
Observation function
85
86
Example
2D case State Odometry Observations of visual marker (relative to robot pose)
with
Visual Navigation for Flying Robots 87
and
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 88 Dr. Jrgen Sturm, Computer Vision Group, TUM
Example
Motion Function and its derivative
Example
Observation Function ( Sheet 2)
89
90
51
Example
Dead reckoning (no observations) Large process noise in x+y
Example
Dead reckoning (no observations) Large process noise in x+y+yaw
91
92
Example
Now with observations (limited visibility) Assume robot knows correct starting pose
Example
What if the initial pose (x+y) is wrong?
93
94
Example
What if the initial pose (x+y+yaw) is wrong?
Example
If we are aware of a bad initial guess, we set the initial sigma to a large value (large uncertainty)
95
96
52
Example
Summary
Observations and actions are inherently noisy Knowledge about state is inherently uncertain Probability theory Probabilistic sensor and motion models Bayes Filter, Histogram Filter, Kalman Filter, Examples
97
98
53
54
Organization - Exam
Oral exams in teams (2-3 students) At least 15 minutes per student individual grades Questions will address
Material from the lecture Material from the exercise sheets Your mini-project
Control Architecture
Robot
DC Motors
Actuators
Trajectory
Localization Attitude Estimation RPM Estimation Position Control Attitude Control Motor Speed Control
Maybe you built one in school Stationary permanent magnet Electromagnet induces torque Split ring switches direction of current
Physical World
Kinematics Dynamics 3
Brushless Motors
Used in most quadrocopters Permanent magnets on the axis Electromagnets on the outside Requires motor controller to switch currents Does not require brushes (less maintenance)
55
I2C Protocol
Serial data line (SDA) + serial clock line (SCL) All devices connected in parallel 7-10 bit address, 100-3400 kbit/s speed Used by Mikrocopter for motor control
Control Architecture
Robot
Trajectory
Localization Attitude Estimation RPM Estimation Position Control Attitude Control Motor Speed Control Actuators Position Velocity Acceleration Forces Torques
Dynamics
Actuators induce forces and torques Forces induce linear acceleration Torques induce angular acceleration
Sensors
Physical World
Kinematics Dynamics 9
What types of forces do you know? What types of torques do you know?
Visual Navigation for Flying Robots 10 Dr. Jrgen Sturm, Computer Vision Group, TUM
Example: 1D Kinematics
State Action Process model
Torque (Drehmoment) Kalman filter How many states do we need for 3D?
Visual Navigation for Flying Robots 11 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 12 Dr. Jrgen Sturm, Computer Vision Group, TUM
56
Forces
Gravity Friction
Stiction (static friction) Damping (viscous friction)
13
14
Torques
Definition Torques sum up Torque results in angular acceleration (with , moment of inertia) Friction same as before
Dynamics of a Quadrocopter
Each propeller induces force and torque by accelerating air Gravity pulls quadrocopter downwards
15
16
Vertical Acceleration
Thrust
17
18
57
attitude
19
20
Yaw
Each propeller induces torque due to rotation and the interaction with the air Induced torque Induced angular acceleration
Robot
Cascaded Control
Trajectory
Localization Attitude Estimation RPM Estimation Sensors Position Velocity Acceleration Forces Torques Position Control Attitude Control Motor Speed Control Actuators
Physical World
21
Kinematics Dynamics 22
23
24
58
Desired value 35
Desired value 35
25
26
Desired value 35
Sensor 45 35
Measured temperature
25
Measured temperature
25
27
28
Measurement Noise
What effect has noise in the measurements?
Measurement
59
31
32
Saturation
In practice, often the set of admissible controls u is bounded This is called (control) saturation
33
34
Block Diagram
Delays
In practice most systems have delays Can lead to overshoots/oscillations/destabilization
Controller
Plant
Measurement
60
Delays
What is the total dead time of this system?
100ms delay in water pipe Controller Measurement Plant
Delays
What is the total dead time of this system?
100ms delay in water pipe Controller Plant
Measurement
Delays
What is the total dead time of this system?
Controller Plant (and measurement)
Smith Predictor
Allows for higher gains Requires (accurate) model of plant
Controller Delay-free plant model Delay model Plant with delay
Smith Predictor
Plant model is available 5 seconds delay Results in perfect compensation Why is this unrealistic in practice?
Smith Predictor
Time delay (and plant model) is often not known accurately (or changes over time) What happens if time delay is overestimated?
41
42
61
Smith Predictor
Time delay (and plant model) is often not known accurately (or changes over time) What happens if time delay is underestimated?
Robot
Position Control
Next waypoint Localization Position Control
Actuators
forces torques
43
44
45
46
47
48
62
P Control
What happens for this control law? This is called proportional control
P Control
What happens for this control law? This is called proportional control
49
50
PD Control
What happens for this control law? Proportional-Derivative control
PD Control
What happens for this control law? What if we set higher gains?
51
52
PD Control
What happens for this control law? What if we set lower gains?
PD Control
What happens when we add gravity?
53
54
63
Gravity compensation
Add as an additional term in the control law Any known (inverse) dynamics can be included
PD Control
What happens when we have systematic errors? (noise with non-zero mean) Example: unbalanced quadrocopter, wind, Does the robot ever reach its desired location?
add example plot
55
56
PID Control
Idea: Estimate the system error (bias) by integrating the error Proportional+Derivative+Integral Control
add example plot
PID Control
Idea: Estimate the system error (bias) by integrating the error Proportional+Derivative+Integral Control For steady state systems, this can be reasonable Otherwise, it may create havoc or even disaster (wind-up effect)
Visual Navigation for Flying Robots 58 Dr. Jrgen Sturm, Computer Vision Group, TUM
57
De-coupled Control
So far, we considered only single-input, singleoutput systems (SISO) Real systems have multiple inputs + outputs MIMO (multiple-input, multiple-output) In practice, control is often de-coupled
Controller 1
Plant
Controller 2
Visual Navigation for Flying Robots 59 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 60 Dr. Jrgen Sturm, Computer Vision Group, TUM
64
Example: Ardrone
Cascaded control Inner loop runs on embedded PC and stabilizes flight Outer loop runs externally and implements position control
Laptop Outer loop Ardrone (=seen as the plant by the outer loop) Inner loop onboard, 1000Hz wireless, approx. 15Hz Plant
61
62
63
64
Mechanical Equivalent
PD Control is equivalent to adding springdampers between the desired values and the current position
65
65
Optimal Control
What other control techniques do exist? Linear-quadratic regulator (LQR) Reinforcement learning Inverse reinforcement learning ... and many more
Optimal Control
Find the controller that provides the best performance Need to define a measure of performance What would be a good performance measure?
Minimize the error? Minimize the controls? Combination of both?
67
68
Reinforcement Learning
In principle, any measure can be used Define reward for each state-action pair Find the policy (controller) that maximizes the expected future reward Compute the expected future reward based on
Known process model Learned process model (from demonstrations)
Visual Navigation for Flying Robots 70 Dr. Jrgen Sturm, Computer Vision Group, TUM
Goal: Find the controller with the lowest cost LQR control
Visual Navigation for Flying Robots 69 Dr. Jrgen Sturm, Computer Vision Group, TUM
71
66
Map a previously unknown building Find good exploration frontiers in partial map
Move in formation (e.g., to traverse a window) Avoid collisions Dynamic role switching
73
74
Versatile Distributed Pose Estimation and Sensor Self-Calibration for an Autonomous MAV
Stephan Weiss, Markus W. Achtelik, Margarita Chli, Roland Siegwart
On-board Velocity Estimation and Closed-loop Control of a Quadrotor UAV based on Optical Flow
Volker Grabe, Heinrich H. Blthoff, and Paolo Robuffo Giordano
IMU, camera EKF for pose, velocity, sensor bias, scale, intersensor calibration
Ego-motion from optical flow using homography constraint Use for velocity control
75
76
Autonomous Landing of a VTOL UAV on a Moving Platform Using Image-based Visual Servoing
Daewon Lee, Tyler Ryan and H. Jin. Kim
Tracking and landing on a moving platform Switch between tracking and landing behavior
77
78
67
ICRA Papers
Will put them in our paper repository Remember password (or ask by mail) See course website
Combine PTAM with Kinect Monocular SLAM: scale drift Kinect: has small maximum range
79
80
68
Organization: Exam
Registration deadline: June 30 Course ends: July 19 Examination dates: t.b.a. (mid August)
Oral team exam Sign up for a time slot starting from Mid July List will be placed on blackboard in front of our secretary
Motivation
of a rigid
x z
Visual Navigation for Flying Robots 5 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 6
x z
Dr. Jrgen Sturm, Computer Vision Group, TUM
69
x z
Visual Navigation for Flying Robots 7 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 8
x z
Dr. Jrgen Sturm, Computer Vision Group, TUM
x z
Visual Navigation for Flying Robots 9 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 10 Dr. Jrgen Sturm, Computer Vision Group, TUM
3D to 2D Perspective Projection
3D point (in the camera frame) 2D point (on the image plane) Pin-hole camera model
Remember, normalize
is homogeneous, need to
11
12
70
Camera Intrinsics
So far, 2D point is given in meters on image plane But: we want 2D point be measured in pixels (as the sensor does)
Camera Intrinsics
Need to apply some scaling/offset
Image Plane
Pixel coordinates Image plane
Image Functions
We can think of an image as a function gives the intensity at position Color images are vector-valued functions
Example:
Discrete case (default in this course) Continuous case
15
16
Image Functions
Realistically, the image function is only defined on a rectangle and has finite range
Example
189 193 214 216 104 191 201 217 220 103 195 205 216 222 113
79 59 68
83 60 69
77 68 83
68
71
77
18
71
Digital Images
Light intensity is sampled by CCD/CMOS sensor on a regular grid Electric charge of each cell is quantized and gamma compressed (for historical reasons) with CRTs / monitors do the inverse Almost all images are gamma compressed Double brightness results only in a 37% higher intensity value (!)
Visual Navigation for Flying Robots 19 Dr. Jrgen Sturm, Computer Vision Group, TUM
Aliasing
High frequencies in the scene and a small fill factor on the chip can lead to (visually) unpleasing effects Examples
20
Rolling Shutter
Most CMOS sensors have a rolling shutter Rows are read out sequentially Sensitive to camera and object motion Can we correct for this?
Image Filtering
We want to remove unwanted sources of variation, and keep the information relevant for whatever task we need to solve
Linear Filtering
Each output is a linear combination of all the input values
In matrix form
G=HF
Example
111 115 113 111 112 111 112 111 135 138 137 139 145 146 149 147 163 168 188 196 206 202 206 207 180 184 206 219 202 200 195 193
-1 -1 -1
2 2 2
-1 -1 -1
c =
189 193 214 216 104 191 201 217 220 103 195 205 216 222 113
79 59 68
83 60 69
77 68 83
?
Dr. Jrgen Sturm, Computer Vision Group, TUM
68
71
77
23
24
72
Important Filters
Example
111 115 113 111 112 111 112 111 135 138 137 139 145 146 149 147 163 168 188 196 206 202 206 207 180 184 206 219 202 200 195 193 189 193 214 216 104 191 201 217 220 103 195 205 216 222 113 79 59 68 83 60 69 77 68 83 ? ? ? -5 -29 -50 -41 -24 -23 ? ? 9 18 40 41 37 33 ? ? -9 24 142 ? 21 4 -88 ? -12 -7 -34 ? 10 5 10 0 ? ? ? ? ? ? ? ?
Edges
Finite difference filter Derivative filter Oriented filters Gabor filter
26 Dr. Jrgen Sturm, Computer Vision Group, TUM
-1 -1 -1
2 2 2
-1 -1 -1
? ? ? ? ? ?
68
71
77
Visual Navigation for Flying Robots
25
Impulse
2 pixels
0 0 0 0 0
0 0 0 0 0
0 0 1 0 0
0 0 0 0 0
0 0 0 0 0
0 0 1 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
27
28
Image rotation
Image rotation
Image rotation is a linear operator (why?), but not a spatially invariant operation (why?). There is no convolution.
?
Visual Navigation for Flying Robots 29 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots
?
30 Dr. Jrgen Sturm, Computer Vision Group, TUM
73
Rectangular Filter
Rectangular Filter
31
32
Rectangular Filter
Gaussian Blur
Gaussian distribution
33
34
Gaussian Blur
s=1
Image Gradient
The image gradient points in the direction of increasing intensity (steepest ascend)
s=2
s=4
36
74
Image Gradient
The image gradient points in the direction of increasing intensity (steepest ascend)
Image Gradient
Gradient direction (related to edge orientation)
37
38
Image Gradient
How can we differentiate a digital image ? Option 1: Reconstruct a continuous image, then take gradient Option 2: Take discrete derivative (finite difference filter) Option 3: Convolve with derived Gaussian (derivative filter)
Finite difference
First-order central difference
-.5
.5
39
40
Finite difference
First-order central difference (half pixel)
Second-order Derivative
Differentiate again to get second-order central difference
-1
-2
41
42
75
Example
Example
-1
-1 1
43
44
Problem Statement
Given: two camera images Goal: estimate the camera motion
3D motion For the moment, lets assume that the camera only moves in the xy-plane, i.e., Extension to 3D follows
Visual Navigation for Flying Robots 45 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 46 Dr. Jrgen Sturm, Computer Vision Group, TUM
General Idea
1. Define an error metric that defines how well the two images match given a motion vector 2. Find the motion vector with the lowest error
47
48
76
49
50
Windowed SSD
Images (and image patches) have finite size Standard SSD has a bias towards smaller overlaps (less error terms) Solution: divide by the overlap area Root mean square error
51
52
Exposure Differences
Images might be taken with different exposure (auto shutter, white balance, ) Bias and gain model With SSD we get
Cross-Correlation
Maximize the product (instead of minimizing the differences)
53
54
77
General Idea
1. Define an error metric that defines how well the two images match given a motion vector 2. Find the motion vector with the lowest error
55
56
Gradient Descent
Perform gradient descent on the SSD energy function (Lucas and Kanade, 1981) Taylor expansion of energy function
Estimate motion on coarse level Use as initialization for next finer level
with
Visual Navigation for Flying Robots 57 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 58 Dr. Jrgen Sturm, Computer Vision Group, TUM
2. Solve
78
with results in uncertainty in the motion estimate with covariance (e.g., useful for Kalman filter)
61
62
Image Patches
Sometimes we are interested of the motion of a small image patches Problem: some patches are easier to track than others What patches are easy/difficult to track? How can we recognize good patches?
Image Patches
Sometimes we are interested of the motion of a small image patches Problem: some patches are easier to track than others
63
64
Example
Lets look at the shape of the energy functional
Corner Detection
Idea: Inspect eigenvalues
of Hessian
79
Corner Detection
1. For all pixels, computer corner strength 2. Non-maximal suppression (E.g., sort by strength, strong corner suppresses weaker corners in circle of radius r)
Other Detectors
Frstner detector (localize corner with subpixel accuracy) FAST corners (learn decision tree, minimize number of tests super fast) Difference of Gaussians / DoG (scale-invariant detector)
strongest responses
Visual Navigation for Flying Robots 67
non-maximal suppression
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 68 Dr. Jrgen Sturm, Computer Vision Group, TUM
Example
KLT tracker is highly efficient (real-time on CPU) but provides only sparse motion vectors Dense optical flow methods require GPU
Visual Navigation for Flying Robots 69 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 70 Dr. Jrgen Sturm, Computer Vision Group, TUM
3D Motion Estimation
(How) Can we recover the camera motion from the estimated flow field?
72
80
Assumptions
1. The quadrocopter moves slowly relative to the sampling rate limited search radius
73
74
with plane normal and distance is called the continuous homography matrix Note: contains both the linear/angular velocity and the scene structure
Visual Navigation for Flying Robots 75 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 76 Dr. Jrgen Sturm, Computer Vision Group, TUM
The KLT tracker estimates the motion feature track in the image Constraint:
of the
77
78
81
Approach
Result: For all observed motions in the image, the continuous homography constraint holds How can we use this to estimate the camera motion?!
gives us
79
80
Approach
Result: For all observed motions in the image, the continuous homography constraint holds How can we use this to estimate the camera motion?
1. Estimate 2. Recover
Remember:
Visual Navigation for Flying Robots 81 Dr. Jrgen Sturm, Computer Vision Group, TUM
Step 1: Estimate H
Continuous homography constraint Stack matrix H as a vector Linear system of equations For several feature tracks and rewrite
82
Step 1: Estimate H
Linear set of equations
Solve for
83
82
Evaluation
Comparison of estimated velocities with ground truth from motion capture system
Algorithm Pure vision Ang. vel. known Normal known Norm error 0.134 0.117 0.113 Std. deviation 0.094 0.093 0.088
Commercial Solutions
Helicommand 3D from Robbe 2(?) cameras, IMU, air pressure sensor, 450 EUR Parrot Mainboard + Navigation board 1 camera, IMU, ultrasound sensor, 210 USD
89
83
Joggobot
Follows a person wearing a visual marker
[http://exertiongameslab.org/projects/joggobot] `
Visual Navigation for Flying Robots 91 Dr. Jrgen Sturm, Computer Vision Group, TUM
84
Visual Navigation for Flying Robots Simultaneous Localization and Mapping (SLAM)
Dr. Jrgen Sturm
SLAM Applications
SLAM is central to a range of indoor, outdoor, in-air and underwater applications for both unmanned and autonomous vehicles. Examples At home: vacuum cleaner, lawn mower Air: inspection, transportation, surveillance Underwater: reef/environmental monitoring Underground: search and rescue Space: terrain mapping, navigation
85
In Computer Vision: Structure from Motion (SfM), sometimes: Structure and Motion
Monocular/stereo camera Sometimes uncalibrated sensors (e.g., Flick images)
Visual Navigation for Flying Robots 9 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 10 Dr. Jrgen Sturm, Computer Vision Group, TUM
Next week: focus on optimization (bundle adjustment), stereo cameras, Kinect In two weeks: map representations, mapping and (dense) 3D reconstruction
Visual Navigation for Flying Robots 11 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 12 Dr. Jrgen Sturm, Computer Vision Group, TUM
86
13
14
no chance to match!
87
Harris Detector
Rotation invariance?
Harris Detector
Rotation invariance?
19
20
Harris Detector
Rotation invariance
Harris Detector
Invariance to intensity change?
Ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner response R is invariant to rotation
Visual Navigation for Flying Robots 21
22
Harris Detector
Partial invariance to additive and multiplicative intensity changes
Only derivatives are used invariance to intensity shift Intensity scale : Because of fixed intensity threshold on local maxima, only partial invariance
R R
Harris Detector
Invariant to scaling?
threshold
x (image coordinate)
x (image coordinate)
24
88
Harris Detector
Not invariant to image scale
25
26
SIFT Detector
Search for local maximum in space and scale
Image 1
Image 2
scale = 1/2
Scale
Visual Navigation for Flying Robots 27 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 28
Scale
Dr. Jrgen Sturm, Computer Vision Group, TUM
SIFT Detector
1. Detect maxima in scale-space 2. Non-maximum suppression 3. Eliminate edge points (check ratio of eigenvalues) 4. For each maximum, fit quadratic function and compute center at sub-pixel accuracy
Rap em sl e Br l u Sbat ut c r
Example
1. Input image 233x189 pixel 2. 832 candidates DoG minima/maxima (visualization indicate scale, orient., location) 3. 536 keypoints remain after thresholding on minimum contrast and principal curvature
1) 2) 3)
29
30
89
Feature Matching
Now, we know how to find repeatable corners Next question: How can we match them?
Template Convolution
Extract a small as a template
?
Visual Navigation for Flying Robots 31 Dr. Jrgen Sturm, Computer Vision Group, TUM
32
Template Convolution
Invariances Scaling: No Rotation: No (maybe rotate template?) Illumination: No (use bias/gain model?) Perspective projection: Not really
SIFT Features
Visual Navigation for Flying Robots 33 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 34 Dr. Jrgen Sturm, Computer Vision Group, TUM
35
36
90
SIFT Descriptor
Compute image gradients over 16x16 window (green), weight with Gaussian kernel (blue) Create 4x4 arrays of orientation histograms, each consisting of 8 bins In total, SIFT descriptor has 128 dimensions
Feature Matching
Given features in , how to find best match in ? Define distance function that compares two features Test all the features in , find the one with the minimal distance
37
38
Feature Distance
How to define the difference between features? Simple approach is Euclidean distance (or SSD)
Feature Distance
How to define the difference between features? Simple approach is Euclidean distance (or SSD) Problem: can give good scores to ambiguous (bad) matches
39
40
Feature Distance
How to define the difference between features? Better approach with best matching feature from second best matching feature from Gives small values for ambiguous matches
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search
41
42
91
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree)
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree)
Localize query in tree Search nearby leaves until nearest neighbor is guaranteed found
43
44
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree)
Localize query in tree Search nearby leaves until nearest neighbor is guaranteed found
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree)
Localize query in tree Search nearby leaves until nearest neighbor is guaranteed found
45
46
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree)
Localize query in tree Search nearby leaves until nearest neighbor is guaranteed found Best-bin-first: use priority queue for unchecked leafs
Visual Navigation for Flying Robots 47
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree) Approximate search
Locality sensitive hashing Approximate nearest neighbor
48
92
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree) Approximate search
Locality sensitive hashing Approximate nearest neighbor
Efficient Matching
For feature matching, we need to answer a large number of nearest neighbor queries Exhaustive search Indexing (k-d tree) Approximate search Vocabulary trees
49
50
Other Descriptors
SIFT (Scale Invariant Feature Transform) [Lowe, 2004] SURF (Speeded Up Robust Feature) [Bay et al., 2008] BRIEF (Binary robust independent elementary features) [Calonder et al., 2010] ORB (Oriented FAST and Rotated Brief) [Rublee et al, 2011]
Visual Navigation for Flying Robots 51 Dr. Jrgen Sturm, Computer Vision Group, TUM
52
Point triangulation
Known camera poses, observe 2D point correspondences, compute 3D point
93
Camera Calibration
Given: Wanted: such that 2D/3D correspondences
Step 1: Estimate M
Each correspondence generates two equations
into
55
Step 1: Estimate M
Re-arranged in matrix form
Triangulation
Given: cameras point correspondence Wanted: Corresponding 3D point
The final error between measured and projected points is typically less than 0.02 pixels
Visual Navigation for Flying Robots 59 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 60 Dr. Jrgen Sturm, Computer Vision Group, TUM
94
Triangulation
Where do we expect to see ?
Epipolar Geometry
Consider two cameras that observe a 3D world point
61
62
Epipolar Geometry
The line connecting both camera centers is called the baseline
Epipolar Geometry
Given the image of a point in one view, what can we say about its position in another?
epipolar line of x
A point in one image generates a line in another image (called the epipolar line)
Visual Navigation for Flying Robots 64 Dr. Jrgen Sturm, Computer Vision Group, TUM
63
Epipolar Geometry
Left line in left camera frame Right line in right camera frame where are the (local) ray directions
Epipolar Geometry
Left line in right camera frame Right line in right camera frame where are the (local) ray directions Intersection of both lines
=0 0=
95
Epipolar Geometry
Note: The epipolar constraint holds for every pair of corresponding points
where
67
68
Step 1: Estimate E
Epipolar constraint Written out (with )
Step 1: Estimate E
Each correspondence gives us one constraint
Linear system with n equations e is in the null-space of Z Solve using SVD (assuming
Visual Navigation for Flying Robots 70
69
normal noise
Estimation is sensitive to scaling Normalize all points to have zero mean and unit variance
We can only recover the translation up to scale
Visual Navigation for Flying Robots 71 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 72 Dr. Jrgen Sturm, Computer Vision Group, TUM
noise free
96
Recover
73
74
By identifying
and
, we obtain
Find: Camera motion R,t (up to scale) Compute correspondences Compute essential matrix Extract camera motion
Visual Navigation for Flying Robots 76 Dr. Jrgen Sturm, Computer Vision Group, TUM
75
Robust Estimation
Example: Fit a line to 2D data containing outliers
Problem: No matter how good the feature descriptor/matcher is, there is always a chance for bad point correspondences (=outliers)
Visual Navigation for Flying Robots 77 Dr. Jrgen Sturm, Computer Vision Group, TUM
There are two problems 1. Fit the line to the data 2. Classify the data into inliers (valid points) and outliers (using some threshold)
Visual Navigation for Flying Robots 78
97
Goal: Robustly fit a model to a data set which contains outliers Algorithm: 1. Randomly select a (minimal) subset of data points and instantiate the model from it 2. Using this model, classify the all data points as inliers or outliers 3. Repeat 1&2 for iterations 4. Select the largest inlier set, and re-estimate the model from all points in this set
Visual Navigation for Flying Robots 79 Dr. Jrgen Sturm, Computer Vision Group, TUM
3. classify as inliers/outliers
80
Two Examples
G. Klein and D. Murray, Parallel Tracking and Mapping for Small AR Workspaces, International Symposium on Mixed and Augmented Reality (ISMAR), 2007
http://www.robots.ox.ac.uk/~gk/publications/KleinMurray2007ISMAR.pdf
Photo Tourism
N. Snavely, S. M. Seitz, R. Szeliski, Photo tourism: Exploring photo collections in 3D, ACM Transactions on Graphics (SIGGRAPH), 2006
http://phototour.cs.washington.edu/Photo_Tourism.pdf
81
82
PTAM (2007)
Architecture optimized for dual cores
Mapping thread
image image image
Tracking Thread
Thread 1 Thread 2
Track camera
Draw Graphics
Visual Navigation for Flying Robots 84 Dr. Jrgen Sturm, Computer Vision Group, TUM
98
85
86
PTAM Video
Mapping thread
Key frames
Local Bundle Adjustment Global Bundle Adjustment 2-49 50-99 100-149
170 ms 380 ms
270 ms 1.7 s
440 ms 6.9 s
87
88
Automatically estimate
Position, orientation and focal length of all cameras 3D positions of point features
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 90 Dr. Jrgen Sturm, Computer Vision Group, TUM
89
99
91
92
93
94
95
96
100
Visual Navigation for Flying Robots Bundle Adjustment and Stereo Correspondence
Dr. Jrgen Sturm
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA
Remember: 3D Transformations
Representation as a homogeneous matrix
Pro: easy to concatenate and invert Con: not minimal
Map optimization
Graph SLAM Bundle adjustment
Depth reconstruction
Laser triangulation Structured light (Kinect) Stereo cameras
Remember: 3D Transformations
From twist coordinates to twist
101
Notation
Camera poses in a minimal representation (e.g., twists)
10
Loop Closures
Idea: Estimate camera motion from frame to frame Problem:
Estimates are inherently noisy Error accumulates over time drift
11
12
102
13
14
Loop Closures
Solution: Use loop-closures to minimize the drift / minimize the error over all constraints
Graph SLAM
[Thrun and Montemerlo, 2006; Olson et al., 2006]
Use a graph to represent the model Every node in the graph corresponds to a pose of the robot during mapping Every edge between two nodes corresponds to a spatial constraint between them Graph-based SLAM: Build the graph and find the robot poses that minimize the error introduced by the constraints
Visual Navigation for Flying Robots 16 Dr. Jrgen Sturm, Computer Vision Group, TUM
15
Interleaving process of front-end and back-end A consistent map helps to determine new constraints by reducing the search space
Visual Navigation for Flying Robots 17 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 18 Dr. Jrgen Sturm, Computer Vision Group, TUM
103
Problem Definition
Given: Set of observations Wanted: Set of camera poses State vector
Map Error
Real observation Expected observation Difference between observation and expectation
Given the correct map, this difference is the result of sensor noise
Visual Navigation for Flying Robots 19 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 20 Dr. Jrgen Sturm, Computer Vision Group, TUM
Error Function
Assumption: Sensor noise is normally distributed Error term for one observation (proportional to negative loglikelihood)
Error Function
Map error (over all observations)
Gauss-Newton Method
Linearize the error function Compute its derivative Set the derivative to zero Solve the linear system Iterate this procedure until convergence
23
24
104
with
25
26
depend on all
and
27
28
depend on all
and
depend on all
and
29
Jij (x) = 0
105
with
Non-zero on the main diagonal at and ... and at the blocks ij,ji
++
++
106
Gauss-Newton Method
Problem: is non-linear! Algorithm: Repeat until convergence 1. Compute the terms of the linear system
2. Solve the linear system to get new increment 3. Update previous estimate
Visual Navigation for Flying Robots 38 Dr. Jrgen Sturm, Computer Vision Group, TUM
Sparse Cholesky decomposition (~100M matrix elements) Preconditioned conjugate gradients (~1.000M matrix elements) many others
Visual Navigation for Flying Robots 39 Dr. Jrgen Sturm, Computer Vision Group, TUM
Example in 1D
Two camera poses State vector One (distance) observation Initial guess Observation Sensor noise Error
Example in 1D
43
107
Levenberg-Marquardt Algorithm
Idea: Add a damping factor
Levenberg-Marquardt Algorithm
Idea: Add a damping factor
Algorithm
If error decreases, accept If error increases, reject
Visual Navigation for Flying Robots 44 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 45
Non-Linear Minimization
One of the state-of-the-art solution to compute the maximum likelihood estimate Various open-source implementations available
g2o [Kuemmerle et al., 2011] sba [Lourakis and Argyros, 2009] iSAM [Kaess et al., 2008]
Bundle Adjustment
Graph SLAM: Optimize (only) the camera poses
Bundle Adjustment: Optimize both 6DOF camera poses and 3D (feature) points
Other extensions:
Robust error functions Alternative parameterizations
Visual Navigation for Flying Robots 46 Dr. Jrgen Sturm, Computer Vision Group, TUM
Typically
Visual Navigation for Flying Robots
(why?)
47 Dr. Jrgen Sturm, Computer Vision Group, TUM
Error Function
Camera pose Feature point Observed feature location Expected feature location
Error Function
Difference between observation and expectation
Error function
108
Primary Structure
Characteristic structure
50
51
Primary Structure
Insight: and are block-diagonal (because each constraint depends only on one camera and one point)
Schur Complement
Given: Linear system
and
54
56
109
57
Epipole
Visual Navigation for Flying Robots 59 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots
Baseline
Epipole
Dr. Jrgen Sturm, Computer Vision Group, TUM
Epipolar Plane
All epipolar lines intersect at the epipoles An epipolar plane intersects the left and right image planes in epipolar lines
Epipolar line Epipolar plane Epipolar line
Epipolar Constraint
This is useful because it reduces the correspondence problem to a 1D search along an epipolar line
Epipole
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 62 Dr. Jrgen Sturm, Computer Vision Group, TUM
Epipole
Visual Navigation for Flying Robots
Baseline
61
110
63
64
Rectification
In practice, it is convenient if the image scanlines (rows) are the epipolar lines Reproject image planes onto a common plane parallel to the baseline (two 3x3 homographies) Afterwards pixel motion is horizontal
Example: Rectification
66
left image
right image
67 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 68 Dr. Jrgen Sturm, Computer Vision Group, TUM
111
Example
Input
Output
3x3
Visual Navigation for Flying Robots 69 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 70
20x20
Dr. Jrgen Sturm, Computer Vision Group, TUM
71
72
Laser Triangulation
Idea: Well-defined light pattern (e.g., point or line) projected on scene Observed by a line/matrix camera or a position-sensitive device (PSD) Simple triangulation to compute distance
Laser Triangulation
Function principle
Laser baseline Pin-hole CCD focal length
disparity
depth
112
75
76
Laser Triangulation
Stripe laser + 2D camera Often used on conveyer belts (volume sensing) Large baseline gives better depth resolution but more occlusions use two cameras
Structured Light
Multiple stripes / 2D pattern Data association more difficult
77
78
Structured Light
Multiple stripes / 2D pattern Data association more difficult Coding schemes
Temporal: Coded light
Structured Light
Multiple stripes / 2D pattern Data association more difficult Coding schemes
Temporal: Coded light Wavelength: Color Spatial: Pattern (e.g., diffraction patterns)
79
80
113
Example Data
Kinect provides color (RGB) and depth (D) video This allows for novel approaches for (robot) perception
Infrared image
(with distorted pattern)
84
Technical Specs
Infrared camera has 640x480 @ 30 Hz
Depth correlation runs on FPGA 11-bit depth image 0.8m 5m range Depth sensing does not work in direct sunlight (why?)
History
2005: Developed by PrimeSense (Israel) 2006: Offer to Nintendo and Microsoft, both companies declined 2007: Alex Kidman becomes new incubation director at Microsoft, decides to explore PrimeSense device. Johnny Lee assembles a team to investigate technology and develop game concepts 2008: The group around Prof. Andrew Blake and Jamie Shotton (Microsoft Research) develops pose recognition 2009: The group around Prof. Dieter Fox (Intel Labs / Univ. of Washington) works on RGB-D mapping and RGB-D object recognition Nov 4, 2010: Official market launch Nov 10, 2010: First open-source driver available 2011: First programming competitions (ROS 3D, PrimeSense), First workshops (RSS, Euron) 2012: First special Issues (JVCI, T-SMC)
Visual Navigation for Flying Robots 86 Dr. Jrgen Sturm, Computer Vision Group, TUM
Four 16-bit microphones with DSP for beam forming @ 16kHz Requires 12V (for motor), weighs 500 grams Human pose recognition runs on Xbox CPU and uses only 10-15% processing power @30 Hz
(Paper: http://research.microsoft.com/apps/pubs/default.aspx?id=145347)
85 Visual Navigation for Flying Robots Dr. Jrgen Sturm, Computer Vision Group, TUM
114
Kinect: Applications
87
88
89
115
116
Exercise Sheet 5
Prepare mid-term presentation Proposed structure: 3 slides
1. Remind people who you are and what you are doing (can be same slide as last time) 2. Your work/achievements so far (video is a plus) 3. Your plans for the next two weeks
Visual Navigation for Flying Robots Place Recognition, ICP, and Dense Reconstruction
Dr. Jrgen Sturm
Loop Closures
How can we detect loop closures efficiently?
Loop Closures
How can we detect loop closures efficiently? 1. Compare with all previous images (not efficient)
117
Loop Closures
How can we detect loop closures efficiently? 2. Use motion model and covariance to limit search radius (metric approach)
Loop Closures
How can we detect loop closures efficiently? 3. Appearance-based place recognition (using bag of words)
China is forecasting a trade surplus of $90bn (51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. China, trade, The figures are likely to further annoy the US, surplus, commerce, which has long argued that China's exports are unfairly helped by a deliberately undervalued exports, imports, US, yuan. Beijing agrees the surplus is too high, but yuan, bank, domestic, says the yuan is only one factor. Bank of China governor Zhou Xiaochuanincrease, foreign, said the country also needed to do more to boost domestic demand trade, value so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value.
Object/Scene Recognition
Analogy to documents: The content can be inferred from the frequency of visual words
object
Visual Navigation for Flying Robots
features
Dr. Jrgen Sturm, Computer Vision Group, TUM
118
13
Overview
feature detection and extraction (e.g., SIFT, ) ... ... codewords dictionary
example patch
16
119
frequency
codewords
19
20
Object/Scene Recognition
Compare histogram of new scene with those of known scenes, e.g., using
simple histogram intersection nave Bayes more advanced statistical methods
10 Parking lot 5 0
Visual Navigation for Flying Robots 21 Dr. Jrgen Sturm, Computer Vision Group, TUM
Example: FAB-MAP
[Cummins and Newman, 2008]
Highway ?
Visual Navigation for Flying Robots 22 Dr. Jrgen Sturm, Computer Vision Group, TUM
Timing Performance
Compact representation of content Highly efficient and scalable Requires training of a dictionary Insensitive to viewpoint changes/image deformations (inherited from feature descriptor)
23
24
120
Laser Scanner
Measures angles and distances to closest obstacles Alternative representation: 2D point set (cloud) Probabilistic sensor model
measured distance z max range
25
26
27
28
Exhaustive Search
Convolve first scan with sensor model
Sweep second scan over likelihood map, compute correlation and select best pose
29
30
121
Wanted: Translation and rotation that minimize the sum of the squared error
where
Visual Navigation for Flying Robots 31 Dr. Jrgen Sturm, Computer Vision Group, TUM
and
Known Correspondences
Note: If the correct correspondences are known, both rotation and translation can be calculated in closed form.
Known Correspondences
Idea: The center of mass of both point sets has to match
Subtract the corresponding center of mass from every point Afterwards, the point sets are zero-centered, i.e., we only need to recover the rotation
Visual Navigation for Flying Robots 33 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 34 Dr. Jrgen Sturm, Computer Vision Group, TUM
Known Correspondences
Decompose the matrix
Unknown Correspondences
If the correct correspondences are not known, it is generally impossible to determine the optimal transformation in one step
using singular value decomposition (SVD) Theorem If , the optimal solution of is unique and given by
122
ICP Algorithm
[Besl & McKay, 92]
Example: ICP
37
38
ICP Variants
Many variants on all stages of ICP have been proposed: Selecting and weighting source points Finding corresponding points Rejecting certain (outlier) correspondences Choosing an error metric Minimization
Performance Criteria
Various aspects of performance
Speed Stability (local minima) Tolerance w.r.t. noise and/or outliers Basin of convergence (maximum initial misalignment)
39
40
41
42
123
Feature-based Sampling
Detect interest points (same as with images) Decrease the number of correspondences Increase efficiency and accuracy Requires pre-processing
Normal-Space Sampling
Uniform sampling
Normal-space sampling
Random sampling
Visual Navigation for Flying Robots 45
Normal-space sampling
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 46 Dr. Jrgen Sturm, Computer Vision Group, TUM
Finding Correspondences
Has greatest effect on convergence and speed Closest point Normal shooting Closest compatible point Projection Speed-up using kd-trees (or oct-trees)
124
Normal Shooting
Project along normal, intersect other mesh
Slightly better than closest point for smooth meshes, worse for noisy or complex meshes
Visual Navigation for Flying Robots 49 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 50 Dr. Jrgen Sturm, Computer Vision Group, TUM
Projection-based Matching
Slightly worse performance per iteration Each iteration is one to two orders of magnitude faster than closest-point Requires point-to-plane error metric
51
52
Error Metrics
Point-to-point Point-to-plane lets flat regions slide along each other
point-to-plane distance
Minimization
Only point-to-point metric has closed form solution(s) Other error metrics require non-linear minimization methods
Which non-linear minimization methods do you remember? Which robust error metrics do you remember?
normal
Generalized ICP: Assign individual covariance to each data point [Segal, 2009]
Visual Navigation for Flying Robots 53 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 54 Dr. Jrgen Sturm, Computer Vision Group, TUM
125
Real-time scan alignment Range images from structure light system (projector and camera, temporal coding)
55
56
ICP: Summary
ICP is a powerful algorithm for calculating the displacement between point clouds The overall speed depends most on the choice of matching algorithm ICP is (in general) only locally optimal can get stuck in local minima
57
Occupancy Grid
Idea: Represent the map using a grid Each cell is either free or occupied Robot maintains a belief on map state
60
126
Mapping
Goal: Estimate How can this be computed?
61
62
63
64
Sensor Model
For the Bayes filter, we need the inverse sensor model
65
66
127
Resulting Map
Memory Consumption
Consider we want to map a building with 40x40m at a resolution of 0.05cm How much memory do we need?
Note: The maximum likelihood map is obtained by clipping the occupancy grid map at a threshold of 0.5
Visual Navigation for Flying Robots 69 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 70 Dr. Jrgen Sturm, Computer Vision Group, TUM
Memory Consumption
Consider we want to map a building with 40x40m at a resolution of 0.05cm How much memory do we need?
72
128
Example: OctoMap
[Wurm et al., 2011]
Example: OctoMap
[Wurm et al., 2011]
Freiburg computer science campus 292 x 167 x 28 m3, 0.2m resolution, 2mb on disk
73
74
Idea: Instead of representing the cell occupancy, represent the distance of each cell to the surface Occupancy grid maps: explicit representation
z = 1.8 x zero = free space 0 1 0.5 0.5 one = occupied x
Algorithm: 1. Estimate the signed distance field 2. Extract the surface using interpolation (surface is located at zero-crossing)
x negative = outside obj. -1.3 -0.3 0.7 1.7
-1.3
75
76
Weighting Function
Weight each observation according to its confidence
weight (=confidence) signed distance to surface
Data Fusion
Each voxel cell in the SDF stores two values
Weighted sum of signed distances Sum of all weights
When new range image arrives, update every voxel cell according to
measured depth
distance
129
SDF Example
A cross section through a 3D signed distance function of a real scene
brightness encodes
SDF
Dr. Jrgen Sturm, Computer Vision Group, TUM
SDF Fusion
81
82
Ray Casting
For each camera pixel, shoot a ray and search for the first zero crossing in the SDF Value in the SDF can be used to skip along when far from surface
Ray Casting
Interpolation reduces artifacts Close to surface, gradient represents the surface normal
83
84
130
Marching Cubes
First in 2D, marching squares: Evaluate each cell separately Check which edges are inside/outside Generate triangles according to lookup table Locate vertices using least squares
Marching Cubes
85
86
KinectFusion
[Newcombe et al., 2011]
Projective ICP with point-to-plane metric Truncated signed distance function (TSDF) Ray Casting
87
88
131
132
The Department of Informatics would like to invite its students and employees to its summer party and career forum.
July 4, 2012
3 pm 6 pm Career Forum: Presentations given by Google, Capgemini etc, stands, panel discussion: TUM alumni talk about their career paths in informatics 3 pm 6 pm Foosball Tournament Starting at 5 pm Summer Party: BBQ, live band and lots of fun!
www.in.tum.de/2012summerparty
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA
133
Robot Architecture
Global Map (SLAM)
Path Planner Executive
Path Tracking
Local Obstacle Map Localization .. Sensors Collision Avoidance
Position Control
.. Actuators
Physical World
Visual Navigation for Flying Robots 7 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 8 Dr. Jrgen Sturm, Computer Vision Group, TUM
Configuration Space
Work space
Typically 3D pose (position + orientation) 6 DOF
Configuration space
Reduced pose (position + yaw) 4 DOF Full pose 6 DOF Pose + velocity 12 DOF Joint angles of manipulation robot
Configuration Space
The configuration space (C-space) is the space of all possible configurations C-space topology is usually not Cartesian C-space is described as a topological manifold
wrap around start
Notation
Configuration space Configuration Free space Obstacle space
Properties
goal
Visual Navigation for Flying Robots 11 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 12 Dr. Jrgen Sturm, Computer Vision Group, TUM
134
Example
What are admissible configurations for the robot? Equiv.: What is the free space? Point robot
obstacle robot
13
14
Example
What are admissible configurations for the robot? Equiv.: What is the free space? Circular robot
robot footprint
Example
What are admissible configurations for the robot? Equiv.: What is the free space? Circular robot
robot footprint in work space (disk)
robot footprint in configuration space (point) obstacle in configuration space
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 16 Dr. Jrgen Sturm, Computer Vision Group, TUM
?
Visual Navigation for Flying Robots 15
Example
What are admissible configurations for the robot? Equiv.: What is the free space? Large circular robot
where
17
18
135
Example
Polygonal robot, translation only
Work space
Configuration space
Reference point
C-space is obtained by sliding the robot along the edge of the obstacle regions
Visual Navigation for Flying Robots 19 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 20 Dr. Jrgen Sturm, Computer Vision Group, TUM
with
Visual Navigation for Flying Robots 21 Dr. Jrgen Sturm, Computer Vision Group, TUM
22
C-Space Discretizations
Two competing paradigms Combinatorial planning (exact planning) Sampling-based planning (probabilistic/randomized planning)
Combinatorial Methods
Mostly developed in the 1980s Extremely efficient for low-dimensional problems Sometimes difficult to implement Usually produce a road map in Assume polygonal environments
23
24
136
Roadmaps
A roadmap is a graph in where Each vertex is a configuration Each edge is a path for which and are vertices
25
26
Roadmap Construction
We consider here three combinatorial methods: Trapezoidal decomposition Shortest path roadmap Regular grid but there are many more! Afterwards, we consider two sampling-based methods: Probabilistic roadmaps (PRMs) Rapidly exploring random trees (RRTs)
Visual Navigation for Flying Robots 27
Roadmap Construction
Decompose horizontally in convex regions using plane sweep Sort vertices in x direction. Iterate over vertices while maintaining a vertically sorted list of edges
28
Roadmap Construction
Place vertices
in the center of each trapezoid on the edge between two neighboring trapezoids
Example Query
Compute path from to Identify start and goal trapezoid Connect start and goal location to center vertex Run search algorithm (e.g., Dijkstra)
29
30
137
Shortest-Path Roadmap
Contains all vertices and edges that optimal paths follow when obstructed Imagine pulling a tight string between and
31
32
Roadmap Construction
Vertices = all sharp corners (>180deg, red) Edges
1. Two consecutive sharp corners on the same obstacle (light blue) 2. Bitangent edges (when line connecting two vertices extends into free space, dark blue)
Example Query
Compute path from to Connect start and goal location to all visible roadmap vertices Run search algorithm (e.g., Dijkstra)
33
34
Example Query
+ Easy to construct in 2D + Generates shortest paths - Optimal planning in 3D or more dim. is NP-hard
Approximate Decompositions
Construct a regular grid High memory consumption (and number of tests) Any ideas?
qI
Visual Navigation for Flying Robots 35 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots
qG qI
36
qG
138
Approximate Decompositions
Construct a regular grid Use quadtree/octtree to save memory Sometimes difficult to determine status of cell
Approximate Decompositions
+ Easy to construct + Most used in practice - High number of tests
qI
Visual Navigation for Flying Robots
qG
qG
qI
37 Dr. Jrgen Sturm, Computer Vision Group, TUM
qI
Visual Navigation for Flying Robots
qG
qG
qI
38 Dr. Jrgen Sturm, Computer Vision Group, TUM
Sampling-based Methods
Abandon the concept of explicitly characterizing and and leave the algorithm in the dark when exploring The only light is provided by a collisiondetection algorithm that probes to see whether some configuration lies in We will have a look at
Probabilistic road maps (PRMs) Rapidly exploring random trees (RRTs)
39
40
PRM Example
1. Sample vertex 2. Find neighbors 3. Add edges
Vertex: Take random sample from , check whether sample is in Edge: Check whether line-of-sight between two nearby vertices is collision-free
Options for nearby: k-nearest neighbors or all neighbors within specified radius Add vertices and edges until roadmap is dense enough
Visual Navigation for Flying Robots 41 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots
Step 3: Check edges for collisions, e.g., using discretized line search
42
139
Probabilistic Roadmaps
+ Probabilistic. complete - Do not work well for some problems (e.g., + Scale well to higher narrow passages) dimensional C-spaces - Not optimal, not + Very popular, many complete extensions
qG Cobs Cobs Cobs qI
Visual Navigation for Flying Robots 43
qG Cobs Cobs qI
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 44 Dr. Jrgen Sturm, Computer Vision Group, TUM
Cobs Cobs
every time?
45 Dr. Jrgen Sturm, Computer Vision Group, TUM
Why not pick every time? This will fail and run into instead of exploring
Visual Navigation for Flying Robots 46 Dr. Jrgen Sturm, Computer Vision Group, TUM
RRT Examples
2-DOF example
RRT: Grow trees from start and goal location towards each other, stop when they connect
47
48
140
Non-Holonomic Robots
Some robots cannot move freely on the configuration space manifold Example: A car can not move sideways
2-DOF controls (speed and steering) 3-DOF configuration space (2D translation + rotation)
Non-Holonomic Robots
RRTs can naturally consider such constraints during tree construction Example: Car-like robot
49
50
51
52
Search Algorithms
Given: Graph G consisting of vertices and edges (with associated costs) Wanted: find the best (shortest) path between two vertices
54
141
Uninformed Search
Breadth-first
Complete Optimal if action costs equal Time and space
Depth-first
Not complete in infinite spaces Not optimal Time Space (can forget explored subtrees)
55
56
Informed Search
Idea
Select nodes for further expansion based on an evaluation function First explore the node with lowest value
Informed Search
Idea
Select nodes for further expansion based on an evaluation function First explore the node with lowest value
57
58
Informed Search
Greedy best-first search
Simply expand the node closest to the goal
A* search
Combines path cost with estimated goal distance
never
Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 60 Dr. Jrgen Sturm, Computer Vision Group, TUM
142
Problems on A* on Grids
1. The shortest path is often very close to obstacles (cutting corners)
Uncertain path execution increases the risk of collisions Uncertainty can come from delocalized robot, imperfect map, or poorly modeled dynamic constraints
Problems on A* on Grids
3. When the path turns out to be blocked during traversal, it needs to be re-planned from scratch
In unknown or dynamic environments, this can occur very often Replanning in large state spaces is costly Can we re-use (repair) the initial plan?
Map Smoothing
Problem: Path gets close to obstacles Solution: Convolve the map with a kernel (e.g., Gaussian)
Path Smoothing
Problem: Paths are aligned to grid structure (because they have to lie in the roadmap) Paths look unnatural and are sub-optimal Solution: Smooth the path after generation
Traverse path and find pairs of nodes with direct line of sight; replace by line segment Refine initial path using non-linear minimization (e.g., optimize for continuity/energy/execution time)
65
66
143
D* Search
Problem: In unknown, partially known or dynamic environments, the planned path may be blocked and we need to replan Can this be done efficiently, avoiding to replan the entire path?
Non-linear optimization
67
68
D* Search
Idea: Incrementally repair path keeping its modifications local around robot pose Many variants:
D* (Dynamic A*) [Stentz, ICRA 94] [Stentz, IJCAI 95] D* Lite [Koenig and Likhachev, AAAI 02] Field D* [Ferguson and Stenz, JFR 06]
D* Search
Main concepts Invert search direction (from goal to start)
Goal does not move, but robot does Map changes (new obstacles) have only local influence close to current robot pose
Mark the changed node and all dependent nodes as unclean (=to be re-evaluated) Find shortest path to start (using A*) while reusing previous solution
Visual Navigation for Flying Robots 70 Dr. Jrgen Sturm, Computer Vision Group, TUM
69
D* Example
Situation at start
BreadthFirstSearch A* BreadthFirstSearch
D* Example
After discovery of blocked cell
A*
D* Lite
D* Lite
All other nodes remain unaltered, the shortest path can reuse them.
71 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 72 Dr. Jrgen Sturm, Computer Vision Group, TUM
144
D* Search
D* is as optimal and complete as A* D* and its variants are widely used in practice Field D* was running on Mars rovers Spirit and Opportunity
73
74
What if the robot has to react quickly to unforeseen, fast moving objects?
Need a collision avoidance algorithm that runs in constant time!
75
Robot Architecture
Robot
Path Planner
Path Tracking
Local Obstacle Map Localization .. Sensors Collision Avoidance
Position Control
.. Actuators
An approximate global planner computes paths ignoring the kinematic and dynamic vehicle constraints (not real-time) An accurate local planner accounts for the constraints and generates feasible local trajectories in real-time (collision avoidance)
Physical World
Visual Navigation for Flying Robots 77 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 78 Dr. Jrgen Sturm, Computer Vision Group, TUM
145
Local Planner
Given: Path to goal (sequence of via points), range scan of the local vicinity, dynamic constraints Wanted: Collision-free, safe, and fast motion towards the goal (or next via point) Typical approaches:
Potential fields Dynamic window approach
Con:
suffers from local minima no consideration of dynamic constraints
79
80
obstacle-free area
-90deg/s
Visual Navigation for Flying Robots 81
+90deg/s
angular velocity
-90deg/s
Visual Navigation for Flying Robots 82
+90deg/s
angular velocity
NF f f vel n n go
dynamic window (speeds reachable in one time frame)
Admissible space
forward velocity
0.9m/s
obstacle-free area
-90deg/s
Visual Navigation for Flying Robots 83
+90deg/s
angular velocity
Path from A*
-90deg/s +90deg/s angular velocity
84
146
0.9m/s
0.9m/s
Path from A*
-90deg/s +90deg/s angular velocity
Path from A*
-90deg/s +90deg/s angular velocity
85
86
Discretize dynamic window and evaluate navigation function (note: window has fixed size = real-time!) Find the maximum and execute motion
87
88
Problems of DWAs
DWAs suffer from local minima (need tuning), e.g., robot does not slow down early enough to enter doorway:
Can you think of a solution? Note: General case requires global planning
Visual Navigation for Flying Robots 89 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 90 Dr. Jrgen Sturm, Computer Vision Group, TUM
147
148
Visual Navigation for Flying Robots Planning under Uncertainty, Exploration and Coordination
Dr. Jrgen Sturm
149
goal
goal
goal
10
11
12
150
Solution 3: POMDPs
Partially observable Markov decision process (POMDP) Considers uncertainty of the motion model and sensor model Finite/infinite time horizon Resulting policy is optimal One solution technique: Value iteration Problem: In general (and in practice) computationally intractable (PSPACE-hard)
Visual Navigation for Flying Robots 13 Dr. Jrgen Sturm, Computer Vision Group, TUM
POMDP
intractable robust
start
GOAL GOAL
15
Goal: shortest path, subject to kinematic and environmental constraints Dr. Jrgen Sturm, Computer Vision Group, TUM
Remember: Probabilistic Roadmaps 1. Add vertices (sampled in free space) 2. Add edges between neighboring vertices (when line of sight is not obstructed) 3. Find shortest path (Dijkstra, )
16 Dr. Jrgen Sturm, Computer Vision Group, TUM
start
GOAL
Line of sight is not enough Robot might get lost or hit into an obstacle
17
1. Sample vertices and localization distributions where p(xCobst) < e 2. Add edges between points where p(xCobst) < e along path 3. Perform graph search Sturm, Computer Vision Group, TUM 18 Dr. Jrgen
151
Belief Roadmap
[He et al., 2008]
z7 z1 z2 z4
z5
z6
z8 z9 z10 GOAL
start GOAL
z3
19
1. Sample vertices from Cfree, build graph and estimate belief dist. transfer functions 2. Propagate covariances by performing graph search 20 Dr. Jrgen Sturm, Computer Vision Group, TUM
Given: Roadmap Goal: Find path from start to goal nodes that results in minimum uncertainty at goal
How can we propagate the belief distribution along an edge? 1. Sample waypoints, use forward simulation to compute full posterior 2. Linearize model and use Kalman filter
Problem: How can we estimate the belief distribution at the goal (efficiently)?
?
Visual Navigation for Flying Robots 22 Dr. Jrgen Sturm, Computer Vision Group, TUM
21
Belief Propagation
[He et al., 2008]
Initial Conditions
u0:T,z0:T
u0:T,z0:T
?
Dr. Jrgen Sturm, Computer Vision Group, TUM
23
24
152
The posterior distribution at a vertex depends on the prior distribution (and thus on path to the vertex) Need to perform forward simulation (and belief prediction) along each edge for every start state Computing minimum cost path of 30 edges: 100 seconds
25
26
User
Mission Planning
Task Planner Global Map (SLAM)
Local Obstacle Map Mission Planner Robot
Mission Planning
Goal: Generate and execute a plan to accomplish a certain (navigation) task Example tasks
Exploration Coverage Surveillance Tracking
Global Planner
Local Planner
Localization
.. Sensors
Position Control
.. Actuators
Physical World
Visual Navigation for Flying Robots 27 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 28 Dr. Jrgen Sturm, Computer Vision Group, TUM
Task Planning
Goal: Generate and execute a high level plan to accomplish a certain task Often symbolic reasoning (or hard-coded)
Propositional or first-order logic Automated reasoning systems Common programming languages: Prolog, LISP
153
Exploration
By reasoning about control, the mapping process can be made much more effective Question: Where to move next?
Exploration
Choose the action that maximizes utility
Example
Where should the robot go next?
unknown
unexplored
occupied
empty
Visual Navigation for Flying Robots 33 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 34 Dr. Jrgen Sturm, Computer Vision Group, TUM
Information Theory
Entropy is a general measure for the uncertainty of a probability distribution Entropy = Expected amount of information needed to encode an outcome
35
36
154
Low entropy
occupied
free
probability
Low entropy
occupied free
probability
High entropy
occupied free
Information Theory
Information gain = Uncertainty reduction
Conditional entropy
39
40
Example
Exploration Costs
So far, we did not consider the cost of executing an action (e.g., time, energy, )
Utility = uncertainty reduction cost Select the action with the highest expected utility
41
42
155
Exploration
For each location <x,y>
Estimate the number of cells robot can sense (e.g., simulate laser beams using current map) Estimate the cost of getting there
Exploration
Greedy strategy: Select the candidate location with the highest utility, then repeat
43
44
Exploration Actions
So far, we only considered reduction in map uncertainty In general, there are many sources of uncertainty that can be reduced by exploration
Map uncertainty (visit unexplored areas) Trajectory uncertainty (loop closing) Localization uncertainty (active re-localization by re-visiting known locations)
45
46
Entropy evolution
47
48
156
Example: Reduce uncertainty in map, path, and pose [Stachniss et al., 2005]
Selected target location
Corridor Exploration
[Stachniss et al., 2005]
The decision-theoretic approach leads to intuitive behaviors: re-localize before getting lost
Some animals show a similar behavior (dogs marooned in the tundra of north Russia)
Visual Navigation for Flying Robots 49 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 50 Dr. Jrgen Sturm, Computer Vision Group, TUM
Multi-Robot Exploration
Given: Team of robots with communication Goal: Explore the environment as fast as possible
Complexity
Single-robot exploration in known, graph-like environments is in general NP-hard Proof: Reduce traveling salesman problem to exploration Complexity of multi-robot exploration is exponential in the number of robots
Levels of Coordination
1. No exchange of information 2. Implicit coordination: Sharing a joint map
Communication of the individual maps and poses Central mapping system
Without coordination, two robots might choose the same exploration frontier
Visual Navigation for Flying Robots 53 Dr. Jrgen Sturm, Computer Vision Group, TUM
Central planner for target point assignment Minimize expected path cost / information gain /
Visual Navigation for Flying Robots 54 Dr. Jrgen Sturm, Computer Vision Group, TUM
157
Typical Trajectories
Implicit coordination: Explicit coordination:
Exploration Time
[Stachniss et al., 2006]
55
56
Coordination Algorithm
In each time step: Determine set of exploration targets Compute for each robot and each target the expected cost/utility Assign robots to targets using the Hungarian algorithm
Hungarian Algorithm
[Kuhn, 1955]
Combinatorial optimization algorithm Solves the assignment problem in polynomial time General idea: Algorithm modifies the cost matrix until there is zero cost assignment
57
58
158
65
66
159
Summary: Exploration
Exploration aims at generating robot motions so that an optimal map is obtained Coordination reduces exploration time Hungarian algorithm efficiently solves the assignment problem (centralized, 1-step lookahead) Challenges (active research):
Limited bandwidth and unreliable communication Decentralized planning and task assignment
Two-layer hierarchical role assignments using Hungarian algorithm (1: rooms, 2: targets in room) Reduces exploration time and risk of interferences
67
68
160
74
Idea
[Mannadiar and Rekleitis, ICRA 2011]
75
76
Idea
[Mannadiar and Rekleitis, ICRA 2011]
Idea
[Mannadiar and Rekleitis, ICRA 2011]
2 1 4
Visual Navigation for Flying Robots 77 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 78 Dr. Jrgen Sturm, Computer Vision Group, TUM
161
Approach: 1. Decompose map into simple cells 2. Compute connectivity between cells and build graph 3. Solve coverage problem on reduced graph
cells
Vertices = Critical points (that triggered the split) Edges = Connectivity between critical points
Extend graph so that vertices have even order Compute Euler tour (linear time)
81
82
Follow the Euler tour Use simple coverage strategy for cells Note: Cells are visited once or twice
Goal: Cover entire surface while minimizing trajectory length in configuration space
Approach:
Discretize 3D environment into patches Build a neighborhood graph Formulate the problem as generalized TSP (GTSP)
Visual Navigation for Flying Robots 83 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 84 Dr. Jrgen Sturm, Computer Vision Group, TUM
162
85
86
87
163
164
Course Evaluation
Much positive feedback thank you!!! We are also very happy with you as a group. Everybody seemed to be highly motivated! Suggestions for improvements (from course evaluation forms)
Workload was considered a bit too high ECTS have been adjusted to 6 credits ROS introduction lab course would be helpful Will do this next time
165
Browse through (recent) text books Ask your professor, colleagues, Its very unlikely that somebody else has already perfectly solved exactly your problem, so dont worry! Technology evolves very fast
Visual Navigation for Flying Robots 9 Dr. Jrgen Sturm, Computer Vision Group, TUM
Theorist
Formalize the problem Find suitable method (Theoretically) prove that it is right (If needed) implement a proof-of-concept
10 Dr. Jrgen Sturm, Computer Vision Group, TUM
Step 3: Validation
What are your claims? How can you prove them?
Theoretical proof (mathematical problem) Experimental validation
Qualitative (e.g., video) Quantitative (e.g., many trials, statistical significance)
Step 4: Dissemination
Good solution/expertise alone is not enough You need to convince other people in the field Usual procedure:
3-6 month 1. Write research paper (usually 6-8 pages) 2. Submit PDF to an international conference or journal 3-6 month 3. Paper will be peer-reviewed 4. Improve paper (if necessary) 5. Give talk or poster presentation at conference 15 min. 6. Optionally: Repeat step 1-5 until PhD 3-5 years
Visual Navigation for Flying Robots 12 Dr. Jrgen Sturm, Computer Vision Group, TUM
166
Step 5: Refinement
Step 5: Refinement
Discuss your work with
Your colleagues Your professor Other colleagues at conferences
Simplify and generalize your approach Collaborate with other people (in other fields)
Visual Navigation for Flying Robots 14 Dr. Jrgen Sturm, Computer Vision Group, TUM
Scientific Research
This was the big picture Todays focus is on best practices in experimentation What do you think are the (desired) properties of a good scientific experiment?
Experiments should allow you to validate/falsify competing hypotheses Current trends: Make data available for review and criticism Same for software (open source)
Visual Navigation for Flying Robots 16 Dr. Jrgen Sturm, Computer Vision Group, TUM
15
Challenges
Reproducibility is sometimes not easy to guarantee Any ideas why?
Challenges
Randomized components/noise (beat with the law of large numbers/statistical tests) Experiment requires special hardware
Self-built, unique robot Expensive lab equipment
Experiments cost time (Video) Demonstrations will suffice Technology changes fast
Visual Navigation for Flying Robots 17 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 18 Dr. Jrgen Sturm, Computer Vision Group, TUM
167
Benchmarks
Effective and affordable way of conducting experiments Sample of a task domain Well-defined performance measurements Widely used in computer vision and robotics Which benchmark problems do you know?
19
http://www.cs.cmu.edu/~chuck/lennapg/
Visual Navigation for Flying Robots 21 Dr. Jrgen Sturm, Computer Vision Group, TUM Visual Navigation for Flying Robots 22 Dr. Jrgen Sturm, Computer Vision Group, TUM
RoboCup Initiative
Evaluation of full system performance Includes perception, planning, control, Easy to understand, high publicity By mid-21st century, a team of fully autonomous humanoid robot soccer players shall win the soccer game, complying with the official rule of the FIFA, against the winner of the most recent World Cup.
23 Dr. Jrgen Sturm, Computer Vision Group, TUM
RoboCup Initiative
24
168
SLAM Evaluation
Intel dataset: laser + odometry [Haehnel, 2004] New College dataset: stereo + omni-directional vision + laser + IMU [Smith et al., 2009] TUM RGB-D dataset [Sturm et al., 2011/12]
RGB-D dataset with ground truth for SLAM evaluation Two error metrics proposed (relative and absolute error) Online + offline evaluation tools Training datasets (fully available) Validation datasets (ground truth not publicly available to avoid overfitting)
Visual Navigation for Flying Robots 26 Dr. Jrgen Sturm, Computer Vision Group, TUM
25
Recorded Scenes
Various scenes (handheld/robot-mounted, office, industrial hall, dynamic objects, ) Large variations in camera speed, camera motion, illumination, environment size,
Dataset Acquisition
Motion capture system
Camera pose (100 Hz)
Microsoft Kinect
Color images (30 Hz) Depth maps (30 Hz) IMU (500 Hz)
27
28
Calibration
Calibration of the overall system is not trivial: 1. Mocap calibration 2. Kinect-mocap calibration 3. Time synchronization
29
30
169
33
34
35
36
170
37
38
Calibration - Validation
Intrinsic calibration Extrinsic calibration color + depth Time synchronization color + depth Mocap system slowly drifts (need re-calibration every hour) Validation experiments to check the quality of calibration
2mm length error on 2m rod across mocap volume 4mm RMSE on checkerboard sequence
Visual Navigation for Flying Robots 39 Dr. Jrgen Sturm, Computer Vision Group, TUM
Sequence description (on the website): For this sequence, the Kinect was pointed at a typical desk in an office environment. This sequence contains only translatory motions along the principal axes of the Kinect, while the orientation was kept (mostly) fixed. This sequence is well suited for debugging purposes, as it is very simple.
Visual Navigation for Flying Robots 40 Dr. Jrgen Sturm, Computer Vision Group, TUM
41
42
171
Dataset Website
In total: 39 sequences (19 with ground truth) One ZIP archive per sequence, containing
Color and depth images (PNG) Accelerometer data (timestamp ax ay az) Trajectory file (timestamp tx ty ty qx qy qz qw)
46
Evaluation metrics
Average over all time steps
Ground truth
Pre-aligned estimated traj.
Visual Navigation for Flying Robots 47 Dr. Jrgen Sturm, Computer Vision Group, TUM
Reference implementations for both evaluation metrics available Output: RMSE, Mean, Median (as text) Plot (png/pdf, optional)
48
172
49
50
Discussion on Benchmarks
Pro: Provide objective measure Simplify empirical evaluation Stimulate comparison Con: Introduce bias towards approaches that perform well on the benchmark (overfitting) Evaluation metrics are not unique (many alternative metrics exist, choice is subjective)
Visual Navigation for Flying Robots 52 Dr. Jrgen Sturm, Computer Vision Group, TUM
51
Final Exam
Oral exam in teams (2-3 students) At least 15 minutes per student individual grades Questions will address
Your project Material from the exercise sheets Material from the lecture
Consolidation
Problem is formalized Alternative approaches appear Need for quantitative evaluation and comparison
Settled
Benchmarks appear Solid scientific analysis, text books,
Visual Navigation for Flying Robots 53 Dr. Jrgen Sturm, Computer Vision Group, TUM
54
173
Exercise Sheet 6
Prepare final presentation Proposed structure: 4-5 slides
1. 2. 3. 4. 5. Title slide with names + motivating picture Approach Results (video is a plus) Conclusions (what did you learn in the project?) Optional: Future work, possible extensions
174