One can recover the motion of an observer in a static environment directly from first derivatives (spatial and temporal) of image brightness, accumulated over a large part of the image. There is no need to extract features and to determine correspondences between features in successive frames. This is fortunate, since the correspondence problem in the case of motion vision is even harder than it is for binocular stereo, where matching features have to lie on corresponding epipolar lines. Motion vision—at least short-range motion vision—should be easier than binocular stereo, not harder. In the case of rigid body motion there is also no need to first estimate the optical flow, since it is so highly constrained.
Berthold Horn hasn't uploaded this paper.
Let Berthold know you want this paper to be uploaded.