Papers by Christoph Bregler
Image-based modeling and rendering differs from traditional graphics in that both the geometry an... more Image-based modeling and rendering differs from traditional graphics in that both the geometry and appearance of the scene are derived from real photographs. The techniques often allow for shorter modeling times, faster rendering speeds, and unprecedented levels of photorealism. In this course we will explain and demonstrate a variety of ways of turning images into models and then back into renderings, including movie maps, panoramas, image warping, photogrammetry, light fields, and 3D scanning. This course overviews the relevant topics in computer vision, and show how these methods relate to imagebased rendering techniques. The course shows ways of applying the techniques to animation as well as to 3D navigation, and to both real and synthetic scenes. One underlying theme is that the various modeling techniques make tradeoffs between navigability, geometric accuracy, manipulability, ease of acquisition, and level of photorealism; another theme is the close connection between image-based modeling and rendering and global illumination. The course shows how image-based lighting techniques allow photorealistic additions and modifications to be made to image-based models. The described techniques are illustrated with results from recent research, pioneering projects, and creative applications in art and cinema.
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997
This paper describes a probabilistic decomposition of human dynamics at multiple abstractions, an... more This paper describes a probabilistic decomposition of human dynamics at multiple abstractions, and shows how to propagate hypotheses across space, time, and abstraction levels. Recognition in this framework is the succession of very general low level rouping mechanisms to increased specific and learned mode? based grouping techniques at higher levels. Hard decision thresholds are delayed and resolved by higher level statistical models and temporal context. Lowlevel primitives are areas of coherent motion found by EM clusterin mid-level categories are simple movements represented ky dynamical systems, and high-level complex gestures are represented by Hidden Markov Models as successive phases of sim le movements. We show how such a representation can beyearned from training data, and apply it to the example of human gait recognition.
Lecture Notes in Computer Science, 2002
This paper demonstrates a new visual motion estimation technique that is able to recover high deg... more This paper demonstrates a new visual motion estimation technique that is able to recover high degree-offreedom articulated human body configurations in complex video sequences. We introduce the use and integration of a mathematical technique, the product of exponential maps and twist motions, into a differential motion estimation. This results in solving simple linear systems, and enables us to recover robustly the kinematic degrees-of-freedom in noise and complex self occluded configurations. A new factorization technique lets us also recover the kinematic chain model itself. We are able to track several human walk cycles, several wallaby hop cycles, and two walk cycels of the famous movements of Eadweard Muybridge's motion studies from the last century. To the best of our knowledge, this is the first computer vision based system that is able to process such challenging footage.
Proceedings of the first ACM conference on Online social networks - COSN '13, 2013
ACM Transactions on Graphics, 2005
2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015
Recent state-of-the-art performance on human-body pose estimation has been achieved with Deep Con... more Recent state-of-the-art performance on human-body pose estimation has been achieved with Deep Convolutional Networks (ConvNets). Traditional ConvNet architectures include pooling layers which reduce computational requirements, introduce invariance and prevent over-training. These benefits of pooling come at the cost of reduced localization accuracy. We introduce a novel architecture which includes an efficient 'position refinement' model that is trained to estimate the joint offset location within a small region of the image. This refinement model is jointly trained in cascade with a state-of-the-art ConvNet model [21] to achieve improved accuracy in human joint location estimation. We show that the variance of our detector approaches the variance of human annotations on the FLIC [20] dataset and outperforms all existing approaches on the MPII-human-pose dataset [1].
. This paper demonstrates a content-based retrieval strategythat can tell whether there are naked... more . This paper demonstrates a content-based retrieval strategythat can tell whether there are naked people present in an image. Nomanual intervention is required. The approach combines color and textureproperties to obtain an effective mask for skin regions. The skinmask is shown to be effective for a wide range of shades and colors ofskin. These skin regions are then fed to a specialized grouper, which attemptsto group a human figure using geometric constraints on humanstructure. This...
2010 20th International Conference on Pattern Recognition, 2010
... George Williams Graham Taylor Kirill Smolskiy Christoph Bregler Dept. of Computer Science, Co... more ... George Williams Graham Taylor Kirill Smolskiy Christoph Bregler Dept. of Computer Science, Courant Institute, New York University george,graham,kirill,chris@movement.nyu.edu Abstract ... Page 3. Figure 3. Multi-Modal Architecture 3.4 Discriminant Classification Campbell et al. ...
This paper describes a probabilistic decomposition of human dynamics at multiple abstractions, an... more This paper describes a probabilistic decomposition of human dynamics at multiple abstractions, and shows how to propagate hypotheses across space, time, and abstraction levels. Recognition in this framework is the succession of very general low level rouping mechanisms to increased specific and learned mode? based grouping techniques at higher levels. Hard decision thresholds are delayed and resolved by higher level statistical models and temporal context. Lowlevel primitives are areas of coherent motion found by EM clusterin mid-level categories are simple movements represented ky dynamical systems, and high-level complex gestures are represented by Hidden Markov Models as successive phases of sim le movements. We show how such a representation can beyearned from training data, and apply it to the example of human gait recognition.
This paper presents an algorithm for learning the time-varying shape of a non-rigid 3D object fro... more This paper presents an algorithm for learning the time-varying shape of a non-rigid 3D object from uncalibrated 2D tracking data. We model shape motion as a rigid component (rotation and translation) combined with a nonrigid deformation. Reconstruction is ill-posed if arbitrary deformations are allowed. We constrain the problem by assuming that the object shape at each time instant is drawn from a Gaussian distribution. Based on this assumption, the algorithm simultaneously estimates 3D shape and motion for each time frame, learns the parameters of the Gaussian, and robustly fills-in missing data points. We then extend the algorithm to model temporal smoothness in object shape, thus allowing it to handle severe cases of missing data.
This paper describes a new technique for object recognition based on learning appearance models. ... more This paper describes a new technique for object recognition based on learning appearance models. The image is decomposed into local regions which are described by a new texture representation derived from the output of multiscale, mult iorientation filter banks. We call this representation "Generalized Second Moments" as it can be viewed as a generalization of the windowed second moment matrix
... Mixtures of Second Moment Experts Christoph Bregler and Jitendra Malik ... The new technique ... more ... Mixtures of Second Moment Experts Christoph Bregler and Jitendra Malik ... The new technique has a 6:5% misclassification rate, compared to eigen-images which give 17:4% misclassificationrate, and nearest neighbors which give 15:7% misclassification rate. 1 Introduction ...
In this work, we propose a novel and efficient method for articulated human pose estimation in vi... more In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features. We propose a new human body pose dataset, FLIC-motion, that extends the FLIC dataset with additional motion features. We apply our architecture to this dataset and report significantly better performance than current state-of-the-art pose detection systems.
Uploads
Papers by Christoph Bregler