Papers by shahriar negahdaripour
IEEE Oceanic Engineering Society. OCEANS'98. Conference Proceedings (Cat. No.98CH36259)
Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003.
Over the last decade, there has been an increasing interest in developing vision systems and tech... more Over the last decade, there has been an increasing interest in developing vision systems and technologies that support the operation of unmanned platforms for positioning, mapping, and navigation. Until very recently, these developments rely on images from standard CCD cameras with a single optical center and limited field of view, making them restrictive for some applications. Panoramic images have been explored extensively in recent years [12, 13, 15, 17, 18, 19]. The particular configuration of interest to this investigation yields a conical view, which is most applicable for airborne and underwater platforms. Unlike a single catadioptric camera [2, 15], combination of conventional cameras may be used to generate images at much higher resolution [12]. In this paper, we derive complete mathematic models of projection and image motion equations for a down-look conical camera that may be installed on a mobile platforme.g, an airborne or submersible system for terrain flyover imaging. We describe the calibration of a system comprising multiple cameras with overlapping fields of view to generate the conical view. We demonstrate with synthetic and real data that such images provide better accuracy in 3D visual motion estimation, which is the underlying issue in 3D positioning, navigation, mapping, image registration and photo-mosaicking.
2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007
Target-based positioning and 3-D target reconstruction are critical capabilities in deploying sub... more Target-based positioning and 3-D target reconstruction are critical capabilities in deploying submersible platforms for a range of underwater applications, e.g., search and inspection missions. While optical cameras provide highresolution and target details, they are constrained by limited visibility range. In highly turbid waters, target at up to distances of 10s of meters can be recorded by high-frequency (MHz) 2-D sonar imaging systems that have become introduced to the commercial market in recent years. Because of lower resolution and SNR level and inferior target details compared to optical camera in favorable visibility conditions, the integration of both sensing modalities can enable operation in a wider range of conditions with generally better performance compared to deploying either system alone. In this paper, estimate of the 3-D motion of the integrated system and the 3-D reconstruction of scene features are addressed. We do not require establishing matches between optical and sonar features, referred to as opti-acoustic correspondences, but rather matches in either the sonar or optical motion sequences. In addition to improving the motion estimation accuracy, advantages of the system comprise overcoming certain inherent ambiguities of monocular vision, e.g., the scale-factor ambiguity, and dual interpretation of planar scenes. We discuss how the proposed solution provides an effective strategy to address the rather complex opti-acoustic stereo matching problem. Experiment with real data demonstrate our technical contribution.
Proceedings. 1991 IEEE International Conference on Robotics and Automation
Submersibles require the capability to accurately maintain position when observing, photographing... more Submersibles require the capability to accurately maintain position when observing, photographing, or working at a site. The most direct way to maintain station in a near bottom environment is to track or lock onto stationary objects on the ocean floor. A particular advantage of an optical stationkeeping system is its ability to use natural rather than manmade beacons. Improvements to our previously reported optical flow methods for detecting vehicle motion are presented. Experimental results indicate that an adaptation of Newton-Raphson search combined with the use of a low-noise, high accuracy camera drastically reduces the number of points at which computations need be done. Experiments with an algorith m which accounts for illumination variations one encounters in undersea environments show significant improvement in the estimation of vehicle motion.
Computer Vision, Graphics, and Image Processing, 1989
OCEANS 2010 MTS/IEEE SEATTLE, 2010
Computer vision is challenged by the underwater environment. Poor visibility, geometrical distort... more Computer vision is challenged by the underwater environment. Poor visibility, geometrical distortions and nonuniform illumination typically make underwater vision less trivial than open air vision. One effect which can be rather strong in this domain is sunlight flicker. Here, submerged objects and the water volume itself are illuminated in a natural random pattern, which is spatially and temporally varying. This phenomenon has been considered mainly as a significant disturbance to vision. We show that the spatiotemporal variations of flicker can actually be beneficial to underwater vision. Specifically, flicker disambiguates stereo correspondence. This disambiguation is very simple, yet it yields accurate results. Under flickering illumination, each object point in the scene has a unique, unambiguous temporal signature. This temporal signature enables us to find dense and accurate correspondence underwater. This process may be enhanced by involving the spatial variability of the flicker field in the solution. The method is demonstrated underwater by in-situ experiments. This method may be useful to a wide range of shallow underwater applications.
IEEE International Conference on Image Processing 2005, 2005
Projective homography sits at the heart of many problems in image registration. In addition to ma... more Projective homography sits at the heart of many problems in image registration. In addition to many methods for estimating the homography parameters [5], analytical expressions to assess the accuracy of the transformation parameters have been proposed [4]. We show that these expressions provide less accurate bounds than those based on the earlier results of Weng et al. [7]. The discrepancy becomes more critical in applications involving the integration of frame-to-frame homographies and their uncertainties, as in the reconstruction of terrain mosaics and the camera trajectory from flyover imagery. We demonstrate these issues through selected examples.
Proceedings 2007 IEEE International Conference on Robotics and Automation, 2007
Many applications in mobile and underwater robotics employ 3D vision techniques for navigation an... more Many applications in mobile and underwater robotics employ 3D vision techniques for navigation and mapping. These techniques usually involve the extraction and 3D reconstruction of scene interest points. Nevertheless, in large environments the huge volume of acquired information could pose serious problems to real-time data processing. Moreover, In order to minimize the drift, these techniques use data association to close trajectory loops, decreasing the uncertainties in estimating the position of the robot and increasing the precision of the resulting 3D models. When faced to large amounts of features, the efficiency of data association decreases drastically, affecting the global performance. This paper proposes a framework that highly reduces the number of extracted features with minimum impact on the precision of the 3D scene model. This is achieved by minimizing the representation redundancy by analyzing the geometry of the environment and extracting only those features that are both photometrically and geometrically significant.
Proceedings of OCEANS 2005 MTS/IEEE
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science app... more Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach.
The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)
Producing high-resolution underwater imagery in range of visibility conditions is a capability of... more Producing high-resolution underwater imagery in range of visibility conditions is a capability of critical demand for a number of applications. New generation of forward-scan acoustic video cameras, becoming available for commercial applications in recent years, produce images with considerably more target details that optical systems in turbid waters. Previous computer processing of sonar imagery has dominantly involved target segmentation, classification and recognition by exploiting 2-D visual cues from texture, or object/shadow shape in a single frame. Processing of video is becoming more and more important because of various applications that involve target tracking, object identification in search and inspection, self localization and mapping, among many other applications. This paper addresses the image registration problem for acoustic video, and the preprocessing steps to be applied to the raw video from a DID-SON acoustic camera for image calibration, filtering and enhancement to achieve reliable results.
A common problem in video surveys in very shallow waters is the presence of strong light fluctuat... more A common problem in video surveys in very shallow waters is the presence of strong light fluctuations, due to sun light refraction. Refracted sunlight casts fast moving patterns, which can significantly degrade the quality of the acquired data. Motivated by the growing need to improve the quality of shallow water imagery, we propose a method to remove sunlight patterns in video sequences. The method exploits the fact that video sequences allow several observations of the same area of the sea floor, over time. It is based on computing the image difference between a given reference frame and the temporal median of a registered set of neighboring images. A key observation is that this difference will have two components with separable spectral content. One is related to the illumination field (lower spatial frequencies) and the other to the registration error (higher frequencies). The illumination field, recovered by lowpass filtering, is used to correct the reference image. In addition to removing the sunflickering patterns, an important advantage of the approach is the ability to preserve the sharpness in corrected image, even in the presence of registration inaccuracies. The effectiveness of the method is illustrated in image sets acquired under strong camera motion containing non-rigid benthic structures. The results testify the good performance and generality of the approach.
2009 Workshop on Applications of Computer Vision (WACV), 2009
Consider photography in scattering media. One goal is to enhance the images and compensate for sc... more Consider photography in scattering media. One goal is to enhance the images and compensate for scattering effects. A second goal is to estimate a distance map of the scene. A prior method exists to achieve these goals. It is based on acquiring two images from a fixed position, using a single camera mounted with a polarizer at different settings. However, the shortcomings of this polarizationbased method comprise having to acquire these images sequentially, reduced light level, and inapplicability at low backscatter degree of polarization. In this paper, a new technique is described to alleviate these issues by integrating polarization and stereo cues. More precisely, the earlier single-camera method is extended to a pair of cameras displaced by a finite baseline. Each camera utilizes polarizers at different settings. Stereo disparity and polarization analysis are fused to construct de-scattered left and right views. The binocular stereo cues provide additional geometric constraints for distance computation. Moreover, the proposed technique acquires the two raw images simultaneously. Thus it can be applied to dynamic scenes. Underwater experiments are presented.
OCEANS 2010 MTS/IEEE SEATTLE, 2010
Binocular stereo vision is a common technique for the recovery of three-dimensional shape. Underw... more Binocular stereo vision is a common technique for the recovery of three-dimensional shape. Underwater, backscatter degrades the image quality and consequently the performance of stereo vision-based 3-D reconstruction techniques. Recently, we proposed a method that exploits the depth cue in the backscatter components of stereo pairs, as an additional constraint for recovering the 3-D scene structure. In this paper, we compare the performance of this method with the application of classic normalized SSD-based minimization to raw underwater data, as well as to de-scattered images. Results of experiments with synthetic and real data are presented to assess the performance of our method with these other techniques.
Proceedings of OCEANS 2005 MTS/IEEE
This paper presents a method for the automatic creation of 2D mosaics of the sea floor, using vid... more This paper presents a method for the automatic creation of 2D mosaics of the sea floor, using video sequences acquired at different altitudes above the sea floor. The benefit of using different altitude sequences comes from the fact that higher altitude sequences can be used to guide the motion estimation of the lower ones, thus increasing the robustness and efficiency of the mosaicing process. Illustrative results are presented using sequences of the same coral reef patch, captured with a single video camera. The sequences present some of the common difficulties of underwater 2D mosaicing, namely non-flat, moving environment and changing lighting conditions. When compared to the case of single sequence mosaic creation, we show that by combining geometric information from different sequences, we are able to successfully estimate the topology of much lower altitude sequences. This results in higher resolution mapping of the sea floor. The importance of this work is emphasized by fact that the presented methods require inexpensive image acquisition and processing equipment, thus potentially benefiting a very large group of marine scientists.
This paper presents a method to recover the full-motion (3 rotations and 3 translations) of the h... more This paper presents a method to recover the full-motion (3 rotations and 3 translations) of the head from an input video using a cylindrical head model. Given an initial reference template of the head image and the corresponding head pose, the head model is created and full head motion is recovered automatically. The robustness of the approach is achieved by a combination of three techniques. First, we use the iteratively re-weighted least squares (IRLS) technique in conjunction with the image gradient to accommodate non-rigid motion and occlusion. Second, while tracking, the templates are dynamically updated to diminish the effects of self-occlusion and gradual lighting changes and to maintain accurate tracking even when the face moves out of view of the camera. Third, to minimize error accumulation inherent in the use of dynamic templates, we re-register images to a reference template whenever head pose is close to that in the template. The performance of the method, which runs in real time, was evaluated in three separate experiments using image sequences (both synthetic and real) for which ground truth head motion was known. The real sequences included pitch and yaw as large as 40° and 75°, respectively. The average recovery accuracy of the 3D rotations was about 3°. In a further test, the method was used as part of a facial expression analysis system intended for use with spontaneous facial behavior in which moderate head motion is common. Image data consisted of 1-minute of video from each of 10 subjects while engaged in a 2-person interview. The method successfully stabilized face and eye images allowing for 98% accuracy in automatic blink recognition.
Journal of the Optical Society of America A, 1988
Finding the relationship between two coordinate systems by using pairs of measurements of the coo... more Finding the relationship between two coordinate systems by using pairs of measurements of the coordinates of a number of points in both systems is a classic photogrammetric task. The solution has applications in stereophotogrammetry and in robotics. We present here a closed-form solution to the least-squares problem for three or more points. Currently, various empirical, graphical, and numerical iterative methods are in use. Derivation of a closedform solution can be simplified by using unit quaternions to represent rotation, as was shown in an earlier paper [J. Opt. Soc. Am. A 4, 629 (1987)]. Since orthonormal matrices are used more widely to represent rotation, we now present a solution in which 3X3 matrices are used. Our method requires the computation of the square root of a symmetric matrix. We compare the new result with that obtained by an alternative method in which orthonormality is not directly enforced. In this other method a best-fit linear transformation is found, and then the nearest orthonormal matrix is chosen for the rotation. We note that the best translational offset is the difference between the centroid of the coordinates in one system and the rotated and scaled centroid of the coordinates in the other system. The best scale is equal to the ratio of the root-mean-square deviations of the coordinates in the two systems from their respective centroids. These exact results are to be preferred to approximate methods based on measurements of a few selected points.
Marine Ecology, 2007
Four hurricanes impacted the reefs of Florida in 2005. In this study, we evaluate the combined im... more Four hurricanes impacted the reefs of Florida in 2005. In this study, we evaluate the combined impacts of hurricanes Dennis, Katrina, Rita, and Wilma on a population of Acropora palmata using a newly developed video‐mosaic methodology that provides a high‐resolution, spatially accurate landscape view of the reef benthos. Storm damage to A. palmata was surprisingly limited; only 2 out of 19 colonies were removed from the study plot at Molasses Reef. The net tissue losses for those colonies that remained were only 10% and mean diameter of colonies decreased slightly from 88.4 to 79.6 cm. In contrast, the damage to the reef framework was more severe, and a large section (6 m in diameter) was dislodged, overturned, and transported to the bottom of the reef spur. The data presented here show that two‐dimensional video‐mosaic technology is well‐suited to assess the impacts of physical disturbance on coral reefs and can be used to complement existing survey methodologies.
Journal of Multimedia, 2006
Belief propagation and graph cuts have emerged as powerful tools for computing efficient approxim... more Belief propagation and graph cuts have emerged as powerful tools for computing efficient approximate solution to stereo disparity field modelled as the Markov random field (MRF). These algorithms have provided the best performance based on results on a standard data set [1]. However, employment of the brightness constancy (BC) assumption severely limits the range of their applications. Previously, augmenting the BC with gradient constancy (GC) assumption has shown to produce a more robust optical flow algorithm [2], [3]. In this paper, these constraints are integrated within the MRF framework to devise an enhanced global method that broadens the application domains for stereo computation. Results of experiments with both semi-synthetic data and more challenging ocean images are presented to illustrate that the proposed method generally outperforms earlier dense optical flow and stereo algorithms.
Journal of Field Robotics, 2009
Scene modeling has a key role in applications ranging from visual mapping to augmented reality. T... more Scene modeling has a key role in applications ranging from visual mapping to augmented reality. This paper presents an end-to-end solution for creating accurate three-dimensional (3D) textured models using monocular video sequences. The methods are developed within the framework of sequential structure from motion, in which a 3D model of the environment is maintained and updated as new visual information becomes available. The proposed approach contains contributions at different levels. The camera pose is recovered by directly associating the 3D scene model with local image observations, using a dual-registration approach. Compared to the standard structure from motion techniques, this approach decreases the error accumulation while increasing the robustness to scene occlusions and feature association failures, while allowing 3D reconstructions for any type of scene. Motivated by the need to map large areas, a novel 3D vertex selection mechanism is proposed, which takes into account the geometry of the scene. Vertices are selected not only to have high reconstruction accuracy but also to be representative of the local shape of the scene. This results in a reduction in the complexity of the final 3D model, with minimal loss of precision. As a final step, a composite visual map of the scene (mosaic)
Image and Vision Computing, 2009
This paper presents a novel approach for combining a set of registered images into a composite mo... more This paper presents a novel approach for combining a set of registered images into a composite mosaic with no visible seams and minimal texture distortion. To promote execution speed in building large area mosaics, the mosaic space is divided into disjoint regions of image intersection based on a geometric criterion. Pair-wise image blending is performed independently in each region by means of watershed segmentation and graph cut optimization. A contribution of this work-use of watershed segmentation to find possible cuts over areas of low photometric difference-allows for searching over a much smaller set of watershed segments, instead of over the entire set of pixels in the intersection zone. The proposed method presents several advantages. The use of graph cuts over image pairs guarantees the globally optimal solution for each intersection region. The independence of such regions makes the algorithm suitable for parallel implementation. The separated use of the geometric and photometric criteria frees the need for a weighting parameter. Finally, it allows the efficient creation of large mosaics, without user intervention. We illustrate the performance of the approach on image sequences with prominent 3D content and moving objects.
Uploads
Papers by shahriar negahdaripour