Geometry of Views

Consider the 3-dimensional world projected through some point in space (the camera) onto some 2-dimensional plane in space (the image). This is the basic pin-hole camera model most of our work assumes. This model allows us to formalize computer vision using a geometric and algebraic language. In this language we prove fundamental constraints and invariants between images and objects that can be used to develop algorithms for wide array of tasks. Most of the work we do is for the uncalibrated case.

Plane + Parallax

Plane + Parallax provides the basis for 3D scene analysis even in difficult scenarios where estimation of scene geometry or camera geometry is difficult. Direct estimation of 3D motion is a difficult and ill-conditioned problem due to the large number of degrees of freedom. On the other hand planer motion estimation is a much easier problem with only 8 unknown variables in the projective model. Plane + Parallax builds on top of a 2D technique of motion estimation by viewing 3D space relative to some arbitrary planar surface in the scene. The core of the method is detecting a single planar surface in the scene directly from image intensities and computing its 2D motion in the image plane. The detected 2D motion is used to register the images so that the planer surface seems stationary. The parallax is defined as the resulting residual image displacement field. In affect the registration cancels camera rotation, thus the disparity field is affected only by 3D translation of the camera. The magnitude of each parallax displacement is directly related to its perpendicular distance from the planer surface defining a 3D structure relative to our plane. This approach can be applied to many image tasks such as calculation of ego-motion and detection of moving objects. Algorithms based on this method are robust and allow multiple moving objects and occlusions.

Selected Papers

Michal Irani, Benny Rousso, Shmuel Peleg Robust Recovery of Ego-Motion CAIP 1993, Budapest, 13-15 Sept 1993.

Michal Irani, Benny Rousso, Shmuel Peleg Computing Occluding and Transparent Motions IJCV, Vol 12 No. 1, January 1994

A. Shashua and N. Navab. Relative Affine Structure: Canonical Model for 3D from 2D Geometry and Applications. PAMI Vol. 18(9), 1996.

Trilinear Tensor

Given a few images of a 3D scene, the relationships exist the images for a given point in the world are studied. For a projective transformation from 3D space to image plane, there are relationships between images that satisfy trilinear constraints. The Trilinear Tensor is a set of coefficients for these trilinear constraints that solve the unique geometric relations between three uncalibrated projective cameras. It is a powerful tool for the analysis of multiple images of a scene. Given seven matching triplets across three views we can calculate these 27 coefficients by solving linear equations. The Trilinear Tensor is uniquely determined for any camera configuration of three views, including the case when the three camera centers are collinear, its coefficients can be calculated directly from this configuration. This tensor contains within much information the most important being: the projection of any scene point across the three views must satisfy this set of trilinear equations. We list some of the advantages of the Trilinear Tensor. The Trilinear Tensor has no critical surface and therefore numerically stable. The Trilinear Tensor is structure invariant, no segmentation is needed and all scene points can participate in its calculation. Many interesting geometric relationships can be derived from this tensor: camera translation, camera rotation, epipolar geometry, fundamental matrix, intrinsic homographies, relative affine structure. Currently the tensor assumes prior correspondence of points for it calculation, in development is use of the constant brightness equation for direct calculation of the tensor from image intensities. One of our successful algorithm based on the tensor is a new method to represent scenes that avoid 3D.

Selected Papers

A. Shashua. Algebraic Functions For Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) Vol. 17(8), pp. 779--789, 1995.

A. Shashua and M. Werman Trilinearity of Three Perspective Views and its Associated Tensor Int. Conf. Computer Vision, Boston, June 1995.

A. Shashua and S. Avidan. The Rank4 Constraint In Multiple View Geometry. To appear in ECCV, April 1996.

Benny Rousso, Shai Avidan, Amnon Shashua, Shmuel Peleg Robust recovery of camera rotation from three frames ARPA Image Understanding Workshop, February 1996.

A.Shashua and P. Anandan. Trilinear Constraints Revisited: Generalized Trilinear Constraints And The Tensor Brightness Constraint. In Proc. of the ARPA Image Understanding Workshop, Palm Springs, Feb. 1996.

[Back] Back to Research Page