Novel view synthesis using a translating camera
Geetika Sharma a, Ankita Kumar a, Shakti Kamal a,
Santanu Chaudhury b,*, J.B. Srivastava a
a
Department of Mathematics, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110 016, India
Department of Electrical Engineering, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110 016, India
Abstract
This paper addresses the problem of synthesizing novel views of a scene using images taken by an uncalibrated translating camera. We propose a method for synthesis of views corresponding to translational motion of the camera. Our
scheme can handle occlusions and changes in visibility in the synthesized views. We give a characterisation of the viewpoints corresponding to which views can be synthesized. Experimental results have established the validity and effectiveness of the method. Our synthesis scheme can also be used to detect translational pan motion of the camera in a
given video sequence. We have also presented experimental results to illustrate this feature of our scheme.
Keywords: Camera translation; Image-based rendering; Novel view synthesis; Pan-detection
1. Introduction
View synthesis from images of real-world scenes
has gained much attention in recent times mainly
due its wide and numerous applications ranging
from video compression to virtual walkthroughs
to special effects and animation. Active research
in this area strives for faster rendering algorithms
and more realistic synthesized views.
In this work, we have considered the problem of
synthesizing novel views of a scene using two or
more images taken by an uncalibrated, translating
camera. Since the given images are taken by a
translating camera their image planes are parallel.
Under the additional assumption of constant but
unknown internal camera parameters, we have
developed a synthesis scheme which produces perspectively correct views for arbitrary translation of
the virtual camera. Our technique also computes zbuffer values for the given and novel views so that
changes in visibility between objects in the scene
may be handled correctly. Further, our technique
can be used to detect translational camera motion
ARTICLE IN PRESS
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
in video sequences which has applications in video
segmentation and characterisation of video sequences based on camera motion patterns. Our
scheme can be used as part of a rendering engine
for virtual walkthroughs and also to generate
translational videos of a scene from still images
of it. Additionally, the translational pan detection
scheme may be combined with the view synthesis
scheme to first identify and then compress translational videos.
Previous work in this area includes Akhloufi et
al. (1999) who require the fundamental matrices
relating the given and novel views to be known
and do not explicitly handle occlusions while we
do not require knowledge of the fundamental matrix and handle occlusions explicitly. Avidan and
Shashua (1998) need to estimate the trifocal tensor
for view synthesis. While they use arbitrary input
views, they cannot handle occlusions. Beier and
Neely (1992) use morphing between line segments
specified and matched by a human animator and
Chen and Williams (1993) require the camera
transformation and range data to be given. Fusiello et al. (2003) use relative affine structure (Shashua and Navab, 1996), which is a projective
reconstruction. They can generate novel views
only for the same displacement of the virtual camera as between the given views and do not handle
occlusions explicitly. Our technique can generate
all in between views and also extrapolate. Genc
and Ponce (2001) use constraints imposed by
weak-perspective and para-perspective cameras
while we work with a perspective camera. Chang
and Zakhor (2001) propose a novel representation
consisting of multiple depth and intensity levels for
new view synthesis, however, they require calibrated cameras. Inamoto and Saito (2002), Saito
et al. (2002) and Yaguchi and Saito (2002) use a
set of views taken from multiple cameras to reconstruct the scene in Projective Grid Space defined
by the epipolar geometry between two chosen basis cameras. Buehler et al. (2001) addresses the
problem of video stabilisation using image-based
rendering. A new sequence as seen from a stabilised camera trajectory is synthesized using a quasi-affine reconstruction of the scene. Unlike them
we do not require any kind of reconstruction to
synthesize novel views.
Lhuillier and Quan (2003) give a method for
obtaining a quasi-dense disparity map and a novel
representation of a pair of images for view synthesis. They handle occlusions by rendering new views
in a heuristically determined order using disparity
values. Computing disparity from image information alone requires rectified images which correspond to a translation of the camera in the
direction of the u axis of the image. Our technique
can handle occlusions using z-buffer values and
arbitrary camera translation. Seitz and Dyer
(1996) do a rectification and interpolation of arbitrary views followed by projective transformation
of the interpolated new view. We can create novel
views for any virtual viewpoint within the rectangle whose diagonal is the line joining the viewpoints of the two given views and not only on
the line joining them. In the case of xyz translation, the virtual viewpoint can move in rectangles
of different sizes depending on the amount of z
translation. Also, we compute z-buffer values for
explicit occlusion handling. The novel views we
render are perspectively correct while most existing
techniques, except Seitz and Dyer (1996) and Avidan and Shashua (1998), require camera parameters to produce correct perspective views.
Camera motion detection is an important prerequisite in video processing. Techniques in this
area, like Srinivasan et al. (1997) and Sudhir and
Lee (1997), are based on optical flow; Jadon et
al. (2002) uses fuzzy set theory; Park et al. (1994)
uses a transformation model based on perspective
projection, 3D rotation and zoom. We present a
novel approach to camera motion detection based
on the geometric relationships between objects in a
static scene and the constraints imposed by translational camera motion. Our technique for view
synthesis can be used for detection of translational
pan motion in video sequences in which the internal parameters of the camera do not change and
can identify translationally related frames, i.e.,
frames whose image planes are parallel, even
though the camera may have undergone arbitrary
motion between them.
This paper is organised as follows. In Section 2
we describe the view synthesis scheme for camera
translation in xy direction while Section 3 describes the scheme for translation in xyz direction.
ARTICLE IN PRESS
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
We describe the application of our technique to
translational pan detection in Section 4 and conclude in Section 5.
2. Camera translation in xy direction
k2p2 = KRP +
We first consider the problem of view synthesis
from a set of views taken by an uncalibrated camera translating parallel to the image plane, i.e., in
the xy direction.
2.1. Synthesis from two views
Let I1 and I2 be the two views of the scene and
let their centres of projection be C1 and C2, respectively. Also, let P=(X, Y,Z)T be a point in the
scene and pi = (ui, v\ 1)T its image in the ith view,
i= 1,2. Then, the world to image projection is given by (Hartley and Zisserman, 2000)
Xlpl=K{RP+tl)
(1)
where
is the matrix of camera internals, R is a rotation
matrix representing the orientation of the camera
and t1 = (x1,y1,z1)T is a vector representing the
position of the first camera. In K, f is the focal
length, su and sv are the scale factors along the u
and v axes of the image plane, respectively, and
x0, y0 are the coordinates of the principle point.
Also, k1 is the depth of the point P in the coordinate system of the first camera.
We assume that R 5 I and t1 5 (0,0,0) T so
that the camera is not aligned with the world coordinate system. This allows us to further assume
that the origin of the world coordinate system is
visible in the given views. So if we have n point
correspondences given between I1 and I2, without
any loss of generality, we may choose one of them
as the origin of the world coordinate system. Letp1
denote its image in the first view. Then, k10p10 = Kt0
and substituting this in Eq. (1) we get
P = R-lK-\Xlpl-X\pl)
The second view I2, has camera matrix
K[Rjt1 + t2], where t2 = (x2,y2,0)T, since we are
assuming camera translation parallel to the image
plane. The visibility of the origin in this view reduces the projection equation to
(2)
XIP20
(3)
and zero translation along the z direction implies
k1 = k2. Eliminating P between (2) and (3), we get
The only unknowns in the above equation are k1
which is fixed and k1 for each point correspondence. Also, each point correspondence gives us
two equations, linear in the unknowns. So, given
n P 2 point correspondences we can set up a system of 2(n — 1) equations in n unknowns which
can be solved for using singular value decomposition. The /l b s computed are actually the values
that would be stored in a z-buffer as they are the
depths of scene points from the centre of projection in the camera coordinate system. Since these
A's are computed using image information alone
they can be used for rendering new views without
any perspective error. It can be shown that the
equations in (4) imply the well-known epipolar
constraint. To the best of our knowledge, the
equations in (4) have not been documented in
literature.
2.2. Synthesis of novel views
Suppose that the virtual camera undergoes
translation ts = (xs,ys,0)T relative to the first camera. Then, the projection equation for the novel
view I is
= KÐRP þ h + tsÞ
Again, since zs = 0, ks = k1. Using (1), we get
k1ps =XlPlþKts
ð5Þ
XlPl=Xlpl+Kts
(6)
(us0,vs0,1ÞT
where ps =
is the image of the origin in
the new view Is. Equating the first coordinate on
both sides and rearranging, we get xs = k ° "°.
Since X\, u10,f and su are fixed by the given views,
different values of u0s correspond to different trans-
ARTICLE IN PRESS
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
lations xs in the x direction. Thus, a translation for
the virtual camera may be chosen, interactively, by
giving a value to us0. Similarly, the y translation of
the virtual camera may chosen by giving a value to
vs0. It follows that a choice of ps0 fixes the translation of the virtual camera and specifies the new
view. Substituting Kts from (6) in (5), we get
Note that we only want to ensure that all corresponding points are within the FOV of the virtual
camera and it is possible that subsequently, some
points become occluded by others due to the scene
geometry. We also show that the synthesized view
is infact a convex combination of the given views.
We have, k1(p2 — p1) = Kt2. Combining this equation with (7) we get,
The positions of point correspondences in the
new view can be obtained from the above equation. Since we do not assume dense correspondences, the new view can be rendered by
triangulating it and the given views using corresponding points as vertices. Triangles in the new
view are then texture mapped by combining textures from corresponding triangles in the given
views. Since the computed A}'s act as z-buffer values for corresponding points, these can be used
to handle occlusions while rendering the new view.
Thus, given n point correspondences the algorithm
for synthesis of new views is as follows:
U
1. Setup a system of equations using (4) and compute Xus.
2. Specify a new view by giving values to us0 and v0s.
3. Render the position of corresponding points in
the new view using (7). Resolve any visibility
issues using )}'& as z-buffer values.
4. Triangulate the new view using the rendered
points as vertices and texture map the triangles
from the given views.
2.3. Characterisation of viewpoints
Our view synthesis scheme works for any arbitrary translation of the camera. However, it is possible that the chosen camera translation is so large
that in the new view except for the origin all other
corresponding points move outside the field of
view (FOV) of the virtual camera. None of the corresponding points would be visible in the image, in
such a case, making it impossible to triangulate
and render the new view. To avoid such choices,
we give a characterisation of all viewpoints for
which it is guaranteed that all the corresponding
points lie within the FOV of the virtual camera.
=
1
V
)U -\
M
j
Since the image is restricted to a finite portion
of the image plane, the u and v coordinates of image points are bounded. Also, since the matrix of
camera internals does not change across the given
and the synthesized views, these bounds are the
same for all the views. Let a be the lower bound
for the u axis and b, the upper bound. We will obtain a condition on xs so that us is within the
bounds of the image plane of the new view. We
have,
First suppose that 0 6 xs 6 x2. Then, multiplying the first inequality in (8) with 1 — j - , the second
with f- and adding, we get a 6 us 6 b. Following a
similar approach we can show that if, 2c and d are
the bounds for the v axis, then c 6 vs 6 d if
y2. Thus, if we specify Is such that the
0
centre of projection remains within the rectangle
whose diagonal is the line joining C1 and C2, all
the points in the fields of view of I1 and I2 are also
in the FOV of Is. We call this rectangle the rectangle of renderable views. It is the shaded rectangle
in Fig. 4.
If, xs > x2 and there is a point in the scene such
that its u coordinate in I1 is a and in I2 is b, then, its
image in Is will be að1 — f-) + bxxs> b. Such a point
will not lie in the FOV of Is. Thus, if xs > x2 we
cannot guarantee that all points in the fields of
view I1 and I2 will lie in the FOV of Is as well. Similarly, for xs < 0, we can show that there can be
points in the fields of view of I1 and I2 which are
not in the FOV of Is.
We conclude that for all corresponding points
to lie in the FOV of the new view we must move
the centre of projection only within the rectangle
of renderable views. Note that in this case, us be-
ARTICLE IN PRESS
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
comes a convex combination of u1 and u2 and vs
becomes a convex combination of v1 and v2. We
now give an interpretation of these conditions in
terms of the image of the origin in the new frame.
We have
{ul - z/0) =
ðx2 - x
Since fsu, X\ > 0 and (x2 — xs) P 0 we must have
US < MQ. Similarly, comparing the second coordinate on both sides, we get vs0 6 v02. Thus, to remain
within the rectangle of renderable views we must
and vs0 6 v02. Also, note that specispecify us
fying U0 = MQ implies xs = x2 and gives the set of
views along the upper horizontal boundary of
the rectangle of renderable views. Similarly, the
views along the other boundaries of the rectangle
may be obtained by giving appropriate values to
S0 and UQ.
We would like to mention here that this is only
a characterisation of viewpoints for which all corresponding points will lie in the FOV of the virtual
camera and our technique can be used to synthesize views for any translation of the virtual camera.
However, it is possible that for viewpoints outside
the rectangle of renderable views, some or all of
the corresponding points move outside the FOV
of the virtual camera.
2.4. Results
We have experimented the proposed scheme on
a variety of scenes. Correspondences were established between feature points detected by the Harris corner detector. Fig. 1 shows the input images of
an outdoor scene with translation in the x direction
and six synthesized novel views. As the viewpoint
shifts in the synthesized views, some portions of
the chimneys in the background become occluded
by the tree in the foreground while others come
into view. These changes in visibility have been correctly rendered by our method. Fig. 2(a)-(d) are
four views of the same scene synthesized by Lhuillier and Quan (2003) who work with uncalibrated
cameras. The images have been obtained from
http://wwwlasmea.univ-bpclermont.fr/Personnel/
Maxime.Lhuillier/Interpol2.html. Observe the distortion in the shape of the tree. While occlusions
are being handled correctly, due to the distortion
in shape, parts of the image are not rendered correctly. For instance in (a), a portion of the chimney
that would have been occluded had the shape of the
tree been preserved is visible.
2.5. Synthesis from N P 3 views
Our technique can be extended to synthesize novel views from three or more views. This allows for
(9)
(h)
Fig. 1. Results for pure x translation: (a) and (h) are the input images with translation only in the x direction. (b)-(g) are six
synthesized views with translation in x direction. Note the movement of the tree with respect to the two chimneys highlighted by the
box. In (b)-(d) the first chimney starts getting occluded by the tree while the second chimney starts becoming visible. In (e), the first
chimney is completely occluded and starts becoming visible again in (f) and (g) while the second chimney is completely visible in these
ARTICLE IN PRESS
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
(b)
(c)
(d)
Fig. 2. (a)-(d) Four views synthesized by Lhuillier and Quan (2003).
3. Camera translation in xyz direction
a better coverage of the scene and expands the rectangle of renderable views to the union of the rectangles formed when the images are taken two at a
time. These facts are illustrated by Fig. 3 which
shows three input images of a lab scene and six
synthesized views.The input views have been obtained by translation both in the x and y directions. Again the changes in visibility induced by
a shift of the viewpoint have been correctly rendered by our scheme. Using all the three views
we get correspondences on different parts of the
big box which would not have been possible from
just the first and third views. Also, since the second
view has y translation, we can generate novel views
with y translation, i.e., the space of renderable
views expands.
In this section we describe the view synthesis
scheme when the input views have been obtained
by a camera translating in an arbitrary or xyz
direction.
3.1. Synthesis from two views
The projection equations for the two views I1
and I2 are given by
kipi = KðRP þ tiÞ
where ti = (xi,yi,zi)T, i=1,2, and z1
The
equation relating the k values of a point in the
two views is
(h)
(i)
Fig. 3. Results for xy translation: (a)-(c) are the input views. (a) and (b) are related by a translation in y, while (a) and (c) are related by
a translation in x. (d)-(f) are three synthesized views with translation in x direction, while (g)-(i) are those from a sequence with xy
translation. Observe the changing positions of the objects relative to each other as highlighted by the boxes.
ARTICLE IN PRESS
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
and k0s independent of each other. However,
if we want zero translations along the x, y or both
x and y directions, we do not have enough independent parameters to choose us0 and v0s independently. Such translations can be specified if the
internal parameters x0 and y0 are known. We
can, however, specify a zero translation along the
z direction by choosing Xs0 k 10 and us0 and v0s arbitrarily. In this case ks = k1 and choosing us0 = u1
corresponds to zero translation along the x direction. Thus, if the translation along the z direction
is zero, we can have zero translations along x or
y as well. Note that Xs's are the z-buffer values
again. Since they have been computed using image
information alone any view rendered using these
values will be perspectively correct.
Fig. 5 shows a lab and an outdoor scene with
translation in xyz direction and three synthesized
views. As the viewpoint translates in z, towards
the scene, the size of the objects in the images increases. This increase in the size of the objects is
apparent in the synthesized views.
U0S;VS0
Fig. 4. Rectangle of renderable views: the camera can move in
the shaded rectangle which is the rectangle of renderable views.
- KPI
(9)
In this equation the number of unknowns increases since k1 5 k2. However, the equations continue to be linear in the unknowns and can be
solved given n P 3 point correspondences. The
equation defining the new view is
The non-linearity on the left-hand side can be resolved by first computing ks from the last coordinate on both sides and then computing us and vs.
To specify the new view we need to give values
to MQ, vs0 and ks0 which corresponds to the translation of the camera in the z direction. The view synthesis scheme and the rendering algorithm is the
same as that for the xy translation case except that
we use (9) to compute the A's and (10) to render the
new views.
If we want the virtual camera to undergo a nonzero translation in each direction, we can choose
(f)
(h)
3.2. Characterisation of viewpoints
As in Section 2.3 we give a characterisation of
all viewpoints for which it is guaranteed that all
corresponding points will lie within the field of
view (FOV) of the virtual camera. Since the orientation matrix R is the same in all the views and
the translation for the new view is specified with
respect to the first view, to characterise the set
(i)
(i)
Fig. 5. Results for xyz translation: (a) and (b) are the input images of a lab scene with translation in the xyz direction. (c)-(e) are three
synthesized views. (f) and (g) are input images of an outdoor scene. (h)-(j) are three synthesized frames. Note the increase in the size of
the objects as the virtual camera translates closer to the scene.
ARTICLE IN PRESS
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
of viewpoints we may assume that the first camera
is aligned. Let the first camera be K[Ij0], the second be K[Ijt1] and the new camera be K[Ijts],
ts = (xs,ys,zs)T. Let a and b be the bounds for
the u axes and c and d be the bounds for the v
axes.
The view-volume of a camera is defined by four
planes which intersect at the centre of projection
and pass through the boundaries of the image
plane. In order to ensure that all points in the fields
of view of the given views are also in the FOV of
the novel view, we will intersect the view-volumes
of the given cameras and require that the viewvolume of the new camera contain the intersection. For the first view, let planes defining the
view-volume be p1, p2, p3 and p4. Then, the equations of these planes are p1 :fX — aZ = 0, p2:
Fig. 6. The set of possible virtual camera locations is the set of
solid rectangles.
a
( a
azs - max I 0; az1 -
< xs < zs
-min 0 ; b z 1 -
fY-dZ = 0, n3:fX-bZ = 0, n4:fY-cZ = 0.
Let Ni denote the normal of nh i= 1,2,3,4,
that points outside the view-volume. Let P =
(X, Y, Z) T be any point inside the view-volume.
Then, the dot product of Ni, i =1,2,3,4, with the
vector joining any point on the plane pi and P is
less than zero. Choosing the centre of projection
as the point on the plane, P will lie inside the
view-volume if,
a
b
aZ <X <bZ
, c
and
We can get similar relations for ys. Thus, if the
internal parameters and the size of the image plane
is known, we can get the bounds for xs and ys.
Using these we can obtain bounds on u0s and vs0.
Note that the amount of x or y translation permissible depends on the z translation. The set of possible viewpoints for the virtual camera turns out to
be a set rectangles of different sizes depending on
the amount of z translation as shown in Fig. 6.
d
cZ<Y<dZ
Similarly, if p0 1 ;p02 ; p30 and < are the planes defining 4. Translational pan detection
the second view-volume, then n[ : fX — aZ þ fx1 —
In this section we describe how our view syntheaz1
:jY-dZ^ þfy1 - dz1 = 0, n'3:fX4
=
0,
sis
scheme can be used to detect whether a video
bZ
-bz i = 0 ,
< :fY -cZ+fy, - cz1 = 0.
segment
in which the internal parameters of the
A point P l lies inside the second view-volume
camera are constant, is a translational pan seif,
quence. We choose three frames from the seab
quence. The idea is that if these frames are infact
Z þ z1
ðZ þ z\) — x1 and
x1 <X <
part of a translational pan sequence then the cor(
f
7
responding points in these frames must satisfy all
the geometric relationships that we have used for
view synthesis in the previous sections. We determine to what extent these constraints are satisfied
We may assume that y2 > y1, choosing as I1 the
by the corresponding points and then decide
view with the lesser vertical displacement. Then,
whether a given sequence is a translational pan sethe intersection of the given view-volumes is
quence or not.
bounded by the left and bottom planes of I2 and
the right and top planes of I1. Requiring that a
Let the three frames be I1, I2 and I3 and setup
point lying in the given view-volumes also lie in
point correspondences between them. Treating I1
the new view-volume gives
and I3 as input images we synthesize the frame cor-
ARTICLE IN PRESS
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
Coffee Sequence
Mean=2.22
House Sequence
Mean=1.96
Lab Sequence
Mean-56.36
Fig. 7. Sequences for pan detection. Frames in the middle row were checked for translational pan using correspondences between
frames in the first and last rows.
responding to I2. This is done by choosing a point
visible in I2 as the origin and synthesizing a new
view with the origin in the same position as in I2.
We then measure the similarity between the rendered view and I2. Mean of the distances between
the positions of corresponding points in the rendered view and their positions in I2 is used as a
measure of similarity. Any one of the corresponding points could be chosen as the origin to synthesize the new view. However, we choose that point
which gives the minimum mean error since that
point synthesizes a frame closest to the given
frame. If the mean error is within a certain threshold, the segment is classified as a translational pan
sequence. The threshold is determined from a
training data set which consists of translational
pan sequences.
We have tested our algorithm on a number of
sequences. In order to determine the threshold
for the mean error to declare the two frames to
be translationally related, we used a training set
consisting of 80 sequences to assess the distribution pattern of errors. On the basis of the distribution, we have found that with 99% confidence
interval we can identify a frame to be translational
pan if the the mean error is less than 2.275863 pixels. Fig. 7 shows three sequences for which it is
known that the coffee and house sequence are
translational pan sequences and the lab sequence
is not. These have been correctly classified by our
technique.
Our technique can also determine if three nonconsecutive frames of a sequence are translationally related, i.e., if their image planes are parallel,
although the camera motion between them may
be arbitrary. Thus, we can identify both translational pan sequences as well as translationally related frames in a video sequence. This is possible
because our technique is based on the geometric
relationships between points in a static scene and
the constraints imposed by translational camera
motion which are independent of intermediate
camera motion.
5. Conclusions
In this paper we have proposed a technique for
the synthesis of novel views using two or more
views taken by an uncalibrated translating camera.
Our scheme produces correct perspective views
while most techniques require calibrated cameras
to produce correct perspective views. We have also
characterised the set of viewpoints for which new
ARTICLE IN PRESS
10
G. Sharma et al. / Pattern Recognition Letters xxx (2004) xxx-xxx
views can be rendered. Our scheme can also be
used for detecting translational pan motion in video sequences in which the internal camera parameters do not change. This work can be extended to
handle arbitrary motion of the camera. However,
some additional information, for example, correspondence of vanishing points, would be required.
This will be a topic for future endeavours.
References
Akhloufi, M.A., Polotski, V., Cohen, P., 1999. Virtual view
synthesis from uncalibrated stereo cameras. In: Proc.
Internat. Conf. on Multimedia Computing and Systems,
vol. 2, pp. 672-676.
Avidan, S., Shashua, A., 1998. Novel view synthesis by
cascading trilinear tensors. IEEE Trans. Visualiz. Comput.
Graphics 4 (4), 293-306.
Beier, T., Neely, S., 1992. Feature-based image metamorphosis.
In: Proc. SIGGRAPH, pp. 35^2.
Buehler, C., Bosse, M., McMillan, L., 2001. Non-metric imagebased rendering for video stabilization. In: Proc. Computer
Vision and Pattern Recognition, vol. 2, pp. 609-614.
Chang, N.L., Zakhor, A., 2001. Constructing a multivalued
representation for view synthesis. Internat. J. Computer
Vision 45 (2), 157-190.
Chen, S.E., Williams, L., 1993. View interpolation for image
synthesis. In: Proc. SIGGRAPH, pp. 279-288.
Fusiello, A., Caldrer, S., Sara, C., Mattern, N., Murino, V.,
2003. View synthesis from uncalibrated images using
parallax. In: Proc. Internat. Conf. on Image Analysis and
Processing, pp. 146-151.
Genc, Y., Ponce, J., 2001. Image-based rendering using
parametrised image varieties. Internat. J. Computer Vision
41 (3), 143-170.
Hartley, R., Zisserman, A., 2000. Multiple View Geometry in
Computer Vision, first ed. Cambridge University Press.
Inamoto, N., Saito, H., 2002. Intermediate view generation of
soccer scene from multiple views. In: Proc. ICPR.
Jadon, R.S., Chaudhury, S., Biswas, K.K., 2002. A fuzzy
theoretic approach for camera motion detection, 2002. In:
Proc. IPMU.
Lhuillier, M., Quan, L., 2003. Image-based rendering by joint
view triangulation. IEEE Trans. Circ. Systems Video
Technol. 13 (11), 1051-1063.
Park, J., Yagi, N., Enami, K., Aizawa, K., Hatori, M., 1994.
Estimation of camera parameters from image sequence for
model-based video coding. IEEE Trans. Circ. Systems
Video Technol. 4 (3), 288-296.
Saito, H., Kimura, M., Yaguchi, S., Inamoto, N., 2002. View
interpolation of multiple cameras based on projective
geometry. In: International Workshop on Pattern Recognition and Understanding for Visual Information Media.
Seitz, S.M., Dyer, C.R., 1996. View morphing. In: Proc.
SIGGRAPH, pp. 21-30.
Shashua, A., Navab, N., 1996. Relative affine structure:
Canonical model for 3D from 2D geometry and applications. IEEE Trans. Pattern Anal. Machine Intell. 18 (9),
O/J—OOJ .
Srinivasan, M.V., Venkatesh, A., Hosie, R., 1997. Quantitative
estimation of camera motion parameters from video
sequences. Pattern Recognit. 30, 593-606.
Sudhir, G., Lee, J., 1997. Video annotation by motion
interpretation using optical flow streams. J. Visual Comm.
Image Represent. 7, 354-368.
Yaguchi, S., Saito, H., 2002. Arbitrary viewpoint video
synthesis from uncalibrated multiple cameras. In: Proc.
WSCG.