Writing Task 2
Writing Task 2
Writing Task 2
You are required to analyze and evaluate one Introduction from your target journals.
You may focus on the components and structure of the Introduction. (no less than 200
words)
This is a new 3D reconstruction method proposed by Google, which is an
improvement on the current best 3D gaussain splatting method.
This Introduction first gives a brief introduction to the development of 3D
reconstruction. First of all, some previous scene representation methods are listed:
point-based representation, Surface-based representation, volume-based
representation , and NeRF. Some representative papers are cited for each
reconstruction method. Finally, the best reconstructed 3D Gaussian Splatting (3DGS)
is derived. The opening section makes clear to the reader what is being covered. The
author introduces the reader to the evolution of the method from the early
representation methods to the more recent 3DGS.
The second paragraph discusses the shortcomings of 3DGS compared with the
old method NeRF. The first is Gaussian primitives in 3DGS appear to have the same
opacity regardless of the viewing direction of the camera. The second is Gaussian
primitives are sorted accurately using only their centers, which causes the rendered
image to mutate as the camera moves. The description of the technical problems is
specific and clear, and readers in the professional field can accurately understand the
shortcomings. The purpose of raising the above two questions is to introduce the work
done in this paper. This paragraph picks up the technical background of the first
paragraph and the main contributions of the third paragraph.
The third paragraph introduces the main contributions of this paper. In this
section, the questions mentioned above are directly responded to, forming a structure
corresponding to the problem and the solution, and enhancing the logic.
The introduction is carefully structured. The introduction adopts the logic of
"background - problem - solution". The author first introduces the research
background. Then discuss the limitations of existing technology. Finally, the work
done is presented. Allow the reader to clearly identify the shortcomings of the current
technology. Reasonable solutions are then drawn to make the research contribution
convincing
Supplement
1. Introduction
The field of 3D reconstruction for novel view synthesis has explored a variety of
scene representations: point based [22], surface based [26, 44, 47], and volume based
[29, 31]. Since their introduction in NeRF [29], differentiable rendering of volumetric
scene representations have become popular due to their ability to yield photorealistic
3D reconstructions. More recently, 3D Gaussian Splatting (3DGS) combined the
speed of point based models with the differentiability of volume based representations
by representing the scene as a collection of millions of Gaussian primitives that can be
rendered via rasterization in real-time.
Unlike NeRF, 3DGS lacks a true volumetric densityfield, and uses Gaussians to
describe the opacity of the scene rather than of density. As such, 3DGS’s scene
representation may violate volumetric rendering, in that an anisotropic Gaussian
primitive in 3DGS will appear to have the same opacity regardless of the viewing
direction of the camera. This lack of a consistent underlying density field prohibits the
use of various algorithms and regularizers (such as the distortion loss from Mip-
NeRF360 [2]), but the biggest downside of this model is popping. Popping in 3DGS
is due to two assumptions made by the model: that primitives do not overlap, and that
(given a camera position) primitives can be sorted accurately using only their centers.
These assumptions are almost always violated in practice, which causes the rendered
image to change significantly as the camera moves due to the sort-order of primitives
changing. This popping may not be noticeable when primitives are small, but
describing a large scene using many small primitives requires a prohibitively large
amount of memory.
In this work we build on the primitive based representation of 3DGS, but we
define a physically accurate, constant density ellipsoid based representation. This
formulation lets us compute the exact form of the volume rendering integral
efficiently, which eliminates the inconsistencies of 3DGS while still maintaining real-
time framerates. Because our system is built around ray tracing (rather than 3DGS’s
rasterization) it can straightforwardly model various optical effects like radial
distortion lenses including fisheye and defocus blur. Our method guarantees 3D
consistent realtime rendering while also improving the image quality of our 3DGS
baselines. This is particularly pronounced in the challenging large-scale Zip-NeRF
scenes [4] where our method matches the quality of state-of-the-art offline rendering
methods.