Potamias Neural Mesh Simplification CVPR 2022 Paper

Neural Mesh Simplification
Rolandos Alexandros Potamias1 Stylianos Ploumpis1,2 Stefanos Zafeiriou1,2

1 2
Imperial College London Huawei Technologies Co. Ltd
{r.potamias, s.ploumpis, s.zafeiriou}@imperial.ac.uk
Figure 1. We present a fast and learnable method for mesh simplification that generates simplified meshes in real-time.
Abstract 1. Introduction
Triangle meshes remain the most popular 3D structure
Despite the advent in rendering, editing and prepro- to represent a surface. The advent of 3D scanning devices
cessing methods of 3D meshes, their real-time execution have made feasible to collect highly detailed 3D meshes
remains still infeasible for large-scale meshes. To ease that typically hold thousand of faces. However, extreme
and accelerate such processes, mesh simplification meth- details lead to enormous memory requirements that limit
ods have been introduced with the aim to reduce the mesh their usage. A wide range of applications including render-
resolution while preserving its appearance. In this work we ing and editing along with their mobile implementations,
attempt to tackle the novel task of learnable and differen- require lightweight meshes to enable real-time processing.
tiable mesh simplification. Compared to traditional simpli- Additionally, many monocular 3D reconstruction methods
fication approaches that collapse edges in a greedy itera- that utilize analysis-by-synthesis are required to be compu-
tive manner, we propose a fast and scalable method that tational efficient and differentiable in terms of their topolo-
simplifies a given mesh in one-pass. The proposed method gies. These types of iterative optimization methods can
unfolds in three steps. Initially, a subset of the input ver- drastically benefit from an differentiable on-the-fly simpli-
tices is sampled using a sophisticated extension of random fication technique that reduce their computational footprint.
sampling. Then, we train a sparse attention network to pro- Mesh simplification is a long studied problem, with an
pose candidate triangles based on the edge connectivity of immense amount of methods developed to sustainably re-
the sampled vertices. Finally, a classification network es- duce the size of the original mesh without extremely dis-
timates the probability that a candidate triangle will be in- torting its appearance. Traditional simplification techniques
cluded in the final mesh. The fast, lightweight and differen- decimate the input mesh in a greedy-fashion by prioritiz-
tiable properties of the proposed method makes it possible ing vertices and edges according to a cost function [10, 34].
to be plugged in every learnable pipeline without introduc- However, in large-scale objects scenario, simplifying over
ing a significant overhead. We evaluate both the sampled 90% of the original mesh size requires iterating through
vertices and the generated triangles under several appear- thousands of vertices resulting in an inevitable computa-
ance error measures and compare its performance against tional burden. In addition to their computational footprint,
several state-of-the-art baselines. Furthermore, we show- traditional simplification techniques are non-differentiable
case that the running performance can be up to 10× faster and thus can not be used directly in end-to-end training
than traditional methods. processes that optimize the mesh surface. To alleviate the
aforementioned limitations we propose a learnable strategy
18583
for mesh simplification that reduces both time and compu- 2. Related Work
tational requirements and provides a plug-and-play method,
ready to be adapted in any differentiable framework. Mesh Simplification. Traditional simplification algo-
A major barrier that limits learnable simplification meth- rithms repeatedly decimate the input mesh according to
ods is the discrete nature of the mesh connectivity, i.e. a cost function to preserve its rendered appearance, until
edges and triangles. Although mesh simplification can be the desired simplification ratio is reached. Simplification
achieved in a two-step process using a learnable sampling methods can be distinguished in two major categories:
method followed by an off-the-shelf triangulation algorithm vertex clustering/decimation and edge collapse methods.
(e.g. Delaunay or Ball Pivoting) [28], such setting, apart Vertex decimation methods rank vertices according to a
from being time consuming, limits the direct optimization heuristic geometric cost function, such as their distance
of the mesh surface. Recently, several approaches have been from the average plane [34–36], to ensure that least impor-
proposed to solve the non-trival task of differential triangu- tant vertices will be decimated first. However, after every
lation using soft relaxations of the discrete setting. How- vertex deletion, a re-tessellation of the generated hole is
ever, most of them are considered impractical since they are required, thus, making such algorithms non-practical. On
applied to volumetric representations [5], demand iterative the other hand, edge collapse methods preserve the input
optimizations for the triangulation for each mesh [37] or re- topology by sequentially contracting pairs of vertices (i.e.
quire 2D parametrizations [30, 31]. Our aim is to utilize a edges). Hoppe et al. [14] pioneered an energy cost function
simple but intuitive differentiable process to directly trian- defined over the edges that is attempted to be minimized in
gulate the 3D points in one-pass. To do so, we model the every contraction step. Following this idea, in the seminal
triangulation process by generating a distribution over pos- works of [10, 33], each vertex was associated with the set
sible edges and triangles and select the ones that preserve of planes in its 1-ring neighborhood and was expressed by
the appearance of the original mesh. a fundamental quadric matrix. The authors showcased, that
In this study we propose the first learnable mesh simpli- using the quadric matrix, the distance of a point from a set
fication method that generates both points and triangulation of planes can be expressed using the sum of their quadrics,
of the simplified mesh. In contrast to previous studies [28], which is known as Quadric Error Metric (QEM). Using this
we propose soft relaxation of the discrete triangulation set- property, edges that introduce the minimum point-to-plane
ting by learning the mesh connectivity distribution in an distance were the first to be collapsed. Several approaches
unsupervised manner. The proposed method can simplify have been built upon QEM to incorporate texture [1, 11],
meshes of any scale in real-time by using an extremely effi- curvature [16, 17] and spectral properties [21, 23] or to
cient point sampling method and a lightweight triangle clas- speed-up the process using parallel processing [19, 26].
sifier. To follow the initial mesh appearance, we constrain Recently, Hanocka et al. [13] proposed the utilization of
the distribution of the edges to the priors defined by the an adaptive greedy edge collapse method as a learnable
original mesh connectivity. The proposed method is fully- pooling strategy, where edge weights are learned through
differentiable and can be adapted to any training procedure the network. However, apart from the inefficient greedy
without a significant computational footprint. Finally, the nature of the edge collapse methods, the resulting mesh
proposed method can generalize to out-of-distribution sam- faces can only be decimated approximately by a factor
ples exhibiting zero-shot capabilities. of two and thus limiting its applicability. Potamias et
The main contributions of this work are summarized as: al. [28] proposed a learnable point cloud simplification
technique that preserves the curvature features of the point
• We propose the first, to the best of our knowledge, cloud. They have also extended their method to mesh
learnable mesh simplification framework that is trained simplification task by utilizing off-the-self algorithms to
to both select vertices and generate the underlining tri- triangulate the resulting simplified point cloud. Neverthe-
angulation of the surface. less, such methodology requires careful selection of the
• The proposed model is fully differentiable and can be triangulation parameters for every sample. Importantly,
directly adapted to any learnable framework. the original topology of the mesh is neglected, i.e. the
mesh connectivity might be totally different, and cannot be
• We introduce an efficient point selection method that directly optimized in a learnable process.
extends the naive random sampling [15] to a train-
able module that samples vertices from the underlying Pooling. Inspired by the regular pooling layers in CNNs,
multinomial distribution. several graph pooling operators have been introduced to re-
• Finally, we showcase a simple but intuitive triangula- duce the size of graphs and enable hierarchical training in
tion strategy that can be adapted to point cloud mesh- graph neural network (GNN) architectures. Most of these
ing. methods utilize non-trainable point selection methods, such
as Graclus clustering algorithm [6–8] and Farthest Point
18584
Figure 2. Overview of the proposed model (bottom block). Initially, Point Sampler module (top left) samples a subset of the input vertices
from the generated multinomial distribution. To avoid isolated nodes, a k-nn graph is constructed to extend the already existing edges of the
graph. Following that, a sparse attention layer weights the connectivity between nearby vertices and generates a set of candidate triangles
(top middle). Finally, Face Classifier module defines a graph over the proposed faces and assigns a probability to each triangle based on
their relative features (top right).
Sampling (FPS) method [29], to generate hierarchical rep- nature of the edges that compose the mesh connectivity.
resentations of the input graph. Ying et al. [40] proposed PointTriNet [37] utilizes two models to suggest and classify
the first trainable task-dependent pooling layer by learning a triangles in local patches, enforcing the selection of water-
clustering assignment matrix. However, the dense nature of tight and manifold triangles using ad-hoc losses. A simi-
the soft-clustering assignment matrix makes such approach lar principle is also used in [24] where the candidate trian-
infeasible for large scale graphs [3]. To tackle such limi- gles are iteratively filtered using the estimated ratio of the
tation, several sparse pooling methods have been proposed geodesic versus the euclidean distance. Recently, there has
that select Top-K nodes based on a learnable score [3,9,20]. been an exertion to employ traditional Delaunay surface tri-
However, Top-K approaches retain only a subset of the edge angulation into the learnable triangulation process. In [31],
set of the input graph, leading to isolated nodes. Recently, the authors propose to learn a parametrization that maps the
Ranjan et al. [32] utilized a sparse soft clustering matrix to input point patches to two-dimensional spaces and triangu-
address the node connectivity issues of the previous meth- late them using Delaunay technique. In a similar manner,
ods. For additional details on graph pooling layers we re- a soft relaxation of weighted Delaunay triangulation was
fer the reader to [12]. In an abstract sense, one may re- also proposed [30] that enables gradient flow, using an in-
gard mesh simplification as a pooling process, since the inclusion score to discard proposed triangles from the final
put topology is known and we seek to find its simplified mesh. Similarly to [31], the input point cloud is triangu-
version. However, in contrast to common graph pooling lated to a spectral partitioned 2D subspace produced using
architectures, mesh simplification methods should also re- Least-Squares Conformal Map parametrization. In this pa-
spect the surface properties of the mesh, such as smoothness per we propose a modular architecture that directly learns to
and manifoldness. A natural approach to bypass the limita- generate a triangulated version of the input vertex set with-
tions is to attempt to bridge both words. In this work, we out adhering to any kind of 2D projection and mapping.
adapt graph pooling strategies along with geometric losses
to tackle the problem of mesh simplification. 3. Method
Learnable Triangulation. Albeit surface reconstruc- The architecture of the proposed model is composed by
tion method have been extensively studied over the years, mainly three components: the Point Sampler, the Edge Pre-
less attention has been devoted to learnable and differen- dictor and the Face Classifier. All modules are fully differ-
tiable triangulation methods. Most of the existing tech- entiable and are trained in an end-to-end fashion. Figure 2
niques in the literature generate mesh surfaces by estimating illustrates an overview of the proposed method.
implicit funtions [4, 27], calculating voxel grids and occu-
3.1. Point Sampler
pancy fields [22, 25] or deforming a template mesh [39].
Less progress has been established to the challenging task The first module of the proposed model is a network that
of direct point set triangulation, mainly due to the discrete samples the vertex set in a way that the structure of the mesh
18585
will remain intact. Previous works, utilized Farthest Point where Wq , Wk are learnable parameters and fi , fj are the
Sampling (FPS) to sample input point sets since it has been features of points xi , xj .
shown that it accurately preserves the underlying shape of To avoid having edges of equal probability between
the object [29]. However, the iterative nature of FPS is not nearby points, we utilize DevConv to enable feature dis-
scalable and thus impractical for large scale point sets. In similarity between them. Finally, the estimated adjacency
contrast, as shown in [15], random sampling is extremely matrix is defined using the product of the estimated proba-
fast, size agnostic, holding an O(1) complexity opposed to bilities and the original adjacency matrix, following the for-
the FPS’s quadratic O(N 2 ) complexity. Based these obser- mulation of [40], as:
vations, we propose to extend random sampling to a sophis-
ticated learnable module that breaks the uniform hypothesis \mathbf {A}_{s}[i,j] = \mathbf {S}[i,:] \: \mathbf {A} \: \mathbf {S}[j, :]^T, \quad i,j \in \mathcal {G}_{ext} (3)
of random sampling and samples nodes under the assump- Our motivation behind the use of the product between the
tion of a multinomial distribution. To do so, we train the estimated and the original connectivity is to enforce the
Point Sample module to assign an inclusion score to each edges of the simplified mesh to respect the original topol-
point in the vertex set. ogy. Note that all of the aforementioned multiplication op-
Given that the simplified point set needs to approximate erations are between the sparse matrices and can be calcu-
the structure of the input point set, the Point Sampler mod- lated very efficiently.
ule needs to be trained to select points that provide the best The set of candidate faces can be easily constructed from
coverage of the input space. Taking into account that nearby the non-zero entities of the simplified adjacency matrix As .
points will also have similar latent descriptors, we propose An initial inclusion probability pinit is assigned to each tri-
t
to utilize a graph neural network (GNN) approach that will angle defined as pinit = 1
(A [i, j] + As [i, k] + As [j, k]),
t 3 s
intuitively provide shape insights for the vertices. We em- where i, j, k are the vertices of triangle t. This initial in-
ploy an update rule based on the relative vertex coordinates clusion probability can be thought as a prior that will be in-
that can approximate the deviation of a point from its neigh- voked by the Face Classifier to produce the final (posterior)
borhood as: probability of the edge.
\mathbf {f}_i = \sigma \left ( \mathbf {W}_{\phi } \max _{j \in \mathcal {N}{(\mathbf {x}_i)}} \mathbf {W}_{\theta } (\mathbf {x}_i- \mathbf {x}_j)\right ) (1) 3.3. Face Classifier
The face classifier is responsible to assign to each trian-
where N (xi ) denotes the neighbouring points of xi , gle an inclusion score pt that captures the probability that
Wϕ , Wθ are learnable parameters and σ(·) a non-linearity. this triangle will be present in the simplified mesh. To es-
We refer to this GNN layer as DevConv. timate this probability, we first construct a k-nn graph Gtri
With such formulation, the advantage is two-fold. Pri- that connects each candidate triangle with its k-neighbours
marily, the point descriptors may better describe the topo- based on their barycenter distances. Then a GNN module,
logical characteristics of a given point, where points in namely TriConv, that acts on Gtri embeds faces to the latent
sharp and rough regions will receive larger values compared space. To better capture the interactions of two triangles n
to smooth areas. Secondly, the sampling module gains ro- and m in space, we utilize a relative position encoding rn,m
bustness to noise, since the combination of the maximum defined as:
as aggregation function and the relative coordinates pro-
vides to the network the ability to easily identify outliers. \begin {split} \mathbf {r}_{n, m} = [(\mathbf {t}_n^{min} - &\mathbf {t}_m^{min}) || (\mathbf {t}_n^{max} - \mathbf {t}_m^{max}) || (\mathbf {b}_n - \mathbf {b}_m) ], \\ &\mathbf {t}_n^{max} = max( \mathbf {e}^{n}_{ij}, \mathbf {e}^{n}_{ik}, \mathbf {e}^{n}_{jk}) \end {split}
(4)
3.2. Edge Predictor
Following the Point Sampler, the Edge Predictor mod- where tmax
n , tmin
n ∈ R3 , enij the edge vectors xi − xj
ule is responsible to predict the connectivity between the for triangle n; bn , bm the barycenters of triangles n, m and
sampled points. To do so, we initially extend the original || the concatenation operation. Finally, the update rule of
mesh connectivity by inserting edges defined by k-nn graph TriConv can be defined as follows:
over the sampled points to avoid isolated nodes in the final
mesh. The extended graph Gext is then processed by a De- \mathbf {f}^{(l)}_{n} = \sum \limits _{k \in \mathcal {N}(t_n)} MLP([ \mathbf {r}_{n,k} || (\mathbf {f}^{(l-1)}_{n} - \mathbf {f}^{(l-1)}_{k})]) (5)
vConv layer followed by a sparse self-attention layer [38]
that predicts the probability that the point xi is connected
with the point xj . Such probability is formulated as: where N (tn ) the neighborhood and fn the previous
(l−1)
feature of face n. In order to generate the final inclusion

probability pt we stack three TriConv layers toped with a
(2)
softmax activation function. We use the prior probability
\mathbf {S}[i,j] = \frac {\exp \left ( \left (\mathbf {W}_q \mathbf {f}_j\right )^T \left ( \mathbf {W}_k \mathbf {f}_i\right )\right )}{\sum \limits _{k \in \mathcal {N}(\mathbf {x}_i)} \exp \left ( \left ( \mathbf {W}_q \mathbf {f}_j\right )^T \left ( \mathbf {W}_k \mathbf {f}_k\right )\right )}
pinit
t as the initial feature of each triangle.
18586
3.4. Losses where x denotes a point from the ground truth surface S
and y a point from the generated surface Ss with proba-
To train the proposed framework we utilize a set of loss
bility py . The second term of equation (8) estimates the
functions that oblige the simplified meshes to preserve
average distance between point y and its k-nearest triangles
the visual appearance of the originals. The basic idea
tk in the generated surface Ss apart from the one that point
underlying the utilized losses is to enforce the selection of
y belongs to. This last term can be intuitively described as
salient points that are representative of the mesh structure
the error introduced when the triangle in which y belongs
and to penalize badly formed triangles.
is not present in the generated triangulation. To make
the reverse term robust, we sample a sufficient amount of
Probabilistic Chamfer Distance: To improve uniform
points from each generated triangle.
sampling we force the Point Selector to assign high proba-
bilities to the points that cover the surface of the point cloud.
We mathematically formulate this by using a modified prob- Triangle Collision: To avoid having triangles that pen-
abilistic Chamfer distance: etrate each other we introduce a loss term that directly pe-
nalizes the probabilities of such triangles. We measure the
\begin {split} d_{\mathcal {P}, \mathcal {P}_s} = \sum _{\mathbf {y} \in \mathcal {P}_s} p(\mathbf {y}) \min _{\mathbf {x} \in \mathcal {P}} \|\mathbf {x}-\mathbf {y} \|^2 \\ + \sum _{\mathbf {x} \in \mathcal {P}} p(\mathbf {y})\min _{\mathbf {y} \in \mathcal {P}_s} \| \mathbf {x}-\mathbf {y} \|^2 \end {split} \label {eq:pcchamf} collision of a triangle in terms of line segments (i.e. edges
of nearby triangles) penetrating its surface. In particular,
(6)
we compute the planes defined by each face and we penal-
ize nearby triangles formed from edges that penetrate this
plane. The penalty applied to such irregular triangle is pro-
where P denotes the input vertex set, Ps the sampled points portional to the number of planes that it penetrates and it is
and p(y) their respective probabilities. Note that this loss defined as:
is applied only to the Point Sampler module and therefore
can be pre-trained and adapted to any learnable framework.
\mathcal {L}_c = \frac {1}{|\mathcal {F}_s|} \sum \limits _{t \in \mathcal {F}_s} p_t m_c(t) (9)
Probabilistic Surfaces Distance: To avoid having tri-
angles in regions that are not existing in the original mesh
and penalize the presence of surface holes, we formulate a where pt denotes the probability of triangle t, mc (t) the
Chamfer-inspired loss that measures the distance between a number of faces penetrated by triangle t and Fs the set of
ground truth and a probabilistic surface. In this setting, the generated triangles.
forward term of the distance, i.e. the distance between the
generated surface Ss and the original, enforces triangles to Edge Crossings and Overlaping triangles: Although
fit the ground truth surface S. We may calculate this term triangle collision loss may be sufficient to penalize triangles
by using the barycenters of the two surfaces as follows: that penetrate the surface of their neighboring triangles, it
can not capture and penalize overlaping triangles with par-
d_{\mathcal {S}, \mathcal {S}_s}^{f} = \sum _{\hat {\mathbf {b}} \in \mathcal {S}_s} p_{\hat {\mathbf {b}}} \min _{\textbf {b} \in \mathcal {P}} \| \hat {\mathbf {b}} -\mathbf {b}\|^2 (7) allel planes. To address this limitation we introduce two
additional losses that penalize such scenarios, namely the
where b stands for the barycenters of the ground truth sur- edge crossings loss Le and the overlap loss Lo . We calcu-
face, b̂ the barycenters of the generated triangles with re- late edge crossings of line segments (edges) of nearby tri-
spective probabilities pb̂ . In this manner, barycenters of angles and we directly penalize triangles that carry an edge
triangles in non existing regions, e.g. a triangle connect- that crosses another edge. Finally, to avoid overlaping trian-
ing the dogs legs, will have greater distance compared to gles in space, we sample a sufficient number of points from
barycenters of triangles in existing regions of the ground each generated face and compute an estimate that belongs to
truth surface. a given face. This can be efficiently implemented by mea-
In contrary to the forward term, the reverse term of the suring the sum of the areas produced by the sampled point
distance function aims to penalize areas with small proba- and the vertices of each of the k-nearest triangles. Simi-
bilities, i.e. areas that when discarded may result into the lar to collision loss, the penalty applied to each triangle is
introduction of surface holes. We can capture this mathe- proportional to the number of faces that it penetrates.
matically as follows: Overall Objective Finally, the overall loss to be mini-
mized is formed as:
\begin {split} d_{\mathcal {S}, \mathcal {S}_s}^{r} &= \sum _{y \in \mathcal {S}_s} p_\mathbf {y} \min _{\mathbf {x} \in \mathcal {P}} \|\mathbf {x}-\mathbf {y} \|^2 \\ &+ \left (1-p_\mathbf {y}\right )\frac {1}{k}\sum _k p_{t_k} \|\mathbf {x}_{t_k}-\mathbf {y} \|^2 \label {eq:revchamf} \end {split} \mathcal {L} = d_{\mathcal {P}_1, \mathcal {P}_2} + d_{\mathcal {S}_1, \mathcal {S}_2}^{f} + d_{\mathcal {S}_1, \mathcal {S}_2}^{r} + \lambda _c \mathcal {L}_c + \lambda _e \mathcal {L}_e + \lambda _o \mathcal {L}_o (10)
(8)
where λc , λe , λo are hyperparameters that scale the loss
functions.
18587
Figure 3. Qualitative comparison of the proposed and the baseline methods. The top row contains a human shape simplified by 90% and
the bottom row shows a cat model simplified by 80%. Figure better viewed in zoom.
Table 1. Quantitative comparison of the proposed and the baseline methods under several simplification ratios. Best approaches are
highlighted in bold and second best in blue.
Ns /Norg = 0.05 Ns /Norg = 0.1 Ns /Norg = 0.2 Ns /Norg = 0.5

Method CD WA LE NE Time CD WA LE NE Time CD WA LE NE Time CD WA LE NE Time
PointTriNet [37] 1.06 12.14 0.98 0.20 107.2 0.47 11.64 0.52 0.17 238.1 0.21 11.48 0.27 0.13 581.9 0.08 14.92 0.12 0.08 1333.3
QEM [10] 0.45 0.00 0.94 0.14 45.6 0.22 0.00 0.53 0.10 31.1 0.11 0.00 0.27 0.07 28.4 0.05 0.00 0.13 0.03 25.6
DSE [30] 1.39 6.64 0.95 0.23 271.8 0.51 4.38 0.50 0.19 490.7 0.24 3.12 0.26 0.15 941.0 0.07 2.39 0.12 0.08 2245.2
Potamias el al. [28] 10.47 8.53 1.23 0.21 105.2 0.83 8.39 0.76 0.17 149.3 0.49 4.72 0.43 0.14 158.7 0.20 4.67 0.22 0.10 183.7
Proposed 1.02 2.17 0.90 0.19 4.1 0.42 2.21 0.47 0.15 4.2 0.19 2.49 0.24 0.11 4.2 0.06 3.57 0.10 0.06 4.4
4. Experiments two learnable and differentiable triangulation methods, i.e.

PointTriNet [37] and DSE [30], to triangulate point clouds
In this section we evaluate the simplified meshes pro- sampled with FPS and thus introduce an alternative for dif-
duced by the proposed framework. We initially examine the ferentiable mesh simplification. Finally, we compared the
effect of Point Sampler and then we assess the quality of the proposed method against a recently introduced learnable
produced triangulation along with an assessment of the re- point cloud simplification method [28], which utilizes the
spective run times. For additional experimental results we Ball-Pivoting algorithm to triangulate the simplified point
refer the reader to the supplementary material. clouds.
Dataset: To train the proposed method we utilized the
benchmark TOSCA [2] dataset that contains 80 high reso- 4.1. Evaluation of the Simplified Meshes
lution meshes. We used the same train test split as in [28] To assess the triangulation performance, we measure
and test the model using topologies not present in the train- the percentage of non-watertight edges (WA), i.e the edges
ing set. In such setting, the devised model may be directly which have one, three or more incident faces. Compared
utilized to out-of-distribution meshes and generalize across to manifoldness metric, WA better assesses the triangula-
different topologies. tion quality when holes are present, given that the edges
Implementation: We trained our model for 150 epochs surrounding a hole are still manifold (a watertight edge is
with learning rate of 1e − 5 and a weight decay of 0.99 on manifold but a manifold edge is not guaranteed to be wa-
every epoch using the Adam optimizer [18]. The generated tertight). Additionally, to evaluate the simplification per-
simplified meshes are produced by selecting only the faces formance of each method we sample 50K points from the
with a probability above 0.5. We further constrain the gen- surface of every simplified mesh and measure the Cham-
erated meshes to be manifold by selecting for each edge the fer distance (CD) and normal dissimilarity (NE) from its
two incident to it faces with the highest probability. corresponding points in the original mesh. Finally, we cal-
Baselines: For comparison, we selected several base- culate the MSE error between the first 200 eigenvectors of
lines with different properties. In particular, we compared the Laplacian of the original and the simplified model (LE).
against the popular quadric mesh simplification (QEM) In Table 1 we quantitatively compare the proposed method
[10] which remains among the most popular and efficient with the aforementioned baselines. Although the iterative
methods in mesh simplification. Additionally, we selected QEM method better preserves the appearance of the simpli-
18588
Figure 4. Qualitative assessment of the proposed point sampling module. The proposed Point Sampler selects points that preserve both the
structure and the details of the input cloud. The proposed method better maintains the structure of the object compared to uniform and [28]
counterparts and improves the smooth point clouds produced by FPS module. The top row shows a dragon point cloud simplified to 5%
of its input size where as the bottom row shows the bunny shape simplified to 20% of its input size. Please note that both shapes are not
present in the TOSCA dataset.
Table 2. Quantitative evaluation of the point sampling module at different simplification ratios. The right part of the table includes the
simplification performance of the various methods when tested to noisy point clouds.
w/o Noise w Noise

Ns /Norg = 0.8 Ns /Norg = 0.2 Ns /Norg = 0.05 Ns /Norg = 0.8 Ns /Norg = 0.2 Ns /Norg =0.05
Method CD Curv Time CD Curv Time CD Curv Time CD Curv Time CD Curv Time CD Curv Time
FPS 0.01 2.21 21.2 0.81 2.99 7.84 4.02 3.42 3.86 2.11 4.07 21.2 3.47 4.35 7.84 7.31 4.79 3.86
Uniform 0.24 2.19 0.05 1.85 2.44 0.04 6.71 2.73 0.03 2.58 3.71 0.05 3.67 3.97 0.04 7.43 4.20 0.03
Potamias et al. [28] 0.03 2.14 120. 1.18 2.27 35.1 4.78 2.51 17.7 2.11 3.14 120. 3.38 3.65 35.1 7.61 4.12 17.7
Point Sampler 0.05 2.16 0.05 1.22 2.24 0.09 5.12 2.51 0.09 1.99 3.12 0.05 3.21 3.18 0.09 7.18 4.01 0.09
fied mesh, the proposed method achieves smaller Laplacian by a large margin. In particular, the proposed method runs
error while at the same time attain competing performance up to 10× faster than the optimized QEM method and at
over all error metrics. On the contrary, the proposed method least 100× faster than its faster differentiable counterpart.
outperforms all differentiable methods and overcomes the Another important feature of the proposed method is that it
limitations of previous triangulation approaches. In partic- remains almost unaffected by the mesh scale, due to the effi-
ular, as can be easily observed in Figure 3, the proposed cient point sampler and the lightweight structure of the face
method has very few holes compared to PointTriNet [37] classifier. In summary, the results highlight the fact that the
as well as less triangles in non-existing regions, such as the proposed method could be directly plugged into any render-
triangles occurring between the thigh and the hand (top row ing process without introducing any significant overhead.
Figure 3), due to the probabilistic surface distance loss that
4.3. Evaluation of Point Sampler
penalizes such triangles. For further qualitative assessment
see also Figure 1. Point Sampler is responsible for the selection of the ver-
tices to be maintained in the simplified mesh. To assess
4.2. Time Performance
the performance of the sampling module we measure the
One of the most prominent applications of mesh simplifi- structural error in terms of i) the Chamfer distance (CD), ii)
cation is inevitably rendering. Real-time rendering requires the details preserved using the two-sided curvature error as
lightweight model structures, therefore simplification algo- suggested in [28] and iii) the time required to simplify the
rithms are commonly equipped in rendering pipelines. To input point cloud (in seconds) under several simplification
this end, the time performance is of crucial importance for ratios. We compare the proposed method with FPS [29],
the simplification process. To assess time performance, we uniform random sampling as utilized in [15], and a recently
measured the execution time for each method to simplify 20 introduced point cloud simplification module [28]. In Fig-
meshes under different simplification ratios. Experimental ure 4 we qualitative compare the simplified point clouds.
result presented in Table 1 showcase the efficiency of the Quantitative results are presented in Table 2 showcasing
proposed method, outperforming its baseline counterparts that the proposed point sampler outperforms the uniform
18589
random sampler as proposed in [15] and also demonstrate
a balance between the smooth and the sharp results pro-
duced by FPS and Potamias et al. [28]. Importantly, the pro-
posed method remains unaffected from the size of the input,
achieving a sampling of 420%- to 2400%- times faster than
FPS and Potamias et al. [28]. This result clearly demon-
strates that the proposed method can be directly utilized to
sample large-scale point clouds with the minimal computa-
tional overhead, greatly advancing the naive random sam-
pling approach [15]. Finally, we assess the performance of
the proposed sampling module under noisy conditions by
training on clean datasets and testing on noisy point clouds Figure 5. Fine-tuning the proposed method for curvature driven
by adding zero mean and unit std Gaussian noise to the simplification.
data. (right block of Table 2). It can be easily observed
that the proposed method is less affected by the presence
of noise compared to its counterparts by virtue of the Dev-
Conv, which encodes points based on the maximum relative
features of the neighborhood.
4.4. Curvature-based Simplification

A significant property of the proposed method is that
all of its components are fully differentiable. Thus, it can
be seamlessly integrated to an arbitrary iterative framework
which requires gradients to flow throughout the optimiza-
tion process. In this experiment, we attempt to exploit
the capability of the model to be adapted to a trainable
Figure 6. Geodesic and Spectral distance comparison between
pipeline that generates customized simplification. In par- QEM and the proposed method. Distances are measured from the
ticular, we fine-tuned the proposed method to generate sim- nose-tip of the cat.
plified meshes that preserve the curvature of the original.
To achieve this, we utilized a loss term that measures the
curvature difference between the original and the simpli-
fied meshes as proposed in [28]. Experimental results re-
5. Conclusion and Limitations
vealed that the fine-tuned method achieved an improvement In this work we attempt to mark a step towards a totally
of 40% (in average) to the curvature error compared to the learnable mesh simplification framework. We proposed the
original version. An example is illustrated in Figure 5 where first differentiable mesh simplification model based on the
it can be easily observed that the fine-tuned version focuses advances of graph neural networks. The run-time efficiency
on preserving the rough details of the original mesh (such and lightweight structure of the proposed model enables its
as the eyes and the nose-tip) compared to the smooth mesh direct use in a wide range of differentiable applications. The
produced by the untouched version of the proposed method. proposed method outperforms all of its differentiable coun-
terparts. A limitation of the proposed method is that, al-
though we enforce the model using tailor-made loss func-
4.5. Evaluation of intrinsic distances
tion to preserve the topology and the manifoldness of the
To evaluate the distortion of intrinsic properties in the generated meshes, it can not be explicitly guaranteed. In-
simplified mesh we rely on geodesic and spectral distances, evitably, QEM produces smother watertight surfaces with
measured between vertices in a spotted area of the surface. finer details compared to the proposed method. However,
The biharmonic spectral distance is calculated as in [21]. QEM comes with a greater computational cost, along with
A qualitative color-coded comparison of the cat shape sim- a non-differentiable nature that limits its range of applica-
plified by 90% between QEM and the proposed method is tions. We believe that the rationale underlying the proposed
illustrated in Figure 6. Although, both methods manage to method and the presented findings of this paper will benefit
satisfactorily preserve the geodesic distances, the QEM ap- the 3D computer vision community.
proach introduces enough distortion to the biharmonic spec- Acknowledgements. Dr. Stefanos Zafeiriou ac-
tral distances. Comparisons with additional baselines are knowledges support from EPSRC fellowship Deform
provided in the supplementary material. (EP/S010203/1).
18590
References [15] Qingyong Hu, Bo Yang, Linhai Xie, Stefano Rosa, Yulan
Guo, Zhihua Wang, Niki Trigoni, and Andrew Markham.
[1] Kanchan Bahirat, Chengyuan Lai, Ryan P. Mcmahan, and Randla-net: Efficient semantic segmentation of large-scale
Balakrishnan Prabhakaran. Designing and evaluating a mesh point clouds. In Proceedings of the IEEE/CVF Conference
simplification algorithm for virtual reality. ACM Trans. Mul- on Computer Vision and Pattern Recognition, pages 11108–
timedia Comput. Commun. Appl., 14(3s), June 2018. 2 11117, 2020. 2, 4, 7, 8
[2] Alexander M Bronstein, Michael M Bronstein, and Ron [16] SJ Kim, WK Jeong, and CH Kim. Lod generation with dis-
Kimmel. Numerical geometry of non-rigid shapes. Springer crete curvature error metric. In Proceedings of Korea Israel
Science & Business Media, 2008. 6 Bi-National Conference, pages 97–104. Citeseer, 1999. 2
[3] Cătălina Cangea, Petar Veličković, Nikola Jovanović, [17] Sun-Jeong Kim, Chang-Hun Kim, and David Levin. Surface
Thomas Kipf, and Pietro Liò. Towards sparse hierarchical simplification using a discrete curvature norm. Computers &
graph classifiers. arXiv preprint arXiv:1811.01287, 2018. 3 Graphics, 26(5):657–663, 2002. 2
[4] Zhiqin Chen and Hao Zhang. Learning implicit fields for [18] Diederik P Kingma and Jimmy Ba. Adam: A method for
generative shape modeling. In Proceedings of the IEEE/CVF stochastic optimization. arXiv preprint arXiv:1412.6980,
Conference on Computer Vision and Pattern Recognition, 2014. 6
pages 5939–5948, 2019. 3 [19] Hyunho Lee and Min-Ho Kyung. Parallel mesh simplifica-
[5] Angela Dai and Matthias Nießner. Scan2mesh: From un- tion using embedded tree collapsing. The Visual Computer,
structured range scans to 3d meshes. In Proceedings of 32(6):967–976, 2016. 2
the IEEE/CVF Conference on Computer Vision and Pattern [20] Junhyun Lee, Inyeop Lee, and Jaewoo Kang. Self-attention
Recognition, pages 5574–5583, 2019. 2 graph pooling. In Kamalika Chaudhuri and Ruslan Salakhut-
[6] Michaël Defferrard, Xavier Bresson, and Pierre Van- dinov, editors, Proceedings of the 36th International Con-
dergheynst. Convolutional neural networks on graphs with ference on Machine Learning, volume 97 of Proceedings of
fast localized spectral filtering. In D. Lee, M. Sugiyama, Machine Learning Research, pages 3734–3743. PMLR, 09–
U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in 15 Jun 2019. 3
Neural Information Processing Systems, volume 29. Curran [21] Thibault Lescoat, Hsueh-Ti Derek Liu, Jean-Marc Thiery,
Associates, Inc., 2016. 2 Alec Jacobson, Tamy Boubekeur, and Maks Ovsjanikov.
[7] Inderjit S Dhillon, Yuqiang Guan, and Brian Kulis. Weighted Spectral mesh simplification. Computer Graphics Forum,
graph cuts without eigenvectors a multilevel approach. IEEE 39(2):315–324, 2020. 2, 8
transactions on pattern analysis and machine intelligence, [22] Yiyi Liao, Simon Donne, and Andreas Geiger. Deep march-
29(11):1944–1957, 2007. 2 ing cubes: Learning explicit surface representations. In Pro-
[8] Xiang Feng, Wanggen Wan, Richard Yi Da Xu, Haoyu Chen, ceedings of the IEEE Conference on Computer Vision and
Pengfei Li, and J Alfredo Sánchez. A perceptual quality met- Pattern Recognition, pages 2916–2925, 2018. 3
ric for 3d triangle meshes based on spatial pooling. Frontiers [23] Hsueh-Ti Derek Liu, Alec Jacobson, and Maks Ovsjanikov.
of Computer Science, 12(4):798–812, 2018. 2 Spectral coarsening of geometric operators. ACM Trans.
Graph., 38(4), July 2019. 2
[9] Hongyang Gao and Shuiwang Ji. Graph u-nets. In Interna-
[24] Minghua Liu, Xiaoshuai Zhang, and Hao Su. Meshing point
tional Conference on Machine Learning, pages 2083–2092,
clouds with predicted intrinsic-extrinsic ratio guidance. In
2019. 3
European Conference on Computer Vision, pages 68–84.
[10] Michael Garland and Paul S Heckbert. Surface simplification Springer, 2020. 3
using quadric error metrics. In Proceedings of the 24th an-
[25] Lars Mescheder, Michael Oechsle, Michael Niemeyer, Se-
nual conference on Computer graphics and interactive tech-
bastian Nowozin, and Andreas Geiger. Occupancy networks:
niques, pages 209–216, 1997. 1, 2, 6
Learning 3d reconstruction in function space. In Proceedings
[11] Michael Garland and Paul S Heckbert. Simplifying sur- of the IEEE/CVF Conference on Computer Vision and Pat-
faces with color and texture using quadric error metrics. In tern Recognition, pages 4460–4470, 2019. 3
Proceedings Visualization’98 (Cat. No. 98CB36276), pages [26] Alexandros Papageorgiou and Nikos Platis. Triangular mesh
263–269. IEEE, 1998. 2 simplification on the gpu. The Visual Computer, 31(2):235–
[12] Daniele Grattarola, Daniele Zambon, Filippo Maria Bianchi, 244, 2015. 2
and Cesare Alippi. Understanding pooling in graph neural [27] Jeong Joon Park, Peter Florence, Julian Straub, Richard
networks. arXiv preprint arXiv:2110.05292, 2021. 3 Newcombe, and Steven Lovegrove. Deepsdf: Learning con-
[13] Rana Hanocka, Amir Hertz, Noa Fish, Raja Giryes, Shachar tinuous signed distance functions for shape representation.
Fleishman, and Daniel Cohen-Or. Meshcnn: A network with In Proceedings of the IEEE/CVF Conference on Computer
an edge. ACM Transactions on Graphics (TOG), 38(4):90:1– Vision and Pattern Recognition, pages 165–174, 2019. 3
90:12, 2019. 2 [28] Rolandos Alexandros Potamias, Giorgos Bouritsas, and Ste-
[14] Hugues Hoppe, Tony DeRose, Tom Duchamp, John McDon- fanos Zafeiriou. Revisiting point cloud simplification:
ald, and Werner Stuetzle. Surface reconstruction from unor- A learnable feature preserving approach. arXiv preprint
ganized points. In Proceedings of the 19th annual conference arXiv:2109.14982, 2021. 2, 6, 7, 8
on computer graphics and interactive techniques, pages 71– [29] Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J
78, 1992. 2 Guibas. Pointnet++: Deep hierarchical feature learning on
18591
point sets in a metric space. In I. Guyon, U. V. Luxburg,
S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R.
Garnett, editors, Advances in Neural Information Processing
Systems, volume 30. Curran Associates, Inc., 2017. 3, 4, 7
[30] Marie-Julie Rakotosaona, Noam Aigerman, Niloy Mitra,
Maks Ovsjanikov, and Paul Guerrero. Differentiable surface
triangulation. arXiv preprint arXiv:2109.10695, 2021. 2, 3,
6
[31] Marie-Julie Rakotosaona, Paul Guerrero, Noam Aigerman,
Niloy J Mitra, and Maks Ovsjanikov. Learning delaunay
surface elements for mesh reconstruction. In Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, pages 22–31, 2021. 2, 3
[32] Ekagra Ranjan, Soumya Sanyal, and Partha Pratim Taluk-
dar. ASAP: Adaptive structure aware pooling for learn-
ing hierarchical graph representations. arXiv preprint
arXiv:1911.07979, 2019. 3
[33] Rémi Ronfard and Jarek Rossignac. Full-range approxima-
tion of triangulated polyhedra. In Computer Graphics Fo-
rum, volume 15, pages 67–76. Wiley Online Library, 1996.
2
[34] Jarek Rossignac and Paul Borrel. Multi-resolution 3d ap-
proximations for rendering complex scenes. In Modeling in
computer graphics, pages 455–465. Springer, 1993. 1, 2
[35] William J Schroeder. A topology modifying progressive dec-
imation algorithm. In Proceedings. Visualization’97 (Cat.
No. 97CB36155), pages 205–212. IEEE, 1997. 2
[36] William J Schroeder, Jonathan A Zarge, and William E
Lorensen. Decimation of triangle meshes. In Proceedings
of the 19th annual conference on Computer graphics and in-
teractive techniques, pages 65–70, 1992. 2
[37] Nicholas Sharp and Maks Ovsjanikov. Pointtrinet: Learned
triangulation of 3d point sets. In European Conference on
Computer Vision, pages 762–778. Springer, 2020. 2, 3, 6, 7
[38] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko-
reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia
Polosukhin. Attention is all you need. In Advances in neural
information processing systems, pages 5998–6008, 2017. 4
[39] Chao Wen, Yinda Zhang, Zhuwen Li, and Yanwei Fu.
Pixel2mesh++: Multi-view 3d mesh generation via deforma-
tion. In Proceedings of the IEEE/CVF International Confer-
ence on Computer Vision, pages 1042–1051, 2019. 3
[40] Zhitao Ying, Jiaxuan You, Christopher Morris, Xiang Ren,
Will Hamilton, and Jure Leskovec. Hierarchical graph repre-
sentation learning with differentiable pooling. In S. Bengio,
H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi,
and R. Garnett, editors, Advances in Neural Information Pro-
cessing Systems, volume 31. Curran Associates, Inc., 2018.
3, 4
18592

Potamias Neural Mesh Simplification CVPR 2022 Paper

Uploaded by

Copyright:

Available Formats

Potamias Neural Mesh Simplification CVPR 2022 Paper

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Potamias Neural Mesh Simplification CVPR 2022 Paper

Uploaded by

Copyright:

Available Formats

Neural Mesh Simplification

Rolandos Alexandros Potamias1 Stylianos Ploumpis1,2 Stefanos Zafeiriou1,2

feature of face n. In order to generate the final inclusion

Ns /Norg = 0.05 Ns /Norg = 0.1 Ns /Norg = 0.2 Ns /Norg = 0.5

4. Experiments two learnable and differentiable triangulation methods, i.e.

w/o Noise w Noise

4.4. Curvature-based Simplification

You might also like