3D Freeform Surfaces from Planar Sketches Using
Neural Networks
Usman Khan, Abdelaziz Terchi, Sungwoo Lim, David Wright, and Sheng-Feng Qin
Brunel University, Uxbridge, Middlesex, United Kingdom
{Usman.Khan, Aziz.Terchi, Sungwoo.Lim, David.Wright,
Sheng.Feng.Qin}@brunel.ac.uk
Abstract. A novel intelligent approach into 3D freeform surface reconstruction
from planar sketches is proposed. A multilayer perceptron (MLP) neural
network is employed to induce 3D freeform surfaces from planar freehand
curves. Planar curves were used to represent the boundaries of a freeform
surface patch. The curves were varied iteratively and sampled to produce
training data to train and test the neural network. The obtained results
demonstrate that the network successfully learned the inverse-projection map
and correctly inferred the respective surfaces from fresh curves.
Keywords: neural networks, freeform surfaces, sketch-based interfaces.
1 Introduction
The preliminary stages of the conceptual product design process are characterised by
a high degree of creative activity. Designers strive to convert new ideas into graphical
form as soon as possible. It can be argued that sketching is an essential activity for
creative design. The reasons are manifold. It permits the rapid exploration and
evaluation of concepts [1]. It also assists the designer’s short-term memory and
facilitates communication with other people. When designers sketch shapes on a sheet
of paper, they start with a vague concept, which they progressively refine into a final
product. While numerous iterations are usually undertaken, the salient properties of
the original idea are often maintained. Recently, the desire to automate the early phase
of the conceptual product design have given impetus to the development intelligent
tools to simulated the way of sketching is performed by designers [2-4]. However,
most existing approaches are restricted to fairly simple objects such as planar and
polygonal shapes. Consideration of complex free-form surfaces is a challenging
process. The problem has surprisingly received little attention in the literature.
The problem of reconstructing a three dimensional (3D) shape from a planar
drawing is fundamental problem in computer vision and computer aided geometric
design. Clowes [5], developed a classification method based on labelling drawings
and sorting their edges to recognise polyhedral shapes. Though, their method was
extended to other line drawings [6-8], their work mainly involved determination of
the depth from a 2D drawing consisting of flat surfaces with straight line edges. With
regard to freeform surfaces some of the foundation work was developed by Igarashi et
al. [3] who reproduced rough freeform models from freehand sketch input. Since then
I. King et al. (Eds.): ICONIP 2006, Part II, LNCS 4233, pp. 651 – 660, 2006.
© Springer-Verlag Berlin Heidelberg 2006
652
U. Khan et al.
only moderate progress has been achieved in recovering freeform surfaces from online sketches. Michalik et al. [9] proposed a constraint-based system that
reconstructed a B-spline surface from a sketch into 3D. These papers employ
techniques based on rules or constraints to extract the correlation between the 2D
drawings and their respective 3D shapes. In the same vein, the work of Lipson and
Shpitalni [6] is also based on the notion of correlation.
Work in recognition of shape features from 2D input was reported by Nezis and
Volniakos [10]. The topology of the input drawing was exploited to categorize the
shape features. Peng and Shamsuddin [11] claimed that a neural network was able to
estimate the pose of a 3D object from a 2D image from any arbitrary viewpoint.
Reconstruction of 3D shapes by estimating their depth was done by Yuan and
Niemann [12]. They represented objects using a triangular mesh from reverse
engineered data and demonstrated that a neural network could reconstruct 3D
geometry from 2D input.
Early work pertaining to reconstruction of freeform surfaces was covered by Gu
and Yan [13]. A non-uniform b-spline (NURB) surface was fitted over scattered data
from a reverse engineering source using an unsupervised neural network. Hoffman
and Varady [14] and Barhak and Fischer [15] extended this line of research. However,
their methods required that all three dimensions be available for reconstruction
purposes.
The present paper proposes and develops a methodology for 3D freeform surface
inference from freehand planar sketches. The methodology is based on neural
networks. Specifically, an MLP neural network, trained with a momentum-augmented
backpropagation learning algorithm, is employed to induce 3D freeform surfaces from
2D sketches. The reconstruction procedure consists of two steps: first a neural
network is trained on pairs of normalised 3D surfaces and their corresponding
projection curves, then the trained neural network is used to reconstruct unknown 2D
sketches. The methodology is tested with a range of data and produced satisfactory
results.
The remainder of this paper is organised as follows. In section 2 3D freeform
surface reconstruction is formulated as an inverse problem. In section 3, neural
networks together with their learning algorithms are discussed. The data generation
procedure is discussed in section 4. The computational results are presented in section 5.
Finally section 6 treats conclusions and future work.
2 Problem Formulation
Volumetric concepts originate in the mind of a designer as 3D entities. They are then
transformed, via an isometric projection onto an arbitrary view plane, into planar
sketches. Such a task is considered as the direct problem. The 3D freeform surface
inference problem consists of extracting the 3D geometry from the 3D, i.e., to recover
the depth information that was lost during the projection process. This process can be
regarded as the inverse process of the original projection. The direct problem is, in
general, a well-posed problem and can be solved analytically using concepts from
projective geometry.
3D Freeform Surfaces from Planar Sketches Using Neural Networks
653
In contrast, the inverse problem is, in general, ill-posed. The solution may not be
unique, may lack continuity could be highly influenced by the amount of noise
present in the data. Therefore, 3D surface reconstruction is indeterminate in that an
infinite number of possible 3D surfaces can correspond to the same 2D curve. To
obtain a unique and physically meaningful solution requires additional information in
terms of general assumptions, constraints and clues from experience. In the context of
this paper, the planar curves are constrained to lie in the x-z or the y-z planes and their
control points are restricted to vary only along the z-direction. Such constraint ensures
the maintenance of the planar property of the inferred 3D surfaces and leads to a
single one to one mapping from the input 2D curves to the expected 3D surfaces. This
renders the inverse problem tractable.
Given a set of p ordered pairs {(xi, yi), i = 1,…, p} with xi ∈ R2 and yi ∈ R3, the
surface reconstruction problem is to find a mapping F : R2 → R3 such that F(xi) = yi, i
= 1,…, p. In practice, the function F is unknown and must be determined from the
given data {(xi, yi), i = 1,…, p}. A neural network solution of this problem is a twostep process: training, where the neural network learns the function from the training
data {xi, yi}, and generalisation, where the neural network predicts the output for a
test input. We demonstrate how an MLP neural network trained with a momentumaugmented backpropagation algorithm on a collection of 2D-3D dependencies, can
approximate the inverse map in a computationally efficient form.
3 Neural Networks
Neural networks are connectionist computational models motivated by the need to
understand how the human brain might function. A neural network consists of a large
number of simple processing elements called neurons. Feedforward neural networks
have established universal approximation capability [16] and have proven to be potent
tool in the solution of approximation, regression, classification and inverse problems.
For this reason, a MLP neural network is selected for the solution of the
reconstruction problem. The MLP neural network is composed of three layers: the
input layer, the hidden layer and the output layer. The neurons of the input layer feed
data to the hidden layer where it performs the following nonlinear transformation:
sj = f
(∑ w x ) .
k
jk k
(1)
where xk are the neurons inputs signals, sj are neural outputs and wjk the synapses and
f is an activation function. For MLP neural network, the sigmoid function is used as
the activation function. The output layer of the neurons takes the linear
transformation:
yj = f
(∑ w s ) .
k
jk k
(2)
where yj are the output layer neuron outputs, and wij are synapses. Neural network
training can be formulated as a nonlinear unconstrained optimisation problem. So the
training process can be realised by minimising the error function E defined by:
654
U. Khan et al.
E=
1 p n
(y jk − t jk )2 .
∑∑
2 k =1 j =1
(3)
where yjk is the actual output value at the j-th neuron of output layer for the k-th
pattern and tjk is the target output value. The training process can be thought of as a
search for the optimal set of synaptic weights in a manner that the errors of the output
is minimised.
3.1 Backpropagation Algorithm
Most learning algorithms are based on the gradient descent strategy. The
backpropagation algorithm (BP) [17] is no exception. The BP algorithm uses the
steepest descent search direction with a fixed step size α to minimise the error
function. The iterative form of this algorithm can be expressed as:
wk +1 = wk − αg k .
(4)
where w denotes the vector of synaptic weights and g = ∇ E(w) is the gradient of the
error function E with respect to the weight vector w.
In the BP learning algorithm the weight changes are proportional to the gradient of
the error. The larger the learning rate, the larger weight changes on each iteration, and
the quicker the network learns. However, the size of the learning rate can also
influence the network’s ability to achieve a stable solution. In a neighbourhood of the
error surface where the gradient retains the same sign, a larger value of the learning
rate α results in a rapid reduction of the energy function faster. On the other hand, in
an area where the gradient rapidly changes sign, a smaller value of α maintains the
descent direct along the error surface.
Despite its computational simplicity and popularity, the BP training algorithm is
plagued by such problems as slow convergence, oscillation, divergence and
“zigzagging” effect. The BP learning algorithm is in essence a gradient descent
optimisation strategy of a multidimensional error surface in the weight space. Such
strategy exhibits has inherently slow convergence; especially on large-scale problems.
This trait becomes more pronounced when the condition number of the Hessian
matrix is large. The condition number is the ratio of the largest to the smallest
eigenvalue of the network's Hessian matrix. The Hessian matrix is the matrix of
second order derivatives of the error function with respect to the weights.
In many cases the error hypersurface is no longer isotropic but rather exhibits
substantially different curvatures along different directions, leading to the formation
of long narrow valleys. For most points on the surface, the gradient does not point
towards the minimum, and successive steps along the gradient descent oscillates from
one side to the other. Progress towards the minimum becomes very slow. This
suggests a method that dynamically adapts the value of the learning rate, α to the
topography of the error surface.
3.2 Momentum-Augmented Backpropagation
One way to circumvent the above problem, the BP propagation in eq. 4 is augmented
with a momentum term:
3D Freeform Surfaces from Planar Sketches Using Neural Networks
wk +1 = wk − αg k + β (wk − wk −1 ) .
655
(5)
The momentum term, β has the following effects: 1) it smoothes the oscillations
across narrow valleys; 2) it amplifies the learning rate when all the weights change in
the same direction; and 3) enables the algorithm to escape from shallow local minima.
In essence, the momentum strategy implements a variable learning rate implicitly.
It introduces a kind of 'inertia' in the dynamics of the weight vector. Once the weight
vector starts moving in a particular direction in the weight space, it tends to continue
moving along the same direction.
If the weight vector acquires sufficient momentum, it bypasses local minima and
continues moving downhill. This increases the speed along narrow valleys, and
prevents oscillations across them. This effect can also be regarded as a smoothing of
the gradient and becomes more pronounced as the momentum term approaches unity.
However, a conservative choice of the momentum term should be adopted because of
the adverse effect that might emerge: in a narrow valley bend the weight movement
might jump over the walls of the valley, if too much momentum has been acquired.
The learning algorithm requires the a priori selection of the learning rate and the
momentum coefficient. However, it may not easy to choose judicious values for these
parameters because a theoretical basis does not seem to exists for the selection of
optimal values. One possible strategy is to experiment with different values of these
parameters to determine their influence on the overall performance. The moment
augmented backpropagation algorithm may be used both in batch and on-line training
modes. In this paper the batch version is used.
4 Data Generation
The neural network used in this paper is trained in a supervised mode via a collection
of input-output pairs to optimise the network parameters (i.e. synaptic weights and
biases). Training is accomplished through a learning algorithm that iteratively adjusts
the network parameters until the mean squared error (MSE) between the predicted
and the desired outputs reaches a suitable minimum.
A training set was generated from a family of freeform surfaces whose edges also
referred to as the boundaries, consisted of four orthogonally arranged planar curves.
An example of a planar curve is shown in Fig. 1. Each curve was governed by four
independent control points and represented by a Non Uniform Rational B-Spline
(NURBS). Two control points determined the ends of the curve whereas the remaining
ones controlled its general shape. NURBS control points need not intersect the curve
and can lie anywhere in the 3D space. The curve was uniformly sampled and the
coordinates of the sample points formed the input features for the neural network.
The planar curves were placed in the x-z plane or the y-z plane and their control
points were only altered along the z-direction to maintain their planar property. Each
of the four boundary curves were uniformly sampled at 10 positions. Hence a surface,
whether represented in 2D or 3D, consisted of 40 sample points. A point on the 3D
surface is represented by the x, y and z coordinates whereas in 2D, it is represented by
its x and y coordinates. Therefore a 3D surface is represented by 120 independent
features and its respective 2D curve by 80 features.
656
U. Khan et al.
Fig. 1. Planar 3D NURBS curve. Each control point of the curve lies on the same plane as the
others.
The positions of the control points were varied to produce a class of unique
freeform surfaces. Each surface was projected onto the view plane to produce the
respective 2D planar projection. The training set is composed of pattern pairs, each
containing a 3D surface and its corresponding 2D curve.
The data set was normalised so that the input 3D pattern would fit within a unit
cube and its respective 2D pattern within the unit square. Normalisation ensures that
the values lie within the characteristic bounds of the activation functions.
Fig. 2 shows two examples of normalised pattern pairs that were used to train the
neural network. The 2D input patterns are depicted in Fig. 2 (a) whereas their
z
x
(a)
y
(b)
Fig. 2. Examples of 2D input patterns and corresponding 3D output patterns
3D Freeform Surfaces from Planar Sketches Using Neural Networks
657
corresponding 3D output patterns are shown in Fig. 2 (b). It can be seen that the
boundary of the surfaces are described by a series of sample points and fits within a
unit square for 2D and unit cube for 3D. Notice that the viewpoint of the 3D desired
pattern coincides with the viewpoint of the 2D input pattern.
The entire data set was composed of 4096 patterns. The whole set cannot be used
to train the network because no data would be left to test the network’s ability to
generalise into fresh inputs. Therefore the data set was randomly split, using three
subsets that were used for training, validation and testing. Accordingly, the number of
training, validation and testing patterns pairs were therefore 2867, 819 and 410
respectively. This corresponds to a 70, 20 and 10 percent split of the data.
5 Computational Results
A three-layer MLP network was employed in our research. The input and output layer
dimensions of the neural network were determined from the features of the training
set. The input layer consist of 80 nodes and while the output layer consists of 120
nodes. The number of nodes in the hidden layer is freely adjustable and results in
different network performance depending on the number of hidden nodes used. The
parameters used in the network are shown in Table 1.
Table 1. Network Architecture and Parameters
Number of Input Nodes
Number of Output Nodes
Learning Rate (α)
Momentum (β)
Number of Epochs
Number of Training Patterns
Learning Mode
80
120
0.7
0.6
5000
2867
Batch
The number of hidden nodes indicates the network complexity and governs how
accurately it learns the mapping from the input patterns to the outputs. It also affects
how long the network takes to perform each training cycle. The higher the number of
hidden nodes, the more computation is required and hence a longer training time is
needed. Experimentation with different numbers of nodes in the hidden layer was
conducted. Multiple neural networks were trained with similar parameters such as the
learning rate, momentum and training sets. In this case the learning rate was 0.7 and
the momentum was 0.6. Only the number of hidden nodes was changed. It was found
that a neural network of 50 hidden nodes produced the best reconstruction error over a
fixed number of epochs. This was found by comparing the average reconstruction
error of the networks based on a fresh test set containing 410 patterns.
Finally a new network of 50 hidden units was trained again for 5000 epochs. The
final training error was 0.06. At the end of the training, the net was saved the test set
applied to the network. The obtained results show that the neural network was able to
infer the 3D shape of a freeform surface from its respective 2D input pattern.
658
U. Khan et al.
(a)
(b)
Fig. 3. Test Input Patterns with Predicted and Desired Outputs
An example test pattern that was applied to the trained network is shown in Fig. 3
(a). The predicted and the expected 3D patterns that correspond to the 2D surface are
shown in Fig. 3 (b). The predicted pattern is depicted in green whereas the desired
pattern is in blue. It can be noticed from the plots in Fig. 3 (b) that the two surfaces
per image are almost identical and hence that the neural network has inferred the
correct shape that was desired. However, small deviations in the predicted patterns
can be seen when observed closely. They relate to the network’s ability to predict the
desired surfaces. The distributions of errors are presented in Fig. 4. This shows the
Euclidean distances between each point from the predicted surface and its
corresponding point on the desired surface. The RMS error for this pattern was
0.33%.
Fig. 4. Distribution of Squared Errors Between Predicted Output and Expected Output
6 Conclusions and Future Work
In this paper a methodology for the inference of 3D freeform surfaces from 2D
surface representations using neural networks has been proposed. A representative
dataset was generated by iteratively adjusting the control points of freeform surface
boundary curves that were previously uniformly sampled. The dataset was normalised
and randomly split into three subsets: training, validation and test sets. An MLP was
optimised using different numbers of hidden nodes. The best network, i.e. the network
with the lowest training RMSE, was trained with a representative family of 2D and
3D pattern pairs. The neural network was applied to a set of 2D patterns had not been
3D Freeform Surfaces from Planar Sketches Using Neural Networks
659
encountered before. Obtained 3D results demonstrate that the target freeform surfaces
can be reproduced from 2D input patterns within 2 % accuracy. Future work will
extend the methodology to more complex shapes and reconstruct the 3D surface that
corresponds to the inferred surface boundary.
Acknowledgements
This research is supported by the EPSRC, research grant [EPSRC GR/S01701/01].
The methodology was implemented in Java 1.5 using Netbeans 5.0 based on a
Windows 2000/XP platform. The neural network was developed using the Java
Object Orientated Neural Engine and the 3D graphics drawn using Jun for Java. The
authors would also like to thank Yosh Nishinaka from Software Research Associates
Inc. and Paolo Marrone from JOONE for their assistance.
References
1. Lim, S., Lee, B., Duffy, A.: Incremental modelling of ambiguous geometric ideas (IMAGI): representation and maintenance of vague geometry. Artificial Intelligence in
Engineering 15 (2001) 93-108
2. Karpenko, O., Hughes, J., Raskar, R.: Free-form Sketching with variational implicit
surfaces. Eurographics 21 (2002) 585-594
3. Igarashi, T., Matsuoka, S., Tanaka, H.: Teddy: A Sketching Interface for 3D Freeform
Design. 26th International Conference on Computer Graphics and Interactive Techniques
(1999) 409-416
4. Alexe, A., Gaildrat, V., Barth, L.: Interactive Modelling from Sketches using Spherical
Implicit Functions. Computer Graphics, Virtual Reality, Visualisation and Interaction in
Africa. Proceedings of the 3rd International Conference on Computer Graphics, Virtual
Reality, Visualisation and Interaction in Africa., Africa (2004) 25-34
5. Clowes, M.: On Seeing Things. Artificial Intelligence 2 (1971) 79-116
6. Lipson, H., Shpitalni, M.: Correlation-Based Reconstruction of a 3D Object from a Single
Freehand Sketch. AAAI Spring Symposium on Sketch Understanding, Palo, Alto, USA
(2002) 99-104
7. Varley, P., Martin, R.: Estimating Depth from Line Drawings. Symposium on Solid
Modelling. ACM press, Saarbrucken, Germany (2002) 180-191
8. Shpitalni, M., Lipson, H.: 3D conceptual design of sheet metal products by sketching.
Journal of Materials Processing Technology 103 (2000) 128-134
9. Michalik, P., Kim, D.H., Bruderlin, B.D.: Sketch- and constraint-based design of B-spline
surfaces. Proceedings of the seventh ACM symposium on Solid modelling and
applications. ACM Press, Saarbrücken, Germany (2002) 297-304
10. Nezis, K., Vosniakos, G.: Recognizing 2 1/2D shape features using a neural network and
heuristics. Computer-Aided Design 29 (1997) 523-539
11. Peng, L.W., Shamsuddin, S.M.: 3D Object Reconstruction and Representation Using
Neural Networks. Computer graphics and interactive techniques in Australia and South
East Asia, Singapore (2004) 139-147
12. Yuan, C., Niemann, H.: Neural Networks for appearance-based 3-D object recognition.
Neurocomputing 51 (2003) 249-264
660
U. Khan et al.
13. Gu, P., Yan, X.: Neural network approach to the reconstruction of freeform surfaces for
reverse engineering. Computer Aided Design 27 (1995)
14. M. Hoffman, Varady, L.: Free-form Surfaces for Scattered Data by Neural Networks.
Journal for Geometry and Graphics 2 (1998) 1-6
15. Barhak, J., Fischer, A.: Adaptive reconstruction of freeform objects with 3D SOM neural
network grids. Computer and Graphics 26 (2002) 745-751
16. Hornick, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal
approximators. Neural Networks 2 (1989) 359-366
17. Rumelhart, D.E., McClelland, J.L.: Parallel Distributed Processing: Exploration in the
Microstructure of Cognition, Vol. 1. MIT Press, Massachusetts (1986)