Python Bindings For The Open Source Electromagnetic Simulator Meep
Python Bindings For The Open Source Electromagnetic Simulator Meep
Python Bindings For The Open Source Electromagnetic Simulator Meep
MEEP.
Emmanuel Lambert1, Martin Fiers1, Shavkat Nizamov2, Martijn Tassaert1, Steven G. Johnson3,
Peter Bienstman1, Wim Bogaerts1
1
Ghent University - IMEC, Department of Information Technology (INTEC), Photonics Research Group, Sint-Pietersnieuwstraat
41, B-9000 Gent, Belgium
2
Samarkand State University, Physics Faculty, 140104, University blvd. 15, Samarkand, Uzbekistan
3
Massachusetts Institute of Technology, Department of Mathematics, Center for Material Science & Engineering, Research
Laboratory of Electronics, Cambridge MA 02139, USA
Part of this work was funded by the European Union in the framework of the FP7 integrated
project HELIOS. Martin Fiers and Martijn Tassaert acknowledge the UGent Special Research
Fund for a doctoral grant. Shavkat Nizamov acknowledges the RFBR grant 09-02-90205. Wim
Bogaerts acknowledges the Flemish Fund for Scientific Research (FWO) for a postdoctoral
grant.
Article abstract:
Meep is a broadly used and acknowledged open-source package for FDTD electromagnetic
simulations. We describe how Python bindings for Meep leverage the tool. We outline new
perspectives for integration of Meep with other libraries in the Python ecosystem. We show that
Python bindings allow using Meep more effectively in a large scale parallel computing
architecture. We describe the technical implementation of Python-Meep with SWIG and
describe different architectures for interfacing data with the Meep core engine. Some
applications of Python-Meep in our photonics and plasmonics research are briefly touched. We
illustrate generic benefits to the wider research and open-source community.
Introduction
We see several generic benefits that Python bindings bring to the wider
community of Meep users. Firstly, they enable the integration of Meep with
existing Python open source libraries for scientific computing. The most
acknowledged are Numpy and SciPy [8]. Numpy is an extension to the Python
language which adds support for large, multi-dimensional matrix operations
and related mathematical functions [9]. SciPy is a higher level library with
mathematical tools and algorithms. Suppose for example that we want to
explore a certain parameter space for the optimal configuration of a photonic
waveguide (i.e. we want to simulate the electromagnetic behaviour of this
waveguide with Meep for various parameter values). Optimization algorithms
such as simulated annealing (provided by SciPy) or genetic algorithms
(provided by PyGene), can now be used to explore this parameter space on a
supercomputer and optimize against a certain target function. Numerical
algorithms offered by Numpy can be used for processing of simulation results.
Combining these libraries with Meep is a promising option for many researchers
already familiar with them.
Suppose for example that we have a computer cluster with 1600 cores at our
disposal and that we want to scan a parameter space with 150 combinations of
parameters. Let's assume that each simulation can be efficiently scaled over 16
cores with MPI. Combining MPI and IPython, we can run 100 Python-Meep
simulations simultaneously, with each simulation consuming 16 cores. If each
simulation takes 30 minutes to complete, then we can execute the full
parameter space in just one hour (30 minutes for 100 simultaneous simulations
on 16 cores per simulation, followed by another 30 minutes for the subsequent
50 simultaneous simulations).
Both dimensions are independent of one another and have different scaling
properties. The scaling behaviour of Python-Meep over the first dimension (the
number of cores for MPI-run) is similar to the standard Meep: the Python layer
does not interfere with the MPI-specific commands in the Meep core. Figure 2b
shows the scaling of a benchmark 3-dimensional simulation with MPI. The total
calculation time is shown for different resolutions (i.e. sizes of the
computational volume). This is compared with the scaling that we ideally
expect: i.e. when we double the number of nodes, we expect the calculation
time to halve. For a given resolution, there is an upper limit to the number of
cores over which we can scale efficiently. For a 3-dimensional simulation, the
communication and synchronization overhead increases with the 4th power of
the number of computing cores. At some point, the added benefit of extra
calculation power is smaller than the additional overhead that is created: in
such a case, the total calculation times even increases. In figure 2b, we can see
that scaling performance is better for more complex, high resolution problems.
Figure 2b illustrates the scaling of a 3D Python-Meep simulation with MPI. The actual
calculation times are show for different resolutions and compared with the calculation times
that we ideally expect.
For the second dimensions (the IPython engines), there is no inherent scaling
limit as the different IPython engines are essentially separated programs
running in parallel, with no intercommunication.
Figure 2c below shows a graphical user interface that was built with PyQt [14]
on top of this IPython based framework: we can conveniently launch new
Python-Meep simulations and inspect results of simulations that have
terminated.
Figure 2c illustrates the graphical user interface of the photonic simulation framework of UGent.
It shows the parameters used in a range of Python-Meep simulations with the corresponding
result for each simulation, i.e. the transmission calculated from the fluxes. It offers the
possibility to inspect results and subsequently launch new simulations (with different
parameters) to a computing cluster. This high level of automation aids in the rapid design of
new components.
A taste of Python-Meep
Figures 3a/3b : a basic Python-Meep simulation script (a) and its equivalent in Scheme (b).
Note that the coordinate system is different in both versions.
Technical implementation of the Python bindings
The Meep core library (written in C++) provides a mechanism of callbacks for
integration with the simulation script: whenever the runtime engine needs
information about specific properties of the simulation, a function defined by
the user is called. This mechanism is used intensively, for example in the
definition of the material properties of the simulation volume or in the definition
of a custom electromagnetic source.
The Python-Meep bindings were developed using SWIG, an open source tool
that allows connecting programs written in C/C++ with a variety of high-level
programming languages [18]. The flexibility of SWIG allows for an elegant
integration with this callback mechanism. Based on our experiences with
performance and ease of use for the end user, the actual implementation
technique evolved in three phases (described below and illustrated in figure 4).
Figure 4 illustrates the alternative architectures that were implemented for definition of the
material properties in the simulation volume. First architecture: using a pure Python class for
callback (a). In this case, the C++/Python boundary is crossed whenever callback occurs
(potentially millions of times for material definition). Second architecture: using inline C/C++
for large simulation volumes with many grid points (b): the callback occurs completely in the
C/C++ domain (great performance). Third architecture: the user works in Python only,
creating a Numpy matrix with the material definition (c). Meep can directly access this matrix
using a pointer, while the user works in pure Python (also with great performance but with
increased memory consumption).
In a first straightforward implementation, Python-Meep provides an abstract
Callback class, from which the user inherits in pure Python. In that class, the
user implements the required functionality, such as definition of the material
properties (see figure 3). For many complex simulations however (i.e. with
high resolution), the performance of this pure Python callback was not
sufficient : the callback function for definition of the materials is typically called
a million times or more. The overhead of swapping from C++ to Python,
subsequently running a piece of interpreted Python code and returning the
results back to C++ is small, but it becomes problematic when the callback is
executed hundreds of thousands or millions of times.
Initially, this drawback was solved by allowing users to define a callback
function in C or C++, with the rest of the simulation script in Python. In this
scheme, the users C++ code is compiled at runtime and dynamically linked
with the Python-Meep bindings: the callback is then done completely inside the
C++ domain. This solution provides the required performance. The Python
package weave allows for very elegant inclusion of inline C/C++. It largely
abstracts the overhead for the user of mixing Python with C/C++.
Nevertheless, combining 2 languages remains a drawback for certain end
users, many of whom are not familiar with C/C++.
In the original Scheme interface, the performance issue with this repeated
callback occurs less often: in this implementation, the standard callback
mechanism is largely bypassed by the authors of Meep. A tighter integration of
the C++ core and definitions in Scheme is realized.
We subseqently worked towards a similar solution that would allow a pure
Python definition of even complex high-resolution simulations. The
breakthrough came by combining SWIG with Numpy matrices. Numpy is known
for its great performance, thanks to the fact that Numpy stores and processes
its data in C and exposes only a thin interface to Python. Therefore, if we
define a Numpy matrix in Python with the material properties of our simulation
volume, that matrix is directly accessible from Meep using C coding
conventions (basically a pointer). The integration then comes down to writing a
wrapper around the Meep callback functionality. This wrapper retrieves the
actual values from the Numpy matrix and returns them to Meep. Figure 4
further illustrates this architecture in contrast with the other two. Code-wise,
we provide a user-friendly class CallbackMatrix from which the user inherits. In
the class, he creates a Numpy matrix with size corresponding to the discretized
simulation volume (or a multiple for better accuracy). This architecture offers
excellent performance, while allowing the user to work in pure Python. A
drawback is the increased memory consumption, as we have to store the
Numpy matrix before it is interfaced to Meep. Figure 5 illustrates the technique
for the straight waveguide example of figure 3.
Figure 5 : use of the technique with Numpy matrix for describing the straight waveguide of
figure 3. The user inherits from CallbackMatrix2D and assigns the Numpy matrix to an
attribute.
As we see in the last line of the code snippet of figure 5, the Python-Meep
function set_matrix_2D is used for interfacing the Numpy matrix with the
underlying C++ code. In the C++ code of the Python-Meep wrapper, the
function signature is :
void set_matrix_3D(double* matrix, int dimX, int dimY, int dimZ, ...)
The first parameter is of type double* and is a pointer to the actual values in
the Numpy matrix. The following two or three int parameters indicate the
matrix dimensions. In Python the matrix is of type numpy.ndarray.
We want to seamlessly pass the Numpy matrix as parameter to the functions
set_matrix_2D and set_matrix_3D. It is therefore required to define some kind
of translation between the Python type numpy.ndarray and an equivalent tuple
of parameters double* and int in C++. In SWIG, the technique for such a
translation is called a typemap. Normally, the definition of typemaps is a
complicated and tedious task. Luckily, a range of typemaps for Numpy are
already available in the open source community ("numpy.i" [19]). They are
called IN_ARRAY2 and IN_ARRAY3 for respectively 2- and 3-dimensional Numpy
arrays.
In our SWIG definition file, we have to link up the signature of the
set_matrix_2D function with the typemap. This is done using the code below.
When we pass a Numpy array to the function in Python, it is automatically
expanded in the three or four corresponding parameters of the C++ function.
//Include the Numy header file, so that Numpy types are known
%{
#define SWIG_FILE_WITH_INIT
#include <numpy/npy_common.h>
%}
%init %{
import_array();
%}
All three of the above techniques for defining material matrices are available to
users of Python-Meep. The approach with the Numpy matrix is the preferred
one for simulations of moderate size. For very large simulation volumes, using
a C/C++ callback function may currently be more appropriate, as it has lower
memory requirements. In future versions, we are planning to explore PyTables
[20] as an approach for processing very large matrices: PyTables combines
HDF5 and Numpy and allows storing huge matrices on disk, thus limiting the
memory consumption.
Figure 6 illustrates the field profile without spatial shaping of the source (a), versus a field
profile when the source is shaped according to an amplitude matrix calculated by Fimmwave
and imported by Python-Meep (b). A field profile that is useful for a realistic design should have
a constant spatial distribution of the power intensity over time for a given cross-section: in (a),
we see that there are major changes over time in the spatial distribution of the power intensity
for the chosen cross-section. In constrast, the profile in (b) shows a constant spatial
distribution of the power intensity over the full length of the waveguide.
Open source
The Python-Meep bindings are distributed by its authors under the terms of the
GNU General Public License (v2). The source code is publicly available on
Launchpad [23] and the community is invited to further contribute to the
project's development.
Conclusion
We conclude that the recently released Python bindings for Meep bring
interesting benefits for the wider research and open source community. First of
all, Python is a convenient alternative for those researchers who want to use
Meep but are not familiar with the Scheme programming language. The Python
bindings enable the integration of Meep with other software libraries in the
Python ecosystem (such as libraries for visualization and libraries with
numerical and scientific algorithms). We can also leverage the parallel
computing capabilities of Meep by combining MPI with the IPython framework.
We discussed the technical implementation of the Python-Meep bindings with
SWIG and three different architectures for interfacing data with the Meep core
engine. We have illustrated how we use Python-Meep in our silicon photonics
and plasmonics research. Some options for improvement in future versions
were discussed. We have released the Python-Meep bindings as open source:
in this way, the community of users can contribute to its further development.
References :
[1] - Allen Taflove and Susan C. Hagness (2005). Computational Electrodynamics: The Finite-
Difference Time-Domain Method, 3rd ed. Artech House Publishers. ISBN 1-58053-832-0.
http://www.artechhouse.com/Detail.aspx?strBookId=1123.
[2] - Ardavan F. Oskooi, David Roundy, Mihai Ibanescu, Peter Bermel, J. D. Joannopoulos, and
Steven G. Johnson, "MEEP: A flexible free-software package for electromagnetic simulations by
the FDTD method," Computer Physics Communications, vol. 181, pp. 687-702 (2010).
[3] - Gerald Jay Sussman and Guy Lewis Steele, Jr. (December 1975), "Scheme: An
Interpreter for Extended Lambda Calculus" (postscript or PDF), AI Memos (MIT AI Lab) AIM-
349
[4] - IEEE Standard for the Scheme Programming Language, IEEE part number STDPD14209.
[9] - Travis E. Oliphant, Python for Scientific Computing, Comput. Sci. Eng. 9, 10 (2007).
From the same author, the book Guide to Numpy (December 7, 2006) was released in the
public domain. It can be downloaded at http://www.tramy.us/numpybook.pdf
[11] Mayavi2 is a Python library for 3D Scientific Data Visualization and Plotting.
http://code.enthought.com/projects/mayavi/
[12] - Gropp, William; Lusk, Ewing; Skjellum, Anthony (1994). Using MPI: portable parallel
programming with the message-passing interface. MIT Press In Scientific And Engineering
Computation Series, Cambridge, MA, USA. 307 pp. ISBN 0-262-57104-8
[13] - Fernando Perez, Brian E. Granger, "IPython: A System for Interactive Scientific
Computing," Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June 2007,
doi:10.1109/MCSE.2007.53.
[14] - PyQt are Python bindings for Nokia's Qt application framework. It runs Windows,
MacOS/X, Linux. http://www.riverbankcomputing.co.uk/software/pyqt/intro
[15] HDF5 is a set of file formats and libraries designed to store and organize large amounts
of numerical data. Originally developed at the National Center for Supercomputing Applications,
and currently supported by HDF Group. http://www.hdfgroup.org
[16] - John Hughes. Why Functional Programming Matters, in D. Turner, editor, Research
Topics in Functional Programming. Addison Wesley, 1990.
[17] - M. P. Atkinson,Peter Buneman,Ronald Morrison, Data types and persistence, par 4.2.1
[18] - David M. Beazley - "Using SWIG to Control, Prototype, and Debug C Programs with
Python". 4th International Python Conference, Livermore, California, June, 1996.
[19] Bill Spotz, Sandia National Laboratories, numpy.i: a SWIG Interface File for NumPy,
Decmber 2007, document available in the Numpy distribution.
[20] PyTables is a package for managing hierarchical datasets designed to efficiently cope
with extremely large amounts of data. http://www.pytables.org
[21] - Boost.Python, a C++ library which enables seamless interoperability between C++ and
Python. http://www.boost.org/doc/libs/1_43_0/libs/python/doc/index.html