Optimizing and Interfacing With Cython: Centre de Biophysique Moléculaire (Orléans) Synchrotron Soleil (ST Aubin)
Optimizing and Interfacing With Cython: Centre de Biophysique Moléculaire (Orléans) Synchrotron Soleil (ST Aubin)
Optimizing and Interfacing With Cython: Centre de Biophysique Moléculaire (Orléans) Synchrotron Soleil (ST Aubin)
and interfacing
with Cython
Konrad HINSEN
Centre de Biophysique Moléculaire (Orléans)
and
Synchrotron Soleil (St Aubin)
Extension modules
▹
Python permits modules to be written in C.
Such modules are called extension modules.
▹
Extension modules can define extension types, which are very
similar to classes, but more efficient.
▹
Extension modules are usually compiled to produce
shared libraries (Unix) or dynamic-link libraries (DLL, Windows).
This causes certain restrictions on the C code in the module.
▹
To client code, extension modules look just like Python modules.
▹
Many modules in the standard library are in fact extension modules.
▹
Many scientific packages (including NumPy) consist of a mix
of Python modules and extension modules.
▹
Writing extension modules in plain C is both difficult and
a lot of work. Nowadays most programmers use interface
generators (SWIG, f2py, ...) or a special extension-module
language: Cython.
Cython
▹
Compiler that compiles a Python module to a C extension module
➥ 5% acceleration at best!
▹
Language extensions for writing C in Python syntax
➥ hybrid Python/C programming
Applications:
▹
optimizing a Python function by translating it to C incrementally
▹
writing extension modules more conveniently
▹
writing interfaces to C libraries
Cython vs. Pyrex
Pyrex: the original compiler, developed by Greg Ewing as a research
project.
Cython: a fork of the Pyrex source code, made by the Sage development
team because they needed to add features at a faster pace than Greg was
willing to handle.
Present state:
●
Pyrex is a slowly-evolving small and stable compiler written in Python.
●
Cython is a much larger and more rapidly evolving compiler that
includes compiled modules itself.
●
The two projects exchange ideas and source code.
●
Cython has some features (optimization, array support) that make it
the better choice for numerical code.
Example: Python
def exp(x, terms = 50):
sum = 0.
power = 1.
fact = 1.
for i in range(terms):
sum += power/fact
power *= x
fact *= i+1 Note: This is not the
return sum best algorithm for
calculating an
exponential function!
Example: Cython
Automatic conversion Python->C
def exp(double x, int terms = 50):
cdef double sum
cdef double power
Declaration of C variables
cdef double fact
cdef int i
sum = 0.
power = 1.
fact = 1.
Conversion to integer loop
for i in range(terms):
sum += power/fact
Loop in C
power *= x
fact *= i+1
return sum Automatic conversion C->Python
Performance
50 000 exponential calculations on my laptop:
Python:
1.05 s
Cython:
0.042 s
math.exp:
0.013 s
with nogil:
sum = 0.
power = 1.
fact = 1.
for i in range(terms):
sum += power/fact
power *= x
fact *= i+1
return sum
NumPy arrays in Cython
cimport numpy
import numpy
Verification of Python data type
def array_sum(numpy.ndarray[double, ndim=1] a):
cdef double sum
Variable declarations in C
cdef int i
sum = 0.
for i in range(a.shape[0]):
Loop in C
sum += a[i]
return sum
Automatic Conversion C->Python
Compiling with NumPy
from distutils.core import setup, Extension
import numpy.distutils.misc_util
version = "0.1",
ext_modules = [Extension('array_sum',
['array_sum.pyx'],
include_dirs=include_dirs)],
)
Interfacing to C code
GSL definitions:
cdef extern from "gsl/gsl_sf_bessel.h":
or status == GSL_EUNDRFLW:
raise ValueError(gsl_strerror(status))
int GSL_SUCCESS
int GSL_EUNDRFLW
gsl_error_handler_t* gsl_set_error_handler_off()
gsl_set_error_handler_off()
Extension types
Python class: Extension type:
class Counter: cdef class Counter:
self.value = value
def __init__(self, int value=0):
def increment(self):
def getValue(self): self.value += 1
return self.value
def getValue(self):
return self.value
Main differences:
- the extension type stores its internal state in C variables
- the extension type can have C methods defined with cdef
C methods
cdef class Counter: cdef class CounterBy10(Counter):
self.value += 10
def __init__(self, int value=0):
self.value = value
0)
Prenez la version NumPy du simulateur comme point de départ.
1)
Transférez la fonction calc_forces dans un nouveau module et
importez-la dans le script.
2)
Transposez l’algorithme de la version non-NumPy (avec les
deux boucles explicites) dans cette fonction à optimiser.
3)
Passez du Python au Cython en rajoutant les déclarations
des types C.
0)
Prenez la version NumPy du simulateur comme point de départ.
1)
Créez un nouveau module qui contient une seule fonction “laplacien”
et modifiez la méthode d’origine pour l’appeller.
2)
Transposez l’algorithme de la version non-NumPy (avec les
boucles explicites) dans cette fonction à optimiser.
3)
Passez du Python au Cython en rajoutant les déclarations
des types C.
Provide a Python interface to erf and erfc. Write a Python script that
verifies that erf(x)+erfc(x)=1 for several x.
Note: if you want to use the symbols erf and erfc for your Python
functions, you will have to rename the C functions. This is done as
follows:
cdef extern from "mconf.h":
double c_erf "erf" (double x)