| =============== |
| C-API for NumPy |
| =============== |
| |
| :Author: Travis Oliphant |
| :Discussions to: `numpy-discussion@scipy.org`__ |
| :Created: October 2005 |
| |
| __ http://www.scipy.org/Mailing_Lists |
| |
| The C API of NumPy is (mostly) backward compatible with Numeric. |
| |
| There are a few non-standard Numeric usages (that were not really part |
| of the API) that will need to be changed: |
| |
| * If you used any of the function pointers in the ``PyArray_Descr`` |
| structure you will have to modify your usage of those. First, |
| the pointers are all under the member named ``f``. So ``descr->cast`` |
| is now ``descr->f->cast``. In addition, the |
| casting functions have eliminated the strides argument (use |
| ``PyArray_CastTo`` if you need strided casting). All functions have |
| one or two ``PyArrayObject *`` arguments at the end. This allows the |
| flexible arrays and mis-behaved arrays to be handled. |
| |
| * The ``descr->zero`` and ``descr->one`` constants have been replaced with |
| function calls, ``PyArray_Zero``, and ``PyArray_One`` (be sure to read the |
| code and free the resulting memory if you use these calls). |
| |
| * If you passed ``array->dimensions`` and ``array->strides`` around |
| to functions, you will need to fix some code. These are now |
| ``npy_intp*`` pointers. On 32-bit systems there won't be a problem. |
| However, on 64-bit systems, you will need to make changes to avoid |
| errors and segfaults. |
| |
| |
| The header files ``arrayobject.h`` and ``ufuncobject.h`` contain many defines |
| that you may find useful. The files ``__ufunc_api.h`` and |
| ``__multiarray_api.h`` contain the available C-API function calls with |
| their function signatures. |
| |
| All of these headers are installed to |
| ``<YOUR_PYTHON_LOCATION>/site-packages/numpy/core/include`` |
| |
| |
| Getting arrays in C-code |
| ========================= |
| |
| All new arrays can be created using ``PyArray_NewFromDescr``. A simple interface |
| equivalent to ``PyArray_FromDims`` is ``PyArray_SimpleNew(nd, dims, typenum)`` |
| and to ``PyArray_FromDimsAndData`` is |
| ``PyArray_SimpleNewFromData(nd, dims, typenum, data)``. |
| |
| This is a very flexible function. |
| |
| :: |
| |
| PyObject * PyArray_NewFromDescr(PyTypeObject *subtype, PyArray_Descr *descr, |
| int nd, npy_intp *dims, |
| npy_intp *strides, char *data, |
| int flags, PyObject *obj); |
| |
| ``subtype`` : ``PyTypeObject *`` |
| The subtype that should be created (either pass in |
| ``&PyArray_Type``, ``&PyBigArray_Type``, or ``obj->ob_type``, |
| where ``obj`` is a an instance of a subtype (or subclass) of |
| ``PyArray_Type`` or ``PyBigArray_Type``). |
| |
| ``descr`` : ``PyArray_Descr *`` |
| The type descriptor for the array. This is a Python object (this |
| function steals a reference to it). The easiest way to get one is |
| using ``PyArray_DescrFromType(<typenum>)``. If you want to use a |
| flexible size array, then you need to use |
| ``PyArray_DescrNewFromType(<flexible typenum>)`` and set its ``elsize`` |
| paramter to the desired size. The typenum in both of these cases |
| is one of the ``PyArray_XXXX`` enumerated types. |
| |
| ``nd`` : ``int`` |
| The number of dimensions (<``MAX_DIMS``) |
| |
| ``*dims`` : ``npy_intp *`` |
| A pointer to the size in each dimension. Information will be |
| copied from here. |
| |
| ``*strides`` : ``npy_intp *`` |
| The strides this array should have. For new arrays created by this |
| routine, this should be ``NULL``. If you pass in memory for this array |
| to use, then you can pass in the strides information as well |
| (otherwise it will be created for you and default to C-contiguous |
| or Fortran contiguous). Any strides will be copied into the array |
| structure. Do not pass in bad strides information!!!! |
| |
| ``PyArray_CheckStrides(...)`` can help but you must call it if you are |
| unsure. You cannot pass in strides information when data is ``NULL`` |
| and this routine is creating its own memory. |
| |
| ``*data`` : ``char *`` |
| ``NULL`` for creating brand-new memory. If you want this array to wrap |
| another memory area, then pass the pointer here. You are |
| responsible for deleting the memory in that case, but do not do so |
| until the new array object has been deleted. The best way to |
| handle that is to get the memory from another Python object, |
| ``INCREF`` that Python object after passing it's data pointer to this |
| routine, and set the ``->base`` member of the returned array to the |
| Python object. *You are responsible for* setting ``PyArray_BASE(ret)`` |
| to the base object. Failure to do so will create a memory leak. |
| |
| If you pass in a data buffer, the ``flags`` argument will be the flags |
| of the new array. If you create a new array, a non-zero flags |
| argument indicates that you want the array to be in Fortran order. |
| |
| ``flags`` : ``int`` |
| Either the flags showing how to interpret the data buffer passed |
| in, or if a new array is created, nonzero to indicate a Fortran |
| order array. See below for an explanation of the flags. |
| |
| ``obj`` : ``PyObject *`` |
| If subtypes is ``&PyArray_Type`` or ``&PyBigArray_Type``, this argument is |
| ignored. Otherwise, the ``__array_finalize__`` method of the subtype |
| is called (if present) and passed this object. This is usually an |
| array of the type to be created (so the ``__array_finalize__`` method |
| must handle an array argument. But, it can be anything...) |
| |
| Note: The returned array object will be unitialized unless the type is |
| ``PyArray_OBJECT`` in which case the memory will be set to ``NULL``. |
| |
| ``PyArray_SimpleNew(nd, dims, typenum)`` is a drop-in replacement for |
| ``PyArray_FromDims`` (except it takes ``npy_intp*`` dims instead of ``int*`` dims |
| which matters on 64-bit systems) and it does not initialize the memory |
| to zero. |
| |
| ``PyArray_SimpleNew`` is just a macro for ``PyArray_New`` with default arguments. |
| Use ``PyArray_FILLWBYTE(arr, 0)`` to fill with zeros. |
| |
| The ``PyArray_FromDims`` and family of functions are still available and |
| are loose wrappers around this function. These functions still take |
| ``int *`` arguments. This should be fine on 32-bit systems, but on 64-bit |
| systems you may run into trouble if you frequently passed |
| ``PyArray_FromDims`` the dimensions member of the old ``PyArrayObject`` structure |
| because ``sizeof(npy_intp) != sizeof(int)``. |
| |
| |
| Getting an arrayobject from an arbitrary Python object |
| ====================================================== |
| |
| ``PyArray_FromAny(...)`` |
| |
| This function replaces ``PyArray_ContiguousFromObject`` and friends (those |
| function calls still remain but they are loose wrappers around the |
| ``PyArray_FromAny`` call). |
| |
| :: |
| |
| static PyObject * |
| PyArray_FromAny(PyObject *op, PyArray_Descr *dtype, int min_depth, |
| int max_depth, int requires, PyObject *context) |
| |
| |
| ``op`` : ``PyObject *`` |
| The Python object to "convert" to an array object |
| |
| ``dtype`` : ``PyArray_Descr *`` |
| The desired data-type descriptor. This can be ``NULL``, if the |
| descriptor should be determined by the object. Unless ``FORCECAST`` is |
| present in ``flags``, this call will generate an error if the data |
| type cannot be safely obtained from the object. |
| |
| ``min_depth`` : ``int`` |
| The minimum depth of array needed or 0 if doesn't matter |
| |
| ``max_depth`` : ``int`` |
| The maximum depth of array allowed or 0 if doesn't matter |
| |
| ``requires`` : ``int`` |
| A flag indicating the "requirements" of the returned array. These |
| are the usual ndarray flags (see `NDArray flags`_ below). In |
| addition, there are three flags used only for the ``FromAny`` |
| family of functions: |
| |
| - ``ENSURECOPY``: always copy the array. Returned arrays always |
| have ``CONTIGUOUS``, ``ALIGNED``, and ``WRITEABLE`` set. |
| - ``ENSUREARRAY``: ensure the returned array is an ndarray (or a |
| bigndarray if ``op`` is one). |
| - ``FORCECAST``: cause a cast to occur regardless of whether or |
| not it is safe. |
| |
| ``context`` : ``PyObject *`` |
| If the Python object ``op`` is not an numpy array, but has an |
| ``__array__`` method, context is passed as the second argument to |
| that method (the first is the typecode). Almost always this |
| parameter is ``NULL``. |
| |
| |
| ``PyArray_ContiguousFromAny(op, typenum, min_depth, max_depth)`` is |
| equivalent to ``PyArray_ContiguousFromObject(...)`` (which is still |
| available), except it will return the subclass if op is already a |
| subclass of the ndarray. The ``ContiguousFromObject`` version will |
| always return an ndarray (or a bigndarray). |
| |
| Passing Data Type information to C-code |
| ======================================= |
| |
| All datatypes are handled using the ``PyArray_Descr *`` structure. |
| This structure can be obtained from a Python object using |
| ``PyArray_DescrConverter`` and ``PyArray_DescrConverter2``. The former |
| returns the default ``PyArray_LONG`` descriptor when the input object |
| is None, while the latter returns ``NULL`` when the input object is ``None``. |
| |
| See the ``arraymethods.c`` and ``multiarraymodule.c`` files for many |
| examples of usage. |
| |
| Getting at the structure of the array. |
| -------------------------------------- |
| |
| You should use the ``#defines`` provided to access array structure portions: |
| |
| - ``PyArray_DATA(obj)`` : returns a ``void *`` to the array data |
| - ``PyArray_BYTES(obj)`` : return a ``char *`` to the array data |
| - ``PyArray_ITEMSIZE(obj)`` |
| - ``PyArray_NDIM(obj)`` |
| - ``PyArray_DIMS(obj)`` |
| - ``PyArray_DIM(obj, n)`` |
| - ``PyArray_STRIDES(obj)`` |
| - ``PyArray_STRIDE(obj,n)`` |
| - ``PyArray_DESCR(obj)`` |
| - ``PyArray_BASE(obj)`` |
| |
| see more in ``arrayobject.h`` |
| |
| |
| NDArray Flags |
| ============= |
| |
| The ``flags`` attribute of the ``PyArrayObject`` structure contains important |
| information about the memory used by the array (pointed to by the data member) |
| This flags information must be kept accurate or strange results and even |
| segfaults may result. |
| |
| There are 6 (binary) flags that describe the memory area used by the |
| data buffer. These constants are defined in ``arrayobject.h`` and |
| determine the bit-position of the flag. Python exposes a nice attribute- |
| based interface as well as a dictionary-like interface for getting |
| (and, if appropriate, setting) these flags. |
| |
| Memory areas of all kinds can be pointed to by an ndarray, necessitating |
| these flags. If you get an arbitrary ``PyArrayObject`` in C-code, |
| you need to be aware of the flags that are set. |
| If you need to guarantee a certain kind of array |
| (like ``NPY_CONTIGUOUS`` and ``NPY_BEHAVED``), then pass these requirements into the |
| PyArray_FromAny function. |
| |
| |
| ``NPY_CONTIGUOUS`` |
| True if the array is (C-style) contiguous in memory. |
| ``NPY_FORTRAN`` |
| True if the array is (Fortran-style) contiguous in memory. |
| |
| Notice that contiguous 1-d arrays are always both ``NPY_FORTRAN`` contiguous |
| and C contiguous. Both of these flags can be checked and are convenience |
| flags only as whether or not an array is ``NPY_CONTIGUOUS`` or ``NPY_FORTRAN`` |
| can be determined by the ``strides``, ``dimensions``, and ``itemsize`` |
| attributes. |
| |
| ``NPY_OWNDATA`` |
| True if the array owns the memory (it will try and free it using |
| ``PyDataMem_FREE()`` on deallocation --- so it better really own it). |
| |
| These three flags facilitate using a data pointer that is a memory-mapped |
| array, or part of some larger record array. But, they may have other uses... |
| |
| ``NPY_ALIGNED`` |
| True if the data buffer is aligned for the type and the strides |
| are multiples of the alignment factor as well. This can be |
| checked. |
| |
| ``NPY_WRITEABLE`` |
| True only if the data buffer can be "written" to. |
| |
| ``NPY_UPDATEIFCOPY`` |
| This is a special flag that is set if this array represents a copy |
| made because a user required certain flags in ``PyArray_FromAny`` and |
| a copy had to be made of some other array (and the user asked for |
| this flag to be set in such a situation). The base attribute then |
| points to the "misbehaved" array (which is set read_only). When |
| the array with this flag set is deallocated, it will copy its |
| contents back to the "misbehaved" array (casting if necessary) and |
| will reset the "misbehaved" array to ``WRITEABLE``. If the |
| "misbehaved" array was not ``WRITEABLE`` to begin with then |
| ``PyArray_FromAny`` would have returned an error because ``UPDATEIFCOPY`` |
| would not have been possible. |
| |
| |
| ``PyArray_UpdateFlags(obj, flags)`` will update the ``obj->flags`` for |
| ``flags`` which can be any of ``NPY_CONTIGUOUS``, ``NPY_FORTRAN``, ``NPY_ALIGNED``, or |
| ``NPY_WRITEABLE``. |
| |
| Some useful combinations of these flags: |
| |
| - ``NPY_BEHAVED = NPY_ALIGNED | NPY_WRITEABLE`` |
| - ``NPY_CARRAY = NPY_DEFAULT = NPY_CONTIGUOUS | NPY_BEHAVED`` |
| - ``NPY_CARRAY_RO = NPY_CONTIGUOUS | NPY_ALIGNED`` |
| - ``NPY_FARRAY = NPY_FORTRAN | NPY_BEHAVED`` |
| - ``NPY_FARRAY_RO = NPY_FORTRAN | NPY_ALIGNED`` |
| |
| The macro ``PyArray_CHECKFLAGS(obj, flags)`` can test any combination of flags. |
| There are several default combinations defined as macros already |
| (see ``arrayobject.h``) |
| |
| In particular, there are ``ISBEHAVED``, ``ISBEHAVED_RO``, ``ISCARRAY`` |
| and ``ISFARRAY`` macros that also check to make sure the array is in |
| native byte order (as determined) by the data-type descriptor. |
| |
| There are more C-API enhancements which you can discover in the code, |
| or buy the book (http://www.trelgol.com) |