Skip to content

Potential Issue with 2-D ARGOUTVIEWM Wrapping #17398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leakec opened this issue Sep 29, 2020 · 13 comments · Fixed by #17399
Closed

Potential Issue with 2-D ARGOUTVIEWM Wrapping #17398

leakec opened this issue Sep 29, 2020 · 13 comments · Fixed by #17399

Comments

@leakec
Copy link
Contributor

leakec commented Sep 29, 2020

I am having an issue with 2-D ARGOUTVIEWM wrapping with swig. The code works when using the form (double** ARGOUTVIEWM_ARRAY2, int* DIM1, int* DIM2), but not when using the form (int* DIM1, int* DIM2, double** ARGOUTVIEWM_ARRAY2). Moreover, when using the latter form, the form that causes a seg fault, swig issues a warning:
"warning: cast to pointer from integer of different size." Looking at the wrapped swig code, it appears that the wrong argument is being freed. When using the first form, this issue does not arise. It should also be noted that if ARGOUTVIEW arrays are used rather than ARGOUTVIEWM arrays, this issue also does not arise.

Since this is a swig wrapping issue, I will include the C++ code, swig code, setup file, and test file. After building the code with "python setup.py build" and moving the library from the build folder to the current directly, the test.py file can be used to reproduce the error, i.e. just run "python test.py".

Reproducing code example:

BF.h:

#include <iostream>

#ifndef BF_H
#define BF_H

void getw(double** arrOut, int* dimOut, int* nOut);
void getw2(int* dimOut, int* nOut, double** arrOut);

#endif

BF.cxx:

#include "BF.h"

void getw(double** arrOut, int* dimOut, int* nOut){
	int m=5; int dim=2;
	*arrOut = new double[m*dim];
	for (int k=0;k<m*dim;k++)
		(*arrOut)[k] = 2.;
	*dimOut = dim;
	*nOut = m;
	return;
}

void getw2(int* dimOut, int* nOut, double** arrOut){
	int m=5; int dim=2;
	*arrOut = new double[m*dim];
	for (int k=0;k<m*dim;k++)
		(*arrOut)[k] = 2.;
	*dimOut = dim;
	*nOut = m;
	return;
}

BF.i

// BF.i

%module BF
%{
#define SWIG_FILE_WITH_INIT
#include <iostream>
#include "BF.h"
%}

%include "numpy.i"

%init %{
        import_array();
%}

// Apply typemaps to allow hooks into Python

%apply (double** ARGOUTVIEWM_ARRAY2, int* DIM1, int* DIM2){(double** arrOut, int* dimOut, int* nOut)}; 
%apply (int* DIM1, int* DIM2, double** ARGOUTVIEWM_ARRAY2){(int* dimOut, int* nOut, double** arrOut)}; 

%include "BF.h"

setup.py

#! /usr/bin/env python

# System imports
from distutils.core import *
from distutils      import sysconfig

# Third-party modules - we depend on numpy for everything
import numpy

# Obtain the numpy include directory.  This logic works across numpy versions.
try:
    numpy_include = numpy.get_include()
except AttributeError:
    numpy_include = numpy.get_numpy_include()

# crop extension module
_BF = Extension("_BF",
                   ["BF.i","BF.cxx"],
                   include_dirs = [numpy_include],
                   swig_opts = ['-c++','-py3'],
                   extra_compile_args = ["--verbose"]
                   )

# NumyTypemapTests setup
setup(  name        = "test",
        description = "A simple test to show the use of ARGOUTVIEWM_ARRAY2",
        author      = "foobar",
        version     = "1.0",
        ext_modules = [_BF]
        )

test.py:

import numpy as np
from BF import getw, getw2

print("Running getw")
W = getw()
del W
getw()
getw()
getw()
getw()

print("Running getw2")
W = getw2()
del W
getw2()
getw2()
getw2()
getw2()
getw2()

Error message:

GDB traceback:
#0  __GI___libc_free (mem=0x2) at malloc.c:3102
#1  0x000000000043da3e in ?? ()
#2  0x00007ffff70f05fa in ?? () from /usr/lib/python3/dist-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so
#3  0x00000000005cc894 in _PyDict_DelItem_KnownHash ()
#4  0x000000000056b042 in _PyEval_EvalFrameDefault ()
#5  0x0000000000565972 in _PyEval_EvalCodeWithName ()
#6  0x0000000000686053 in PyEval_EvalCode ()
#7  0x00000000006753d1 in ?? ()
#8  0x000000000067544f in ?? ()
#9  0x0000000000675507 in PyRun_FileExFlags ()
#10 0x000000000067758a in PyRun_SimpleFileExFlags ()
#11 0x00000000006ae99e in Py_RunMain ()
#12 0x00000000006aed29 in Py_BytesMain ()
#13 0x00007ffff7dd30b3 in __libc_start_main (main=0x4ebd20 <main>, argc=2, argv=0x7fffffffddd8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7fffffffddc8) at ../csu/libc-start.c:308
#14 0x00000000005f62ee in _start ()

Terminal printout:
Running getw
Running getw2
Segmentation fault (core dumped)

NumPy/Python version information:

1.17.4 3.8.2 (default, Jul 16 2020, 14:00:26)
[GCC 9.3.0]

@seberg
Copy link
Member

seberg commented Sep 29, 2020

Not sure, looking at the swig code, I don't really see much of a reason these should differ. Could it be that the problem is just the new which is then free'd using free (by swig/NumPy)?

EDIT: To be a bit more clear, please try this using malloc instead of new.

@leakec
Copy link
Contributor Author

leakec commented Sep 29, 2020

I don't think so, as the getw function does the same thing and runs without error, and it is only the getw2 function that has the seg fault issue. Moreover, swig only issues a warning when wrapping the getw2 function, and the warning points to int* begin freed via the PyCapsule object. In the getw function, it is the double** pointer that is being freed via the PyCapsule object.

Also, since replacing ARGOUTVIEWM with ARGOUTVIEW for the getw2 function results in working code, this leads me to believe that the free is working fine, and it is ARGOUTVIEWM that is causing the problem.

EDIT: I just tried replacing the new with malloc, and the same issue persists.

@eric-wieser
Copy link
Member

I've never heard of ARGOUTVIEWM before - is it a numpy thing or a swig thing - and where is the behavior implemented?

@seberg
Copy link
Member

seberg commented Sep 29, 2020

Well, the main difference between ARGOUTVIEWM and ARGOUTVIEW appears to be that ARGOUTVIEW does not free the memory, if I read the code correctly. So that is where my guess came from. And swig definitely uses free to free that memory, and while I do not know C++ well, mixing new and free seems probably not good. So, I do think you have to use malloc: https://stackoverflow.com/questions/4061514/c-object-created-with-new-destroyed-with-free-how-bad-is-this

@leakec sorry for being pedantic, but you tried replacing both new with malloc, right?

The only other idea I have is that dimensions are normally npy_intp and not int, but since the %apply you got spells out int, I do not really see swig not casting the int correctly.

@eric-wieser it seems to be defined here:

(DATA_TYPE** ARGOUTVIEWM_ARRAY2, DIM_TYPE* DIM1, DIM_TYPE* DIM2)
but my swig knowledge is not really existing.

@eric-wieser
Copy link
Member

Ah, found it:

numpy/tools/swig/numpy.i

Lines 2509 to 2542 in cb107d1

/* Typemap suite for (DATA_TYPE** ARGOUTVIEWM_ARRAY2, DIM_TYPE* DIM1, DIM_TYPE* DIM2)
*/
%typemap(in,numinputs=0)
(DATA_TYPE** ARGOUTVIEWM_ARRAY2, DIM_TYPE* DIM1 , DIM_TYPE* DIM2 )
(DATA_TYPE* data_temp = NULL , DIM_TYPE dim1_temp, DIM_TYPE dim2_temp)
{
$1 = &data_temp;
$2 = &dim1_temp;
$3 = &dim2_temp;
}
%typemap(argout,
fragment="NumPy_Backward_Compatibility,NumPy_Utilities")
(DATA_TYPE** ARGOUTVIEWM_ARRAY2, DIM_TYPE* DIM1, DIM_TYPE* DIM2)
{
npy_intp dims[2] = { *$2, *$3 };
PyObject* obj = PyArray_SimpleNewFromData(2, dims, DATA_TYPECODE, (void*)(*$1));
PyArrayObject* array = (PyArrayObject*) obj;
if (!array) SWIG_fail;
%#ifdef SWIGPY_USE_CAPSULE
PyObject* cap = PyCapsule_New((void*)(*$1), SWIGPY_CAPSULE_NAME, free_cap);
%#else
PyObject* cap = PyCObject_FromVoidPtr((void*)(*$1), free);
%#endif
%#if NPY_API_VERSION < 0x00000007
PyArray_BASE(array) = cap;
%#else
PyArray_SetBaseObject(array,cap);
%#endif
$result = SWIG_Python_AppendOutput($result,obj);
}

numpy/tools/swig/numpy.i

Lines 2544 to 2577 in cb107d1

/* Typemap suite for (DIM_TYPE* DIM1, DIM_TYPE* DIM2, DATA_TYPE** ARGOUTVIEWM_ARRAY2)
*/
%typemap(in,numinputs=0)
(DIM_TYPE* DIM1 , DIM_TYPE* DIM2 , DATA_TYPE** ARGOUTVIEWM_ARRAY2)
(DIM_TYPE dim1_temp, DIM_TYPE dim2_temp, DATA_TYPE* data_temp = NULL )
{
$1 = &dim1_temp;
$2 = &dim2_temp;
$3 = &data_temp;
}
%typemap(argout,
fragment="NumPy_Backward_Compatibility,NumPy_Utilities")
(DIM_TYPE* DIM1, DIM_TYPE* DIM2, DATA_TYPE** ARGOUTVIEWM_ARRAY2)
{
npy_intp dims[2] = { *$1, *$2 };
PyObject* obj = PyArray_SimpleNewFromData(2, dims, DATA_TYPECODE, (void*)(*$3));
PyArrayObject* array = (PyArrayObject*) obj;
if (!array) SWIG_fail;
%#ifdef SWIGPY_USE_CAPSULE
PyObject* cap = PyCapsule_New((void*)(*$1), SWIGPY_CAPSULE_NAME, free_cap);
%#else
PyObject* cap = PyCObject_FromVoidPtr((void*)(*$1), free);
%#endif
%#if NPY_API_VERSION < 0x00000007
PyArray_BASE(array) = cap;
%#else
PyArray_SetBaseObject(array,cap);
%#endif
$result = SWIG_Python_AppendOutput($result,obj);
}

@leakec
Copy link
Contributor Author

leakec commented Sep 29, 2020

@seberg No worries, yes I replaced both calls to new with malloc, and still the same issue. Its probably worth noting that I have used swig to wrap C++ in the past and have used new almost every time without issue. However, in the past I have always used ARGOUTVIEW arrays, as the programs were small enough that I did not need to use ARGOUTVIEWM arrays to release the memory after it is used.

@eric-wieser Yes, that is what I see as well. When using (double** ARGOUTVIEWM_ARRAY2, int* DIM1, int* DIM2) the program appears to wrap correctly, and the free is called on the double** argument. Whereas when using (int* DIM1, int* DIM2, double** ARGOUTVIEWM_ARRAY2) the free is called on one of the int* arguments instead. My guess is that there is an issue in the typemap that has the ARGOUTVIEWM argument last.

@leakec
Copy link
Contributor Author

leakec commented Sep 29, 2020

I think I found it! See this snippet of the numpy.i file:

/* Typemap suite for (DATA_TYPE** ARGOUTVIEWM_ARRAY2, DIM_TYPE* DIM1, DIM_TYPE* DIM2)
 */
%typemap(in,numinputs=0)
  (DATA_TYPE** ARGOUTVIEWM_ARRAY2, DIM_TYPE* DIM1     , DIM_TYPE* DIM2     )
  (DATA_TYPE*  data_temp = NULL  , DIM_TYPE  dim1_temp, DIM_TYPE  dim2_temp)
{
  $1 = &data_temp;
  $2 = &dim1_temp;
  $3 = &dim2_temp;
}
%typemap(argout,
         fragment="NumPy_Backward_Compatibility,NumPy_Utilities")
  (DATA_TYPE** ARGOUTVIEWM_ARRAY2, DIM_TYPE* DIM1, DIM_TYPE* DIM2)
{
  npy_intp dims[2] = { *$2, *$3 };
  PyObject* obj = PyArray_SimpleNewFromData(2, dims, DATA_TYPECODE, (void*)(*$1));
  PyArrayObject* array = (PyArrayObject*) obj;

  if (!array) SWIG_fail;

%#ifdef SWIGPY_USE_CAPSULE
    PyObject* cap = PyCapsule_New((void*)(*$1), SWIGPY_CAPSULE_NAME, free_cap);
%#else
    PyObject* cap = PyCObject_FromVoidPtr((void*)(*$1), free);
%#endif

%#if NPY_API_VERSION < 0x00000007
  PyArray_BASE(array) = cap;
%#else
  PyArray_SetBaseObject(array,cap);
%#endif

  $result = SWIG_Python_AppendOutput($result,obj);
}

/* Typemap suite for (DIM_TYPE* DIM1, DIM_TYPE* DIM2, DATA_TYPE** ARGOUTVIEWM_ARRAY2)
 */
%typemap(in,numinputs=0)
  (DIM_TYPE* DIM1     , DIM_TYPE* DIM2     , DATA_TYPE** ARGOUTVIEWM_ARRAY2)
  (DIM_TYPE  dim1_temp, DIM_TYPE  dim2_temp, DATA_TYPE*  data_temp = NULL  )
{
  $1 = &dim1_temp;
  $2 = &dim2_temp;
  $3 = &data_temp;
}
%typemap(argout,
         fragment="NumPy_Backward_Compatibility,NumPy_Utilities")
  (DIM_TYPE* DIM1, DIM_TYPE* DIM2, DATA_TYPE** ARGOUTVIEWM_ARRAY2)
{
  npy_intp dims[2] = { *$1, *$2 };
  PyObject* obj = PyArray_SimpleNewFromData(2, dims, DATA_TYPECODE, (void*)(*$3));
  PyArrayObject* array = (PyArrayObject*) obj;

  if (!array) SWIG_fail;

%#ifdef SWIGPY_USE_CAPSULE
    PyObject* cap = PyCapsule_New((void*)(*$1), SWIGPY_CAPSULE_NAME, free_cap);
%#else
    PyObject* cap = PyCObject_FromVoidPtr((void*)(*$1), free);
%#endif

%#if NPY_API_VERSION < 0x00000007
  PyArray_BASE(array) = cap;
%#else
  PyArray_SetBaseObject(array,cap);
%#endif

  $result = SWIG_Python_AppendOutput($result,obj);
}

As you can see, the ($1) argument is freed in both cases. If I switch the freed arguments to ($3) in the second case, the case where the ARGOUTVIEWM array is the last argument , it works!

@eric-wieser
Copy link
Member

I see the bug.

 %#ifdef SWIGPY_USE_CAPSULE 
     PyObject* cap = PyCapsule_New((void*)(*$1), SWIGPY_CAPSULE_NAME, free_cap); 
 %#else 
     PyObject* cap = PyCObject_FromVoidPtr((void*)(*$1), free); 
 %#endif

should be

 %#ifdef SWIGPY_USE_CAPSULE 
     PyObject* cap = PyCapsule_New((void*)(*$3), SWIGPY_CAPSULE_NAME, free_cap); 
 %#else 
     PyObject* cap = PyCObject_FromVoidPtr((void*)(*$3), free); 
 %#endif

@seberg's warnings about new vs malloc are valid though - mixing them is undefined behavior, meaning you might get away with it until without warning you won't. Don't take the risk.

@leakec
Copy link
Contributor Author

leakec commented Sep 29, 2020

@seberg Great minds think alike :)

I'll propose the fix in a few minutes.

@eric-wieser
Copy link
Member

Looks like we cross-posted. Mind making a PR? Perhaps even check the others too, if you're feeling generous.

@seberg
Copy link
Member

seberg commented Sep 29, 2020

@leakec thanks for digging it out, had not noticed that typo obviously! A PR would be great, I am not sure if we run tests, but there do seem swig specific tests in the folder.

@leakec
Copy link
Contributor Author

leakec commented Sep 29, 2020

@seberg @eric-wieser Sure thing, I will check the other ARGOUTVIEWM arrays as well. I'm pretty new to open source thing, so it will take me a few minutes to figure out how to make the PR correctly.

@leakec
Copy link
Contributor Author

leakec commented Sep 29, 2020

@seberg @eric-wieser I was able to correct all of the ARGOUTVIEWM memory de-allocation. However, adding an ARGOUTVIEWM case to the current unit tests for testArray.py results in a double free. I believe this is because the Array class already deallocates that memory. Would you like me to create a separate unit test for the ARGOUTVIEWM arrays?
EDIT: Submitted pull request that includes modified numpy.i file, but no changes to the tests. Please let me know if you would like me to create a new test for the ARGOUTVIEWM array typemaps.

@leakec leakec closed this as completed Sep 29, 2020
mattip added a commit that referenced this issue Sep 30, 2020
BLD: Fixed ARGOUTVIEWM memory deallocation. Closes #17398.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants