BUG: Prevent passing of size 0 to array alloc C functions #13691

oribro · 2019-06-01T11:54:44Z

Hi,
So this is my first bug fix attempt with core code,
I tried the approach of ensuring that size 0 is not passed to PyDataMem_NEW and PyDataMem_RENEW, both found in core/multiarray/alloc.c.
I looked up all the places in the code I could find that make calls to these functions and checked for size 0 explicitly, in that case the call to the alloc function will be skipped.
After that I used assert(size != 0) inside these functions to make any remaining calls fail.
In ctors.c I found code that uses "pass size 1 approach and make the allocation succeed". For consistency I changed it to skip as well.
I verified that the existing tests pass on my machine:

numpy 1.17.0.dev0+d9ed18a
Python 3.6.8 (default, Dec 24 2018, 19:24:27) 
[GCC 5.4.0 20160609] on linux

See #13645

I'll appreciate any reviews, thanks!

mattip · 2019-06-02T07:34:06Z

I verified that the existing tests pass on my machine:

It might have been better to check the asserts fired by adding them first, and then making sure we have tests covering each of the cases where you checked for zero size. Then we could be reasonably sure our test suite exercises the new code paths. Coverage points to a few code paths as untested with the 0 condition.

(Assuming you know this: when you want to check c-level assert you need to compile with CFLAGS=-UNDEBUG or so, or use python-debug which does this for you. The travis jobs with USE_DEBUG=1 checks it.)

oribro · 2019-06-03T17:21:40Z

Thanks, I wasn't aware of UNDEBUG. How can I make it work with runtests.py?
I'm confused because the C assert is in C but the tests are in python.

I removed my size 0 checks and kept only the assertions.
Then I tried to run runtests.py using gcov with python runtests.py -v --gcov and got this output:
........................ [ 8%] numpy/core/tests/test_item_selection.py ..... [ 8%] numpy/core/tests/test_longdouble.py .................ssssssssss.... [ 8%] numpy/core/tests/test_machar.py . [ 8%] numpy/core/tests/test_mem_overlap.py ................... [ 9%] numpy/core/tests/test_memmap.py ................... [ 9%] numpy/core/tests/test_multiarray.py .................................... [ 9%] .......................s.......................Aborted (core dumped)

But I couldn't find a way to trace the exception or find the generated report.
From the docs I read:

`Running individual test files can be useful; it’s much faster than running the whole test suite or that of a whole module (example: np.random.test()). This can be done with:

$ python path_to_testfile/test_file.py
That also takes extra arguments, like --pdb which drops you into the Python debugger when a test fails or an exception is raised.`

But pdb didn't work for me with runtests combined with C assert.

seberg · 2019-06-03T17:58:39Z

@oribro that is perfect, the Aborted means you hit the assert. If you look into the python runtest.py --help you will find an example call for gdb which should be sufficient to get a nice traceback here that tells you which assertion was hit.

oribro · 2019-06-04T20:23:19Z

ok, I made some progress and found the untested code paths that @mattip mentioned.
I would like to try and write new tests for those paths, and could use gdb's python extension commands such as py-list and py-bt.
I was impressed when I read from the docs that there's a way to see where we are in the python code while we check the C code.
So I got python-dbg, but the docs only mention:

Building NumPy with a Python built with debug support (on Linux distributions
typically packaged as python-dbg) is highly recommended.

And I'm not sure how to make the tests work with it.
I tried to use many variations, my best bet was gdb --args python3.6-dbg runtests.py -v,
(python3.6-dbg instead of python for gdb to read the symbols from)
but I got this:

Reading symbols from python3.6-dbg...
(gdb) r
Starting program: /usr/bin/python3.6-dbg runtests.py -v
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Building, see build.log...
[Detaching after fork from child process 14225]
Build OK
Traceback (most recent call last):
  File "runtests.py", line 491, in <module>
    main(argv=sys.argv[1:])
  File "runtests.py", line 297, in main
    tests=tests)
  File "/home/orib/numpy/build/testenv/lib/python3.6/site-packages/numpy/_pytesttester.py", line 122, in __call__
    import pytest
ModuleNotFoundError: No module named 'pytest'
[Inferior 1 (process 14221) exited with code 01]
(gdb)

What am I missing here?

seberg · 2019-06-04T20:30:49Z

Well, if that is all, then it sounds pretty harmless, since it just does not find pytest, which means you probably just need to install pytest for the debug python version. If you installed it yourself or in its own environment, maybe python3.6-dbg -m "ensurepip" (and then python3.7-dbg -m pip install pytest? But it depends on your setup.

oribro · 2019-06-05T12:57:34Z

Thanks, so here is a summary for what I found so far:
When I only added the itemsize == 0 check in scalarapi.c, all the tests passed with the asserts applied.
Then I ran gcov and lcov and here are the results for the remaining checks:

The new size 0 checks in methods.c and in item_selection.c are covered by the tests.
I traced the checks I added to convert_datatype.c and iterators.c to be related to the PyArrayNeighborhoodIter with the calls to _multiarray_tests.test_neighborhood_iterator which I found difficult to debug, for example in test_multiarray.py:

# Simple, 1d tests
    def test_simple(self, dt):
        # Test padding with constant values
        x = np.zeros(10, dtype=np.void)
        r = [[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 0]]
        l = _multiarray_tests.test_neighborhood_iterator(
            x, [-1, 1], x[0], NEIGH_MODE['zero'])
        assert_array_equal(l, r)

        r = [[1, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 1]]
        l = _multiarray_tests.test_neighborhood_iterator(
            x, [-1, 1], x[0], NEIGH_MODE['one'])
        assert_array_equal(l, r)

        r = [[x[4], 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, x[4]]]
        l = _multiarray_tests.test_neighborhood_iterator(
            x, [-1, 1], x[4], NEIGH_MODE['constant'])

In general, I found the size 0 problem to be related to flexible data types such as np.dtype('V') or more specific to PyArray_Descr.elsize:

For data types that are always the same size (such as long), this holds the size of the data type. For flexible data types where different arrays can have a different element size, this should be 0

Which other examples of data types with size 0 can be tested? For the tests above, When I changed the datatype, the tested C function was not called for the new dt, or an error was raised stating the dt is not supported or that the conversion did not succeed, so it was quite confusing.

For the file ctors.c the new uncovered paths led to creating an array from a string or file, for example:

    def test_bool_fromstring(self):
        v = np.array([True, False, True, False], dtype=np.bool_)
        y = np.fromstring('1 0 -2.3 0.0', sep=' ', dtype=np.bool_)
        assert_array_equal(v, y)

such that np.fromstring calls PyArray_FromString which has a check for itemsize == 0 where itemsize = dtype->elsize. Then it proceeds to array_from_text function which has:

if (num < 0 && thisbuf == size) {
            totalbytes += bytes;
            if (totalbytes == 0) {
                break;
            }
            tmp = PyDataMem_RENEW(PyArray_DATA(r), totalbytes);

and since totalbytes = bytes = size * dtype->elsize; I couldn't find a way it can be 0 since dtype->elsize == 0 was checked earlier and if size = 0 then num has to be 0 as well but num < 0 in the if condition.
The other uncovered size checks in ctors.c were similar but maybe I missed something.

So yeah, I put some effort to try and tackle this issue, but I feel I reached a dead end with no clear strategy to continue.
I'll appreciate any guidance from here,
Thank you!

eric-wieser · 2019-06-05T16:07:17Z

Which other examples of data types with size 0 can be tested?

np.dtype([]) is a good one. Unfortunately a lot of places misinterpret that dtype. I have a local patch that corrects the interpretation, but it exposed the very set of bugs you're looking at here.

oribro · 2019-06-05T18:24:26Z

Ok, so for now I've kept the checks that were covered by the existing tests, and updated the PR.

mattip · 2019-06-06T04:47:31Z

numpy/core/src/multiarray/methods.c

        if (PyArray_DATA(self) == NULL) {
            fa->nd = 0;
-            fa->data = PyDataMem_NEW(PyArray_DESCR(self)->elsize);
+            fa->data = PyDataMem_NEW(elsize);


I realize this was here before your PR, but it seems strange. What is this supposed to do? We could not allocate num bytes of memory, so now we allocate elsize instead and raise an error?

Your code looks "correct" but the whole idea looks wrong. Maybe it is a separate issue, this code somehow slipped in during different refactorings?

I agree, this is quite odd.
I can add that this NULL check was not covered by the tests even before my changes and I found it difficult to reproduce.

opened #13724 to clean up the code not related to this PR

mattip · 2019-06-06T11:50:13Z

LGTM. Anyone else want to take a look?

mattip · 2019-06-09T19:17:45Z

Merging #13724 seems to have caused a conflict. @obiro can you rebase off master and force-push?

oribro · 2019-06-09T20:19:04Z

@mattip Sure

mattip · 2019-06-11T06:11:01Z

Coverage is all green now, in it goes. Thanks @oribro

eric-wieser · 2019-06-11T06:17:29Z

numpy/core/src/multiarray/ctors.c

-        else {
-            PyArray_DIMS(r)[0] = *nread;
-            ((PyArrayObject_fields *)r)->data = tmp;
+        const size_t nsize = PyArray_MAX(*nread,1)*dtype->elsize;


I think this MAX was actually a previous failed attempt to prevent this.

charris added 00 - Bug component: numpy._core labels Jun 2, 2019

oribro force-pushed the core-forbid-malloc0 branch from d9ed18a to 8dfbc81 Compare June 5, 2019 18:20

mattip reviewed Jun 6, 2019

View reviewed changes

mattip mentioned this pull request Jun 6, 2019

MAINT: fix use of cache_dim #13724

Merged

eric-wieser self-requested a review June 6, 2019 15:33

BUG: Prevent passing of size 0 to array alloc C functions

03052b7

oribro force-pushed the core-forbid-malloc0 branch from 8dfbc81 to 03052b7 Compare June 9, 2019 20:17

mattip merged commit 31bb4ca into numpy:master Jun 11, 2019

mattip mentioned this pull request Jun 11, 2019

np.zeros(10, dtype=[]) calls malloc(0) #13645

Closed

eric-wieser reviewed Jun 11, 2019

View reviewed changes

This was referenced Aug 8, 2019

BUG: np.lexsort MemoryError on empty array with non-standard strides #14228

Closed

BUG: Don't fail when lexsorting some empty arrays. #14240

Merged

charris mentioned this pull request Aug 23, 2019

BUG: don't fail when lexsorting some empty arrays (#14228) #14338

Merged

Uh oh!

BUG: Prevent passing of size 0 to array alloc C functions #13691

BUG: Prevent passing of size 0 to array alloc C functions #13691

Uh oh!

Conversation

oribro commented Jun 1, 2019

Uh oh!

mattip commented Jun 2, 2019

Uh oh!

oribro commented Jun 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Jun 3, 2019

Uh oh!

oribro commented Jun 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Jun 4, 2019

Uh oh!

oribro commented Jun 5, 2019

Uh oh!

eric-wieser commented Jun 5, 2019

Uh oh!

oribro commented Jun 5, 2019

Uh oh!

mattip Jun 6, 2019

Choose a reason for hiding this comment

Uh oh!

oribro Jun 6, 2019

Choose a reason for hiding this comment

Uh oh!

mattip Jun 6, 2019

Choose a reason for hiding this comment

Uh oh!

mattip commented Jun 6, 2019

Uh oh!

mattip commented Jun 9, 2019

Uh oh!

oribro commented Jun 9, 2019

Uh oh!

mattip commented Jun 11, 2019

Uh oh!

eric-wieser Jun 11, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oribro commented Jun 3, 2019 •

edited

Loading

oribro commented Jun 4, 2019 •

edited

Loading