Skip to content

BUG: Prevent passing of size 0 to array alloc C functions #13691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 11, 2019

Conversation

oribro
Copy link
Contributor

@oribro oribro commented Jun 1, 2019

Hi,
So this is my first bug fix attempt with core code,
I tried the approach of ensuring that size 0 is not passed to PyDataMem_NEW and PyDataMem_RENEW, both found in core/multiarray/alloc.c.
I looked up all the places in the code I could find that make calls to these functions and checked for size 0 explicitly, in that case the call to the alloc function will be skipped.
After that I used assert(size != 0) inside these functions to make any remaining calls fail.
In ctors.c I found code that uses "pass size 1 approach and make the allocation succeed". For consistency I changed it to skip as well.
I verified that the existing tests pass on my machine:

numpy 1.17.0.dev0+d9ed18a
Python 3.6.8 (default, Dec 24 2018, 19:24:27) 
[GCC 5.4.0 20160609] on linux

See #13645

I'll appreciate any reviews, thanks!

@mattip
Copy link
Member

mattip commented Jun 2, 2019

I verified that the existing tests pass on my machine:

It might have been better to check the asserts fired by adding them first, and then making sure we have tests covering each of the cases where you checked for zero size. Then we could be reasonably sure our test suite exercises the new code paths. Coverage points to a few code paths as untested with the 0 condition.

(Assuming you know this: when you want to check c-level assert you need to compile with CFLAGS=-UNDEBUG or so, or use python-debug which does this for you. The travis jobs with USE_DEBUG=1 checks it.)

@oribro
Copy link
Contributor Author

oribro commented Jun 3, 2019

Thanks, I wasn't aware of UNDEBUG. How can I make it work with runtests.py?
I'm confused because the C assert is in C but the tests are in python.

I removed my size 0 checks and kept only the assertions.
Then I tried to run runtests.py using gcov with python runtests.py -v --gcov and got this output:
........................ [ 8%] numpy/core/tests/test_item_selection.py ..... [ 8%] numpy/core/tests/test_longdouble.py .................ssssssssss.... [ 8%] numpy/core/tests/test_machar.py . [ 8%] numpy/core/tests/test_mem_overlap.py ................... [ 9%] numpy/core/tests/test_memmap.py ................... [ 9%] numpy/core/tests/test_multiarray.py .................................... [ 9%] .......................s.......................Aborted (core dumped)

But I couldn't find a way to trace the exception or find the generated report.
From the docs I read:

`Running individual test files can be useful; it’s much faster than running the whole test suite or that of a whole module (example: np.random.test()). This can be done with:

$ python path_to_testfile/test_file.py
That also takes extra arguments, like --pdb which drops you into the Python debugger when a test fails or an exception is raised.`

But pdb didn't work for me with runtests combined with C assert.

@seberg
Copy link
Member

seberg commented Jun 3, 2019

@oribro that is perfect, the Aborted means you hit the assert. If you look into the python runtest.py --help you will find an example call for gdb which should be sufficient to get a nice traceback here that tells you which assertion was hit.

@oribro
Copy link
Contributor Author

oribro commented Jun 4, 2019

ok, I made some progress and found the untested code paths that @mattip mentioned.
I would like to try and write new tests for those paths, and could use gdb's python extension commands such as py-list and py-bt.
I was impressed when I read from the docs that there's a way to see where we are in the python code while we check the C code.
So I got python-dbg, but the docs only mention:

Building NumPy with a Python built with debug support (on Linux distributions
typically packaged as python-dbg) is highly recommended.

And I'm not sure how to make the tests work with it.
I tried to use many variations, my best bet was gdb --args python3.6-dbg runtests.py -v,
(python3.6-dbg instead of python for gdb to read the symbols from)
but I got this:

Reading symbols from python3.6-dbg...
(gdb) r
Starting program: /usr/bin/python3.6-dbg runtests.py -v
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Building, see build.log...
[Detaching after fork from child process 14225]
Build OK
Traceback (most recent call last):
  File "runtests.py", line 491, in <module>
    main(argv=sys.argv[1:])
  File "runtests.py", line 297, in main
    tests=tests)
  File "/home/orib/numpy/build/testenv/lib/python3.6/site-packages/numpy/_pytesttester.py", line 122, in __call__
    import pytest
ModuleNotFoundError: No module named 'pytest'
[Inferior 1 (process 14221) exited with code 01]
(gdb) 

What am I missing here?

@seberg
Copy link
Member

seberg commented Jun 4, 2019

Well, if that is all, then it sounds pretty harmless, since it just does not find pytest, which means you probably just need to install pytest for the debug python version. If you installed it yourself or in its own environment, maybe python3.6-dbg -m "ensurepip" (and then python3.7-dbg -m pip install pytest? But it depends on your setup.

@oribro
Copy link
Contributor Author

oribro commented Jun 5, 2019

Thanks, so here is a summary for what I found so far:
When I only added the itemsize == 0 check in scalarapi.c, all the tests passed with the asserts applied.
Then I ran gcov and lcov and here are the results for the remaining checks:

  • The new size 0 checks in methods.c and in item_selection.c are covered by the tests.
  • I traced the checks I added to convert_datatype.c and iterators.c to be related to the PyArrayNeighborhoodIter with the calls to _multiarray_tests.test_neighborhood_iterator which I found difficult to debug, for example in test_multiarray.py:
# Simple, 1d tests
    def test_simple(self, dt):
        # Test padding with constant values
        x = np.zeros(10, dtype=np.void)
        r = [[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 0]]
        l = _multiarray_tests.test_neighborhood_iterator(
            x, [-1, 1], x[0], NEIGH_MODE['zero'])
        assert_array_equal(l, r)

        r = [[1, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 1]]
        l = _multiarray_tests.test_neighborhood_iterator(
            x, [-1, 1], x[0], NEIGH_MODE['one'])
        assert_array_equal(l, r)

        r = [[x[4], 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, x[4]]]
        l = _multiarray_tests.test_neighborhood_iterator(
            x, [-1, 1], x[4], NEIGH_MODE['constant'])

In general, I found the size 0 problem to be related to flexible data types such as np.dtype('V') or more specific to PyArray_Descr.elsize:

For data types that are always the same size (such as long), this holds the size of the data type. For flexible data types where different arrays can have a different element size, this should be 0

Which other examples of data types with size 0 can be tested? For the tests above, When I changed the datatype, the tested C function was not called for the new dt, or an error was raised stating the dt is not supported or that the conversion did not succeed, so it was quite confusing.

  • For the file ctors.c the new uncovered paths led to creating an array from a string or file, for example:
    def test_bool_fromstring(self):
        v = np.array([True, False, True, False], dtype=np.bool_)
        y = np.fromstring('1 0 -2.3 0.0', sep=' ', dtype=np.bool_)
        assert_array_equal(v, y)

such that np.fromstring calls PyArray_FromString which has a check for itemsize == 0 where itemsize = dtype->elsize. Then it proceeds to array_from_text function which has:

if (num < 0 && thisbuf == size) {
            totalbytes += bytes;
            if (totalbytes == 0) {
                break;
            }
            tmp = PyDataMem_RENEW(PyArray_DATA(r), totalbytes);

and since totalbytes = bytes = size * dtype->elsize; I couldn't find a way it can be 0 since dtype->elsize == 0 was checked earlier and if size = 0 then num has to be 0 as well but num < 0 in the if condition.
The other uncovered size checks in ctors.c were similar but maybe I missed something.

So yeah, I put some effort to try and tackle this issue, but I feel I reached a dead end with no clear strategy to continue.
I'll appreciate any guidance from here,
Thank you!

@eric-wieser
Copy link
Member

Which other examples of data types with size 0 can be tested?

np.dtype([]) is a good one. Unfortunately a lot of places misinterpret that dtype. I have a local patch that corrects the interpretation, but it exposed the very set of bugs you're looking at here.

@oribro oribro force-pushed the core-forbid-malloc0 branch from d9ed18a to 8dfbc81 Compare June 5, 2019 18:20
@oribro
Copy link
Contributor Author

oribro commented Jun 5, 2019

Ok, so for now I've kept the checks that were covered by the existing tests, and updated the PR.

if (PyArray_DATA(self) == NULL) {
fa->nd = 0;
fa->data = PyDataMem_NEW(PyArray_DESCR(self)->elsize);
fa->data = PyDataMem_NEW(elsize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this was here before your PR, but it seems strange. What is this supposed to do? We could not allocate num bytes of memory, so now we allocate elsize instead and raise an error?

Your code looks "correct" but the whole idea looks wrong. Maybe it is a separate issue, this code somehow slipped in during different refactorings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this is quite odd.
I can add that this NULL check was not covered by the tests even before my changes and I found it difficult to reproduce.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened #13724 to clean up the code not related to this PR

@mattip
Copy link
Member

mattip commented Jun 6, 2019

LGTM. Anyone else want to take a look?

@eric-wieser eric-wieser self-requested a review June 6, 2019 15:33
@mattip
Copy link
Member

mattip commented Jun 9, 2019

Merging #13724 seems to have caused a conflict. @obiro can you rebase off master and force-push?

@oribro oribro force-pushed the core-forbid-malloc0 branch from 8dfbc81 to 03052b7 Compare June 9, 2019 20:17
@oribro
Copy link
Contributor Author

oribro commented Jun 9, 2019

@mattip Sure

@mattip mattip merged commit 31bb4ca into numpy:master Jun 11, 2019
@mattip
Copy link
Member

mattip commented Jun 11, 2019

Coverage is all green now, in it goes. Thanks @oribro

else {
PyArray_DIMS(r)[0] = *nread;
((PyArrayObject_fields *)r)->data = tmp;
const size_t nsize = PyArray_MAX(*nread,1)*dtype->elsize;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this MAX was actually a previous failed attempt to prevent this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants