Skip to content

BUG: fix some errors raised when minlength is incorrect in np.bincount #4542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 27, 2014

Conversation

immerrr
Copy link
Contributor

@immerrr immerrr commented Mar 24, 2014

I've encountered weird exceptions raised from np.bincount when minlength is incorrect. It turns out, PyArray_PyIntAsIntp return values were handled incorrectly in that function, this PR is an attempt to fix this.

I'd also argue that minlength=0 is a valid input, but that would be an API change and hence worth a separate discussion.

Before

In [1]: np.bincount([1,2,3], minlength="foobar")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-6edd4f5aca9f> in <module>()
----> 1 np.bincount([1,2,3], minlength="foobar")

ValueError: minlength must be positive

In [2]: np.bincount([], minlength="foobar")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-fb3e912f6d10> in <module>()
----> 1 np.bincount([], minlength="foobar")

ValueError: negative dimensions are not allowed

In [3]: np.bincount([], minlength=0)
---------------------------------------------------------------------------
SystemError                               Traceback (most recent call last)
<ipython-input-3-555b8e15ec70> in <module>()
----> 1 np.bincount([], minlength=0)

SystemError: error return without exception set

After

In [1]: np.bincount([1,2,3], minlength="foobar")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-6edd4f5aca9f> in <module>()
----> 1 np.bincount([1,2,3], minlength="foobar")

TypeError: an integer is required

In [2]: np.bincount([], minlength="foobar")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-fb3e912f6d10> in <module>()
----> 1 np.bincount([], minlength="foobar")

TypeError: an integer is required

In [3]: np.bincount([], minlength=0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-555b8e15ec70> in <module>()
----> 1 np.bincount([], minlength=0)

ValueError: minlength must be positive

else if (!(minlength = PyArray_PyIntAsIntp(mlength))) {
if (mlength == Py_None) {
minlength = 0;
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
else {

@charris
Copy link
Member

charris commented Mar 24, 2014

Looks OK, needs some tests. @jaimefrio has been working on this function also.

@jaimefrio
Copy link
Member

In that humongous PR #4330 I put together, I did something similar, see here, but I don't think my approach is superior to this one. So it also LGTM.

As for the tests, I also put together some tests for errors in that PR, see here, feel free to copy/paste or use them as inspiration.

@charris
Copy link
Member

charris commented Mar 24, 2014

@immerrr After the tests go in, squash Minor fixes into the first commit.

@immerrr
Copy link
Contributor Author

immerrr commented Mar 25, 2014

@charris sure.

I intended to check with "assert_raises_regexp" instead of "assert_raises", because some errors had incorrect messages. Is this OK to pass this function from nose.tools via numpy.testing.utils ?

@charris
Copy link
Member

charris commented Mar 25, 2014

I was thinking you could just check the error type, but checking the messages might be better. The potential problem with checking messages is that they are more likely to get edited, but that probably wouldn't be hard to fix when needed.

I don't see a problem with exporting assert_raises_regexp from testing.utils, it's what we do with assert_raises and assert_raises_regexp might be generally useful.

@charris
Copy link
Member

charris commented Mar 25, 2014

Stack exchange yields this little nugget,

    with assert_raises(HTTPError) as cm:
         call_your_method(p1, p2)
    ex = cm.exception # raised exception is available through exception property of context
    ok_(excep.code == 401, 'HTTPError should be Unauthorized!')

That's neat. We might want to directly export assert_raises so we can use it as a context manager. The numpy testing utils were put together before context managers were available and as far as I can see, the only advantage of the numpy version is documentation and lazy import. Since the function is already widely used, I don't see that lazy import buys anything.

@charris
Copy link
Member

charris commented Mar 25, 2014

Some experimentation

In [46]: from nose.tools import assert_raises, assert_raises_regexp

In [47]: with assert_raises_regexp(ValueError, 'foo'):
   ....:     raise ValueError('foo')
   ....: 

In [48]: with assert_raises_regexp(ValueError, 'bar'):
   ....:     raise ValueError('foo')
   ....: 
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-48-e0e351ddfb93> in <module>()
      1 with assert_raises_regexp(ValueError, 'bar'):
----> 2     raise ValueError('foo')

/usr/lib64/python2.7/unittest/case.pyc in __exit__(self, exc_type, exc_value, tb)
    165         if not expected_regexp.search(str(exc_value)):
    166             raise self.failureException('"%s" does not match "%s"' %
--> 167                      (expected_regexp.pattern, str(exc_value)))
    168         return True
    169 

AssertionError: "bar" does not match "foo"

@immerrr
Copy link
Contributor Author

immerrr commented Mar 25, 2014

I was thinking you could just check the error type, but checking the messages might be better. The potential problem with checking messages is that they are more likely to get edited, but that probably wouldn't be hard to fix when needed.

Agreed on that. My usual problem with assert_raises is that there's not many exceptions to describe behaviour contract violations (ValueError, TypeError and RuntimeError as last resort) unless time is invested in devising a separate exception hierarchy. And when only checking types, it sometimes happens that the expected exception is "shadowed" by an unexpected one from elsewhere.

Stack exchange yields this little nugget,

Some experimentation

I usually don't like the extra noise added by with-statement which kind of stands out of the regular assertions. Luckily, assert_raises_regexp supports the same mode of operation as assert_raises:

assert_raises_regexp(Exception, "pattern", call_your_method, p1, p2)
# or
assert_raises_regexp(Exception, "pattern", lambda: call_your_method(p1, p2))

I'm going to add necessary tests tonight.

@charris
Copy link
Member

charris commented Mar 25, 2014

Great.

@charris
Copy link
Member

charris commented Mar 25, 2014

Tests are failing, looks like a failure to import nose correctly. I suspect it has to do with the import_nose function. Looks like a lazy import and in this case I'm not sure why we don't just import it.

@charris
Copy link
Member

charris commented Mar 25, 2014

However, for some reason it seems to work for python 2.7.

@charris
Copy link
Member

charris commented Mar 25, 2014

Looks like nose 1.3.0 is using a deprecated function in assert_raises_regexp. Evidently python 2.7 isn't raising that warning. Grrr @NoSe.

@@ -1546,6 +1546,23 @@ def test_empty_with_minlength(self):
y = np.bincount(x, minlength=5)
assert_array_equal(y, np.zeros(5, dtype=int))

def test_with_incorrect_minlength(self):
x = np.array([], dtype=int)
assert_raises_regexp(TypeError, "an integer is required",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is raising a deprecation warning in nose 1.3.0 which numpy testing turns into an exception. Try self.assertRaisesRegexp instead.

@charris
Copy link
Member

charris commented Mar 25, 2014

Nuts. The assertRaisesRegexp method is new in Python 2.7, so won't work in Python 2.6 in anycase. I'm thinking the assert_raises context manager might be useful here as you can easily get the exception with message

In [5]: with assert_raises(ValueError) as e:
   ...:     raise ValueError('foo')
   ...: 

In [6]: e.exception
Out[6]: ValueError('foo')

Or maybe it is easier to implement our own assert_raises_regexp.

@charris
Copy link
Member

charris commented Mar 26, 2014

Looks like we can copy the function from unittest. We need

class _AssertRaisesContext(object):
    """A context manager used to implement TestCase.assertRaises* methods."""

    def __init__(self, expected, test_case, expected_regexp=None):
        self.expected = expected
        self.failureException = test_case.failureException
        self.expected_regex = expected_regexp

    def __enter__(self):
        pass

    def __exit__(self, exc_type, exc_value, tb):
        if exc_type is None:
            try:
                exc_name = self.expected.__name__
            except AttributeError:
                exc_name = str(self.expected)
            raise self.failureException(
                "{0} not raised".format(exc_name))
        if not issubclass(exc_type, self.expected):
            # let unexpected exceptions pass through
            return False
        if self.expected_regex is None:
            return True

        expected_regexp = self.expected_regex
        if isinstance(expected_regexp, basestring):
            expected_regexp = re.compile(expected_regexp)
        if not expected_regexp.search(str(exc_value)):
            raise self.failureException('"%s" does not match "%s"' %
                     (expected_regexp.pattern, str(exc_value)))
        return True

And

    def assertRaisesRegexp(self, expected_exception, expected_regexp,
                           callable_obj=None, *args, **kwargs):
        """Asserts that the message in a raised exception matches a regexp.

        Args:
            expected_exception: Exception class expected to be raised.
            expected_regexp: Regexp (re pattern object or string) expected
                    to be found in error message.
            callable_obj: Function to be called.
            args: Extra args.
            kwargs: Extra kwargs.
        """
        context = _AssertRaisesContext(expected_exception, self, expected_regexp)
        if callable_obj is None:
            return context
        with context:
            callable_obj(*args, **kwargs)

In Python 3.1 and above this is called assertRegex, which is why assertRaisesRegexp is deprecated on those test machines. The code above is from the python 2.7 beta, so might be worth checking a more current version.

@immerrr
Copy link
Contributor Author

immerrr commented Mar 26, 2014

Ok, will rename to assert_raises_regex and provide backward-compatibility wrappers.

But this really looks like it can be fixed at nose level to avoid all these python-specific hacks in each project...

@charris
Copy link
Member

charris commented Mar 26, 2014

The main problem is that nose uses unittest and the function isn't available in Python 2.6. Python 2.6 is in both RHEL and one of the Ubuntu long term support releases, so we will need to support it for maybe four more years.

@charris
Copy link
Member

charris commented Mar 26, 2014

Supposedly the six compatibility package would take care of the python 3 problem, but I didn't find the function in its code. There is a unittest2 package on pypi that backports all the new functions introduced in Python 2.7 to earlier versions, but after chasing this around, it looked easier to just include the code for the one function rather than get another dependency.

@charris
Copy link
Member

charris commented Mar 27, 2014

Well done! Thanks @immerrr.

charris added a commit that referenced this pull request Mar 27, 2014
BUG: fix some errors raised when minlength is incorrect in np.bincount
@charris charris merged commit 419cb15 into numpy:master Mar 27, 2014
@charris
Copy link
Member

charris commented Mar 27, 2014

BTW, there was also an assertWarnsRegex added in Python 3.2 that might be worth having if you are up for that.

@immerrr
Copy link
Contributor Author

immerrr commented Mar 27, 2014

It is funny that I did mention it in nose-devs/nose#789, but didn't think of bringing its goodness to numpy.

@immerrr immerrr deleted the fix-bincount-systemerror branch March 27, 2014 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants