Skip to content

[NOMRG] MNT Check travis CRON job #18644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

alfaro96
Copy link
Member

Trying to see why the travis CRON job is failing.

@alfaro96 alfaro96 changed the title [NOMRG] MNT Check travis CRON job [scipy-dev] [NOMRG] MNT Check travis CRON job Oct 19, 2020
@alfaro96
Copy link
Member Author

alfaro96 commented Oct 20, 2020

The problem is related with downcasting for numpy data types.

For instance, the following line:

self.X_stride = X.strides[0] / X.itemsize

throws this error:

TypeError: 'float' object cannot be interpreted as an integer

The data type of X_stride is:

cdef np.npy_intp X_stride

and changing np.npy_intp by int solves the issue.

I have not found anything related in the issue tracker of numpy nor cython.

Pinging @thomasjpfan that previously investigated the travis CRON job.

@alfaro96
Copy link
Member Author

alfaro96 commented Oct 20, 2020

The error is unrelated with the NumPy development version. The latest stable version (numpy==1.19.2) fails too.

Edit: The travis CRON job works for numpy<=1.19.1.

@alfaro96
Copy link
Member Author

The breaking change is numpy/numpy#16986.

@alfaro96
Copy link
Member Author

alfaro96 commented Oct 22, 2020

Hey @mattip,

Sorry to bother you, do you know why this is happening?

I identified that numpy/numpy#16986 introduces the incompatibility, but, sincerely, I am not familiar with the NumPy code. I know that the issue is that downcasting is not working for NumPy data types (see #18644 (comment)).

I would like your opinion in this regard.

Thank you so much!

@mattip
Copy link

mattip commented Oct 22, 2020

The offending operation x / y is producing a float. Maybe previously python/cython would convert it silently to an int, and now it refuses. You could try x // y instead, if you are sure it will not change the output value.

@mattip
Copy link

mattip commented Oct 22, 2020

Did the python version get upgraded to 3.9?

@alfaro96
Copy link
Member Author

alfaro96 commented Oct 23, 2020

Thank you @mattip for the quick reply, I really appreciate it.

Did the python version get upgraded to 3.9?

No, I am using the following Python environment:

>>> sklearn.show_versions()

System:
    python: 3.6.12 (default, Sep 10 2020, 18:03:29)  [GCC 8.3.0]
executable: /usr/local/bin/python
   machine: Linux-4.19.76-linuxkit-x86_64-with-debian-10.5

Python dependencies:
          pip: 20.2.3
   setuptools: 50.3.0
      sklearn: 0.24.dev0
        numpy: 1.19.2
        scipy: 1.4.1
       Cython: 3.0a6
       pandas: 1.0.3
   matplotlib: 3.2.1
       joblib: 0.15.1
threadpoolctl: 2.0.0

Built with OpenMP: True

However, note that with the following one (only numpy gets downgraded to 1.19.1):

>>> sklearn.show_versions()

System:
    python: 3.6.12 (default, Sep 10 2020, 18:03:29)  [GCC 8.3.0]
executable: /usr/local/bin/python
   machine: Linux-4.19.76-linuxkit-x86_64-with-debian-10.5

Python dependencies:
          pip: 20.2.3
   setuptools: 50.3.0
      sklearn: 0.24.dev0
        numpy: 1.19.1
        scipy: 1.4.1
       Cython: 3.0a6
       pandas: 1.0.3
   matplotlib: 3.2.1
       joblib: 0.15.1
threadpoolctl: 2.0.0

Built with OpenMP: True

The following code snippet works (the data type of X_stride is np.npy_intp):

self.X_stride = X.strides[0] / X.itemsize

I have compared numpy==1.19.1 and numpy==1.19.2 and I suspect that the breaking change could be numpy/numpy#16986 because is the only one modifying the NumPy declarations for Cython.

WDYT?

@mattip
Copy link

mattip commented Oct 23, 2020

Something else is going on. The numpy declarations in #16986 are identical to the cython ones: strides has been a npy_intp pointer forever. Could you somehow access the value of X.strides[0] and X.itemsize at the point of failure? Maybe somehow 1.19.2 has changed one of the values. Which test is failing?

I don't understand the comment, what is the intent of the word "downcasting"? All the values involved in self.X_stride = X.strides[0] / X.itemsize are now and have always been npy_intp, no?

@alfaro96
Copy link
Member Author

alfaro96 commented Oct 23, 2020

Something else is going on. The numpy declarations in #16986 are identical to the cython ones: strides has been a npy_intp pointer forever. Could you somehow access the value of X.strides[0] and X.itemsize at the point of failure? Maybe somehow 1.19.2 has changed one of the values. Which test is failing?

For instance, sklearn/utils/tests/test_seq_dataset.py::test_seq_dataset_basic_iteration[make_dense_dataset_64] test is failing for numpy==1.19.2 and working for numpy==1.19.1.

The values of X.strides[0] and X.itemsize are 32 and 8, respectively. The division (X.strides[0] / X.itemsize) returns 4.0. These values are the same for both numpy==1.19.1andnumpy==1.19.2`.

I don't understand the comment, what is the intent of the word "downcasting"?

The intent of the word downcasting is casting from floating to integer.

All the values involved in self.X_stride = X.strides[0] / X.itemsize are now and have always been npy_intp, no?

Yes, the data type of X.strides[0] and X.itemsize are np.npy_intp in both numpy==1.19.1 and numpy==1.19.2.

My main concern is that although X.strides[0] / X.itemsize returns a floating value in both numpy==1.19.1 and numpy==1.19.2, only in numpy==1.19.2 the following error:

TypeError: 'float' object cannot be interpreted as an integer

is raised.

@mattip
Copy link

mattip commented Oct 25, 2020

For instance, sklearn/utils/tests/test_seq_dataset.py::test_seq_dataset_basic_iteration[make_dense_dataset_64] test is failing for numpy==1.19.2 and numpy==1.19.1.

The travis CRON job works for numpy<=1.19.1.

These statements seem to contradict each other.

@alfaro96
Copy link
Member Author

alfaro96 commented Oct 25, 2020

For instance, sklearn/utils/tests/test_seq_dataset.py::test_seq_dataset_basic_iteration[make_dense_dataset_64] test is failing for numpy==1.19.2 and numpy==1.19.1.

The travis CRON job works for numpy<=1.19.1.

These statements seem to contradict each other.

I have edited the reply. I wanted to say:

For instance, sklearn/utils/tests/test_seq_dataset.py::test_seq_dataset_basic_iteration[make_dense_dataset_64] test is failing for numpy==1.19.2 and working for numpy==1.19.1.

@alfaro96
Copy link
Member Author

FYI, I have run two workflows. One with numpy==1.19.1 (working) and another one with numpy==1.19.2 (failing).

@alfaro96
Copy link
Member Author

We have decided to apply an explicit casting of floating values to integer.

Thank you @mattip for your patience and support, I really appreciate your help with this issue!

@alfaro96 alfaro96 closed this Oct 30, 2020
@alfaro96 alfaro96 deleted the check_travis_cron_job branch October 30, 2020 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants