-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
MAINT Remove -Wcpp warnings when compiling sklearn.cluster._dbscan_inner #24881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT Remove -Wcpp warnings when compiling sklearn.cluster._dbscan_inner #24881
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to include sklearn.clusters._dbscan_inner
in the setup.py
and more precisely in the tuple: USE_NEWEST_NUMPY_C_API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you, @OmarManzoor.
Was this accessible or quite easy to you?
Feel free to create similar PRs for #24875 or more advanced ones (we might have little Cython-related issues, yet many things can be improved in implementations, e.g. resolving TODOs
).
@jjerphan I think this was a good issue to start with. Thank you for creating it. Also thank you for sharing the document link. I did review two other pages on Cython along the lines of memory views. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
@@ -5,14 +5,12 @@ | |||
from libcpp.vector cimport vector | |||
cimport numpy as cnp | |||
|
|||
cnp.import_array() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we still use cnp
in this file, it is safer to keep cnp.import_array()
. This is the recommendation from Cython
as well:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting this!
As of Cython>=3.0, it looks like not calling cnp.import_array()
won't be a problem because this piece of C-code (which should be injected at runtime takes care of calling import_array
from NumPy. Still not calling cnp.import_array()
might raise warnings, so let's keep this instruction as you propose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we bring back the cnp.import_array() should we remove sklearn.clusters._dbscan_inner from the USE_NEWEST_NUMPY_C_API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not calling cnp.import_array
is legal and doable if we only use cnp
to access types (e.g. cnp.float64_t
), which we should be able to here.
Moreover, dbscan_inner
is still using NumPy array via cnp.ndarray
. Is it possible to use memoryview instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjerphan Sorry about that. When I reverted the removal of cnp.import_array I accidentally reverted the other changes too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually changing my review based on the comment made by @thomasjpfan: to me, the reliance on calling cnp.import_array
still needs to be discussed in the last unresolved thread.
@@ -5,14 +5,12 @@ | |||
from libcpp.vector cimport vector | |||
cimport numpy as cnp | |||
|
|||
cnp.import_array() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not calling cnp.import_array
is legal and doable if we only use cnp
to access types (e.g. cnp.float64_t
), which we should be able to here.
Moreover, dbscan_inner
is still using NumPy array via cnp.ndarray
. Is it possible to use memoryview instead.
I leave @thomasjpfan or @glemaitre merge since they also have reviewed this PR. |
…scikit-learn into cython_dbscan_inner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Reference Issues/PRs
Towards #24875
What does this implement/fix? Explain your changes.
Any other comments?