Skip to content

MAINT fix cython 3 compilation errors #26771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 6, 2023

Conversation

jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Jul 5, 2023

Cython compilation errors seen in scipy dev build with cython 3.0.0rc1.dev0.

  • enum with non base 10 int
cdef enum:
    # Max value for our rand_r replacement (near the bottom).
    # We don't use RAND_MAX because it's different across platforms and
    # particularly tiny on Windows/MSVC.
    RAND_R_MAX = 0x7FFFFFFF
    ^
------------------------------------------------------------

sklearn/tree/_utils.pxd:26:4: Compiler crash in AnalyseDeclarationsTransform

File 'ModuleNode.py', line 203, in analyse_declarations: ModuleNode(_utils.pxd:1:0,
    full_module_name = 'sklearn.tree._utils',
    is_pxd = True)
File 'Nodes.py', line 393, in analyse_declarations: StatListNode(_utils.pxd:11:0)
File 'Nodes.py', line 393, in analyse_declarations: StatListNode(_utils.pxd:22:5)
File 'Nodes.py', line 1728, in analyse_declarations: CEnumDefNode(_utils.pxd:22:5,
    in_pxd = True,
    visibility = 'private')
File 'Nodes.py', line 1786, in analyse_enum_declarations: CEnumDefItemNode(_utils.pxd:26:4,
    name = 'RAND_R_MAX')

Compiler crash traceback from this point on:
  File "/usr/share/miniconda/envs/testvenv/lib/python3.11/site-packages/Cython/Compiler/Nodes.py", line 1786, in analyse_enum_declarations
    enum_value = int(self.value.value)
                 ^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: '0x7FFFFFFF'
  • non matching function declaration
Error compiling Cython file:
------------------------------------------------------------
...
        double lower_bound,
        double upper_bound,
    ) noexcept nogil:
        pass

    cdef inline bint _check_monotonicity(
         ^
------------------------------------------------------------

sklearn/tree/_criterion.pyx:232:9: Signature not compatible with previous declaration

Error compiling Cython file:
------------------------------------------------------------
...
            self,
            cnp.int8_t monotonic_cst,
            double lower_bound,
            double upper_bound,
    ) noexcept nogil
    cdef inline bint _check_monotonicity(
                                        ^
------------------------------------------------------------

sklearn/tree/_criterion.pxd:92:40: Previous declaration is here

@github-actions
Copy link

github-actions bot commented Jul 5, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 5b23a36. Link to the linter CI: here

@jeremiedbb jeremiedbb changed the title MAINT fix cython enum use base 10 int MAINT fix cython 3 compilation errors Jul 5, 2023
@lesteve
Copy link
Member

lesteve commented Jul 5, 2023

Just curious, do you understand the reason for the cython enum?

@lesteve
Copy link
Member

lesteve commented Jul 5, 2023

Also is it worth reporting to cython as a breaking change (even if the fix is straightforward)?

@jeremiedbb
Copy link
Member Author

Just curious, do you understand the reason for the cython enum?

Not deeply. It looks like cython is trying to do int("0x7FFFFFFF") which raises an error because it expects a base 10 int. int("0x7FFFFFFF", 16) would work. Anyway I find that directly writing the value in base 10 is better, so I'm happy to make this change without complaining :)

Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont‘t like that we define the same constant 3 times, but that‘s not the point of this PR.

RAND_R_MAX = 0x7FFFFFFF
# It corresponds to the maximum representable value for
# 32-bit signed integers (i.e. 2^31 - 1).
RAND_R_MAX = 2147483647
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you compares, if cd gives the same results?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc tests are meant to ensure that. Like the example in the docstring of GraphicalLasso, which uses enet_coordinate_descent.

also:

In [4]: %%cython
   ...: cdef enum:
   ...:     R = 0x7FFFFFFF
   ...: print(R)
   ...:
2147483647

@jeremiedbb
Copy link
Member Author

I dont‘t like that we define the same constant 3 times

me neither. I tried to resolve that and get rid of the enum a while ago and it didn't go well at all :)

@lesteve
Copy link
Member

lesteve commented Jul 6, 2023

OK let's merge this one!

@lesteve lesteve merged commit 8abf663 into scikit-learn:main Jul 6, 2023
punndcoder28 pushed a commit to punndcoder28/scikit-learn that referenced this pull request Jul 29, 2023
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Sep 18, 2023
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants