Skip to content

TST Add TODO for global_dtype in sklearn/tree/tests/test_tree.py #22926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

adam2392
Copy link
Member

@adam2392 adam2392 commented Mar 23, 2022

Reference Issues/PRs

See: #22881 (comment)

What does this implement/fix? Explain your changes.

Adds an inline comment for TODO behavior change for global_dtype tests in trees in the future.

Any other comments?

cc: @jjerphan

Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, @adam2392.

Whilst tree-based models do not preserve dtypes, we must still test that they work on 32bit datasets -- #22881 can still be addressed for this case.

@adam2392 adam2392 requested a review from jjerphan March 24, 2022 15:39
Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adam2392: here are some precisions.

Actually, I think it would be better to address #22881 for the two files you modified in two independent pull requests, taking the changes you made.

adam2392 and others added 2 commits March 24, 2022 13:13
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
@adam2392 adam2392 changed the title TST Add TODO for global_dtype in sklearn/tree/tests/test_tree.py and sklearn/ensemble/tests/test_forest.py TST Add TODO for global_dtype in sklearn/tree/tests/test_tree.py Mar 24, 2022
@adam2392 adam2392 requested a review from jjerphan March 24, 2022 17:16
Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience, @adam2392.

We decided to put such PRs on hold if attribute dtype aren't preserved estimator fitted attributes (which is the case for DecisionTree-based estimator).

@adam2392
Copy link
Member Author

Hi @jjerphan ah okay.

what is needed to preserve the dtype of the decision trees? Perhaps I can make a PR?

@jjerphan
Copy link
Member

The internals have been written in Cython for float64 (double). At a first sight, datastructures need to be to be ported to float32 but that's not trivial, see the structures' definition: https://github.com/scikit-learn/scikit-learn/blob/3850935ea610b5231720fdf865c837aeff79ab1b/sklearn/tree/_tree.pxd

@adam2392 adam2392 closed this Mar 8, 2023
@adam2392 adam2392 deleted the globaltree branch March 8, 2023 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants