-
-
Notifications
You must be signed in to change notification settings - Fork 26k
ENH Use scipy Yeo-Johnson implementation in PowerTransformer for scipy >= 1.9 #31227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Use scipy Yeo-Johnson implementation in PowerTransformer for scipy >= 1.9 #31227
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yaichm @smarie @eroussama @Dlimim @HamzaLuffy @AmineHannoun @xuefeng-xu thanks for this nice PR. Seems solid. Only a few nitpicks.
sklearn/utils/fixes.py
Outdated
@@ -80,6 +81,39 @@ def _sparse_linalg_cg(A, b, **kwargs): | |||
return scipy.sparse.linalg.cg(A, b, **kwargs) | |||
|
|||
|
|||
#TODO : remove this when minimum version of scipy >= 1.9.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#TODO : remove this when minimum version of scipy >= 1.9.0 | |
# TODO: Remove this when minimum version of scipy >= 1.9.0, also remove | |
# sklearn.preprocessing._yeo_johnson_optimize and use scipy.stats.yeojohnson instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we still need _yeo_johnson_optimize
since we need to get rid of the nan
, see also _box_cox_optimize
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
@smarie Could you instruct how to use linting tools and the CI reports? |
doc/whats_new/upcoming_changes/sklearn.preprocessing/31227.fix.rst
Outdated
Show resolved
Hide resolved
Yes, good catch thanks @lorentzenchr . We reviewed the pre-commit tool together from https://pre-commit.com/#4-optional-run-against-all-the-files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a nitpick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there are other small details to fix:
doc/whats_new/upcoming_changes/sklearn.preprocessing/31227.fix.rst
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,5 @@ | |||
- Fixed numerical stability (mostly overflows) of the Yeo-Johnson transform with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ogrisel @lorentzenchr I suggest to explicitly tell about the replacement of implem, what do you think ? Like this:
- Fixed numerical stability (mostly overflows) of the Yeo-Johnson transform with | |
- Now using ``scipy.stats.yeojohnson`` instead of our own implementation of the Yeo-Johnson transform. Fixed numerical stability (mostly overflows) with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We all agree, but I do not have the permission to accept this suggestion myself. @yaichm has to do it I guess.
Once done, we can merge. |
Initial PR by :user: `Xuefeng Xu <@xuefeng-xu>` completed by `Mohamed Yaich<@yaichm>`, | ||
`Oussama Er-rabie<@eroussama>`, `Mohammed Yaslam Dlimi<@Dlimim>`, | ||
`Hamza Zaroual<@HamzaLuffy>`, `Amine Hannoun<@AmineHannoun>` and `Sylvain Marié<@smarie>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need something like this, please double-check the rendering in the changelog by clicking on "check the rendered doc" CI status at the bottom of the PR.
Initial PR by :user: `Xuefeng Xu <@xuefeng-xu>` completed by `Mohamed Yaich<@yaichm>`, | |
`Oussama Er-rabie<@eroussama>`, `Mohammed Yaslam Dlimi<@Dlimim>`, | |
`Hamza Zaroual<@HamzaLuffy>`, `Amine Hannoun<@AmineHannoun>` and `Sylvain Marié<@smarie>`. | |
Initial PR by :user:`Xuefeng Xu <xuefeng-xu>` completed by :user:`Mohamed Yaich <yaichm>`, | |
:user:`Oussama Er-rabie <eroussama>`, :user:`Mohammed Yaslam Dlimi <Dlimim>`, | |
:user:`Hamza Zaroual <HamzaLuffy>`, :user:`Amine Hannoun <AmineHannoun>` and :user:`Sylvain Marié <smarie>`. |
The current rendering is slightly off, see this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the changelog rendering looks good now see this
Thanks for your PR, merging! |
Congratulations, team ! And thanks to the reviewers |
Reference Issues/PRs
This PR is finalizing #27818 to close #26308 and also fix the warning issue reported in #23319 (comment)
Close #27818
What does this implement/fix? Explain your changes.
Use scipy.stats.yeojohnson instead of our own implementation as @lorentzenchr suggested.
Any other comments?