Skip to content

ENH use log1p and expm1 in Yeo-Johnson transformation and its inverse #27868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

xuefeng-xu
Copy link
Contributor

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR was inspired by scipy's YJ transformation and also implement its inverse.
https://github.com/scipy/scipy/blob/fcf7b652bc27e47d215557bda61c84d19adc3aae/scipy/stats/_morestats.py#L1495-L1516

Specifically, if $\lambda=1$, we could skip the computation and return x directly.

Any other comments?

The formula of YJ transformation
image

Copy link

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 92bba12. Link to the linter CI: here

@s-banach
Copy link
Contributor

If _yeo_johnson_transform accepted an optional out parameter, then _yeo_johnson_optimize could reuse the same out array every time it calls _yeo_johnson_transform and reduce some array allocations.

I do this in my own personal work to speedup yeo johnson, maybe you could add it here along with this optimization?

@xuefeng-xu
Copy link
Contributor Author

Thanks, @s-banach. But later scikit-learn will use scipy.stats.yeojohnson for YJ to resolve another issue, see #26308. So I will probably remain the current status. Maybe you could open a PR at scipy?

@lorentzenchr
Copy link
Member

As explained in #26308, we want to (sooner or later) rely on scipy and get rid of our own implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants