Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
1cf77b0
fix issue 28946
rnmourao Jun 19, 2024
cbac667
lint
rnmourao Jun 20, 2024
8ee0689
Merge branch 'scikit-learn:main' into issue_28946
maf-rnmourao Jun 20, 2024
26750a6
Merge branch 'main' into issue_28946
maf-rnmourao Jun 21, 2024
019238c
Merge branch 'main' into issue_28946
maf-rnmourao Jun 23, 2024
0d63296
Merge branch 'main' into issue_28946
maf-rnmourao Jul 15, 2024
9a9b7d8
Merge branch 'main' into issue_28946
maf-rnmourao Jul 22, 2024
0916652
Merge branch 'main' into issue_28946
maf-rnmourao Aug 15, 2024
4919f1a
Merge branch 'main' into issue_28946
maf-rnmourao Sep 9, 2024
36339e5
Merge branch 'scikit-learn:main' into issue_28946
maf-rnmourao Oct 30, 2024
47037a8
Update sklearn/preprocessing/tests/test_data.py
maf-rnmourao Nov 23, 2024
2cb8a3e
Update sklearn/preprocessing/tests/test_data.py
maf-rnmourao Nov 23, 2024
3d1ded9
Update sklearn/preprocessing/tests/test_data.py
maf-rnmourao Nov 23, 2024
5b4f6b6
Merge branch 'main' into issue_28946
maf-rnmourao Nov 23, 2024
15007e2
Merge branch 'main' into issue_28946
maf-rnmourao Nov 25, 2024
587eb12
refined warning message for NaNs in inverse transform
rnmourao Nov 25, 2024
01f6aaf
linting fixes
rnmourao Nov 25, 2024
6a98c89
linting fixes
rnmourao Nov 26, 2024
2552e2a
fix whats new number
rnmourao Nov 26, 2024
e82d244
Merge branch 'main' into issue_28946
maf-rnmourao Nov 26, 2024
2c95673
adjust the with nest logic
rnmourao Nov 28, 2024
c81d5fc
Merge branch 'main' into issue_28946
maf-rnmourao Nov 28, 2024
c039ee2
Update doc/whats_new/upcoming_changes/sklearn.preprocessing/29307.enh…
maf-rnmourao Dec 4, 2024
cd5d8e4
added TransformationFailedWarning; light check for Yeo-Johnson invers…
rnmourao Dec 4, 2024
68dfc84
Merge branch 'main' into issue_28946
maf-rnmourao Dec 5, 2024
5041cf9
replaced TransformFailedWarning with UserWarning
rnmourao Dec 5, 2024
0b6de10
checking all warnings
rnmourao Dec 6, 2024
5f44202
a more elegant test
rnmourao Dec 6, 2024
4814802
Merge branch 'main' into issue_28946
maf-rnmourao Dec 6, 2024
1875cb9
Merge branch 'main' into issue_28946
maf-rnmourao Dec 6, 2024
264c4c4
Update doc/whats_new/upcoming_changes/sklearn.preprocessing/29307.enh…
maf-rnmourao Sep 2, 2025
54fcae3
Update doc/whats_new/upcoming_changes/sklearn.preprocessing/29307.enh…
maf-rnmourao Sep 2, 2025
519abe4
Update sklearn/preprocessing/_data.py
maf-rnmourao Sep 2, 2025
18bdbea
Update sklearn/preprocessing/_data.py
maf-rnmourao Sep 2, 2025
fbf5798
Update sklearn/preprocessing/tests/test_data.py
maf-rnmourao Sep 2, 2025
5b57c0f
Merge branch 'main' into issue_28946
maf-rnmourao Sep 2, 2025
85b2484
lint fixes
rnmourao Sep 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
- The :class:`preprocessing.PowerTransformer` now returns a warning
when NaN values are encountered in the inverse transform, `inverse_transform`, typically
caused by extremely skewed data.
By :user:Roberto Mourao <maf-rnmourao>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maf-rnmourao the rst syntax is not quite right, would you be kind enough to open a PR with the following change 🙏:

By :user:`Roberto Mourao <maf-rnmourao>`

From the dev website:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lesteve ,

Here is the PR: #32093

Best Regards,

18 changes: 15 additions & 3 deletions sklearn/preprocessing/_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -3501,9 +3501,21 @@ def inverse_transform(self, X):
"yeo-johnson": self._yeo_johnson_inverse_transform,
}[self.method]
for i, lmbda in enumerate(self.lambdas_):
with np.errstate(invalid="ignore"): # hide NaN warnings
X[:, i] = inv_fun(X[:, i], lmbda)

with warnings.catch_warnings(record=True) as captured_warnings:
with np.errstate(invalid="warn"):
X[:, i] = inv_fun(X[:, i], lmbda)
if any(
"invalid value encountered in power" in str(w.message)
for w in captured_warnings
):
warnings.warn(
f"Some values in column {i} of the inverse-transformed data "
f"are NaN. This may be caused by numerical issues in the "
f"transformation process, e.g. extremely skewed data. "
f"Consider inspecting the input data or preprocessing it "
f"before applying the transformation.",
UserWarning,
)
return X

def _yeo_johnson_inverse_transform(self, x, lmbda):
Expand Down
18 changes: 18 additions & 0 deletions sklearn/preprocessing/tests/test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -2760,6 +2760,24 @@ def test_power_transformer_constant_feature(standardize):
assert_allclose(Xt_, X)


def test_yeo_johnson_inverse_transform_warning():
"""Check if a warning is triggered when the inverse transformations of the
Box-Cox and Yeo-Johnson transformers return NaN values."""
trans = PowerTransformer(method="yeo-johnson")
x = np.array([1, 1, 1e10]).reshape(-1, 1) # extreme skew
trans.fit(x)
lmbda = trans.lambdas_[0]
assert lmbda < 0 # Should be negative

# any value `psi` for which lambda * psi + 1 <= 0 will result in nan due
# to lacking support
psi = np.array([10]).reshape(-1, 1)
with pytest.warns(UserWarning, match="Some values in column"):
x_inv = trans.inverse_transform(psi).item()

assert np.isnan(x_inv)


@pytest.mark.skipif(
sp_version < parse_version("1.12"),
reason="scipy version 1.12 required for stable yeo-johnson",
Expand Down
Loading