Enhance ROC Curve Display Tests for Improved Clarity and Maintainability #31253

NEREUScode · 2025-04-25T18:50:50Z

Commit Description:

Replaced the data_binary fixture that filtered classes from a multiclass dataset with a new fixture generating a synthetic binary classification dataset using make_classification. This ensures consistent data characteristics, introduces label noise, and better simulates real-world classification challenges.

PR Description:

Summary of Changes:

This PR refactors the data_binary fixture in the test_roc_curve_display.py file. The previous fixture filtered a multiclass dataset (Iris) to create a binary classification task. However, this approach resulted in AUC values consistently reaching 1.0, which does not reflect real-world challenges.

The new fixture utilizes make_classification from sklearn.datasets to generate a synthetic binary classification dataset with the following characteristics:

200 samples and 20 features.
5 informative features and 2 redundant features.
10% label noise (flip_y=0.1) to simulate real-world imperfections in the data.
Class separation (class_sep=0.8) set to avoid perfect separation.

These changes provide a more complex and representative dataset for testing the roc_curve_display function and other related metrics, thereby improving the robustness of tests.

Reference Issues/PRs:

Fixes Use more complex data in test_roc_curve_display.py #31243
See also ENH add from_cv_results in RocCurveDisplay (single RocCurveDisplay) #30399 (comment)

For Reviewers:

This change ensures that the dataset used for testing is more reflective of real-world data, particularly in classification tasks that may involve noise and less clear separation between classes.

Replaced the `data_binary` fixture that filtered classes from a multiclass dataset with a new fixture generating a synthetic binary classification dataset using `make_classification`. This ensures consistent data characteristics, introduces label noise, and better simulates real-world classification challenges.

github-actions · 2025-04-25T18:51:41Z

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here

`ruff format`

ruff detected issues. Please run ruff format locally and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.11.2.


--- sklearn/metrics/_plot/tests/test_roc_curve_display.py
+++ sklearn/metrics/_plot/tests/test_roc_curve_display.py
@@ -31,8 +31,8 @@
         n_features=20,
         n_informative=5,
         n_redundant=2,
-        flip_y=0.1,        # Add some label noise
-        class_sep=0.8,     # Reduce separation for more overlap
+        flip_y=0.1,  # Add some label noise
+        class_sep=0.8,  # Reduce separation for more overlap
         random_state=42,
     )
     return X, y

1 file would be reformatted, 918 files already formatted

_{Generated for commit: 4cfe688. Link to the linter CI: here}

mohammed benyamna added 3 commits April 25, 2025 19:33

Update test_roc_curve_display.py

e299bf6

Update test_roc_curve_display.py

7ab9430

github-actions bot added the module:metrics label Apr 25, 2025

Replace filtered data fixture with synthetic binary dataset

4cfe688

NEREUScode closed this Apr 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Enhance ROC Curve Display Tests for Improved Clarity and Maintainability #31253

Enhance ROC Curve Display Tests for Improved Clarity and Maintainability #31253

Uh oh!

NEREUScode commented Apr 25, 2025

Uh oh!

github-actions bot commented Apr 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Enhance ROC Curve Display Tests for Improved Clarity and Maintainability #31253

Enhance ROC Curve Display Tests for Improved Clarity and Maintainability #31253

Uh oh!

Conversation

NEREUScode commented Apr 25, 2025

Commit Description:

PR Description:

Summary of Changes:

Reference Issues/PRs:

For Reviewers:

Uh oh!

github-actions bot commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Linting issues

ruff format

Uh oh!

Uh oh!

github-actions bot commented Apr 25, 2025 •

edited

Loading

`ruff format`