Skip to content

FIX: In the regression setting, cv=LeaveOneGroupOut() and cv=LeavePGroupsOut() are not working #696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 15, 2025

Conversation

sulphatet
Copy link
Contributor

Description

This pull request fixes an issue where the _MapieRegressor internal class did not support LeaveOneGroupOut and LeavePGroupsOut as a valid cross-validation strategy. This was due to incorrect passing of the groups parameter to the _check_no_agg_cv logic, which has now been corrected.

Fixes #542

Type of change

Please remove options that are irrelevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Added a new unit test: test_group_cv_output_shape, which:
    • Imports and applies LeaveOneGroupOut and LeavePGroupsOut with groups passed to .fit()

    • Asserts that predicted intervals have the correct shape

Note: Now that the _MapieRegressor is an internal class, classes like SplitConformalRegressor class in the v1 API wraps _MapieRegressor using cv="prefit". This change does not directly affect it, but in future refactors, I think the handling of CV and group parameters could be carefully unified, given that SplitConformalRegressor inherits from _MapieRegressor.

Checklist

  • I have read the contributing guidelines
  • I have updated the HISTORY.rst and AUTHORS.rst files
  • Linting passes successfully: make lint
  • Typing passes successfully: make type-check
  • Unit tests pass successfully: make tests
  • Coverage is 100%: make coverage
  • When updating documentation: doc builds successfully and without warnings: make doc
  • When updating documentation: code examples in doc run successfully: make doctest

@Valentin-Laurent
Copy link
Collaborator

Hello @sulphatet, thank you for this contribution. I agree with your note, by the way.

Suggestion about the test: we could make it simply about not breaking when using LeaveOneGroupOut and LeavePGroupsOut, without the parametrization of alpha and verifications on output shape (so no predict needed).

We have a lot of "end-to-end" tests like that in MAPIE, but they end up making the test suite quite long to run.

What do you think?

@sulphatet
Copy link
Contributor Author

Thanks for the feedback @Valentin-Laurent that makes sense, and I’ve simplified the test as suggested to only check that .fit() works without running .predict(). Let me know if this change is more in line what you were thinking. I'd be happy to make any further modifications.

Additionally, since _MapieRegressor is now internal and being inherited by classes like SplitConformalRegressor, I think we’ll eventually also want to ensure group-based CV is fully tested through those public APIs?

@Valentin-Laurent Valentin-Laurent requested a review from Copilot May 15, 2025 07:54
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request fixes an issue with the internal regressor class not properly supporting group‑based cross-validation strategies by correctly passing the groups parameter.

  • Adds new unit tests to verify that LeaveOneGroupOut and LeavePGroupsOut CV splitters work as expected.
  • Updates the internal _MapieRegressor fit logic to include the groups parameter when checking the CV strategy.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
mapie/tests/test_regression.py Added unit test to validate group‑based CV fit execution.
mapie/estimator/regressor.py Updated _MapieRegressor to pass groups to _check_no_agg_cv.
Comments suppressed due to low confidence (1)

mapie/estimator/regressor.py:531

  • [nitpick] Consider aligning the closing parenthesis on line 536 with the function call's opening for improved readability.
self.use_split_method_ = _check_no_agg_cv(

@Valentin-Laurent
Copy link
Collaborator

Valentin-Laurent commented May 15, 2025

Looks good to me, thank you 👍

we’ll eventually also want to ensure group-based CV is fully tested through those public APIs
That's right.

However, having too many functional ("end-to-end") tests is not ideal (it makes the test suite huge and slow if we want to cover all possible scenarios). Actually we already have functional tests in tests_v1/test_functional/test_non-regression... covering groups, so I'd say we're fine for now.

In the future, I'd rather focus on having a more modular codebase, so we can better unit test and catch issues like the one you just fixed.

@Valentin-Laurent Valentin-Laurent self-requested a review May 15, 2025 08:08
@Valentin-Laurent Valentin-Laurent merged commit e1936d9 into scikit-learn-contrib:master May 15, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

In the regression setting, cv=LeaveOneGroupOut() and cv=LeavePGroupsOut() are not working
2 participants