Skip to content

FEAT: Venn-ABERS calibrator for binary classification is added. #736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dimoibiehg
Copy link

@dimoibiehg dimoibiehg commented Jul 30, 2025

Description

Venn-ABERS calibration for binary classification problems.
Implements the Inductive Venn-ABERS Predictors (IVAP) algorithm described in:
"Large-scale probabilistic prediction with and without validity guarantees"
by Vovk et al. (https://arxiv.org/pdf/1511.00213.pdf).
This is a MAPIE wrapper for
the implementation in https://github.com/ptocca/VennABERS/.
Note that VennABERSCalibrator uses its own specific calibration algorithm.

Type of change

Please remove options that are irrelevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

A list of tests has been implemented and added to the end of test_calibration.py.

  • test_venn_abers_initialized
  • test_venn_abers_default_parameters
  • test_venn_abers_prefit_cv_argument
  • test_venn_abers_split_cv_argument
  • test_venn_abers_invalid_cv_argument
  • test_venn_abers_binary_classification
  • test_venn_abers_prefit_split_same_results
  • test_venn_abers_calibration_effect
  • test_venn_abers_with_pipeline
  • test_venn_abers_multiclass_error

Checklist

  • I have read the contributing guidelines
  • I have updated the HISTORY.rst and AUTHORS.rst files
  • Linting passes successfully: make lint
  • Typing passes successfully: make type-check
  • Unit tests pass successfully: make tests
  • Coverage is 100%: make coverage
  • When updating documentation: doc builds successfully and without warnings: make doc
  • When updating documentation: code examples in doc run successfully: make doctest

@Valentin-Laurent
Copy link
Collaborator

Hello @dimoibiehg, thank you so much for this high-quality PR.

MAPIE core devs are on holidays, we'll be back to our keyboards by the end of August, and are willing to merge your work. Few remarks came to my mind after a quick read, feel free to proceed in the meantime if you'd like:

  • I understand the code is adapted from another repository. We should acknowledge this somewhere and ensure we fully comply with their MIT License.
  • There are no tests to validate that the method gives correct results. Maybe we can adapt the test notebook from the original repo
  • We'll also need documentation: a simple example, and an quick written introduction to the method (about that: we're trying to move away from too technical explanations, and rather focus on non-academic users)

Thanks again, looking forward to reviewing this in more detail soon. In the meantime, let’s give Copilot a bit of work, it sometimes catches interesting things.

@Valentin-Laurent Valentin-Laurent requested a review from Copilot July 31, 2025 12:24
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements Venn-ABERS calibration for binary classification problems in MAPIE, providing a new calibration method based on the Inductive Venn-ABERS Predictors (IVAP) algorithm. The implementation adds the VennABERSCalibrator class which wraps the algorithm described in the Vovk et al. paper.

Key changes include:

  • Addition of a complete VennABERSCalibrator class with fit and predict methods
  • Implementation of the core Venn-ABERS algorithms (Algorithms 1-4 from the paper)
  • Comprehensive test suite covering initialization, parameter validation, and calibration functionality

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File Description
mapie/calibration.py Adds the complete VennABERSCalibrator class implementation with all core algorithms
mapie/tests/test_calibration.py Adds comprehensive test suite for the new calibrator including edge cases and validation
HISTORY.rst Documents the new feature addition
AUTHORS.rst Adds the contributor credit


valid_inputs = ["binary"]

calibration_points_: list[Tuple[float, Union[int, float]]]
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation should use List from typing instead of the built-in list for consistency with other type annotations in the codebase and better compatibility with older Python versions.

Suggested change
calibration_points_: list[Tuple[float, Union[int, float]]]
calibration_points_: List[Tuple[float, Union[int, float]]]

Copilot uses AI. Check for mistakes.

+ (", ").join(self.valid_inputs) + "."
)

def _prepare_data(self, calibr_points: list[Tuple[float, float]]) -> Tuple:
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation should use List from typing instead of the built-in list for consistency with other type annotations in the codebase.

Suggested change
def _prepare_data(self, calibr_points: list[Tuple[float, float]]) -> Tuple:
def _prepare_data(self, calibr_points: List[Tuple[float, float]]) -> Tuple:

Copilot uses AI. Check for mistakes.


return y_prime, y_csd, x_prime, pts_unique, k_prime

def _algorithm1(self, P: Dict, k_prime: int) -> list:
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type annotation should use List from typing instead of the built-in list for consistency with other type annotations in the codebase.

Suggested change
def _algorithm1(self, P: Dict, k_prime: int) -> list:
def _algorithm1(self, P: Dict, k_prime: int) -> List:

Copilot uses AI. Check for mistakes.

S.append(P[i])
return S

def _algorithm2(self, P: Dict, S: list, k_prime: int) -> NDArray:
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter type annotation should use List from typing instead of the built-in list for consistency with other type annotations in the codebase.

Suggested change
def _algorithm2(self, P: Dict, S: list, k_prime: int) -> NDArray:
def _algorithm2(self, P: Dict, S: List, k_prime: int) -> NDArray:

Copilot uses AI. Check for mistakes.


return F1

def _algorithm3(self, P: Dict, k_prime: int) -> list:
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type annotation should use List from typing instead of the built-in list for consistency with other type annotations in the codebase.

Suggested change
def _algorithm3(self, P: Dict, k_prime: int) -> list:
def _algorithm3(self, P: Dict, k_prime: int) -> List:

Copilot uses AI. Check for mistakes.


return S

def _algorithm4(self, P: Dict, S: list, k_prime: int) -> NDArray:
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter type annotation should use List from typing instead of the built-in list for consistency with other type annotations in the codebase.

Suggested change
def _algorithm4(self, P: Dict, S: list, k_prime: int) -> NDArray:
def _algorithm4(self, P: Dict, S: List, k_prime: int) -> NDArray:

Copilot uses AI. Check for mistakes.

pos1 = np.searchsorted(pts_unique[:-1], test_objects, side='right') + 1
return F0[pos0], F1[pos1]

def _scores_to_multi_probs(self, calibr_points: list[Tuple[float, float]],
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter type annotation should use List from typing instead of the built-in list for consistency with other type annotations in the codebase.

Suggested change
def _scores_to_multi_probs(self, calibr_points: list[Tuple[float, float]],
def _scores_to_multi_probs(self, calibr_points: List[Tuple[float, float]],

Copilot uses AI. Check for mistakes.

@Valentin-Laurent
Copy link
Collaborator

In addition to my previous comment regarding testing the correctness of implementation, here's another implementation of Venn-Abers, validated by Vovk at COPA 2023 : https://github.com/ip200/venn-abers

@dimoibiehg
Copy link
Author

In addition to my previous comment regarding testing the correctness of implementation, here's another implementation of Venn-Abers, validated by Vovk at COPA 2023 : https://github.com/ip200/venn-abers

This is a much more complete implementation, including multi-class classification.
Now, I hesitate to move further with this PR. It is worth integrating this new repository within a new PR. Isn't it?

@Valentin-Laurent
Copy link
Collaborator

If you're willing to, it would be definitely worth it yes, because this repository seems very reliable in terms of correctness of implementation.

Because the scope would be larger, maybe starting with a reduced scope (if possible/relevant) is a good idea.

In any case, we have to make sure we actually have the right to re-use the code. But seems that this is OK: both repos are under MIT license, which according to french Wikipedia allows sub-licensing (provided we mention the original licence and copyright).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants