FEA Add DummyClassifier strategy that produces randomized probabilities #31488

cboseak · 2025-06-05T13:20:07Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR adds a new strategy to DummyClassifier called "random_proba" that generates randomized probability distributions for classification tasks. This strategy can be used for benchmarking and testing purposes where completely random probabilistic outputs are desirable.

github-actions · 2025-06-05T13:21:04Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: e39ae04. Link to the linter CI: here}

tmcclintock

LGTM. Thanks for implementing this so quickly. I'd have done it myself but I appreciate you being so eager!

I left one non-blocking suggestion that I think will be ignored, since it seems to be not the trend in this project.

Great work! 🚀

sklearn/tests/test_dummy.py

betatim

Looks good to me.

I think we need to do two things in addition to a second review:

do we want this feature?
is there a better name for the strategy? uniform-proba doesn't really tell you what it is if you don't already know the answer. Could we have this behaviour as part of uniform (so no new name needed)? Ideas welcome

Thanks a lot for making the PR so quickly and without waiting for a 👍 / 👎. As so often the discussion about naming and whether we want to do this or not can take much longer than the actual implementation work. Patience please :D

cboseak · 2025-06-06T13:01:37Z

do we want this feature?

I see it as a 'why not' feature. Its added functionality that doesn't hinder or affect existing functionality. Worst case, it goes unused but it shouldn't negatively affect anyone. Its a 2 way door decision.

is there a better name for the strategy? uniform-proba doesn't really tell you what it is if you don't already know the answer.

Just let me know what to update it to. I named it what was suggested in the issue but have no preference on name

tmcclintock · 2025-06-08T15:11:20Z

@betatim thank you for your healthy skepticism :)

Do we need this? Yes, I think so. I have personally seen this functionality implemented at three companies in order to test their ML pipelines. So, it's likely it would be used in many instances.
I think uniform-proba is good for two reasons:
i. proba implies that strategy applies to the probabilities (bc predict_proba)
ii. uniform implies randomness and a relation to the uniform strategy, which there is since uniform applies to the predicted labels while uniform-proba applies to the probabilities

Co-authored-by: Tom McClintock <thmsmcclintock@gmail.com>

[31462] DummyClassifier strategy that produces randomized probabilities

aa4b63c

cboseak and others added 3 commits June 5, 2025 09:05

[31462] DummyClassifier strategy that produces randomized probabilities

8a27749

Merge branch 'main' into 31462

287f367

changelog

7167d18

tmcclintock approved these changes Jun 6, 2025

View reviewed changes

sklearn/tests/test_dummy.py Outdated Show resolved Hide resolved

tmcclintock mentioned this pull request Jun 6, 2025

Feat: DummyClassifier strategy that produces randomized probabilities #31462

Open

betatim changed the title ~~[31462] DummyClassifier strategy that produces randomized probabilities~~ FEA Add DummyClassifier strategy that produces randomized probabilities Jun 6, 2025

betatim reviewed Jun 6, 2025

View reviewed changes

betatim added Needs Decision - Include Feature Requires decision regarding including feature Quick Review For PRs that are quick to review labels Jun 6, 2025

Update sklearn/tests/test_dummy.py based on suggestion

e39ae04

Co-authored-by: Tom McClintock <thmsmcclintock@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FEA Add DummyClassifier strategy that produces randomized probabilities #31488

FEA Add DummyClassifier strategy that produces randomized probabilities #31488

Uh oh!

cboseak commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025 •

edited

Loading

Uh oh!

tmcclintock left a comment

Uh oh!

Uh oh!

betatim left a comment

Uh oh!

cboseak commented Jun 6, 2025

Uh oh!

tmcclintock commented Jun 8, 2025

Uh oh!

Uh oh!

Uh oh!

FEA Add DummyClassifier strategy that produces randomized probabilities #31488

Are you sure you want to change the base?

FEA Add DummyClassifier strategy that produces randomized probabilities #31488

Uh oh!

Conversation

cboseak commented Jun 5, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Uh oh!

github-actions bot commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

tmcclintock left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

betatim left a comment

Choose a reason for hiding this comment

Uh oh!

cboseak commented Jun 6, 2025

Uh oh!

tmcclintock commented Jun 8, 2025

Uh oh!

Uh oh!

github-actions bot commented Jun 5, 2025 •

edited

Loading