Skip to content

Add requires_positive_y estimator tag #14095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 24, 2019

Conversation

rth
Copy link
Member

@rth rth commented Jun 14, 2019

This adds a requires_positive_y estimator tag, for estimators that only work with a positive y for regression, such as Poisson regression in the GLM PR #9405.

This tag only makes sense for regressors, but this is not enforced. Estimator tag validation in general could be done in some other PR.

The GLM PR distinguishes estimators, that work with strictly positive y as well as positive or zero. Here I only consider the strictly positive case, as I think for common tests that is enough.

@@ -1519,6 +1519,9 @@ non_deterministic
requires_positive_data - unused for now
whether the estimator requires positive X.

requires_positive_y
whether the estimator requires a positive y (only applicable for regression).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should call the one above requires_positive_X to be consistent, not sure. In any case, I am not so keen on requires_positive_target if we try to reach naming consistency in the other direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you not like target? I'm fine either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanna actually add the one above and do the same thing you did for y? The tag is not used in the estimator or tags yet, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you not like target? I'm fine either way.

Mostly to be consistent with fit(X, y), but it's not too critical I agree.

Wanna actually add the one above and do the same thing you did for y? The tag is not used in the estimator or tags yet, I think.

Sure will make another PR.


# doesn't error on actual estimator
LogisticRegression,
LogisticRegression(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this run on AdaBoostClassifier which took significant time (around 3 sec) for two checks. Since it looks like this only intends to check that initialized/non initialized estimators work, replaced it by a faster estimator.

AdaBoostClassifier is picked up by common tests anyway.

@rth
Copy link
Member Author

rth commented Jun 15, 2019

(The test failure in one of the jobs is unrelated).

@rth
Copy link
Member Author

rth commented Jun 20, 2019

In case you are able to have a quick look (should be easy to review) @thomasjpfan @glemaitre. Already has a +1 )

@rth
Copy link
Member Author

rth commented Jun 21, 2019

I think I have addressed all comments. If there aren't new ones, could someone please merge this then, as this PR has a +2? Thanks!

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just get some consensus on naming: requires_positive_y vs requires_positive_target. @rth prefers the former, I think I prefer the latter for consistency with requires_positive_data. Other opinions? Does it matter if tags are still experimental?

@thomasjpfan
Copy link
Member

We have X_types instead of data_types, while also using requires_positive_data. I am +0.5 on using X and y since it used in all our function signatures.

TLDR: +0.5 on requires_positive_y and changing requires_positive_data to requires_positive_X.

@rth
Copy link
Member Author

rth commented Jun 24, 2019

OK, changed requires_positive_data to requires_positive_X for consistency with X_types and requires_positive_y. Though no strong objections on this, can change those to _data and _target if necessary.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also more consistent with X_types.

Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
@rth
Copy link
Member Author

rth commented Jun 24, 2019

Thanks, @jnothman -- I added your suggestion! This should be good to merge then (CI is green)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants