Skip to content

[MRG+1] Option to suppress validation for finiteness #7548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jun 8, 2017

Conversation

jnothman
Copy link
Member

@jnothman jnothman commented Oct 2, 2016

Fixes #1363. I'm not sure if this is the best way to make a global configuration parameter available, or the best place to document it.

@jnothman jnothman force-pushed the suppress-validation branch from 12dd724 to 1cbf18b Compare October 2, 2016 02:52
@jnothman
Copy link
Member Author

jnothman commented Oct 2, 2016

And perhaps in the future we will add more validation suppression as a result of this flag (e.g. some of the stuff in utils.multiclass) and so we might need better ways to warn users that the behaviour of this flag may become more expansive without backwards compatibility concerns.

Is it worth using this option, for instance, to assume labels are already encoded in 0 to n_classes - 1? (or because this is a fit-time concern, is that premature optimisation?)

@amueller
Copy link
Member

amueller commented Oct 3, 2016

Currently the variable name is a bit to generic for my taste. It sounds like it suppresses all validation. Maybe FAST_VALIDATION?

@jnothman
Copy link
Member Author

jnothman commented Oct 4, 2016

SUPPRESS_COSTLY_VALIDATION?? :) Problem with FAST_VALIDATION is the "only do it with care" factor. How about LESS_VALIDATION? PRESUME_VALID?

@jnothman
Copy link
Member Author

jnothman commented Oct 5, 2016

I've renamed to PRESUME_FINITE

@amueller
Copy link
Member

amueller commented Oct 5, 2016

Sounds good. To me ASSUME_FINITE sounds more natural but you're the native speaker ;)

@amueller
Copy link
Member

amueller commented Oct 5, 2016

LGTM. I'd like to have @GaelVaroquaux sign off on this if he has time.

@amueller amueller changed the title [MRG] Option to suppress validation for finiteness [MRG + 1] Option to suppress validation for finiteness Oct 5, 2016
@amueller
Copy link
Member

amueller commented Oct 5, 2016

The documentation is quite hard to find right now, but I'm not sure what a better place would be.

@jnothman
Copy link
Member Author

jnothman commented Oct 5, 2016

I'm happy to change to ASSUME. The difference is that often you assume something but still check; when you presume something, you go with that presumption blindly.

No hurry here.

@NelleV
Copy link
Member

NelleV commented Oct 6, 2016

I like PRESUME_FINITE.
It looks good to me as well.

@NelleV NelleV changed the title [MRG + 1] Option to suppress validation for finiteness [MRG + 2] Option to suppress validation for finiteness Oct 7, 2016
@jnothman
Copy link
Member Author

jnothman commented Oct 7, 2016

But we do have assume_* elsewhere in the code (assume_centered and assume_unique) with the same meaning so perhaps we should follow that convention.

@TomDLT
Copy link
Member

TomDLT commented Oct 13, 2016

Small concern about putting the check inside assert_all_finite(X), as it is a public function advertised in the doc.
Shouldn't we just add it inside _ensure_sparse_format, check_array and check_X_y?

@amueller
Copy link
Member

@TomDLT I'm not sure I understand your concern. We should document that the behavior is overwritten by the flag, but it doesn't change any current behavior.

@jnothman
Copy link
Member Author

I think the idea that we're assuming rather than asserting finiteness when
the flag is on is pretty clear.

On 15 October 2016 at 04:04, Andreas Mueller notifications@github.com
wrote:

@TomDLT https://github.com/TomDLT I'm not sure I understand your
concern. We should document that the behavior is overwritten by the flag,
but it doesn't change any current behavior.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#7548 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz6xVXXhOe7etRA3SvLIboMAiKrSQAks5qz7YCgaJpZM4KL6yC
.

@jnothman
Copy link
Member Author

have added a note to docs.

@NelleV
Copy link
Member

NelleV commented Oct 31, 2016

Are we still waiting for @GaelVaroquaux to comment on that one?

@GaelVaroquaux
Copy link
Member

LGTM.

One think that would be good would be to add a contextmanager to do that. I fear Ill-written code that suppresses validation in more places that it would like.

@jnothman
Copy link
Member Author

jnothman commented Nov 1, 2016

Do you foresee with sklearn.set_validation analogous to np.seterr? If we can agree on where it lives and how broadly it is named, I'm happy to implement here before merge. Otherwise, I think we could merge this PR as is (modulo what's new).

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Nov 1, 2016 via email

@amueller
Copy link
Member

I'm fine with either way, slightly leaning to set_validation

@jnothman
Copy link
Member Author

Yes, I'll get back to it some time soon! I know, it'll be a nice feature to have.

@jnothman
Copy link
Member Author

@amueller, my concern with that first example is that introducing a context manger into func where there was none before wrecks everything from my_callback, despite some_dict not setting any of the same parameters as my_callback. Functions that use context managers then need to be documented for their imperviousness anyway.

I get the documentation simplicity of your approach. It makes sense in numpy where the errstate is homogeneous in the types of settings it's dealing with; and perhaps in matplotlib where the config is everything. Here we only have one setting, and as far as I can tell, configurations are unlikely to have much to do with each other. A context, except at the outermost level of library invocation, is likely to set only few of the params.

But this is almost entirely philosophy and I should probably just adopt what you suggest.

@tacaswell
Copy link
Contributor

You currently only have one global configuration 😈. If you expect it to stay that way, then why aren't these called ignore_nan_context and set_ignore_nan_state, etc?

Users are very clever and will do all sorts of 'interesting' things (and will not be persuaded by 'do not do that', and documenting what not to do just gives some users ideas!) so even if sklearn never writes functions like this, some downstream library will.

@jnothman
Copy link
Member Author

jnothman commented Dec 20, 2016 via email

@tacaswell
Copy link
Contributor

Sorry if I was arguing too forcefully or was out of line. I do not know the audience here (but do know I am the peanut gallery!).

@jnothman
Copy link
Member Author

jnothman commented Dec 21, 2016 via email

@jnothman
Copy link
Member Author

Pushed conformant changes

@jnothman
Copy link
Member Author

jnothman commented Feb 3, 2017

@amueller, I assume this still has your +1? Another review?

@amueller
Copy link
Member

amueller commented Jun 7, 2017

I'll review again and ping @GaelVaroquaux to check it out

import os
from contextlib import contextmanager as _contextmanager

_ASSUME_FINITE = bool(os.environ.get('SKLEARN_ASSUME_FINITE', False))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to add more config in the future, don't we want them to be bundled in a dictionary?

@amueller
Copy link
Member

amueller commented Jun 7, 2017

still +1 but maybe have a the global variable be a dict if we want to add an option for the repr?


Parameters
----------
assume_finite : bool, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad param docstring

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? Because I should say that it needs to be specified by name?

The point here is that the user always has the option to change, but can leave things unchanged Too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forget it

@jnothman
Copy link
Member Author

jnothman commented Jun 7, 2017 via email

@amueller
Copy link
Member

amueller commented Jun 8, 2017

Ok I'll do that in my PR ;)

@jnothman
Copy link
Member Author

jnothman commented Jun 8, 2017 via email

@agramfort
Copy link
Member

looks great.

@amueller green button ?

@amueller amueller merged commit ee88cf4 into scikit-learn:master Jun 8, 2017
@amueller
Copy link
Member

amueller commented Jun 8, 2017

yes ;)

Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
* ENH add suppress validation option

* TST skip problematic doctest

* Rename SUPPRESS_VALIDATION to PRESUME_FINITE

* Change PRESUME_ to ASSUME_ for convention's sake

* DOC add note regarding assert_all_finite

* ENH add set_config context manager for ASSUME_FINITE

* Make ASSUME_FINITE private and provide get_config

* Fix ImportError due to incomplete change in last commit

* TST/DOC tests and more cautious documentation for set_config

* DOC what's new entry for validation suppression

* context manager is now config_context; set_config affects global config

* Rename missed set_config to config_context

* Fix mis-named test

* Mention set_config in narrative docs

* More explicit about limmited restoration of context

* Handle case where error raised in config_context

* Reset all settings after exiting context manager
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
* ENH add suppress validation option

* TST skip problematic doctest

* Rename SUPPRESS_VALIDATION to PRESUME_FINITE

* Change PRESUME_ to ASSUME_ for convention's sake

* DOC add note regarding assert_all_finite

* ENH add set_config context manager for ASSUME_FINITE

* Make ASSUME_FINITE private and provide get_config

* Fix ImportError due to incomplete change in last commit

* TST/DOC tests and more cautious documentation for set_config

* DOC what's new entry for validation suppression

* context manager is now config_context; set_config affects global config

* Rename missed set_config to config_context

* Fix mis-named test

* Mention set_config in narrative docs

* More explicit about limmited restoration of context

* Handle case where error raised in config_context

* Reset all settings after exiting context manager
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
* ENH add suppress validation option

* TST skip problematic doctest

* Rename SUPPRESS_VALIDATION to PRESUME_FINITE

* Change PRESUME_ to ASSUME_ for convention's sake

* DOC add note regarding assert_all_finite

* ENH add set_config context manager for ASSUME_FINITE

* Make ASSUME_FINITE private and provide get_config

* Fix ImportError due to incomplete change in last commit

* TST/DOC tests and more cautious documentation for set_config

* DOC what's new entry for validation suppression

* context manager is now config_context; set_config affects global config

* Rename missed set_config to config_context

* Fix mis-named test

* Mention set_config in narrative docs

* More explicit about limmited restoration of context

* Handle case where error raised in config_context

* Reset all settings after exiting context manager
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
* ENH add suppress validation option

* TST skip problematic doctest

* Rename SUPPRESS_VALIDATION to PRESUME_FINITE

* Change PRESUME_ to ASSUME_ for convention's sake

* DOC add note regarding assert_all_finite

* ENH add set_config context manager for ASSUME_FINITE

* Make ASSUME_FINITE private and provide get_config

* Fix ImportError due to incomplete change in last commit

* TST/DOC tests and more cautious documentation for set_config

* DOC what's new entry for validation suppression

* context manager is now config_context; set_config affects global config

* Rename missed set_config to config_context

* Fix mis-named test

* Mention set_config in narrative docs

* More explicit about limmited restoration of context

* Handle case where error raised in config_context

* Reset all settings after exiting context manager
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
* ENH add suppress validation option

* TST skip problematic doctest

* Rename SUPPRESS_VALIDATION to PRESUME_FINITE

* Change PRESUME_ to ASSUME_ for convention's sake

* DOC add note regarding assert_all_finite

* ENH add set_config context manager for ASSUME_FINITE

* Make ASSUME_FINITE private and provide get_config

* Fix ImportError due to incomplete change in last commit

* TST/DOC tests and more cautious documentation for set_config

* DOC what's new entry for validation suppression

* context manager is now config_context; set_config affects global config

* Rename missed set_config to config_context

* Fix mis-named test

* Mention set_config in narrative docs

* More explicit about limmited restoration of context

* Handle case where error raised in config_context

* Reset all settings after exiting context manager
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
* ENH add suppress validation option

* TST skip problematic doctest

* Rename SUPPRESS_VALIDATION to PRESUME_FINITE

* Change PRESUME_ to ASSUME_ for convention's sake

* DOC add note regarding assert_all_finite

* ENH add set_config context manager for ASSUME_FINITE

* Make ASSUME_FINITE private and provide get_config

* Fix ImportError due to incomplete change in last commit

* TST/DOC tests and more cautious documentation for set_config

* DOC what's new entry for validation suppression

* context manager is now config_context; set_config affects global config

* Rename missed set_config to config_context

* Fix mis-named test

* Mention set_config in narrative docs

* More explicit about limmited restoration of context

* Handle case where error raised in config_context

* Reset all settings after exiting context manager
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
* ENH add suppress validation option

* TST skip problematic doctest

* Rename SUPPRESS_VALIDATION to PRESUME_FINITE

* Change PRESUME_ to ASSUME_ for convention's sake

* DOC add note regarding assert_all_finite

* ENH add set_config context manager for ASSUME_FINITE

* Make ASSUME_FINITE private and provide get_config

* Fix ImportError due to incomplete change in last commit

* TST/DOC tests and more cautious documentation for set_config

* DOC what's new entry for validation suppression

* context manager is now config_context; set_config affects global config

* Rename missed set_config to config_context

* Fix mis-named test

* Mention set_config in narrative docs

* More explicit about limmited restoration of context

* Handle case where error raised in config_context

* Reset all settings after exiting context manager
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
* ENH add suppress validation option

* TST skip problematic doctest

* Rename SUPPRESS_VALIDATION to PRESUME_FINITE

* Change PRESUME_ to ASSUME_ for convention's sake

* DOC add note regarding assert_all_finite

* ENH add set_config context manager for ASSUME_FINITE

* Make ASSUME_FINITE private and provide get_config

* Fix ImportError due to incomplete change in last commit

* TST/DOC tests and more cautious documentation for set_config

* DOC what's new entry for validation suppression

* context manager is now config_context; set_config affects global config

* Rename missed set_config to config_context

* Fix mis-named test

* Mention set_config in narrative docs

* More explicit about limmited restoration of context

* Handle case where error raised in config_context

* Reset all settings after exiting context manager
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants