-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH estimator freezing #9397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH estimator freezing #9397
Conversation
* `est.frozen=True` results in `clone(est) is est` * `est.frozen=True` means est is not fitted to some meta-estimators * Some tests are not yet implemented
OMG OMG OMG jumps up and down excitedly |
Thanks for the enthusiasm! If you'd like to suggest a compelling example...
I'll also throw in some doubt about the API (@GaelVaroquaux). This is
definitely less magical than #8374 which overwrites the fit* methods with a
no-op. But if we consider maintainers of scikit-learn-compatible
metaestimators, this creates a divide between those metaestimators which
are frozen_fit aware and those which are not. Users will have to request
frozen_fit support from those maintainers. If they implement it, but they
also preserve compatibility with pre-freezing versions of scikit-learn,
user code will silently misbehave if either the wrong version of
scikit-learn or the wrong version of the metaestimator is installed. We can
help users with the wrong version of scikit-learn by having a function
sklesrn.base.freeze rather than asking the user to set a public attribute.
But the problem stands with this design, where #8374 has none, regarding
third-party metaestimators.
On 19 Jul 2017 1:15 am, "Andreas Mueller" <notifications@github.com> wrote:
OMG OMG OMG *jumps up and down excitedly*
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#9397 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6195dHe_zHNOZPV9xcse18vaVdH5ks5sPMwYgaJpZM4ObLIK>
.
|
Compelling use-case: cross-validating CalibratedClassifierCV with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like frozen_fit
. I guess it's the simplest solution?
How about we add a method freeze
that sets a private _frozen
that is checked in clone
, but which also makes fit
a no-op and removes any fit_X
methods?
That would make "unfreezing" harder but we could store these methods and add an unfreeze()
or thaw()
?
That would mean no other code needs to change. I guess the question is whether we want the meta-estimators to handle this, or the estimators.
If someone else wrote a meta-estimator, with the current solution they would have to use frozen_fit
, but if we let the estimators handle it, they might get surprising behavior?
I see pros and cons for both approaches.
We can't always ensure that fit_transform()
is the same as fit().transform()
and here we are now replacing fit_transform(X)
by fit(X_old).transform(X)
which makes me very nervous. Though I guess we are controlling the context in which it happens.
out | ||
estimator if ``method == 'fit'``, else the output of ``transform`` etc. | ||
If the estimator has attribute ``frozen`` set to True, it will not be | ||
refit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would benefit from an example.
def frozen_fit(estimator, method, X, y, **kwargs): | ||
"""Fit the estimator if not frozen, and return the result of method | ||
|
||
A frozen estimator has an attribute ``frozen`` set to True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a bit to understand what this function is doing. Maybe expand a bit on the explanation? I'm not sure I have actually understood it, I think....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about naming this "fit_if_not_frozen". Maybe that would help @amueller (and others) understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better way to make it easier to understand is to remove method
and apply only to fit
.
return estimator | ||
if not method.startswith('fit_'): | ||
raise ValueError('method must be "fit" or begin with "fit_"') | ||
method = getattr(estimator, method[4:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow this is a bit of a hack lol. And a model doesn't necessarily have that, right? If I call fit_transform
on a frozen T-SNE, it's gonna give me an attribute error. Not sure if there's a better solution though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't an attribute error what you should be getting, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it? The call is frozen_fit(TSNE(), 'fit_transform', X, None)
. Using substring-matching on method names is not something I would expect.
|
||
With transfer learning: | ||
>>> tfidf = TfidfVectorizer().fit(large_X) | ||
>>> tfidf.frozen = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to add BaseEstimator.freeze
? it saves like 5 characters ;) But it also allows us to change the implementation details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not: having a "freeze()" method rather than a "frozen" attribute means that the logic is modifiable in subclasses: the contract is more open: "freeze()" could change more to the estimator. This means that it is potentialy harder to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think a method is helpful either. Only makes it harder to use something that for whatever strange reason does not inherit from base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jnothman hm that's a fair point. I'm just concerned that this will be very hard to change in the future, if we ever decide that's necessary.
""" | ||
if getattr(estimator, 'frozen', False): | ||
if method == 'fit': | ||
return estimator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we should check it was fit before, but I guess that's hard and we defer to when it's used?
@@ -586,9 +590,9 @@ def _transform_one(transformer, weight, X): | |||
def _fit_transform_one(transformer, weight, X, y, | |||
**fit_params): | |||
if hasattr(transformer, 'fit_transform'): | |||
res = transformer.fit_transform(X, y, **fit_params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we'd add a and not transformer.frozen
in the line above that would simplify frozen_fit
a lot, I feel. But maybe more susceptible to bugs?
Thanks for the review. I think it's worthwhile providing the frozen_fit
helper because, as you imply, the logic is not altogether trivial. And yes,
tSNE is not useful frozen, but I don't think there's a way we can conclude
that from the API.
it's irrelevant whether fit_transform as transform are the same. Freezing
is to declare that you only want to use transform.
I think the big question here is, assuming this is a feature we want is
whether we want a model where the fitting is blocked by (a) the estimator
itself or by (b) the metaestimator. The only disadvantage to (a) that I can
see is the need for some magic (although pretty straightforward as in
#8374).
…On 22 Jul 2017 6:54 am, "Andreas Mueller" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/pipeline.py
<#9397 (comment)>
:
> @@ -586,9 +590,9 @@ def _transform_one(transformer, weight, X):
def _fit_transform_one(transformer, weight, X, y,
**fit_params):
if hasattr(transformer, 'fit_transform'):
- res = transformer.fit_transform(X, y, **fit_params)
If we'd add a and not transformer.frozen in the line above that would
simplify frozen_fit a lot, I feel. But maybe more susceptible to bugs?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#9397 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6yXoKlzrtg20O_N3N0hxp3daR2AFks5sQQ_4gaJpZM4ObLIK>
.
|
lol I just realized you addressed everything I mentioned in my review in your comment on the PR above... (I got confused by the email formatting and just read it). I also checked out your old PR. I guess this is the better method, because it's less invasive. I would still maybe use a public method so we can change the implementation later if needed? And I think if we put the logic in the meta-estimator, I think it's not more complicated to add an "if frozen" than wrapping the |
When you have a moment, @GaelVaroquaux, I'd really appreciate a response to my comments regarding choosing magic as opposed to creating maintainability, compatibility, and user surprise issues. |
if not method.startswith('fit_'): | ||
raise ValueError('method must be "fit" or begin with "fit_"') | ||
method = getattr(estimator, method[4:]) | ||
# FIXME: what do we do with kwargs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say: pass them along, to minimize surprise: that way, if people have coded an estimator that take extra arguments in fit_*, they get what they expect.
I guess that that's an argument for sample_props and then actively battling arguments in fit_*
@@ -367,6 +367,9 @@ def test_estimator_weights_format(): | |||
assert_array_equal(eclf1.predict_proba(X), eclf2.predict_proba(X)) | |||
|
|||
|
|||
def test_frozen(): | |||
raise NotImplementedError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
??
Is this a TODO for you to address?
Yes, the not implemented tests are TODOs to be addressed. Atm, my concern
that I would like you to consider is that this design is bad in that it
puts burden onto meta-estimator maintainers, and will surprise users /
break user code if their dependencies are mismatched, as discussed at
#9397 (comment)
.
…On 24 July 2017 at 17:59, Gael Varoquaux ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/ensemble/tests/test_voting_classifier.py
<#9397 (comment)>
:
> @@ -367,6 +367,9 @@ def test_estimator_weights_format():
assert_array_equal(eclf1.predict_proba(X), eclf2.predict_proba(X))
+def test_frozen():
+ raise NotImplementedError()
??
Is this a TODO for you to address?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#9397 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6z0umA43fThAFxDcxN4c6dy5Ckhrks5sRE7igaJpZM4ObLIK>
.
|
I think at the sprint we generally felt that moving stuff into meta-estimators is better than making the estimator interface too complicated... but I feel a more flexible interface would still be good. I guess in most of the cases where I want this behavior, current sklearn errors, so I was less concerned about silently misbehaving code. But for the transfer learning case, this is definitely an issue. Maybe we should loop in @glemaitre? imblearn.pipeline is probably the most commonly used non-sklearn meta-estimator. |
I don't understand the argument. The whole point of a method would be that you could change more to the estimator. Basically you're saying it's bad to separate implementation from interface. Sorry, I don't agree. |
The other way to solve the unaware metaestimator problem is to suggest that
a user wrap their frozen estimator in a Pipeline before applying the
external estimator. but that screams "unreadable hack".
…On 25 Jul 2017 2:40 am, "Andreas Mueller" ***@***.***> wrote:
@GaelVaroquaux <https://github.com/gaelvaroquaux>
I'd rather not: having a "freeze()" method rather than a "frozen"
attribute means that the logic is modifiable in subclasses: the contract is
more open: "freeze()" could change more to the estimator. This means that
it is potentialy harder to understand.
I don't understand the argument. The whole point of a method would be that
you could change more to the estimator. Basically you're saying it's bad to
separate implementation from interface. Sorry, I don't agree.
And if we go the attribute route, I'm pretty sure within a year we'll come
across a case where we can only make thinks work by having frozen be a
property with a custom setter that does magic in the background.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9397 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz65QX-LrhD-sMJUtXy-nBYKW_jvGlks5sRMkOgaJpZM4ObLIK>
.
|
I tried to catch up the conversation feed. I still miss a lot of perspective regarding all user case. I really see the concerns raised by @jnothman #9397 (comment) but I don't see a clear solution.
Could we have a frozen base meta-estimator and non-frozen aware meta-estimator which we could subclass from? |
Could we have a frozen base meta-estimator and non-frozen aware
meta-estimator which we could subclass from?
How would we then implement raising an error if a frozen estimator is
passed to a non-supporting meta-estimator? It doesn't help.
In terms of maintainability, #8374 remains meritorious.
…On 25 July 2017 at 20:15, Guillaume Lemaitre ***@***.***> wrote:
I tried to catch up the conversation feed. I still miss a lot of
perspective regarding all user case.
I really see the concerns raised by @jnothman
<https://github.com/jnothman> #9397 (comment)
<#9397 (comment)>
but I don't see a clear solution.
The other way to solve the unaware metaestimator problem is to suggest
that a user wrap their frozen estimator in a Pipeline before applying the
external estimator. but that screams "unreadable hack".
Could we have a frozen base meta-estimator and non-frozen aware
meta-estimator which we could subclass from?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9397 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz68kn8pnTr_ad_8Jabmyqa9ChGSf0ks5sRcA6gaJpZM4ObLIK>
.
|
The frozen estimator will have an attribute |
I also like #8374. Is |
@glemaitre the point is that the non-frozen aware are the ones that have been written before and are now installed on people's machines and we can't change them ;) |
Of course, that was the missing point :) stupid me. |
The magic is in overwriting methods. But #8374's approach is also not so different (except for attribute access and things like that) from just having something like: class FreezeWrap(BaseEstimator):
# clone returns this unchanged
def __init__(self, estimator):
self.estimator = estimator
def fit(self, *args, **kwargs):
return self
# TODO: make this disappear if estimator lacks transform()
def fit_transform(self, X, *args, **kwargs):
return self.estimator.transform(X)
# TODO: classes_, _estimator_type, etc. #8374 excels in not having to reimplement these things. |
Basically, I am strongly inclining to close this implementation, because of the maintenance risks involved. If we want this feature, it will be through something like #8374 or my previous comment, rather than through meta-estimators. |
Maybe this solution just needs to assist metaestimator implementations implement freezing by creating a common test. This would involve replacing any estimator arguments with a frozen version, and assert that its fit is not called. Common testing for metaestimators isn't really something that's been possible before. |
Resolves #8370
Resolves #8374
est.frozen=True
results inclone(est) is est
est.frozen=True
means est is not fitted to some meta-estimators