-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
WIP/RFC Targettransformer #8988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1.0 | ||
>>> tt.estimator_.coef_ | ||
array([ 2.]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove line
estimator : object | ||
Estimator object to wrap. | ||
|
||
func : function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there cases in which it is useful to have a transformer here, i.e. one of:
- The transformation involves statistics from the training portion of y
- The transformation is parameterised and those parameters should be searchable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second one one can be done, but ugly, by searching over functions? Do you have a use-case for 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the second bullet; the first doesn't seem pipeline-friendly but I guess could be useful in custom code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well LabelBinarizer without specifying the set of labels is an example of learnt state... Centring or Box Cox for regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I completely agree, I misread the first bullet earlier re:training portion.
Come to think of it, centering / standardization in linear models is also a good example of something that could be expressed under this form, and also confirms @amueller's point that we care about the score after inverse-transforming.
with #8022 I wouldn't need to add the estimator to donttest ^^ |
"TargetTransformer only implements a score method if " | ||
"estimator is a classifier or regressor, but " + err) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you are computing the score in the "outer" domain, not in the "inner" domain that the estimator_ is fit.
An alternative would be: return self.estimator_.score(X, self.func(y))
which doesn't need error checking but returns scores in the "inner" domain, where the estimator is truly fit.
I would prefer this alternative. I guess it depends on the use cases this transformer is designed for?
The name is confusing: this is called TargetTransfomer but it's not a transformer: it's a metaclassifier. (Is it intended to have it be a metaestimator that could wrap transformers as well?) |
Call it Retarget? not sure I'm serious
…On 6 Jun 2017 8:00 pm, "Vlad Niculae" ***@***.***> wrote:
The name is confusing: this is called TargetTransfomer but it's not a
transformer: it's a metaclassifier. (Is it intended to have it be a
metaestimator that could wrap transformers as well?)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8988 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz62RNf4ijE5ZDRlNC5cBOOR0yiyk_ks5sBSM3gaJpZM4NwOlJ>
.
|
MovingTargetClassifier :P (definitely not serious) More seriously: FunctionRetargeter? (along the lines of FunctionTransformer) |
I think BoxCox is a good example. how would we support that best? TargetFunction? TransformTarget? |
Well, I could imagine
TransformedY(BoxCox(), SVR())
would we then want a shorthand for the common case you now handle, e.g.:
TransformedY(FunctionTransformer(np.log, np.exp), SVR())
?
1d y would automatically be converted to and from 2d in fitting and
transforming.
…On 6 June 2017 at 21:19, Andreas Mueller ***@***.***> wrote:
I think BoxCox is a good example. how would we support that best?
Do the simple case first or try to get it right?
TargetFunction? TransformTarget?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8988 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6w6gc7t7FbcG8h4ZOuMu3v4atDKUks5sBTWtgaJpZM4NwOlJ>
.
|
I would have base_estimator as the first parameter, as I think usually it is. We can have it be a transformer or a tuple? Though I'd actually prefer kwargs to be more explicit, either specifying |
I prefer the explicit API with kwargs to the tuple. |
quoth @ogrisel on API discussion
If all are none, this just does an identity function. |
|
FYI I'll make some changes and drop the PR on Andy branch |
@glemaitre ? you already have it open, right? |
Is is what I have for tonight here @amueller @jnothman I got trapped with the What did you have in the head to pass by this problem? |
If a transformer is provided, we'd need to assume that it requires a 2d
array. We would need to reshape at fit, and ravel (and check n_samples is
maintained) at transform. I can see problems in that, such as that existing
label transformers would not work where they require 1d.
…On 7 June 2017 at 09:53, Guillaume Lemaitre ***@***.***> wrote:
Is is what I have for tonight here
<https://github.com/amueller/scikit-learn/pull/30/files>
@amueller <https://github.com/amueller> @jnothman
<https://github.com/jnothman> I got trapped with the fit of the
transformer.
With the FunctionTransformer, we can trigger or not the check_array which
is really useful.
However, in the case of the BoxCox, there is nothing that allow for such
thing which mean that y should be a 2d array to pass the through the fit.
What did you have in the head to pass by this problem?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8988 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz62QuSCSttzsWOTshhKpSPZmvzsuMks5sBeaEgaJpZM4NwOlJ>
.
|
Closed by #9041 |
Implements #8678.
No tests yet.