Skip to content

WIP/RFC Targettransformer #8988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

amueller
Copy link
Member

@amueller amueller commented Jun 5, 2017

Implements #8678.

No tests yet.

1.0
>>> tt.estimator_.coef_
array([ 2.])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove line

estimator : object
Estimator object to wrap.

func : function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases in which it is useful to have a transformer here, i.e. one of:

  • The transformation involves statistics from the training portion of y
  • The transformation is parameterised and those parameters should be searchable

Copy link
Member Author

@amueller amueller Jun 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second one one can be done, but ugly, by searching over functions? Do you have a use-case for 1?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the second bullet; the first doesn't seem pipeline-friendly but I guess could be useful in custom code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well LabelBinarizer without specifying the set of labels is an example of learnt state... Centring or Box Cox for regression?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I completely agree, I misread the first bullet earlier re:training portion.

Come to think of it, centering / standardization in linear models is also a good example of something that could be expressed under this form, and also confirms @amueller's point that we care about the score after inverse-transforming.

@amueller
Copy link
Member Author

amueller commented Jun 6, 2017

with #8022 I wouldn't need to add the estimator to donttest ^^

"TargetTransformer only implements a score method if "
"estimator is a classifier or regressor, but " + err)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are computing the score in the "outer" domain, not in the "inner" domain that the estimator_ is fit.

An alternative would be: return self.estimator_.score(X, self.func(y)) which doesn't need error checking but returns scores in the "inner" domain, where the estimator is truly fit.

I would prefer this alternative. I guess it depends on the use cases this transformer is designed for?

@vene
Copy link
Member

vene commented Jun 6, 2017

The name is confusing: this is called TargetTransfomer but it's not a transformer: it's a metaclassifier. (Is it intended to have it be a metaestimator that could wrap transformers as well?)

@jnothman
Copy link
Member

jnothman commented Jun 6, 2017 via email

@vene
Copy link
Member

vene commented Jun 6, 2017

MovingTargetClassifier :P (definitely not serious)

More seriously: FunctionRetargeter? (along the lines of FunctionTransformer)

@amueller
Copy link
Member Author

amueller commented Jun 6, 2017

I think BoxCox is a good example. how would we support that best?
Do the simple case first or try to get it right?

TargetFunction? TransformTarget?

@jnothman
Copy link
Member

jnothman commented Jun 6, 2017 via email

@amueller
Copy link
Member Author

amueller commented Jun 6, 2017

I would have base_estimator as the first parameter, as I think usually it is.

We can have it be a transformer or a tuple? Though I'd actually prefer kwargs to be more explicit, either specifying transformer or func and inverse_func?

@vene
Copy link
Member

vene commented Jun 6, 2017

I prefer the explicit API with kwargs to the tuple.

@vene
Copy link
Member

vene commented Jun 6, 2017

quoth @ogrisel on API discussion

(transformer=None, func=None, inverse_func=None)

If all are none, this just does an identity function.

@vene
Copy link
Member

vene commented Jun 6, 2017

TransformedTargetRegressor?

@glemaitre
Copy link
Member

FYI I'll make some changes and drop the PR on Andy branch

@amueller
Copy link
Member Author

amueller commented Jun 6, 2017

@glemaitre ? you already have it open, right?

@glemaitre
Copy link
Member

Is is what I have for tonight here

@amueller @jnothman I got trapped with the fit of the transformer.
With the FunctionTransformer, we can trigger or not the check_array which is really useful.
However, in the case of the BoxCox, there is nothing that allow for such thing which mean that y should be a 2d array to pass the through the fit.

What did you have in the head to pass by this problem?

@jnothman
Copy link
Member

jnothman commented Jun 7, 2017 via email

@glemaitre glemaitre mentioned this pull request Jun 7, 2017
3 tasks
@jnothman
Copy link
Member

Closed by #9041

@jnothman jnothman closed this Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants