Skip to content

LabelBinarizer and LabelEncoder fit and transform signatures not compatible with Pipeline #3112

Closed
@hxu

Description

@hxu

I get this error when I try to use LabelBinarizer and LabelEncoder in a Pipeline:

sklearn/pipeline.pyc in fit_transform(self, X, y, **fit_params)
    141         Xt, fit_params = self._pre_transform(X, y, **fit_params)
    142         if hasattr(self.steps[-1][-1], 'fit_transform'):
--> 143             return self.steps[-1][-1].fit_transform(Xt, y, **fit_params)
    144         else:
    145             return self.steps[-1][-1].fit(Xt, y, **fit_params).transform(Xt)

TypeError: fit_transform() takes exactly 2 arguments (3 given)

It seems like this is because the classes' fit and transform signatures are different from most other estimators and only accept a single argument.

I think this is a pretty easy fix (just change the signature to def(self, X, y=None)) that I'd be happy to send a pull request for, but I wanted to check if there were any other reasons that the signatures are the way they are that I didn't think of.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions