Skip to content

ENH: Add feature_names_ property to PolynomialFeatures #6216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

maniteja123
Copy link
Contributor

Added the property of feature_names_ for PolynomialFeatures in preprocessing. See issue #6185. I am not totally sure this is the expected solution. Please let me know if something else needs to be done. Thanks.

@aniryou
Copy link

aniryou commented Jan 28, 2016

There are multiple features, your approach doesn't take that in consideration.

For polynomial with degree 2, 3 features, we need:
['bias', 'X1', 'X2', 'X3', 'X1^2', 'X1*X2', 'X1*X3', 'X2^2', 'X2*X3', 'X3^2']

I generated it with:

@property
def feature_names_(self):
    check_is_fitted(self, 'n_input_features_')

    def pstr(p):
        if np.count_nonzero(p)==0:
            return 'bias'
        vars, exps = np.nonzero(p)[0], p[np.nonzero(p)]
        vstrs = ['X'+str(v+1) for v in vars]
        estrs = [('^'+str(e) if e>1 else '') for e in exps]
        terms = [v+e for v,e in zip(vstrs, estrs)]
        return '*'.join(terms)

    return [pstr(power) for power in self.powers_]

@maniteja123
Copy link
Contributor Author

Thanks for clarifying. So the property expected is the output feature mapping in terms of the polynomials of input features. Will do that and let you know.

@maniteja123
Copy link
Contributor Author

Hi everyone, the suggestion by @aniryou seems to be solving the use case here. I don't think I have enough exposure in this domain to decide on the best approach to take. Please let me know if it would be ideal to go ahead with his idea. I will proceed as per the consensus reached here. Thanks.

@@ -1151,6 +1151,10 @@ class PolynomialFeatures(BaseEstimator, TransformerMixin):
features is computed by iterating over all suitably sized combinations
of input features.

feature_names_ : list, shape [n_input_features_]
Represents the names of input features
It is of the form ``['X1', 'X2', 'X3'...]``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should be lower-case. upper case indicates matrices. Also, these are the output features, right?

@jakevdp
Copy link
Member

jakevdp commented Feb 16, 2016

I think @aniryou's version is the better one to use. Also I agree that this should be a get_feature_names function, as this is already used in a couple other places in the package, and is expected in a transformer by the pipeline code.

@maniteja123
Copy link
Contributor Author

Thanks for clarifying. It would indeed be better if someone experienced completes it. Sorry I didn't get to do the right thing here. Closing this as it is superseded by #6372.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants