Skip to content

Categorical feature in Tree-based classifiers #5442

@saj1919

Description

@saj1919

Hi,
How categorical features are handled in sklearn ?
I have used R version which handles strings. Also AN6U5 says tree classifiers can handle categorical features (link - http://datascience.stackexchange.com/questions/5226/strings-as-features-in-decision-tree-random-forest)

But when I tried data which have city names and state names .. I can not pass them directly. I had to use LabelEncoder so that RandomForest in sklearn can handle. But now categorical data are numbers only .. how can RandomForest identify that certain column is categorical or not ?

Now my question is how it is handling numeric and categorical data at the same time ? or it is not handling ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions