Skip to content

ColumnTransformer give "TypeError: invalid type promotion" #20090

Closed
@princyok

Description

@princyok

Versions

sklearn 0.23.2

Description

Instantiating ColumnTransformer with the remainder argument set to "passthrough" produces a TypeError under certain circumstances. I narrowed down one such circumstance.

The error occurs when exactly n - 1 columns are transformed (where n is the total number of columns) and the one column that gets passed through (i.e., not transformed) has a dtype that cannot be converted to that of the other columns. The root cause is that sklearn tries to combine the arrays with numpy.hstack and fails.

Code to Reproduce

from sklearn import preprocessing, compose
import numpy as np
import pandas as pd
import datetime

prng = np.random.default_rng()
d = pd.DataFrame(prng.random((20,3)), columns = ["aaa", "bbb", "ccc"])
d["time"] = datetime.datetime.now()

columns = ["aaa", "bbb", "ccc"]
t = compose.ColumnTransformer(
    [("stnd", preprocessing.StandardScaler(), columns)],
    remainder="passthrough"
)
t.fit_transform(d)

The above code produces "TypeError: invalid type promotion". The dtype of the "time" column is datetime and that of the others is float. Like described above, letting only the "time" column to pass through results in hstack failing when it tries to concatenate the arrays. If you change columns to columns = ["aaa", "bbb"], it works as is expected. Also changing the remainder argument to remainder="drop" also works.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions