Skip to content

ColumnTransformer behavior for negative column indexes #12946

@albertcthomas

Description

@albertcthomas

Description

The behavior of ColumnTransformer when negative integers are passed as column indexes is not clear.

Steps/Code to Reproduce

import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder

X = np.random.randn(2, 2)
X_categories = np.array([[1], [2]])
X = np.concatenate([X, X_categories], axis=1)

print('---- With negative index ----')
ohe = OneHotEncoder(categories='auto')
tf_1 = ColumnTransformer([('ohe', ohe, [-1])], remainder='passthrough')
print(tf_1.fit_transform(X))

print('---- With positive index ----')
tf_2 = ColumnTransformer([('ohe', ohe, [2])], remainder='passthrough')
print(tf_2.fit_transform(X))

Expected Results

The first transformer tf_1 should either raise an error or give the same result as the second transformer tf_2

Actual Results

---- With negative index ----
[[ 1.          0.          0.10600662 -0.46707426  1.        ]
 [ 0.          1.         -1.33177629  2.29186299  2.        ]]
---- With positive index ----
[[ 1.          0.          0.10600662 -0.46707426]
 [ 0.          1.         -1.33177629  2.29186299]]

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugEasyWell-defined and straightforward way to resolve

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions