-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
Describe the workflow you want to enable
As noted in #27037, handling the index of an input container can be hairy. The solution implemented in #27044 works, but it excludes pandas.Series
input types. I'd like to modify the logic in the :method:PandasAdapter.create_container
so that it checks if the X_original
is a pandas.DataFrame
or pandas.Series
. This would allow transformers that accept 1-dimensional inputs and output 2-dimensional dataframes to persist their indices.
Describe your proposed solution
I'd like to change line 124 from this:
elif isinstance(X_original, pd.DataFrame):
To this:
elif isinstance(X_original, (pd.DataFrame, pd.Series)):
Describe alternatives you've considered, if relevant
User sets the index on their own:
some_series = pd.Series(...)
trf = SomeTransformer().set_output(transform="pandas")
out_frame = trf.fit_transform(some_series).set_index(some_series.index)
Additional context
I recognize most transformers in scikit-learn
expect 2-dimensional inputs. But some packages that depend on scikit-learn
(like mlxtend
) have transformers that transform 1-dimensional input into 2-dimensional output. I believe this would greatly benefit them. See the newly updated TransactionEncoder
for an example.
I'm willing to submit a PR if this is an acceptable enhancement.