-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
Describe the bug
If you want to use the new attribute 'feature_names_in' of RandomForestClassifier which is added in scikit-learn V1.0, you will need use x_train to fit the model first and its datatype is dataframe (for you want to use the new attribute 'feature_names_in' and only the dataframe can contain feature names in the heads conveniently). but when I fit the model, the warning will arise:
UserWarning: X does not have valid feature names, but RandomForestClassifier was fitted with feature names
warnings.warn(
(half of the bracket in the waring is exactly what I get from Jupyter notebook)
I know I can use "x_train.values“ to fit the model and avoid this waring , but if x_train only contains the numeric data, what's the point of having the attribute 'feature_names_in' in new version 1.0?
Steps/Code to Reproduce
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
data = pd.read_csv(r'F:\train_modified.csv')#The first line of 'train_modified.csv' is feature's names. And these label names are just `a b c d`
target='Disbursed'
x_columns = [x for x in data.columns if x not in [target]]
X = data[x_columns]
y =data[target]
x_train,x_test,y_train,y_test = train_test_split(X, y,random_state=10)
X= x_train
y=y_train
#once use fit method on such X data, the warning arise
rf0 = RandomForestClassifier(oob_score=True, random_state=10)
rf0.fit(X,y)
print (rf0.oob_score_)
Expected Results
No warning. (Because new added attribute 'feature_names_in' just needs x_train has its features' names.)
Actual Results
UserWarning: X does not have valid feature names, but RandomForestClassifier was fitted with feature names
warnings.warn(
Versions
System:
python: 3.8.11 (default, Aug 6 2021, 09:57:55) [MSC v.1916 64 bit (AMD64)]
executable: E:\Anaconda3\python.exe
machine: Windows-10-10.0.18363-SP0
Python dependencies:
pip: 21.3.1
setuptools: 58.0.4
sklearn: 1.0.1
numpy: 1.19.2
scipy: 1.7.1
Cython: 0.29.24
pandas: 1.3.2
matplotlib: 3.4.2
joblib: 1.0.1
threadpoolctl: 2.2.0
Built with OpenMP: True
Process finished with exit code 0