-
-
Notifications
You must be signed in to change notification settings - Fork 26k
Closed
Labels
Description
Describe the bug
When a dictionary is supplied to keyword argument class_weight
with unequal weights and sample_weight
is specified, the sample weight object is modified.
Steps/Code to Reproduce
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import numpy as np
X, y = load_iris(return_X_y=True)
np.random.seed(1234)
W = np.random.random(len(X)) * 10.0
print('Sum of weight before: {}'.format(W.sum()))
# fit model
clf = LogisticRegression(random_state=0,
class_weight={0: 1.0, 1: 10.0, 2: 1.0},
max_iter=200)
clf.fit(X, y, sample_weight=W)
print('Sum of weight after: {}'.format(W.sum()))
Produces results:
Sum of weight before: 761.8436984163643
Sum of weight after: 3075.667075252882
Expected Results
Weight object (W
) should be the unaltered after LogisticRegression.fit
call.
The expected result can be achieved by deep copying the weight object:
# fit model
import copy
clf = LogisticRegression(random_state=0,
class_weight={0: 1.0, 1: 10.0, 2: 1.0},
max_iter=200)
clf.fit(X, y, sample_weight=copy.deepcopy(W))
Sum of weight before: 761.8436984163643
Sum of weight after: 761.8436984163643
Actual Results
Weight object (W
) was modified (sum of weights before was 761.8, after 3075.7).
Versions
System:
python: 3.8.3 (default, May 19 2020, 06:50:17) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\kusan\work\bayes_inj_risk\.venv\Scripts\python.exe
machine: Windows-10-10.0.19041-SP0
Python dependencies:
pip: 20.2.2
setuptools: 49.6.0
sklearn: 0.23.2
numpy: 1.19.1
scipy: 1.5.2
Cython: None
pandas: 1.1.1
matplotlib: 3.3.1
joblib: 0.16.0
threadpoolctl: 2.1.0
Built with OpenMP: True