From d60666a654891a78d564284fac861a764b4c2f94 Mon Sep 17 00:00:00 2001 From: arnaudstiegler Date: Tue, 18 Jun 2019 18:35:33 -0400 Subject: [PATCH 1/2] updated class_weight explanation --- doc/glossary.rst | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/doc/glossary.rst b/doc/glossary.rst index ed2b026fca1d0..a3cc0f79851cd 100644 --- a/doc/glossary.rst +++ b/doc/glossary.rst @@ -1412,7 +1412,12 @@ functions or non-estimator constructors. ``class_weight='balanced'`` can be used to give all classes equal weight by giving each sample a weight inversely related to its class's prevalence in the training data: - ``n_samples / (n_classes * np.bincount(y))``. + ``n_samples / (n_classes * np.bincount(y))``. Class weights will be + used differently depending on the algorithm: for linear models (such + as linear SVM or logistic regression), the class weights will alter the + loss function by weighting the loss of each sample by its class weight. + For tree-based algorithms, the class weights will be used when + calculating the splitting criteria. **Note** however that this rebalancing does not take the weight of samples in each class into account. From 4bc3f3710ff0d20e0c469bf9f89e76b0fec11920 Mon Sep 17 00:00:00 2001 From: arnaudstiegler Date: Wed, 19 Jun 2019 11:06:48 -0400 Subject: [PATCH 2/2] glossary_class_weight --- doc/glossary.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/glossary.rst b/doc/glossary.rst index a3cc0f79851cd..dba7ffa746732 100644 --- a/doc/glossary.rst +++ b/doc/glossary.rst @@ -1416,8 +1416,8 @@ functions or non-estimator constructors. used differently depending on the algorithm: for linear models (such as linear SVM or logistic regression), the class weights will alter the loss function by weighting the loss of each sample by its class weight. - For tree-based algorithms, the class weights will be used when - calculating the splitting criteria. + For tree-based algorithms, the class weights will be used for + reweighting the splitting criterion. **Note** however that this rebalancing does not take the weight of samples in each class into account.