Skip to content

Conversation

jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented May 13, 2020

Fixes #17203

KMeans scales sample weight to sum up to n_samples. It has been done inplace when we generalized the use of check_sample_weight.

@rth

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Please add a what's new entry.

......................

- |Fix| Fixed a bug in :class:`cluster.KMeans` where the sample weights
provided by the user was modified in place. :pr:`17204` by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
provided by the user was modified in place. :pr:`17204` by
provided by the user were modified in place. :pr:`17204` by

@@ -177,7 +177,7 @@ def _check_normalize_sample_weight(sample_weight, X):
# an array of 1 (i.e. samples_weight is None) is already normalized
n_samples = len(sample_weight)
scale = n_samples / sample_weight.sum()
sample_weight *= scale
sample_weight = sample_weight * scale
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make an explicit copy above instead?

@NicolasHug
Copy link
Member

I'll justmerge and will fix the typo in #17205

@NicolasHug NicolasHug changed the title [MRG] Fix don't use inplace operation on sample weight in KMeans FIX don't modify sample weight inplace in KMeans May 13, 2020
@NicolasHug NicolasHug merged commit ccf5c36 into scikit-learn:master May 13, 2020
gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
adrinjalali pushed a commit to adrinjalali/scikit-learn that referenced this pull request May 18, 2020
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kmeans fit sideeffect: sample_weight is parameter is modified
4 participants