DOC use KBinsDiscretizer in lieu of KMeans in vector quantization example #24374

x110 · 2022-09-06T13:42:23Z

…nto vector_quantization_example

ogrisel

Could you please add titles to each of the imshow plots?

ogrisel · 2022-09-07T15:25:18Z

examples/cluster/plot_face_compress.py

-# create an array from labels and values
-face_compressed = np.choose(labels, values)
+est = preprocessing.KBinsDiscretizer(
+    n_bins=n_bins, strategy="uniform", encode="ordinal", random_state=0


I would rather keep on using the k-means strategy here.

glemaitre · 2022-09-09T09:07:30Z

This PR only partially fixes the original issue. The initial issue reported was about the image size (in kB) where the quantized image was larger than the original due to some data type issues.

So I think that the narration should be improved by:

looking at the number of unique values original vs. quantized image to show the compression
explain the expansion in terms of memory link to the data type

The best would be to change the example into a notebook-style example where each of these points could be discuss in a different section/cell

…nto vector_quantization_example

glemaitre · 2022-10-12T14:34:53Z

I solve the conflicts with main. I thought that I would give try to change the example using the notebook style. I, therefore, added some additional narrative but kept the original idea done in this PR.

I will check the rendering and I will probably merge if this is fine.

ArturoAmorQ

Thanks for the PR @x110, this is indeed an improvement to the current version of the example. Here is a first batch of suggestions.

examples/cluster/plot_face_compress.py

Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>

glemaitre · 2022-10-17T09:54:25Z

I accepted the other suggestion from @ArturoAmorQ which are a net improvement in regard to my first draft. I will wait for the example generation to check that the rendering is fine and if it is, I will merge it.

ArturoAmorQ

Just a small typo, otherwise then LGTM. Thanks @x110!

examples/cluster/plot_face_compress.py

ArturoAmorQ · 2022-10-17T10:41:04Z

examples/cluster/plot_face_compress.py

+_, ax = plt.subplots()
+ax.hist(raccoon_face.ravel(), bins=256)
+color = "tab:orange"
+for center in bin_center:
+    ax.axvline(center, color=color)
+    ax.text(center - 10, ax.get_ybound()[1] + 100, f"{center:.1f}", color=color)


For info, what I meant in this comment was the following:

Suggested change

_, ax = plt.subplots()

ax.hist(raccoon_face.ravel(), bins=256)

color = "tab:orange"

for center in bin_center:

ax.axvline(center, color=color)

ax.text(center - 10, ax.get_ybound()[1] + 100, f"{center:.1f}", color=color)

_, ax = plt.subplots()

ax.hist(raccoon_face.ravel(), bins=256)

ax.set(xlabel="Original pixel values", ylabel="Count of pixels")

ax1 = ax.twiny()

ax1.set_xlim(ax.get_xlim())

ax1.set(

xticks=bin_center,

xticklabels=list(range(8)),

xlabel="Sub-sampled pixel values",

)

for center in bin_center:

ax.axvline(center, color="tab:orange", ls="--")

Something similar should then be done for the plot using k-means.

yep, it is what I did at first. But I find it more complex to understand than just adding the annotation within the for loop.

glemaitre · 2022-10-17T11:26:58Z

The rendering is fine. Thanks @x110 @ArturoAmorQ. Merging.

…mple (scikit-learn#24374) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>

x110 added 4 commits September 6, 2022 17:21

Vector Quantization Example

afab601

fixed formatting

74e02ed

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…

fb6ad42

…nto vector_quantization_example

run black

ca81f50

ogrisel reviewed Sep 7, 2022

View reviewed changes

x110 added 2 commits September 8, 2022 00:39

using kmeans + adding figures titles

1fc70c1

run black

82a2161

glemaitre changed the title ~~Vector quantization example~~ DOC use KBinsDiscretizer in lieu of KMeans in vector quantization example Sep 9, 2022

github-actions bot added the Documentation label Sep 9, 2022

x110 added 3 commits September 13, 2022 11:59

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…

aa53222

…nto vector_quantization_example

add comments on the in-memory compression

a01e54d

formatting

7053ba0

glemaitre mentioned this pull request Oct 12, 2022

Fixes #23896 Vector Quantization Example increases rather than decrea #24632

Closed

glemaitre self-requested a review October 12, 2022 13:05

glemaitre added 2 commits October 12, 2022 15:14

Merge remote-tracking branch 'origin/main' into pr/x110/24374

ac817d4

DOC transform to a notebook style example

de5992a

glemaitre approved these changes Oct 12, 2022

View reviewed changes

ArturoAmorQ reviewed Oct 14, 2022

View reviewed changes

x110 and others added 3 commits October 17, 2022 07:25

Update examples/cluster/plot_face_compress.py

239f917

Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>

Update examples/cluster/plot_face_compress.py

027006c

Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>

Update examples/cluster/plot_face_compress.py

0b945dd

Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>

glemaitre self-requested a review October 17, 2022 09:31

glemaitre and others added 2 commits October 17, 2022 11:34

Apply suggestions from code review

44bcd12

Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>

add annotations for bin center

95d7d51

ArturoAmorQ approved these changes Oct 17, 2022

View reviewed changes

examples/cluster/plot_face_compress.py Outdated Show resolved Hide resolved

forgot to push

d518e44

ArturoAmorQ reviewed Oct 17, 2022

View reviewed changes

glemaitre merged commit 5af75b9 into scikit-learn:main Oct 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC use KBinsDiscretizer in lieu of KMeans in vector quantization example #24374

DOC use KBinsDiscretizer in lieu of KMeans in vector quantization example #24374

x110 commented Sep 6, 2022 •

edited

Loading

ogrisel left a comment

ogrisel Sep 7, 2022

glemaitre commented Sep 9, 2022

glemaitre commented Oct 12, 2022

ArturoAmorQ left a comment

glemaitre commented Oct 17, 2022

ArturoAmorQ left a comment

ArturoAmorQ Oct 17, 2022 •

edited

Loading

ArturoAmorQ Oct 17, 2022

glemaitre Oct 17, 2022

glemaitre commented Oct 17, 2022

DOC use KBinsDiscretizer in lieu of KMeans in vector quantization example #24374

DOC use KBinsDiscretizer in lieu of KMeans in vector quantization example #24374

Conversation

x110 commented Sep 6, 2022 • edited Loading

ogrisel left a comment

Choose a reason for hiding this comment

ogrisel Sep 7, 2022

Choose a reason for hiding this comment

glemaitre commented Sep 9, 2022

glemaitre commented Oct 12, 2022

ArturoAmorQ left a comment

Choose a reason for hiding this comment

glemaitre commented Oct 17, 2022

ArturoAmorQ left a comment

Choose a reason for hiding this comment

ArturoAmorQ Oct 17, 2022 • edited Loading

Choose a reason for hiding this comment

ArturoAmorQ Oct 17, 2022

Choose a reason for hiding this comment

glemaitre Oct 17, 2022

Choose a reason for hiding this comment

glemaitre commented Oct 17, 2022

x110 commented Sep 6, 2022 •

edited

Loading

ArturoAmorQ Oct 17, 2022 •

edited

Loading