-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
DOC use KBinsDiscretizer in lieu of KMeans in vector quantization example #24374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC use KBinsDiscretizer in lieu of KMeans in vector quantization example #24374
Conversation
…nto vector_quantization_example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add titles to each of the imshow plots?
# create an array from labels and values | ||
face_compressed = np.choose(labels, values) | ||
est = preprocessing.KBinsDiscretizer( | ||
n_bins=n_bins, strategy="uniform", encode="ordinal", random_state=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather keep on using the k-means strategy here.
This PR only partially fixes the original issue. The initial issue reported was about the image size (in kB) where the quantized image was larger than the original due to some data type issues. So I think that the narration should be improved by:
The best would be to change the example into a notebook-style example where each of these points could be discuss in a different section/cell |
…nto vector_quantization_example
I solve the conflicts with I will check the rendering and I will probably merge if this is fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @x110, this is indeed an improvement to the current version of the example. Here is a first batch of suggestions.
Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>
Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>
Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>
Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>
I accepted the other suggestion from @ArturoAmorQ which are a net improvement in regard to my first draft. I will wait for the example generation to check that the rendering is fine and if it is, I will merge it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small typo, otherwise then LGTM. Thanks @x110!
_, ax = plt.subplots() | ||
ax.hist(raccoon_face.ravel(), bins=256) | ||
color = "tab:orange" | ||
for center in bin_center: | ||
ax.axvline(center, color=color) | ||
ax.text(center - 10, ax.get_ybound()[1] + 100, f"{center:.1f}", color=color) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For info, what I meant in this comment was the following:
_, ax = plt.subplots() | |
ax.hist(raccoon_face.ravel(), bins=256) | |
color = "tab:orange" | |
for center in bin_center: | |
ax.axvline(center, color=color) | |
ax.text(center - 10, ax.get_ybound()[1] + 100, f"{center:.1f}", color=color) | |
_, ax = plt.subplots() | |
ax.hist(raccoon_face.ravel(), bins=256) | |
ax.set(xlabel="Original pixel values", ylabel="Count of pixels") | |
ax1 = ax.twiny() | |
ax1.set_xlim(ax.get_xlim()) | |
ax1.set( | |
xticks=bin_center, | |
xticklabels=list(range(8)), | |
xlabel="Sub-sampled pixel values", | |
) | |
for center in bin_center: | |
ax.axvline(center, color="tab:orange", ls="--") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something similar should then be done for the plot using k-means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, it is what I did at first. But I find it more complex to understand than just adding the annotation within the for loop.
The rendering is fine. Thanks @x110 @ArturoAmorQ. Merging. |
…mple (scikit-learn#24374) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>
Fixes #23896