Vector Quantization Example increases rather than decreases memory use

### Describe the issue linked to the documentation

The [Vector Quantization Example](https://scikit-learn.org/stable/auto_examples/cluster/plot_face_compress.html) doesn't seem to demonstrate Vector Quantization.  

As written, the kmeans clustering approach used in the example converts a grey scale `uint8` `face` to an `int32` representation (`labels`). This increases the image memory use by 4x. 

```python
print(f'face dtype: {face.dtype}')
print(f'face bytes: {face.nbytes}')
print(f'labels dtype: {labels.dtype}')
print(f'labels bytes: {labels.nbytes}')
```
```
face dtype: uint8
face bytes: 786432
labels dtype: int32
labels bytes: 3145728
```

**Expected output**
Vector quantization output demonstrates a decrease in memory use.


**Additional details**
From [Wikipedia](https://en.wikipedia.org/wiki/Vector_quantization): "Vector quantization, also called "block quantization" or "pattern matching quantization" is often used in lossy data compression. It works by encoding values from a multidimensional vector space into a finite set of values from a discrete subspace of lower dimension. A lower-space vector requires less storage space, so the data is compressed."

I'm guessing kmeans outputs an `int32` by default.  The cluster `labels` are in the range 0,1,2,3,4. While this could be compressed to a 4 bit integer, `uint8` is as small as we can go with numpy so the example does not effectively illustrate the data compression. 

Perhaps the tutorial assumption is that the values contained in `labels` could be compressed through some other algorithm (e.g. outside of numpy). However, for someone unfamiliar with Vector Quantization, it may seem odd why someone would quantize a vector in a way that both loses information and increases memory use.

### Suggest a potential alternative/fix

1. add a comment to clarify that the quantized representation of the original face could be further compressed by another algorithm
2. replace the gray image with a color image and cast the kmeans output to a `uint8` to demonstrate the compression. Converting 3, 8 bit channels to one 8 bit channel would reduce nbytes by 67%.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Vector Quantization Example increases rather than decreases memory use #23896

Describe the issue linked to the documentation

Suggest a potential alternative/fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Vector Quantization Example increases rather than decreases memory use #23896

Description

Describe the issue linked to the documentation

Suggest a potential alternative/fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions