Skip to content

Add oxipng to pre-commit #29925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Add oxipng to pre-commit #29925

wants to merge 1 commit into from

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Apr 16, 2025

PR summary

@oscargus suggested oxipng in #29816.

I went with the hook suggestion of -o 4 compression (though their normal default is 2), though maybe we want to up it to the max, since it shouldn't happen very often. I did not enable metadata stripping (as suggested) since knowing what version of Matplotlib was used for the plot might be useful. I also didn't add alpha channel optmization, as they say it is technically lossy.

PR checklist

@QuLogic
Copy link
Member Author

QuLogic commented Apr 16, 2025

Running oxipng -o $level lib/matplotlib/tests/baseline_images/*/*.png from a fresh set each level, the timing and size is:
Figure_1
And in actual numbers:

Level Time Size
Original 17458920
0 2.159 11734051
1 4.954 11010397
2 12.094 10494049
3 42.332 10440070
4 81.065 10148715
5 157.035 10145413
6 192.527 10145351

Note that oxipng is multithreaded and these are wall times, so multiply by ~11.5 for CPU time.

Levels 5 and 6 seem useless as they take a so much longer and saved just a bit over 3K from the 1200 images. And I'm not even sure if we should bother going from the default 2 to 4 now...

@QuLogic QuLogic closed this Apr 16, 2025
@QuLogic QuLogic reopened this Apr 16, 2025
@QuLogic
Copy link
Member Author

QuLogic commented Apr 16, 2025

And it seems oxipng is timing out on pre-commit CI, but I'm not sure if it's trying to build itself from source, or run on all the existing PNGs in the repo.

@QuLogic
Copy link
Member Author

QuLogic commented Apr 16, 2025

The compressed images cause a single failure due to a warning:

______________________________________________________________________ test_alpha[png] ______________________________________________________________________

args = (), kwds = {'extension': 'png', 'request': <FixtureRequest for <Function test_alpha[png]>>}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)

/usr/lib64/python3.13/contextlib.py:85: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
lib/matplotlib/testing/compare.py:474: in compare_images
    expected_image = _load_image(expected)
lib/matplotlib/testing/compare.py:405: in _load_image
    img = img.convert("RGB")
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <PIL.PngImagePlugin.PngImageFile image mode=P size=200x100 at 0x7FF8F9401FD0>, mode = 'RGB', matrix = None, dither = None, palette = <Palette.WEB: 0>
colors = 256

    def convert(
        self,
        mode: str | None = None,
        matrix: tuple[float, ...] | None = None,
        dither: Dither | None = None,
        palette: Palette = Palette.WEB,
        colors: int = 256,
    ) -> Image:
        """
        Returns a converted copy of this image. For the "P" mode, this
        method translates pixels through the palette.  If mode is
        omitted, a mode is chosen so that all information in the image
        and the palette can be represented without a palette.
    
        This supports all possible conversions between "L", "RGB" and "CMYK". The
        ``matrix`` argument only supports "L" and "RGB".
    
        When translating a color image to grayscale (mode "L"),
        the library uses the ITU-R 601-2 luma transform::
    
            L = R * 299/1000 + G * 587/1000 + B * 114/1000
    
        The default method of converting a grayscale ("L") or "RGB"
        image into a bilevel (mode "1") image uses Floyd-Steinberg
        dither to approximate the original image luminosity levels. If
        dither is ``None``, all values larger than 127 are set to 255 (white),
        all other values to 0 (black). To use other thresholds, use the
        :py:meth:`~PIL.Image.Image.point` method.
    
        When converting from "RGBA" to "P" without a ``matrix`` argument,
        this passes the operation to :py:meth:`~PIL.Image.Image.quantize`,
        and ``dither`` and ``palette`` are ignored.
    
        When converting from "PA", if an "RGBA" palette is present, the alpha
        channel from the image will be used instead of the values from the palette.
    
        :param mode: The requested mode. See: :ref:`concept-modes`.
        :param matrix: An optional conversion matrix.  If given, this
           should be 4- or 12-tuple containing floating point values.
        :param dither: Dithering method, used when converting from
           mode "RGB" to "P" or from "RGB" or "L" to "1".
           Available methods are :data:`Dither.NONE` or :data:`Dither.FLOYDSTEINBERG`
           (default). Note that this is not used when ``matrix`` is supplied.
        :param palette: Palette to use when converting from mode "RGB"
           to "P".  Available palettes are :data:`Palette.WEB` or
           :data:`Palette.ADAPTIVE`.
        :param colors: Number of colors to use for the :data:`Palette.ADAPTIVE`
           palette. Defaults to 256.
        :rtype: :py:class:`~PIL.Image.Image`
        :returns: An :py:class:`~PIL.Image.Image` object.
        """
    
        if mode in ("BGR;15", "BGR;16", "BGR;24"):
            deprecate(mode, 12)
    
        self.load()
    
        has_transparency = "transparency" in self.info
        if not mode and self.mode == "P":
            # determine default mode
            if self.palette:
                mode = self.palette.mode
            else:
                mode = "RGB"
            if mode == "RGB" and has_transparency:
                mode = "RGBA"
        if not mode or (mode == self.mode and not matrix):
            return self.copy()
    
        if matrix:
            # matrix conversion
            if mode not in ("L", "RGB"):
                msg = "illegal conversion"
                raise ValueError(msg)
            im = self.im.convert_matrix(mode, matrix)
            new_im = self._new(im)
            if has_transparency and self.im.bands == 3:
                transparency = new_im.info["transparency"]
    
                def convert_transparency(
                    m: tuple[float, ...], v: tuple[int, int, int]
                ) -> int:
                    value = m[0] * v[0] + m[1] * v[1] + m[2] * v[2] + m[3] * 0.5
                    return max(0, min(255, int(value)))
    
                if mode == "L":
                    transparency = convert_transparency(matrix, transparency)
                elif len(mode) == 3:
                    transparency = tuple(
                        convert_transparency(matrix[i * 4 : i * 4 + 4], transparency)
                        for i in range(0, len(transparency))
                    )
                new_im.info["transparency"] = transparency
            return new_im
    
        if mode == "P" and self.mode == "RGBA":
            return self.quantize(colors)
    
        trns = None
        delete_trns = False
        # transparency handling
        if has_transparency:
            if (self.mode in ("1", "L", "I", "I;16") and mode in ("LA", "RGBA")) or (
                self.mode == "RGB" and mode in ("La", "LA", "RGBa", "RGBA")
            ):
                # Use transparent conversion to promote from transparent
                # color to an alpha channel.
                new_im = self._new(
                    self.im.convert_transparent(mode, self.info["transparency"])
                )
                del new_im.info["transparency"]
                return new_im
            elif self.mode in ("L", "RGB", "P") and mode in ("L", "RGB", "P"):
                t = self.info["transparency"]
                if isinstance(t, bytes):
                    # Dragons. This can't be represented by a single color
>                   warnings.warn(
                        "Palette images with Transparency expressed in bytes should be "
                        "converted to RGBA images"
                    )
E                   UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images

venv/lib64/python3.13/site-packages/PIL/Image.py:1054: UserWarning

@oscargus
Copy link
Member

oscargus commented Apr 16, 2025

Thanks for giving it a try and sorry for the unexpected hassle. I use it in a GitLab project and since the pre-commit CI doesn't support that, I have to run it on the diff from the latest main (set in the script), so not sure if it runs on all files, but it is a reasonable guess I would assume.

A small caveat that is worthwhile to consider is that if the compression improves, it will fail the tests. I wanted to have some sort of threshold, but the developers didn't see a need to support that (but suggested that one scripts it). (Edit: one may not want to commit a new image, just because the new compression removed 4 bytes...)

But, good thing, size decreases. Just a bit boring if one will need to re-commit lots of compressed images...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants