cgifsave file size larger than expected for some gifs #2576

jmreid · 2021-12-10T20:16:54Z

I have a few example gifs that seem to really increase in filesize when run through libvips. This doesn't seem to be the case for all gifs, but i haven't been able to figure out why it's seemingly much worse for some files.

Given this example input gif (~897 KB):

vipsthumbnail (using cgif) returns a 3.5MB image:
vipsthumbnail "example.gif[n=-1]" --size 200 -o example_vips_out.gif

Gifsicle returns a less nicer-looking image, but it's ~1MB:
gifsicle --resize-fit 200x_ -o example_gifsicle_out.gif example.gif

Here's another example I found online:

This is a 570KB file, and if I run it through vips to resize to 1400px wide:

vipsthumbnail "test.gif[n=-1]" --size 1400 -o test_vips_out.gif

du -h test_vips_out.gif
 24M	test_vips_out.gif

Using gifsicle:

gifsicle --resize-fit 1400x_ -o test_gifsicle_out.gif test.gif  

du -h test_gifsicle_out.gif
344K	test_gifsicle_out.gif

Again, Gifsicle's image is not as nice looking. But that filesize difference is quite a bit, especially because the source is only 570KB.

I'm running:

vips --vips-version
libvips 8.12.1-Wed Nov 24 15:41:04 UTC 2021

gifsicle --version
LCDF Gifsicle 1.93

@dloebl @MCLoebl Found you a couple examples! 😄

The text was updated successfully, but these errors were encountered:

dloebl · 2021-12-11T08:07:05Z

Thanks!
I had a quick look:
example.gif: has only a global colour table with 128 entries
example_vips_out.gif: almost every frame has its own local colour table with 256 entries
example_gifsicle_out.gif: has only a global colour table with 128 entries

test.gif: has only a global colour table with 64 entries
test_vips_out.gif: almost every frame has its own local colour table with 256 entries
test_gifsicle_out.gif: has only a global colour table with 64 entries

The extra colour tables are of course extra data in the GIF.
Plus: The larger the used colour table is of a given frame, the worse the compression ratio will become.

Would be interesting to know whether the number of colours actually increases from example.gif to example_vips_out.gif.
@jcupitt Can sizing an image down increase the number of colours in it? The output appears to get a bit noisy as well.

jcupitt · 2021-12-11T10:12:31Z

Hi @jmreid, thanks for these interesting examples.

Yes, resizing will make new in-between colours, since it effectively averages pixels.

Something we discussed a few months ago is getting the GIF decoder to attach the colour tables from the source image. The encoder could then just reuse that rather than finding a new table -- at least the result would be no worse.

The difficulty comes if processing is more than a simple resize. For example, a watermark might be drawn on a few frames. In this case, you probably would want to recompute the colour tables.

How about:

The loader examines the GIF. If there is a single global table, it attaches it as metadata, eg. gif-tables perhaps. We could attach many tables and record frame numbers, but that sounds like a lot of work for probably very marginal gain.
The saver checks for the existence of gif-tables and flips into a special save mode (save GIF source as GIF).
- It sets the saved colour table as the current one, then for each frame:
  - Compute the maximum encoding error of this frame with this table. If the frame is a simple resize, the error will be bounded by half the largest inter-colour distance in the table.
  - Error > threshold plus a bit? Recompute the table and attach as a local table.
  - Dither to the output.

Does that sound enough? I've probably missed something.

We'd need two new functions: something to compute maximum encoding error given an RGB image and a colour table, and something to find the maximum inter-colour distance in a table.

jcupitt · 2021-12-11T10:32:38Z

Another possibility would be to add a GIF save option for reuse-tables and make it the application's responsibility to decide when simple reuse was better than optimisation. So (for example) the shopify image resize system could spot cases when it was doing GIF -> resize -> GIF and set the flag then.

The change would be:

Loader attaches gif-tables iff there is a single global colour table.
Application layer spots cases when table reuse is OK (eg. resize, crop, flip, rotate etc. are fine to reuse, but watermark, brighten, darken, filter etc. would need reoptimisation).
Saver checks for reuse-tables option being set and the existence of gif-tables metadata. If both are OK, it flips to a special save mode where it sets a single global table and never tries to reoptimise. It'll still need to dither, of course.

On the command line, it'd be something like:

vipsthumbnail input.gif[n=-1] --size 128x128 -o output.gif[reuse-tables]

That sounds really simple and safe, and not much extra work for the application level. There would be no change for any of the bindings either. Is that better? @kleisauke, do you have any thoughts on this?

kleisauke · 2021-12-11T11:00:54Z

I like the reuse-tables idea. Is it also possible to do that in libvips instead of doing it at the application level? I remembered a similar discussion here: https://gitter.im/libvips/devchat?at=6120f3f8aa48d1340c24e090.

Perhaps it would make sense to overload vips_image_decode with an additional parameter indicating that the operation changes the color (and thus re-quantization is needed)?

jcupitt · 2021-12-11T12:38:34Z

I thought about doing in libvips, but it would add a lot of special cases everywhere and extra code we'd have to maintain.

The application level already knows if tables can be reused, so it's a lot simpler to just get them to pass the information down to the writer.

jcupitt · 2021-12-11T12:43:54Z

I'd forgotten about the INDEXED idea. Yes, that's another possibility, though I think it wouldn't help with resize, since images would need to be converted back to RGB for that.

We could implement both: INDEXED would make things like flip, crop, rot90 automatically reuse the colour table, and reuse-tables could be set by the application level to skip table recomputation for simple resize.

jcupitt · 2021-12-11T14:29:24Z

I was bored and did a quick table reuse prototype:

https://github.com/libvips/libvips/tree/add-gif-tables

But it doesn't work! It seems libimagequant has no API to reuse an old colour table. there's liq_get_palette() to get the colour table out of a call to liq_image_quantize(), but there's nothing to set your own palette before calling liq_write_remapped_image().

I think we'd need to get changes upstream.

dloebl · 2021-12-11T15:35:36Z

Sounds good to reuse the palette(s) for certain GIF to GIF operations.

I had a look at the second case (G-Shock watch) and visualized possible transparency as green. Hope that adds something to the discussion:
gifsicle --resize-fit 700x_ -o test_gifsicle_out.gif watch.gif

vipsthumbnail "watch.gif[n=-1]" --size 700 -o test_vips_out.gif

Before resizing, frame 2 and the following frames are practically identical to their corresponding previous frame.
When setting everything possible to transparent, encoding frames 2 and following is, therefore, more or less encoding single-colour images (which LZW compresses very efficiently).
After resizing, a lot of noise is introduced and the transparency optimization is much less effective.
What would help is if resizing/colour-quantization preserves locally identical parts of two frames as locally identical. Then the transparency optimization would still work well and cgif produces a much smaller animation.

jcupitt · 2021-12-11T15:46:18Z

How about doing a low-pass filter before looking for similar pixels? That would remove the slight noise and let you find similar areas, not just similar points.

jcupitt · 2021-12-11T18:04:52Z

I did another quick hack adding set-palette to @lovell's libimagequant:

https://github.com/jcupitt/libimagequant/tree/add-set-palette

I now see:

$ vipsthumbnail g-shock.gif[n=-1] --size 1400 -o x.gif
$ ls -l x.gif 
-rw-r--r-- 1 john john 24191667 Dec 11 18:01 x.gif

ie. the bad behaviour you saw before, but with reuse-tables:

$ vipsthumbnail g-shock.gif[n=-1] --size 1400 -o x.gif[reuse-tables]
reusing cmap
74 frames
0 cmaps
$ ls -l x.gif 
-rw-r--r-- 1 john john 435381 Dec 11 18:01 x.gif

Much saner. The colours look a bit messed up though, I'm not sure why. No doubt I've done something dumb.

edit perhaps quantizr would be a better place to add API, since it's not a fork?

jcupitt · 2021-12-12T13:41:14Z

... I had a quick look at INDEXED mode, but that would need libnsgif changes -- the current API always gives you a fully decoded RGBA frame.

dloebl · 2021-12-13T18:01:36Z

How about doing a low-pass filter before looking for similar pixels? That would remove the slight noise and let you find similar areas, not just similar points.

Thanks for the suggestion, it's a good idea to remove the noise when doing the transparency optimization.

Maybe there is a bit of a misunderstanding in this issue:
My understanding is that the GIF encoding layer of libvips (cgifsave) is at the moment more libimagequant-and-cgif-encodes-the-GIF.
cgif is just the GIF encoder so far and we found it important that the encoder encodes exactly the image it gets as the input. So we didn't do any lossy compression yet and just started looking into the lossless optimization that Matthias mentioned in our meeting.
From our perspective, it would be interesting to integrate image pre-processing methods (e.g. quantization, resizing) in cgif that have in mind the compression ratio of GIFs.
It's of course not a task that can be done super quickly, but adding a layer above the bare GIF encoder might be helpful.
Would that be interesting for libvips?

jcupitt · 2021-12-13T18:57:33Z

Sure. I was thinking the same thing -- really you need to take control of the whole process (palette selection, dither, encode) if you want to optimise it significantly. Perhaps there's a possible collaboration with @DarthSim (the very skilled quantizr person)?

DarthSim · 2021-12-14T11:45:49Z

Hey everyone!

I think the core problem here is the usage of local color tables while the originals use only a global one. And here're two parts of the problem I see:

Every 256 color palette is 1KB of data.
Different palettes hugely increase the difference between frames and reduce the effectiveness of frames optimizations.

It increases the quality though. So I think what we need here is a compromise. If the original uses the global palette only, the result should use only the global palette, too. However, as an author of an app, I'd like to have control over this behavior. Ideally, the decoder should save the info about global/local palettes usage and their sizes so an app could decide how the encoder should behave.

Since animated images are represented as a column of frames inside of Vips, we can generate a single palette for all the frames in one pass. The problem here is that both LIQ and Quantizr change the palette while remapping the image. This most probably will lead to the color changes in all frames but the last one. Not good.

Alternatively, in the case of using only the global palette, we could remap all the frames in one pass. This will remove the palette change issue but will all an unwanted dithering on the top edges of frames.

To fix this, I can add an option like frame_height to Quantizr. If this option is set, Quantizr will nullify pixel errors in each frame_height row during dithering. Not sure if it's possible with LIQ.

I don't think the palette reusage is a good option. It's good in terms of speed, but if the image was resized, blurred, sharpened, color-adjusted, etc, the original's palette won't fit the image anymore.

Anyway, I'm going to spend some time on experiments on it on holidays. I'll let you know if I achieve something.

jcupitt · 2021-12-14T15:17:07Z

The problem for libvips is that global palette optimisation needs a lot of memory and prevents streaming, which (in turn) limits parellelism,

We'd really like a GIF saver which can work a frame at a time. That's the motivation behind the use of local palettes: make one for the first frame, and switch to local palettes if encoding error goes over some threshold (eg. perhaps there's a cut in the video). The current change detector is really bad (just the sum of index pixels), so replacing that would be a simple improvement,

Palette reuse can work well. For example, for resize, colours will mostly be the average of existing colours, so peak encode error is constrained. Imagine the original palette as 256 points scattered in a 256 x 256 x 256 RGB cube, and imagine a sphere at each point. Now expand each sphere equally until it just touches the nearest sphere. The radius of the largest sphere is the peak encode error on resize, blur, rotate, etc.

DarthSim · 2021-12-14T16:18:50Z

I agree that replacing the changes detector may improve things. However, this new detector should be really good. For example, if some new colors appeared in a new frame, the detector should notice this, otherwise, we're risking losing ones. I doubt it's possible to generate a good palette for multiple frames without having all these frames' data.

On the other hand, having this good changes detector that detects small yet noticeable changes, we will get redundant local palettes and less effective frames optimizations.

Palette reuse can work in case we don't change the colors of the image. If we apply a watermark, extend an image with some color outside of the palette, or apply color adjustments, the original palette will simply become invalid.

jcupitt · 2021-12-14T16:39:21Z

How about a function like max_quantization_error()? You give it an RGBA frame and a palette, and it somehow quickly estimates the maximum quant error. Perhaps ignoring 1% of outliers.

libvips could measure the quant error for the first frame, then for subsequent frames, generate a new palette if abs(this_frame_error - first_frame_error) > threshold.

DarthSim · 2021-12-14T18:41:23Z

It sounds like remapping without actual remapping :( If I got your idea right, we'll need to find the best color from the palette for each pixel and get the error (actual_color - palette_color). This is basically what happens during the remapping process. And this is not fast at all :(

dloebl · 2021-12-14T22:05:47Z

My brother and I had another look at the resizing problem. We made a branch gifresize where we do a very simple GIF-to-GIF resizing which avoids the size explosion and makes a reasonable resize. Regarding image quality, the resizing is not ideal yet as no color smoothing is done. But it's hopefully a first step.

For the proof-of-principle purpose, we used gifdec as the decoder (included as a git submodule). As libvips uses libnsgif, we will switch. Important is that this decoder outputs also just RGB data. But knowing that it came from a GIF, we can reconstruct the color table and color indices.

Since we just take RGB data as the input to the encoder, we used for simplicity only local color tables, but no size explosion happens. So local tables are not the main issue here, what matters is that identical areas of subsequent frames stay identical -- then the GIF transparency optimization in combination with LZW leads to really good compression.

Our resize tool works but it is more a draft right now. We can include further functionalities like this simple resize in cgif_tools and extend cgif to more than a bare encoder. With your help (@jcupitt, @DarthSim) we can make the quality better as well (color smoothing/quantization, etc.).
If you prefer, a similar resizing method like the one here could be integrated into libvips/cgifsave and Matthias and I would be happy to help. If it's helpful, we could have another chat so that we are all on the same page.

Results:
$ ./gifresize example.gif cgif_example.gif 200 113

$ ./gifresize test.gif cgif_test.gif 1400 438

$ du -h --apparent-size test.gif example.gif
558K	test.gif
874K	example.gif

$ du -h --apparent-size cgif_test.gif cgif_example.gif 
369K	cgif_test.gif
988K	cgif_example.gif

DarthSim · 2021-12-15T07:50:11Z

I agree, local palettes themselves are not a big problem. The problem is that the colors of a local palette most probably will differ from the colors of the global one so will indexes. I'm not sure how exactly frames optimizations are implemented in CGIG (didn't have a chance to check), but I bet this will lead to what we see in #2576 (comment).

jcupitt · 2021-12-15T09:46:47Z

Hmm then perhaps the remapper could compute peak error? Perhaps a public field in the struct has the peak error number after the remapper finishes. Then after calling remap, libvips could decide whether to reoptimise the palette (and call the remapper again for that frame).

DarthSim · 2021-12-15T13:45:44Z

I rewrote the code of vips_cgifsave to quantize all frames in one pass and to use the global palette only. It expectedly uses much more memory and it's a bit slower but the file size is much better.

I used the G-Shock gif for the test and set the result's bitsize to 6 as in the original image:

Current implementation:

$ /usr/bin/time -f %M:%e vipsthumbnail "/images/animated_gif12.gif[n=-1]" --size 1400 -o animated_gif12_vips_out.gif[bitdepth=6]
102280:7.72

$ du -h /images/animated_gif12_vips_out.gif
4.1M	/images/animated_gif12_vips_out.gif

My implementation:

$ /usr/bin/time -f %M:%e vipsthumbnail "/images/animated_gif12.gif[n=-1]" --size 1400 -o animated_gif12_vips_out1.gif[bitdepth=6]
598180:8.88

$ du -h /images/animated_gif12_vips_out1.gif
372K	/images/animated_gif12_vips_out1.gif

Hmm then perhaps the remapper could compute peak error? Perhaps a public field in the struct has the peak error number after the remapper finishes. Then after calling remap, libvips could decide whether to reoptimise the palette (and call the remapper again for that frame).

Well, this may do the job. LIQ calculates MSE, but not at high speeds. I can add the same to Quantizr.

jmreid · 2021-12-15T17:14:34Z

Is it safe to assume that Quantizr will be the quantizer of choice in libvips? If we're working to add these improvements there, then LIQ will end up being the inferior package to use.

Which is fine by me! We're happy to use Quantizr if it becomes the golden path for the best vips experience.

kleisauke · 2021-12-15T19:01:07Z

I noticed that you could also use the liq_histogram_* functions available in libimagequant.
https://github.com/ImageOptim/libimagequant#multiple-images-with-the-same-palette

So instead of quantizing all frames in one pass (which effectively reverts PR #2445), or generating a palette for each frame (current behavior), you can just call liq_histogram_add_image() for each frame to generate/optimize a single palette. Unfortunately, this API was introduced in version 2.8.0, so it is not available in Lovell's fork and thus GPL-licensed.

I had a quick attempt to implement this with commit kleisauke@86e499b and I see (with the second example image):

$ /usr/bin/time -f %M:%e vipsthumbnail test.gif[n=-1] --size 1400 -o x.gif
105504:11.86
$ du -h x.gif
24M	x.gif

$ /usr/bin/time -f %M:%e vipsthumbnail test.gif[n=-1] --size 1400 -o x.gif
821900:6.56
$ du -h x.gif
660K	x.gif

So, much higher memory usage, but faster and a better file size.

DarthSim · 2021-12-15T20:18:19Z

The results look good! Unfortunately, I see a few problems here:

As you mentioned, liq_histogram_quantize is available in the GPL-licensed versions only, which is not good. I don't think it's a good idea to lock the feature on a GPL-licensed lib. At least, I hope you guys won't do so 😅
You still need all the frames in the memory so you can remap them. Looks like using liq_histogram_quantize eats more memory than quantizing all the frames in one pass (or Quantizr is just more optimal in terms of memory usage, IDK).
When liq_histogram_quantize is used, the remapper doesn't improve the palette. From my experience, improving the palette after remapping gives a pretty good boost of quality.

kleisauke · 2021-12-16T11:35:56Z

Agreed, it's GPL-licensed so we should not spend much time on it. But it's worth investigating this API to add something similar to Quantizr.

I had another idea, instead of keeping those liq_image frame pointers in memory, we could generate the global palette in a separate sink_disc pass. See for example commit kleisauke@0c05e6e, with that I see:

$ /usr/bin/time -f %M:%e vipsthumbnail test.gif[n=-1] --size 1400 -o x.gif
124264:8.90
$ du -h x.gif
660K	x.gif

So, the memory usage is more or less the same, but the execution time is faster and the file size is much better.

DarthSim · 2021-12-19T14:37:59Z

Nice! Had the same idea about two sink_disk passes but thought it would be quite slow.

I implemented histogram API in Quantizr and tried your code - works great!

joshuamsager · 2022-01-17T19:52:56Z

Is there any progress on this? Is there any way I can help? Having this improved in libvips would be very beneficial

lovell · 2022-01-19T10:38:02Z

There's a fix to the GIF change detector discussed at #2622 might help a little bit here, as it should reduce the number of frame local palettes for a typical GIF.

maleblond · 2022-01-24T20:09:05Z

Thanks a lot @dloebl for your effort! @joshuamsager and I built both libvips and cgif from your branches to see what's the impact of this fix on GIF sizes with real GIFs uploaded by our users that are resized by us.

We detailed our analysis process below. Please let us know if you need more details around anything. We're also happy to help testing / comparing multiple configuration options with real world GIFs (i.e. testing multiple "quality" settings, to see how it impacts file-size and quality on real world images).

Impact on previously mentioned problematic GIFs

This table shows the file size after resizing the original image with libvips

Original image	Original size	New width	Resized (without libvips fix)	Resized (with libvips fix)	Diff (%)
	~897kB	200px	3.7MB	2.3MB	~38% smaller
	1MB	1400px	24.3MB	814KB	~97% smaller

We're seeing an outstanding improvement for the last image and a significant improvement on the first image (the resized file size is still more than twice the original).

The quality of the resized images also seems fine. For example, this is a comparison between the original image and the one resized to a 1400 px wide GIF with the version of libvips that includes the file size fixes:

Original

Resized

Holistic impact

We compared the file size of thousands of resized GIFs with libvips 8.12.1, libvips with #2628 and gifsicle. The results are encouraging, but we're still observing that GIFs tends to be significantly bigger w/ libvips.

For instance, without the libvips fix, 3.25% of all resized gifs were at least 5 times bigger than the gifsicle equivalent. With the fix, that went down to ~1%.

About 28% of the resized GIFs are at least 1.5 times bigger than gifsicle, and this didn't change significantly with the fix. Only ~2% of the resized GIFs are more than 10% smaller than gifsicle.

Sample GIFs

To help further debugging, here are some additional problematic GIFs. All of those images were resized with the following commands:
vipsthumbnail <filepath>[n=-1] --size <desiredSize>
gifsicle --resize-fit <desiredSize> <filepath>

Original size	Desired dimension	Libvips 8.12.1	Libvips (built from #2628)	Gifsicle 1.93
6.35MB	200x	1.67MB	1.67MB	194KB
6.38MB	200x	5.04MB	5.04MB	1.4MB
7.9MB	1024x1024	18.6MB	18.7MB	7.8MB

dloebl · 2022-01-30T21:32:39Z

Thanks @maleblond and @joshuamsager for this detailed analysis!
I made another change to #2628 that should lead to an additional size reduction.
To reduce the file size further I added a lossy option in #2628 (maxerror parameter in gifsave - it's switched on by default). You can think of it as inter-frame denoising, which increases the similarity between frames and therefore makes the inter-frame/transparency compression in GIF animations more effective.

In my tests, this method reduces the file size while keeping the high-quality resize of libvips. Your feedback on quality/file size would be very helpful. Can you also see an improvement with your data set? Just keep in mind that the method is lossy, so one should not increase the maxerror parameter too much (otherwise it could lead to ghosting artifacts with certain animated GIFs).

Furthermore:
Please note that this trick cannot be applied to frames with an active alpha channel (like the first two examples you provided) yet, but that's something on my ToDo already. At the moment, cgif does not support inter-frame compression and a user-defined alpha channel within the same frame.

joshuamsager · 2022-02-01T17:53:32Z

Thanks @dloebl for your improvements. With the default configuration for maxerror, we saw a substantial improvement to file sizes when testing 89beda7. We are working on what impact this made to the quality of GIFs across the board.

maleblond · 2022-02-08T15:44:13Z

Hi @dloebl, thanks a lot for your work on this !

I took a deep look at our image size + quality metrics and there is no doubt your work has made a huge impact. We plan on setting maxerror to 16, which we think is a fair balance between size and quality. I've put a detailed analysis below with some metrics.

Do you have a rough idea of when your PRs could make it into the next libvips / cgif releases? Is there any way we can help?

Detailed analysis

Image size

Metric type	Libvips 8.12.2	Maxerror set to 6	Maxerror set to 16	Maxerror set to 32
avg	1.55MB	1.13MB	0.92MB	0.77MB
p50	0.36MB	0.20MB	0.086MB	0.08MB
p95	7.48MB	5.65MB	4.45MB	3.79MB
p99	14.45MB	11.59MB	9.26MB	7.77MB

As expected, the higher maxerror is, lower is the image size. We also compared the GIF sizes between our new image processing server (using libvips) and our old (using gifsicle), and we found out that the sizes are much more comparable now, and even generally smaller with libvips.

The image quality metrics we use are SSIM and DeltaE . It basically tell us how different the output GIF is different from the original ones, from a scale of 0 to 100 (0 being identical, 100 being opposite). Even if those metrics are not really meant to be used for animated images / videos, we were able to observe that increasing maxerror would increase deltaE + SSIM:

Delta E, lower is better

Metric type	Libvips 8.12.2	Maxerror set to 6	Maxerror set to 16	Maxerror set to 32
avg	0.17	0.14	0.22	0.28
p50	0.072	0.07	0.085	0.092
p95	0.25	0.38	0.59	1.05
p99	1.05	0.77	1.32	1.60

The metrics for SSIM are similar.

We also did side by side comparison of selected GIFs. We observed that our old image processing server (gifsicle) was serving GIFs that were lower quality than when using libvips with maxerror 32. Using a maxerror set to 16 leads to overall better filesize and quality than with our old image processing server, so we chose to go with that option.

jcupitt · 2022-02-08T18:18:24Z

These results look great!

Do you have a rough idea of when your PRs could make it into the next libvips / cgif releases? Is there any way we can help?

From the libvips POV, we run on roughly a six-month cycle, so the next version (8.13) is due in perhaps May or early June.

maleblond · 2022-02-08T18:55:02Z

From the libvips POV, we run on roughly a six-month cycle, so the next version (8.13) is due in perhaps May or early June.

@jcupitt Got it, thanks! Is there any change it's considered a "bug fix" and it's included in the next patch release (which appears to be done more frequently)? Definitely not a big deal; we can fallback to building from source if we want to

@dloebl I noticed #2628 is in "draft mode"; is the latest commit production-ready as is? You mentioned adding support for lossy compression for GIFs w/ active alpha channel, but we were wondering if you were planning to do it in the same PR or a follow-up one? From our POV, we'd be good using your PR as is for now, since we saw really good improvements even without the active alpha channel support 😄

jcupitt · 2022-02-08T20:53:10Z

This feels like a large change, so I think it's better to keep it for 8.13. I'd build from master (as you say).

dloebl · 2022-02-09T16:53:21Z

@maleblond Thanks a lot for these systematic benchmarks! I'm very happy to see that the improvement is so significant. The initial value of maxerror=6 is maybe a bit conservative. My reasoning to keep this parameter relatively low was that I saw a few ghosting artifacts in a manual review. I added a few examples of ghosting artifacts and further notes to #2628.

Regarding the next cgif release:
#2628 depends on cgif v0.2.0. I plan to release v0.2.0 within the next couple of weeks. Behind the scenes, I did some restructuring of cgif to make it more flexible for future improvements, that's why I would like to do a few additional checks. Improving the compression for GIFs with an active alpha channel does not have to be done in #2628. I discussed the topic with @MCLoebl and we have a plan for how this improvement can be implemented in cgif. But it is something that we should not do in a rush since there are a few tricky cases to consider.

kleisauke · 2022-02-24T16:19:55Z

I implemented histogram API in Quantizr and tried your code - works great!

I finally had some time to incorporate that, see commit kleisauke@e0be9a6. This avoids having the libimagequant symbol aliases in Quantizr. This implementation also removes the 2000x2000 pixel limit, which helps kleisauke/net-vips#157.

Here's the file size comparison table from above with the extra added column built from this commit.

Details

Original size	Desired dimension	Libvips 8.12.1	Libvips (built from #2628)	Libvips (built from commit kleisauke@`e0be9a6`)	Gifsicle 1.93
~897kB	200x	3.7MB	2.3MB	1.77MB	?
1MB	1400x	24.3MB	814KB	674KB	?
6.35MB	200x	1.67MB	1.67MB	1.45MB	194KB
6.38MB	200x	5.04MB	5.04MB	4.69MB	1.4MB
7.9MB	1024x1024	18.6MB	18.7MB	18.55MB	7.8MB

(Tested on Fedora 35 with libimagequant v2.15.1 and cgif v0.2.0)

jcupitt · 2022-03-03T13:25:23Z

I've just tweaked the gifsave change detector to spot more cases (Kleis found a nasty one).

It's not the best way to do this. We are testing if the input image image changes when we want to know is "is the current palette good enough?"

How about:

Adding something to quantizr_remap() to measure the error on each RGBA -> index conversion (just euclidean distance, or perhaps distance squared) and accumulate a histogram.
Adding API like eg. quantizr_get_error_histogram() to read the error histogram from the previous remap.

Then libvips GIF write would do this:

For each output frame:
If this is the first frame, compute a palette.
Remap this frame.
Fetch the error hist and compute the average error of the top 1% of pixels (or maybe peak error after removing the worst 1%?).
If we've just computed a new palette, make a note of the error, otherwise compare it to the error from the frame the palette was generated for.
If the error is significantly worse (more than 1%?), recompute the palette.
Write the frame and loop.

So we now only recompute the palette if the rendering error gets bad, not every time the image changes. It should make smaller images for video clips (a very common case).

kleisauke · 2022-03-04T19:13:38Z

Note that we could get rid of the internal change detector if we switch back to global palettes. The histogram API of Quantizr and libimagequant to generate a global palette in a separate sink_disc pass could help with this.

I've just rebased this branch, see:
master...kleisauke:cgifsave-global-table

I now see:

$ vipsthumbnail test1.gif[n=-1] -s 200x -o tn_%s.gif
$ vipsthumbnail test2.gif[n=-1] -s 1400x -o tn_%s.gif
$ vipsthumbnail test3.gif[n=-1] -s 200x -o tn_%s.gif
$ vipsthumbnail test4.gif[n=-1] -s 200x -o tn_%s.gif
$ vipsthumbnail test5.gif[n=-1] -s 1024x1024 -o tn_%s.gif
$ du -h --apparent-size tn_*.gif
2.1M	tn_test1.gif
734K	tn_test2.gif
1.4M	tn_test3.gif
4.5M	tn_test4.gif
18M	tn_test5.gif

It's surprisingly fast without having to compromise on memory usage and file size (as noted in #2576 (comment)).

Looking at the commits that added the histogram API to libimagequant (commit ImageOptim/libimagequant@bdb17d2, ImageOptim/libimagequant@1b858a3, ImageOptim/libimagequant@cfc15ec and ImageOptim/libimagequant@e5f45cb), it looks like it's just making some internal features publicly available. I think this is doable in Lovell's BSD-2-Clause licensed fork without having to look at those commits (clean room design), but I'm not a lawyer.

Another option is to migrate the pre-built binaries to using Quantizr. However, that would imply that the effort parameter is no-op, but maybe that's not a issue.

jcupitt · 2022-03-05T10:30:58Z

Note that we could get rid of the internal change detector if we switch back to global palettes. The histogram API of Quantizr and libimagequant to generate a global palette in a separate sink_disc pass could help with this.

That's true, but you can't make two passes if the upstream image is sequential, like the output of thumbnail. We'd need to buffer the whole RGBA image to be written as GIF in memory and process it from there. We could have a global-palette option that enabled buffering in memory and global optimization, but I'm not it should be the default.

I tried to make a list of the strategies that have been suggested:

Global palette optimization Buffer the whole image to be written in memory in the GIF writer and generate one global palette. Pro smaller output image Con high memory use, slower, lower image quality.
Local palettes via change detection (current behaviour) Look for frame to frame changes and reoptimize the palette if the change is significant. Pro Low memory use, fast, good image quality Con larger output image, change detection can be unreliable.
Local palettes via error threshold Look for errors in frame encoding going out of bounds and reoptimize the palette if they do. Pro Low memory use, fast (hopefully?), good image quality Con larger output image (though smaller than 2.).
Reuse encoding palette Attach the palette during GIF load, reuse that palette if possible for GIF save. This is what imagemagick does -- they have a specific flag you must use if you want to reoptimize. Pro extremely fast, low memory use Con can produce terrible results if you need to reoptimize and don't.

By far the most common case must be GIF -> GIF, so maybe 4. is the best strategy? Plus an option to generate a global palette with 1., or a local palette with 3.

kleisauke · 2022-03-05T18:21:02Z

That's true, but you can't make two passes if the upstream image is sequential, like the output of thumbnail. We'd need to buffer the whole RGBA image to be written as GIF in memory and process it from there.

I thought running two passes over a sequential image was fine (modifying the test suite to test sequential access in gifsave seems also to work without any errors, see e.g. commit kleisauke@8d3aa99). But indeed, writing interlaced PNG images (on which this logic is based on) also seems to load the entire image in memory.

libvips/libvips/foreign/vipspng.c

Lines 1035 to 1043 in 5f7c12e

    
           	/* Check input image. If we are writing interlaced, we need to make 7 
        
           	 * passes over the image. We advertise ourselves as seq, so to ensure 
        
           	 * we only suck once from upstream, switch to WIO.  
        
           	 */ 
        
           	if( interlace ) { 
        
           		if( !(write->memory = vips_image_copy_memory( in )) ) 
        
           			return( -1 ); 
        
           		in = write->memory; 
        
           	}

I think this way (i.e. without doing vips_image_copy_memory()) was safe, since vips_{webp,gif}load already loads the entire image into memory, which are the most common formats to save as GIF. Conversions from JPEG/PNG/TIFF to GIF are much more unusual and probably generate those "out of order read"-errors on that branch.

By far the most common case must be GIF -> GIF, so maybe 4. is the best strategy? Plus an option to generate a global palette with 1., or a local palette with 3.

Thanks for listing the pros and cons! Global palette optimization (1.) was faster in my tests, without affecting memory usage, than its current behaviour with local palettes (2.). However, I only tested it with the GIF images mentioned in this issue.

You're right about the lower image quality when generating global palettes, especially when there are more than 256 colours in the image. Gifsicle resolves this neatly by generating global colour palettes by default and switching back to local palettes when it discovers that there are more than 256 colours in the image.
https://github.com/kohler/gifsicle/blob/15c36c2f2c113c21650fa81a6381841ca1536c59/src/merge.c#L185-L186

Reusing encoding palette (4.) sounds like an interesting strategy that would accelerate GIF to GIF processing, which is the most common case. I'm not sure if we should support GIF images with local palettes in that case, since it requires adding each palette of the frame as a separate metadata item.

DarthSim · 2022-04-26T08:48:09Z

Using the 3rd approach may be better than using the 2nd in terms of a better trigger to generate a new palette. Yet I see come cons here:

Remapping is about 60% of the job. So if we need to regenerate the palette, we'll spend 1.6 times more time.
Calculating MSE is easy as we calculate the error for each pixel anyway. Building an error histogram on the other hand will take much more time and much more memory (0.5 of RGBA image size).
LIQ calculates MSE only when effort is high enough, and it doesn't count the error histogram.

Here's what I'd offer:

For each frame, calculate the palette.
Calculate the difference between palettes:
- For each entry from the new palette, find the closest entry from the old palette and calc the distance between them.
- Calc mean distance - this is our error.
If the error is too high, use the new palette for remapping. Otherwise, use the old palette.

Pros:

No fancy stuff, no surprises. We use only the functions that are supported both by LIQ and Quantizr.
We ignore minor changes that don't affect the palette.
This is definitely more accurate than the frame sum comparison. And this doesn't require frame size limitations.

Cons:

We quantize every frame no matter what. But I bet we do the same with the current approach anyway.

DarthSim · 2022-04-26T08:53:53Z

And yeah, I still think that the global palette approach is the best one. GIF -> GIF conversion is the most common case, and most GIFs have global palettes. So I think that by trying to generate the best palette we are trying to solve a VERY rare case.

jcupitt · 2022-05-07T11:32:27Z

I was looking at hacking up a test implementation for (4) and realized that libnsgif (our GIF loader) can't tell us if an image has local colour tables, so that's blocked, sigh.

I'll ask the libnsgif maintainers if they'd be willing to add this feature (it should be simple, I think).

jcupitt · 2022-05-07T13:53:13Z

... libnsgif are adding this feature. Once it's done, I'll hack up a quick test of strategy (4) for us to evaluate.

Slight tangent: how about end of May for a release of libvips 8.13? It'd be good if we could resolve this issue by then.

... if there are no local colour tables. See #2576

jcupitt · 2022-05-09T14:51:20Z

Here's a libvips branch which attaches the input GIF palette, if possible:

https://github.com/libvips/libvips/tree/gif-palette

Eg.:

$ python3
Python 3.10.4 (main, Apr  2 2022, 09:04:19) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyvips
>>> x = pyvips.Image.new_from_file("3198.gif")
>>> x.get("gif-palette")
[-14010776, -4605511, -12035707, -16119469, -14282072, ...
>>> len(x.get("gif-palette"))
256
>>>

Each int is actually a int32 of RGBA bytes, hence the -ves.

Would anyone like to try getting GIF save to use these palettes if they are available?

* save GIF palette as metadata ... if there are no local colour tables. See #2576 * cgifsave: reuse global palette, if possible * add reuse_palette parameter * add reoptimise parameter and reuse palette by default * attach global palette even if local palettes are present * add check for presence of use-lct * Revert "add check for presence of use-lct" This reverts commit cd0f14e. * Revert "attach global palette even if local palettes are present" This reverts commit 4085b9e. * move global palette quantization to cgif_build * rename member variable gct * update comments * improve error handling * update documentation Co-authored-by: John Cupitt <jcupitt@gmail.com>

jcupitt · 2022-05-23T12:25:25Z

Master now has GIF palette reuse! Good job everyone.

Global palettes are attached on GIF load
On save, if the image has an attached palette, it's written with that
On save, if there's no attached palette, or if the reoptimise option is set, local palettes are generated on significant change

There's no option to generate a new global palette on write, perhaps there should be.

This will be in 8.13.

jcupitt · 2022-05-23T12:26:16Z

... let's make a new issue for any further improvements to GIF handling, this one has become very large,

dloebl mentioned this issue Jan 18, 2022

[WIP] Add cgif_rgb API dloebl/cgif#37

Draft

16 tasks

dloebl mentioned this issue Jan 19, 2022

cgifsave: avoid size issue #2628

Merged

lovell mentioned this issue Apr 3, 2022

Very slow when processing animated gif images lovell/sharp#3176

Closed

dloebl mentioned this issue Apr 26, 2022

cgifsave: allow to apply maxerror optimization to frames with alpha channel #2778

Merged

tlsa mentioned this issue May 8, 2022

Update LibNSGIF to latest upstream #2791

Merged

jcupitt added a commit that referenced this issue May 9, 2022

save GIF palette as metadata

6a3d770

... if there are no local colour tables. See #2576

dloebl mentioned this issue May 10, 2022

cgifsave: reuse global palette, if possible #2797

Merged

jcupitt closed this as completed May 23, 2022

wchaws mentioned this issue Jun 14, 2022

gif resize larger than original file wchaws/serverless-image-handler#122

Open

6 tasks

lovell mentioned this issue Apr 4, 2023

Animated GIF becomes up to 6x larger in size lovell/sharp#3610

Closed

3 tasks

Uh oh!

cgifsave file size larger than expected for some gifs #2576

cgifsave file size larger than expected for some gifs #2576

Comments

jmreid commented Dec 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

dloebl commented Dec 11, 2021

Uh oh!

jcupitt commented Dec 11, 2021

Uh oh!

jcupitt commented Dec 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kleisauke commented Dec 11, 2021

Uh oh!

jcupitt commented Dec 11, 2021

Uh oh!

jcupitt commented Dec 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcupitt commented Dec 11, 2021

Uh oh!

dloebl commented Dec 11, 2021

Uh oh!

jcupitt commented Dec 11, 2021

Uh oh!

jcupitt commented Dec 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcupitt commented Dec 12, 2021

Uh oh!

dloebl commented Dec 13, 2021

Uh oh!

jcupitt commented Dec 13, 2021

Uh oh!

DarthSim commented Dec 14, 2021

Uh oh!

jcupitt commented Dec 14, 2021

Uh oh!

DarthSim commented Dec 14, 2021

Uh oh!

jcupitt commented Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarthSim commented Dec 14, 2021

Uh oh!

dloebl commented Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarthSim commented Dec 15, 2021

Uh oh!

jcupitt commented Dec 15, 2021

Uh oh!

DarthSim commented Dec 15, 2021

Uh oh!

jmreid commented Dec 15, 2021

Uh oh!

kleisauke commented Dec 15, 2021

Uh oh!

DarthSim commented Dec 15, 2021

Uh oh!

kleisauke commented Dec 16, 2021

Uh oh!

DarthSim commented Dec 19, 2021

Uh oh!

joshuamsager commented Jan 17, 2022

Uh oh!

lovell commented Jan 19, 2022

Uh oh!

maleblond commented Jan 24, 2022

Impact on previously mentioned problematic GIFs

Holistic impact

Sample GIFs

Uh oh!

dloebl commented Jan 30, 2022

Uh oh!

joshuamsager commented Feb 1, 2022

Uh oh!

maleblond commented Feb 8, 2022

Detailed analysis

jmreid commented Dec 10, 2021 •

edited

Loading

jcupitt commented Dec 11, 2021 •

edited

Loading

jcupitt commented Dec 11, 2021 •

edited

Loading

jcupitt commented Dec 11, 2021 •

edited

Loading

jcupitt commented Dec 14, 2021 •

edited

Loading

dloebl commented Dec 14, 2021 •

edited

Loading

jcupitt commented Feb 8, 2022 •

edited

Loading

jcupitt commented Mar 3, 2022 •

edited

Loading

jcupitt commented Mar 5, 2022 •

edited

Loading

jcupitt commented May 7, 2022 •

edited

Loading