Agg: Remove 16-bit limits #28904

QuLogic · 2024-09-28T08:08:47Z

PR summary

When rendering objects, Agg rasterizes them into scan line objects (an x/y point, horizontal length, and colour), and the renderer class writes those to the pixels in the final buffer. Though we have determined that Agg buffers cannot be larger than 2**16, the scan line classes that we use contain 16-bit signed integers internally, cutting off positive values at half the maximum.

Since the renderer uses 32-bit integers, this can cause odd behaviour where any horizontal span that starts before 2**15 (such as a horizontal spine) is rendered correctly even if it cross that point, but those that start after (such as a vertical spine or any portion of an angled line) end up clipped. For example, see how the spines and lines break in #28893.

A similar problem occurs for resampled images, which also uses Agg scanlines internally. See the breakage in #26368 for an example.

The example in that issue also contains horizontal spines that are wider than 2**15, which also exhibit strange behaviour.

By moving to 32-bit scan lines, positions and lengths of the lines will no longer be clipped, and this fixes rendering on very large figures.

Now, this does mean that internal bookkeeping will use double the memory. I have not yet benchmarked the memory usage to see what effect that would have. But I have implemented this two ways to show how it might work:

In _image_resample.h, the larger scan lines are used if the output image is >=2**15 in either dimension.
In _backend_agg.h, the larger scan lines are always used. While this is unfortunately the more common case, it's also the harder one to conditionalize as the type is in the class, and the template is used everywhere.

That being said, having the scan lines be conditional means that double the code is emitted instead. For the Agg backend, it increases by 4096 bytes (after stripping debug info), whereas for the image extension, it increased by 65536 bytes vs no increase for just using the larger one.

However, given the breakage in #23826, it may be necessary to always use the 32-bit scan lines.

Fixes #23826
Fixes #26368
Fixes #28893

PR checklist

"closes #0000" is in the body of the PR description to link the related issue
new and changed code is tested
[n/a] Plotting related features are demonstrated in an example
New Features and API Changes are noted with a directive and release note
Documentation complies with general and docstring guidelines

tacaswell · 2024-09-28T12:29:18Z

Even if we double the book keeping cost, I would expect it to be small in absolute terms? Our output is small (on the scale of the data our users feed us, on the scale of a small of the footprint of a typical modern program and available memory on a typical machine).

timhoffm · 2024-09-28T15:44:11Z

... and given AGG is basically unchanged since 2006 and typical RAM sizes back then were about 512MB, I believe we can afford a worst-case factor of 2 increase now after 20 years.

QuLogic · 2024-10-01T10:51:42Z

I ran tests with pytest-memray (starting with #28856 to avoid the leak), and 99% of the tests didn't change peak memory usage. About 16 were within 50K, 3 more within 100K, and 17 over 1M bytes difference (both above and below the original). Running those 17 all over again, a lot were close to zero difference the second time over. I spot checked the remaining ones and the difference was in NumPy random state somewhere. So I think almost all the differences in peak memory usage were just global state and/or due to caching from other tests running earlier depending on parallel job distribution.

timhoffm · 2024-10-01T11:30:35Z

src/_image_resample.h

@@ -699,8 +702,8 @@ static void get_filter(const resample_params_t &params,
 }


-template<typename color_type>
-void resample(
+template<typename color_type, bool is_large = false>


Would this be a better name?

Suggested change

template<typename color_type, bool is_large = false>

template<typename color_type, bool use_32bit = false>

Since there's not really been much change to memory usage, I'm going to drop the template change and just always use the larger scan lines. That'll save us some code space, and template complications.

When rendering objects, Agg rasterizes them into scan line objects (an x/y point, horizontal length, and colour), and the renderer class writes those to the pixels in the final buffer. Though we have determined that Agg buffers cannot be larger than 2**16, the scan line classes that we use contain 16-bit _signed_ integers internally, cutting off positive values at half the maximum. Since the renderer uses 32-bit integers, this can cause odd behaviour where any horizontal span that _starts_ before 2**15 (such as a horizontal spine) is rendered correctly even if it cross that point, but those that start after (such as a vertical spine or any portion of an angled line) end up clipped. For example, see how the spines and lines break in matplotlib#28893. A similar problem occurs for resampled images, which also uses Agg scanlines internally. See the breakage in matplotlib#26368 for an example. The example in that issue also contains horizontal spines that are wider than 2**15, which also exhibit strange behaviour. By moving to 32-bit scan lines, positions and lengths of the lines will no longer be clipped, and this fixes rendering on very large figures. Fixes matplotlib#23826 Fixes matplotlib#26368 Fixes matplotlib#28893

For very large Figures with a single Axes, the horizontal spines may be long enough to need splitting into separate lines. For large enough coordinates, the `x1+x2` calculation may overflow, flipping the sign bit, causing the coordinates to become negative. This causes an infinite loop as some internal condition is never met. By dividing `x1`/`x2` by 2 first, we avoid the overflow, and can calculate the split point correctly.

QuLogic · 2024-10-02T08:26:45Z

After some additional checks, I was able to determine that the image dimension limit is also outdated, and we can expand it to match Agg's coordinates at 2**23 pixels. Also after some searching, I believe I've essentially re-created #3451 which was closed due to the CXX removal.

With these changes, I am able to run:

import matplotlib.pyplot as plt
import numpy as np

t = np.arange(0, 3000, 0.01)
s1 = np.sin(2 * np.pi * 20 * t)

fig, ax = plt.subplots(figsize=(83886.07, .5), layout='constrained')
ax.plo(t, s1)
ax.set(xlim=(0, 2), xticks=[], yticks=[])
ax.grid(True)

fig.savefig('lines.png')

and produce an 8388607x50-pixel (i.e., 2**23-1) image with lines that are not broken in any visible way.

Shortening the example from #26368:

import numpy as np
import matplotlib.pyplot as plt

image = np.zeros((100000, 800, 3))
linspace = np.linspace(0, 1, 800)
image[:, :, 0] = 1 - linspace  # Red channel
image[:, :, 2] = linspace  # Blue channel

plt.figure(figsize=(10000, 0.5), dpi=100)
ax = plt.axes([0, 0, 1, 1], frameon=False)
ax.set_xticks([])
ax.set_yticks([])

ax.imshow(image, aspect="auto")

plt.savefig("gradient_image.png", pad_inches=0)

which produces a 1Mx50-pixel image of a gradient. Unfortunately, the resampling step takes about 16s per 100000 pixels, or, nearly 3 minutes for 1M pixels and 5-7GiB of RAM. So I did not test out the full 8M pixel image.

I also read issues that mentioned the 32768-pixel limit as being something about sqrt(max int), i.e., total number of pixels below max int, though I'm not sure why. I believe we have changed most multiplications to upcast to AGG_INT64, and (2**23-1)**2 should be well within that. With RGBA, it's also approximately 256 TiB, so I have neither the RAM nor the time to test that.

With 32-bit scan lines, we are able to take advantage of the full 32-bit range of coordinates. However, as noted in `extern/agg24-svn/include/agg_rasterizer_scanline_aa.h`, Agg uses 24.8-bit fixed point coordinates. With one bit taken for the sign, the maximum coordinate is now 2**23.

tacaswell · 2024-10-07T19:12:36Z

Does this need a rebase?

QuLogic · 2024-10-08T10:07:08Z

It could to fix the CI, but from other PRs, it seems that macOS/homebrew broke again for other reasons.

QuLogic added backend: agg Performance labels Sep 28, 2024

github-actions bot added the topic: images label Sep 28, 2024

timhoffm reviewed Oct 1, 2024

View reviewed changes

timhoffm approved these changes Oct 1, 2024

View reviewed changes

tacaswell added this to the v3.10.0 milestone Oct 1, 2024

QuLogic added 2 commits October 2, 2024 03:12

QuLogic force-pushed the remove-16bit-limits branch from ff86c19 to ace1cc2 Compare October 2, 2024 08:07

QuLogic changed the title ~~Agg: Use 32-bit scan line classes~~ Agg: Remove 16-bit limits Oct 2, 2024

QuLogic marked this pull request as ready for review October 2, 2024 08:27

QuLogic force-pushed the remove-16bit-limits branch from ace1cc2 to 9259989 Compare October 2, 2024 08:32

ksunden approved these changes Oct 2, 2024

View reviewed changes

greglucas approved these changes Oct 8, 2024

View reviewed changes

greglucas merged commit 9b33fbd into matplotlib:main Oct 8, 2024
38 of 44 checks passed

QuLogic deleted the remove-16bit-limits branch October 8, 2024 22:23

zqianem mentioned this pull request Jan 8, 2025

[Bug]: High memory usage in Figure.savefig #29434

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Agg: Remove 16-bit limits #28904

Agg: Remove 16-bit limits #28904

Uh oh!

QuLogic commented Sep 28, 2024

Uh oh!

tacaswell commented Sep 28, 2024

Uh oh!

timhoffm commented Sep 28, 2024

Uh oh!

QuLogic commented Oct 1, 2024 •

edited

Loading

Uh oh!

timhoffm Oct 1, 2024

Uh oh!

QuLogic Oct 1, 2024

Uh oh!

QuLogic commented Oct 2, 2024

Uh oh!

tacaswell commented Oct 7, 2024

Uh oh!

QuLogic commented Oct 8, 2024

Uh oh!

Uh oh!

Uh oh!

	template<typename color_type, bool is_large = false>
	template<typename color_type, bool use_32bit = false>

Uh oh!

Agg: Remove 16-bit limits #28904

Agg: Remove 16-bit limits #28904

Uh oh!

Conversation

QuLogic commented Sep 28, 2024

PR summary

PR checklist

Uh oh!

tacaswell commented Sep 28, 2024

Uh oh!

timhoffm commented Sep 28, 2024

Uh oh!

QuLogic commented Oct 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm Oct 1, 2024

Choose a reason for hiding this comment

Uh oh!

QuLogic Oct 1, 2024

Choose a reason for hiding this comment

Uh oh!

QuLogic commented Oct 2, 2024

Uh oh!

tacaswell commented Oct 7, 2024

Uh oh!

QuLogic commented Oct 8, 2024

Uh oh!

Uh oh!

Uh oh!

QuLogic commented Oct 1, 2024 •

edited

Loading