reducev performance #2757

kleisauke · 2022-04-11T19:27:23Z

kleisauke
Apr 11, 2022
Maintainer

The extra vips_sequential in reducev, introduced in commit 3ca2068, could slow down multi-threaded resize operations. This may cause --vips-concurrency 1 or VIPS_CONCURRENCY=1 to perform better than the same operation executed with full concurrency. See for example this gist:
https://gist.github.com/kleisauke/7d6f194cde52737ce8a494ee0a5e55e7

On that gist, removing this extra vips_sequential makes the C-path of reducev ~67% faster, and the AVX2/SSE4.1 paths (see the simd branch) varying between ~40% and ~52% faster. The multi-threaded timings are all inline after that, with at most 16ms difference.

This discussion was opened to find possible alternatives for resolving the "out of order read" errors under heavy load, which were fixed with that commit.

(This was initially discussed on Gitter, but I thought it would be better to track it here)

jcupitt · 2022-04-12T08:24:57Z

jcupitt
Apr 12, 2022
Maintainer

I agree, that was a really expensive fix, there must be something better.

I think this is where that extra seq came from: #617 ie. it was part of the revised seq mechanism.

It'd be good to get a reproducer for this problem so we can test possible improvements. I've tried commenting out the extra seq and trying to provoke an out of order read, but not managed it yet.

There's this issue as well from a month later: #639 that one finds and fixes a couple of small cache size issues, so (perhaps??) it makes the addition of the seq unnecessary now?

I've tried using VIPS_STALL, ie.:

$ VIPS_STALL=1 vips reducev k2.jpg x.jpg 1.5
$ VIPS_STALL=1 vips reducev st-francis.jpg x.jpg 1.5
$ VIPS_STALL=1 vips reducev st-francis.jpg x.jpg 16

But I've not been able to provoke an out of order read.

6 replies

jcupitt Apr 12, 2022
Maintainer

Ah that's a shame. But good to have a reproducer, nice!

jcupitt Apr 12, 2022
Maintainer

That doesn't reproduce on this PC :( I'll try a laptop.

kleisauke Apr 12, 2022
Maintainer Author

Ah, I could not reproduce this with --vips-concurrency 32 (IIRC, that's the same number of threads of your threadripper workstation). However, I was able to reproduce this using: --vips-concurrency 6 (which is the same number of threads of my i5-8600K workstation).

kleisauke Apr 20, 2022
Maintainer Author

Here's another benchmark: https://gist.github.com/kleisauke/8e27833e3c0ac0b4f35200203cc36bf4 (on Windows with a 5000x5000 image).

It's curious to see that this performance degradation affects the simd branch more than the master branch. Perhaps the THINSTRIP -> FATSTRIP change could cause that (commit kleisauke@ef2f42f)?

kleisauke Aug 1, 2022
Maintainer Author

I've updated the above benchmark gist for libvips 8.13.0 and latest simd branch revision. Here are some handy commits to test with:
kleisauke@735ac0a (based on the master branch)
kleisauke@6733cd4 (based on the simd branch)

jcupitt · 2023-11-15T16:43:55Z

jcupitt
Nov 15, 2023
Maintainer

There's an issue and sample code here with a possible improvement #3754

1 reply

jcupitt Nov 15, 2023
Maintainer

To save you a click, the idea is that we should have some thread-private storage containing an allocation number -- a small integer assigned in order from 0 by the threadpool allocation function (this can go in the base class) and order by that. sequential should stall threads which are out of order, and only let them though by strict allocation number.

This means out of order reads should be impossible, except in the cases where there's very large non-locality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reducev performance #2757

{{title}}

Replies: 2 comments 7 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

reducev performance #2757

kleisauke Apr 11, 2022 Maintainer

Replies: 2 comments · 7 replies

jcupitt Apr 12, 2022 Maintainer

jcupitt Apr 12, 2022 Maintainer

jcupitt Apr 12, 2022 Maintainer

kleisauke Apr 12, 2022 Maintainer Author

kleisauke Apr 20, 2022 Maintainer Author

kleisauke Aug 1, 2022 Maintainer Author

jcupitt Nov 15, 2023 Maintainer

jcupitt Nov 15, 2023 Maintainer

kleisauke
Apr 11, 2022
Maintainer

Replies: 2 comments 7 replies

jcupitt
Apr 12, 2022
Maintainer

jcupitt Apr 12, 2022
Maintainer

jcupitt Apr 12, 2022
Maintainer

kleisauke Apr 12, 2022
Maintainer Author

kleisauke Apr 20, 2022
Maintainer Author

kleisauke Aug 1, 2022
Maintainer Author

jcupitt
Nov 15, 2023
Maintainer

jcupitt Nov 15, 2023
Maintainer