imgproc: supports CV_SIMD_SCALABLE in pointSetBoundingRect #27479

fengyuentau · 2025-06-25T08:38:48Z

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

fengyuentau · 2025-06-25T08:44:08Z

Performance stats:

SpacemiT K1 (RISC-V):

                  Name of Test                   base-gcc patch-gcc patch-gcc 
                                                                        vs    
                                                                     base-gcc 
                                                                    (x-factor)
BoundingRect::TestBoundingRect::(CV_32F, 400)     0.001     0.001      1.95   
BoundingRect::TestBoundingRect::(CV_32F, 511)     0.002     0.001      2.12   
BoundingRect::TestBoundingRect::(CV_32F, 1000)    0.003     0.001      2.53   
BoundingRect::TestBoundingRect::(CV_32F, 10000)   0.025     0.011      2.15   
BoundingRect::TestBoundingRect::(CV_32F, 100000)  0.245     0.108      2.27   
BoundingRect::TestBoundingRect::(CV_32S, 400)     0.001     0.001      1.35   
BoundingRect::TestBoundingRect::(CV_32S, 511)     0.001     0.001      1.38   
BoundingRect::TestBoundingRect::(CV_32S, 1000)    0.002     0.001      1.54   
BoundingRect::TestBoundingRect::(CV_32S, 10000)   0.014     0.011      1.25   
BoundingRect::TestBoundingRect::(CV_32S, 100000)  0.139     0.109      1.27

                  Name of Test                   base-clang patch-clang patch-clang
                                                                            vs     
                                                                        base-clang 
                                                                        (x-factor) 
BoundingRect::TestBoundingRect::(CV_32F, 400)      0.002       0.001       2.68    
BoundingRect::TestBoundingRect::(CV_32F, 511)      0.002       0.001       2.85    
BoundingRect::TestBoundingRect::(CV_32F, 1000)     0.004       0.001       3.38    
BoundingRect::TestBoundingRect::(CV_32F, 10000)    0.039       0.013       3.03    
BoundingRect::TestBoundingRect::(CV_32F, 100000)   0.380       0.120       3.17    
BoundingRect::TestBoundingRect::(CV_32S, 400)      0.001       0.001       1.89    
BoundingRect::TestBoundingRect::(CV_32S, 511)      0.002       0.001       2.01    
BoundingRect::TestBoundingRect::(CV_32S, 1000)     0.003       0.001       2.35    
BoundingRect::TestBoundingRect::(CV_32S, 10000)    0.026       0.013       2.02    
BoundingRect::TestBoundingRect::(CV_32S, 100000)   0.248       0.120       2.06

i7-12700K:

                  Name of Test                   i7-base i7-patch  i7-patch 
                                                                      vs    
                                                                   i7-base  
                                                                  (x-factor)
BoundingRect::TestBoundingRect::(CV_32F, 400)     0.000   0.000      1.05   
BoundingRect::TestBoundingRect::(CV_32F, 511)     0.000   0.000      1.03   
BoundingRect::TestBoundingRect::(CV_32F, 1000)    0.000   0.000      1.04   
BoundingRect::TestBoundingRect::(CV_32F, 10000)   0.004   0.004      1.04   
BoundingRect::TestBoundingRect::(CV_32F, 100000)  0.042   0.041      1.02   
BoundingRect::TestBoundingRect::(CV_32S, 400)     0.000   0.000      1.09   
BoundingRect::TestBoundingRect::(CV_32S, 511)     0.000   0.000      1.09   
BoundingRect::TestBoundingRect::(CV_32S, 1000)    0.001   0.001      1.01   
BoundingRect::TestBoundingRect::(CV_32S, 10000)   0.005   0.004      1.11   
BoundingRect::TestBoundingRect::(CV_32S, 100000)  0.044   0.042      1.03

m2:

                  Name of Test                   m2-base m2-patch  m2-patch
                                                                      vs
                                                                   m2-base
                                                                  (x-factor)
BoundingRect::TestBoundingRect::(CV_32F, 400)     0.000   0.000      1.10
BoundingRect::TestBoundingRect::(CV_32F, 511)     0.000   0.000      0.97
BoundingRect::TestBoundingRect::(CV_32F, 1000)    0.000   0.000      0.98
BoundingRect::TestBoundingRect::(CV_32F, 10000)   0.003   0.003      0.99
BoundingRect::TestBoundingRect::(CV_32F, 100000)  0.031   0.031      1.00
BoundingRect::TestBoundingRect::(CV_32S, 400)     0.000   0.000      0.95
BoundingRect::TestBoundingRect::(CV_32S, 511)     0.000   0.000      0.92
BoundingRect::TestBoundingRect::(CV_32S, 1000)    0.000   0.000      1.04
BoundingRect::TestBoundingRect::(CV_32S, 10000)   0.003   0.003      1.00
BoundingRect::TestBoundingRect::(CV_32S, 100000)  0.029   0.031      0.92

fengyuentau · 2025-06-25T08:48:01Z

modules/imgproc/src/geometry.cpp

-        }
-        for(int j = 16; j < VTraits<v_uint8>::vlanes(); j*=2)
+        constexpr int max_nlanes = VTraits<v_int32>::max_nlanes;
+        int arr_minval[max_nlanes], arr_maxval[max_nlanes];


We can do better than this if we have something like vtrn1q and vtrn2q in universal intrinsics, which does something like this:

v: [x1 y1 x2 y2] vtrn1q(v, v): [x1 x1 x2 x2] vtrn2q(v, v): [y1 y1 y2 y2]

IMHO you can add extra branch with #ifdef CV_NEON. It gives you a chance to fix regressions on Mac M chips.

Sure we can use CV_NEON for conditional compilation.

Tested with vtrn and actually the performance did not go better on m2.

Could you push it as separate branch? It may be useful with other ARM configs. I want to test it with Jetson.

Let me give you a patch instead.

diff --git a/modules/imgproc/src/geometry.cpp b/modules/imgproc/src/geometry.cpp index c98684f6d4..412460e0e2 100644 --- a/modules/imgproc/src/geometry.cpp +++ b/modules/imgproc/src/geometry.cpp @@ -739,6 +739,16 @@ static Rect pointSetBoundingRect( const Mat& points ) minval = v_min(ptXY2, minval); maxval = v_max(ptXY2, maxval); } + #if CV_NEON_AARCH64 + int32x4_t x_minval = vtrn1q_s32(minval.val, minval.val); + int32x4_t y_minval = vtrn2q_s32(minval.val, minval.val); + int32x4_t x_maxval = vtrn1q_s32(maxval.val, maxval.val); + int32x4_t y_maxval = vtrn2q_s32(maxval.val, maxval.val); + xmin = vminvq_s32(x_minval); + ymin = vminvq_s32(y_minval); + xmax = vmaxvq_s32(x_maxval); + ymax = vmaxvq_s32(y_maxval); + #else constexpr int max_nlanes = VTraits<v_int32>::max_nlanes; int arr_minval[max_nlanes], arr_maxval[max_nlanes]; vx_store(arr_minval, minval); @@ -754,6 +764,7 @@ static Rect pointSetBoundingRect( const Mat& points ) if (xmax < arr_maxval[2*j]) xmax = arr_maxval[2*j]; if (ymax < arr_maxval[2*j+1]) ymax = arr_maxval[2*j+1]; } + #endif #endif for( ; i < npoints; i++ ) { @@ -794,6 +805,17 @@ static Rect pointSetBoundingRect( const Mat& points ) minval = v_min(ptXY2, minval); maxval = v_max(ptXY2, maxval); } + #if CV_NEON_AARCH64 + v_int32 i_minval = v_floor(minval), i_maxval = v_floor(maxval); + int32x4_t x_minval = vtrn1q_s32(i_minval.val, i_minval.val); + int32x4_t y_minval = vtrn2q_s32(i_minval.val, i_minval.val); + int32x4_t x_maxval = vtrn1q_s32(i_maxval.val, i_maxval.val); + int32x4_t y_maxval = vtrn2q_s32(i_maxval.val, i_maxval.val); + xmin = vminvq_s32(x_minval); + ymin = vminvq_s32(y_minval); + xmax = vmaxvq_s32(x_maxval); + ymax = vmaxvq_s32(y_maxval); + #else constexpr int max_nlanes = VTraits<v_int32>::max_nlanes; int arr_minval[max_nlanes], arr_maxval[max_nlanes]; vx_store(arr_minval, v_floor(minval)); @@ -809,6 +831,7 @@ static Rect pointSetBoundingRect( const Mat& points ) if (xmax < arr_maxval[2*j]) xmax = arr_maxval[2*j]; if (ymax < arr_maxval[2*j+1]) ymax = arr_maxval[2*j+1]; } + #endif #endif for( ; i < npoints; i++ ) {

I observed that with older version of gcc (8.3), the halide trick which deals with the tail of simd processing, can lead to some performance degradation. Newer compiler seems fine.

However, if the halide trick is dropped, the performance on i7-12700K also degrades.

modules/imgproc/src/geometry.cpp

asmorkalov · 2025-07-14T10:00:45Z

Hm, there is strange fault on armv7:

[----------] 4 tests from Imgproc_BoundingRect_Types
[ RUN      ] Imgproc_BoundingRect_Types.accuracy/0, where GetParam() = 4
[       OK ] Imgproc_BoundingRect_Types.accuracy/0 (81 ms)
[ RUN      ] Imgproc_BoundingRect_Types.accuracy/1, where GetParam() = 5
[       OK ] Imgproc_BoundingRect_Types.accuracy/1 (82 ms)
[ RUN      ] Imgproc_BoundingRect_Types.alignment/0, where GetParam() = 4
***  ***: ./bin/opencv_test_imgproc terminated
***  ***: ./bin/opencv_test_imgproc terminated

I'm debugging it right now.

asmorkalov · 2025-07-14T12:11:48Z

The test check the function behaviour with unaligned input. It was introduced by Maxim in c4ce942. armv7 is more sensitive to alignment and triggers sigbus:

Note: Google Test filter = Imgproc_BoundingRect_Types.alignment/0
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Imgproc_BoundingRect_Types
[ RUN      ] Imgproc_BoundingRect_Types.alignment/0, where GetParam() = 4

Program received signal SIGBUS, Bus error.
0xb6be3ab0 in pointSetBoundingRect(cv::Mat const&) () from /mnt/disk/opencv-build/lib/libopencv_imgproc.so.413
(gdb) bt
#0  0xb6be3ab0 in pointSetBoundingRect(cv::Mat const&) () from /mnt/disk/opencv-build/lib/libopencv_imgproc.so.413
Backtrace stopped: Cannot access memory at address 0x100fff8

Newer generation just charges some performance penalties here.

feat: pointSetBoundingRect supports CV_SIMD_SCALABLE

ba8bee5

fengyuentau added this to the 4.13.0 milestone Jun 25, 2025

fengyuentau requested a review from asmorkalov June 25, 2025 08:38

fengyuentau added optimization category: imgproc platform: riscv labels Jun 25, 2025

asmorkalov self-assigned this Jun 25, 2025

fengyuentau commented Jun 25, 2025

View reviewed changes

fengyuentau added 3 commits June 26, 2025 11:36

fix: fix scalar accuracy

8944340

fix: fix accuracy of simd

164b614

doc: fix tab

9cb463a

asmorkalov reviewed Jul 8, 2025

View reviewed changes

modules/imgproc/src/geometry.cpp Show resolved Hide resolved

fix: drop memcpy

f119229

vpisarev reviewed Jul 9, 2025

View reviewed changes

modules/imgproc/src/geometry.cpp Outdated Show resolved Hide resolved

vpisarev self-requested a review July 9, 2025 19:45

feat: use min and max in place of if conditions

23add81

vpisarev approved these changes Jul 12, 2025

View reviewed changes

asmorkalov approved these changes Jul 14, 2025

View reviewed changes

asmorkalov self-requested a review July 14, 2025 12:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

imgproc: supports CV_SIMD_SCALABLE in pointSetBoundingRect #27479

imgproc: supports CV_SIMD_SCALABLE in pointSetBoundingRect #27479

fengyuentau commented Jun 25, 2025

Uh oh!

fengyuentau commented Jun 25, 2025 •

edited

Loading

Uh oh!

fengyuentau Jun 25, 2025

Uh oh!

asmorkalov Jun 25, 2025

Uh oh!

fengyuentau Jun 25, 2025

Uh oh!

fengyuentau Jun 25, 2025

Uh oh!

asmorkalov Jun 25, 2025

Uh oh!

fengyuentau Jun 25, 2025

Uh oh!

fengyuentau Jun 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

asmorkalov commented Jul 14, 2025

Uh oh!

asmorkalov commented Jul 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

imgproc: supports CV_SIMD_SCALABLE in pointSetBoundingRect #27479

Are you sure you want to change the base?

imgproc: supports CV_SIMD_SCALABLE in pointSetBoundingRect #27479

Conversation

fengyuentau commented Jun 25, 2025

Pull Request Readiness Checklist

Uh oh!

fengyuentau commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fengyuentau Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

asmorkalov Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

fengyuentau Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

fengyuentau Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

asmorkalov Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

fengyuentau Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

fengyuentau Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asmorkalov commented Jul 14, 2025

Uh oh!

asmorkalov commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

fengyuentau commented Jun 25, 2025 •

edited

Loading

fengyuentau Jun 25, 2025 •

edited

Loading

asmorkalov commented Jul 14, 2025 •

edited

Loading