GAPI Fluid: SIMD Multiply kernel. #21024

anna-khakimova · 2021-11-09T11:28:55Z

SIMD optimization for Fluid Multiply kernel.

Performance report:

force_builders=Linux AVX2,Custom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
Xbuild_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

build_image:Custom=centos:7
buildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

Xbuild_image:Custom=ubuntu-openvino-2021.3.0:20.04
build_image:Custom Win=openvino-2021.4.1
build_image:Custom Mac=openvino-2021.2.0

buildworker:Custom Win=windows-3

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
Xtest_opencl:Custom=OFF
Xtest_bigdata:Custom=1
Xtest_filter:Custom=*

CPU_BASELINE:Custom Win=AVX512_SKX
CPU_BASELINE:Custom=SSE4_2

sivanov-work · 2021-11-09T11:47:05Z

modules/gapi/src/backends/fluid/gfluidcore.cpp

 {
-    // like OpenCV: returns 0, if y=0
+    // like OpenCV: returns 0, if DST type=uchar/short/ushort and divider(y)=0
    auto result = y? scale * x / y: 0;


Can SCR2 be float?

Yes, it can be. And next overload handles this case.

sivanov-work · 2021-11-09T11:52:39Z

modules/gapi/src/backends/fluid/gfluidcore.cpp

+        BINARY_(uchar,  uchar,  uchar,  run_arithm, dst, src1, src2, ARITHM_MULTIPLY, scale);
+        BINARY_(uchar,  ushort, ushort, run_arithm, dst, src1, src2, ARITHM_MULTIPLY, scale);
+        BINARY_(uchar,  short,  short,  run_arithm, dst, src1, src2, ARITHM_MULTIPLY, scale);
+        BINARY_(uchar,  float,  float,  run_arithm, dst, src1, src2, ARITHM_MULTIPLY, scale);


Is it supposed to have SRC1 & SRC2 for the same type (i mean if it falls into the div/mul with SFINAE with SRC1 & SRC2 then do we need to satisfy static_check(std::is_same<SRC1, SRC2>))?

This check is already exist. Please take a look here

sivanov-work · 2021-11-09T11:54:21Z

modules/gapi/src/backends/fluid/gfluidcore_func.dispatch.cpp

+namespace gapi {
+namespace fluid {
+
+#define DIV_SIMD(SRC, DST)                                                  \


looks like order is different with BINARY_
https://github.com/opencv/opencv/pull/21024/files#diff-fe0ac6b5a07c1bec59c3756100eb186c9117a82dd1d951ef6997b8423a89943dR787

What do you think to align: DST <-->SRC according to BINARY_ order or vise versa?

Two first arguments of div_simd() have SRC type. Third argument has DST type. So for MACRO I should mention SRC type first and second should be DST.

BYNARY_ macro is working good, so in the bounds of this task, I can't change API of this MACRO. Otherwise I have to change its invocation for all kernels that use it.

I see your point.
But what do you think to align argument positional agreement in newly added functions for mul/div in accommodation with the old approach with BINARY_?

UPDATED: Or it will affect template type auto-deduction?

sivanov-work · 2021-11-09T12:16:16Z

modules/gapi/src/backends/fluid/gfluidcore_func.simd.hpp

+CV_ALWAYS_INLINE
+typename std::enable_if<(std::is_same<SRC, short>::value && std::is_same<DST, ushort>::value) ||
+                        (std::is_same<SRC, ushort>::value && std::is_same<DST, ushort>::value) ||
+                        (std::is_same<SRC, ushort>::value && std::is_same<DST, short>::value), int>::type


IMHO: enable_if is good for simple condition check otherwise it confused
but as I noticed we need to pass type into v_load_f32 and others function, so we need to know actual type here...

IMHO again:
To avoid eyeballs scattering when reading this code i think it is possible to introduce MACRO at least

#define SRC_DST_SHORT_USHORT (std::is_same<SRC, >...) #define DST_SHORT_USHORT (std::is_same ...) ... template<...> CV_ALWAYS_INLINE typename std::enable_if<SRC_DST_SHORT_USHORT, int >::type div_hal(...) ... template<...> CV_ALWAYS_INLINE typename std::enable_if<SRC_DST_SHORT_USHORT, int >::type mul_hal(...)

Feel free to ignore it

sivanov-work · 2021-11-09T12:37:00Z

modules/gapi/src/backends/fluid/gfluidcore.cpp

+        {
+#if CV_SIMD
+            x = div_simd(in1, in2, out, length, scale);
+#endif
            for (; x < length; ++x)


is there are no loop-unrolling MACRO in opencv?

Does it make sense here?

dmatveev

👍 thanks!

alalek · 2021-11-15T17:17:52Z

Need to rebase after "Div" PR merge (git rebase -i upstream/4.x and drop "div" commits).

anna-khakimova · 2021-11-17T10:21:12Z

Need to rebase after "Div" PR merge (git rebase -i upstream/4.x and drop "div" commits).

@alalek Ok. Done. Please take a look.

anna-khakimova requested review from sivanov-work, terfendail and alalek November 9, 2021 11:29

sivanov-work approved these changes Nov 9, 2021

View reviewed changes

dmatveev self-assigned this Nov 15, 2021

dmatveev added this to the 4.5.5 milestone Nov 15, 2021

dmatveev approved these changes Nov 15, 2021

View reviewed changes

anna-khakimova force-pushed the ak/simd_mul branch from ecc0595 to 5ad0b60 Compare November 17, 2021 08:24

Fluid: SIMD multiply kernel

c47673b

anna-khakimova force-pushed the ak/simd_mul branch from 5ad0b60 to c47673b Compare November 17, 2021 08:37

opencv-pushbot merged commit 4b6047e into opencv:4.x Nov 17, 2021

alalek mentioned this pull request Dec 30, 2021

(5.x) Merge 4.x #21371

Merged

alalek mentioned this pull request Feb 22, 2022

(5.x) Merge 4.x #21651

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GAPI Fluid: SIMD Multiply kernel. #21024

GAPI Fluid: SIMD Multiply kernel. #21024

Uh oh!

anna-khakimova commented Nov 9, 2021 •

edited

Loading

Uh oh!

sivanov-work Nov 9, 2021

Uh oh!

anna-khakimova Nov 10, 2021 •

edited

Loading

Uh oh!

sivanov-work Nov 9, 2021

Uh oh!

anna-khakimova Nov 10, 2021

Uh oh!

sivanov-work Nov 9, 2021

Uh oh!

anna-khakimova Nov 10, 2021

Uh oh!

anna-khakimova Nov 10, 2021

Uh oh!

sivanov-work Nov 10, 2021 •

edited

Loading

Uh oh!

sivanov-work Nov 9, 2021

Uh oh!

sivanov-work Nov 9, 2021

Uh oh!

anna-khakimova Nov 10, 2021

Uh oh!

dmatveev left a comment

Uh oh!

alalek commented Nov 15, 2021

Uh oh!

anna-khakimova commented Nov 17, 2021

Uh oh!

Uh oh!

Uh oh!

GAPI Fluid: SIMD Multiply kernel. #21024

GAPI Fluid: SIMD Multiply kernel. #21024

Uh oh!

Conversation

anna-khakimova commented Nov 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anna-khakimova Nov 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sivanov-work Nov 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dmatveev left a comment

Choose a reason for hiding this comment

Uh oh!

alalek commented Nov 15, 2021

Uh oh!

anna-khakimova commented Nov 17, 2021

Uh oh!

Uh oh!

anna-khakimova commented Nov 9, 2021 •

edited

Loading

anna-khakimova Nov 10, 2021 •

edited

Loading

sivanov-work Nov 10, 2021 •

edited

Loading