Skip to content

GAPI Fluid: Enable dynamic dispatching for AbsDiffC kernel. #21204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 13, 2021

Conversation

anna-khakimova
Copy link
Member

@anna-khakimova anna-khakimova commented Dec 6, 2021

Moved SIMD for GAPI Fluid AbsDiffC kernel to enable dynamic dispatching.

Performance report:

FluidAbsDiffCSIMDEnableDynamicDispatch.xlsx

force_builders=Linux AVX2,Custom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
Xbuild_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

build_image:Custom=centos:7
buildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

Xbuild_image:Custom=ubuntu-openvino-2021.3.0:20.04
build_image:Custom Win=openvino-2021.4.1
build_image:Custom Mac=openvino-2021.2.0

buildworker:Custom Win=windows-3

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
Xtest_opencl:Custom=OFF
Xtest_bigdata:Custom=1
Xtest_filter:Custom=*

CPU_BASELINE:Custom Win=AVX512_SKX
CPU_BASELINE:Custom=SSE4_2


int w = 0;
#if CV_SIMD
w = absdiffc_simd(in, scalar, out, width, chan);
w = absdiffc_simd(in, scalar, out, length, chan);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was a bug before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it wasn't a bug. I just took the length calculation to a higher level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, it was width then became width * chan here - so data amount as increased in chan times. Looks like we didn't use by whole data range before (was bug) or processing more data (possible out-of-bound ) for now

Copy link
Member Author

@anna-khakimova anna-khakimova Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it wasn't so. Earlier we calculate length inside absdiffc_simd_c3() and absdiffc_simd_c1c2c4() functions and then used length for a loop. You can see it here for example. Now length is calculated 2 level up of call stack (i.e. here).

@@ -1240,13 +1002,14 @@ static void run_absdiffc(Buffer &dst, const View &src, const float scalar[])

int width = dst.length();
int chan = dst.meta().chan;
const int length = width * chan;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for interest: why all types is signed? is it by historical reasons?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is for history :-)

@@ -129,6 +129,17 @@ MULC_SIMD(float, float)

#undef MULC_SIMD

#define ABSDIFFC_SIMD(T) \
int absdiffc_simd(const T in[], const float scalar[], T out[], \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no difference. This structure was organized for dynamic dispatching support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you describe relationship by short schema between them please?
Because i see two functions with the similar names, with the similar agruments and in the same namespace .
May be one of fucntions should have *_impl suffic, may be i wrong

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am i right in my assumption (it's definitely clear for me by now) that here we have function declaration
but in file before we have function definition?

If yes, then i suggest to rename MACRO as ABSDIFFC_SIMD_DECLARATION & ABSDIFFC_SIMD_DEFINITION or something like that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gfluidcore_func.hpp is header file for forward declaration of wrappers in the gfluidcore_func.dispatch.cpp. It's necessary for C++ compiler.

Copy link
Member Author

@anna-khakimova anna-khakimova Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest to rename MACRO as ABSDIFFC_SIMD_DECLARATION & ABSDIFFC_SIMD_DEFINITION or something like that

Ok. I would prefer to do this in the next PR together with MACRO for SIMD of the other kernels. All together.

//-------------------------

#define ABSDIFFC_SIMD(SRC) \
int absdiffc_simd(const SRC in[], const float scalar[], SRC out[], \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always_inline ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of dynamic dispatching support, any key words before return type break compilation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any key words before return type break compilation.

could you please provide observed error message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is implementation of external interface declared on the line 154 and used in other compilation modules, e.g. .dispatch.cpp. So it can't be inlined.

@dmatveev dmatveev self-assigned this Dec 8, 2021
Copy link
Contributor

@sivanov-work sivanov-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions

@@ -50,7 +50,7 @@ namespace opencv_test
class MinPerfTest : public TestPerfParams<tuple<cv::Size, MatType, cv::GCompileArgs>> {};
class MaxPerfTest : public TestPerfParams<tuple<cv::Size, MatType, cv::GCompileArgs>> {};
class AbsDiffPerfTest : public TestPerfParams<tuple<cv::Size, MatType, cv::GCompileArgs>> {};
class AbsDiffCPerfTest : public TestPerfParams<tuple<cv::Size, MatType, cv::GCompileArgs>> {};
class AbsDiffCPerfTest : public TestPerfParams<tuple<compare_f, cv::Size, MatType, cv::GCompileArgs>> {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be exact? why do we need a comparison function here?

Copy link
Member Author

@anna-khakimova anna-khakimova Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the meaning of the question. Why does Alexey Trutnev add comparison functions to all GAPI performance tests in the PR? Probably to understand whether the kernels works correctly in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the question was why this check is not exact any more but we discussed this double/float change on call. Thanks

Comment on lines +973 to +975
compare_f cmpF = get<0>(GetParam());
cv::Size sz_in = get<1>(GetParam());
MatType type = get<2>(GetParam());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tie()

Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@opencv-pushbot opencv-pushbot merged commit f1053d4 into opencv:4.x Dec 13, 2021
@alalek alalek mentioned this pull request Dec 30, 2021
@alalek alalek mentioned this pull request Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants