Skip to content

G-API: oneVPL - Performance: Add async decode pipeline & add cached pool #20901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 10, 2021

Conversation

sivanov-work
Copy link
Contributor

@sivanov-work sivanov-work commented Oct 19, 2021

This PR is a part of large PR #20469

Improve performance of VPL pipeline processing:

  • make it by fully async: queue multiple VPL decode requests to utilize full VPL decoder usage
  • check decoding readiness without blocking
  • Turn on cached surface pool: avoid of O(n) in searching free surface from pool

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Build configuration

force_builders=XCustom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
build_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

Xbuild_image:Custom=centos:7
Xbuildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

build_image:Custom=ubuntu-openvino-2021.3.0:20.04
Xbuild_image:Custom Win=openvino-2021.2.0
build_image:Custom Mac=openvino-2021.2.0

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
test_opencl:Custom=OFF
test_bigdata:Custom=1
test_filter:Custom=*


build_image:Custom Win=gapi-onevpl-2021.6.0
buildworker:Custom Win=windows-3
build_contrib:Custom Win=OFF

GAPI_LOG_WARNING(nullptr, "[" << my_sess.session <<
"] has no surface, reason: " <<
ex.what());
// TODO it is supposed to place `break;` here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it Ok to get into this catch? I mean even now it can be just exited correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comment can be applied to all the catch blocks below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's OK here: swap_surface can throw exception in case of no available "free" surface is present in pool. This lack-of-resource state processing (like std::bad_alloc for new allocation ) allows us to go on async-waiting result state and put off decoder queuing job until one or more async result completed and resource freeing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue of bad_alloc is that it can be handled explicitly - here we handle all the exceptions uniformly and this bad-alloc / general error difference is not clear to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the reason why I adhered my point about using abort() in GAPI_Assert instead of throwing exception:
because abort() means the library itself problem which is not supposed to be handed by user but introducing BUG to fix. Otherwise exception has recoverable semantic

In current catch block we could face with two different kind of exception:

swap_surface ->get_free_surface() -> find KEY in pool (exception means library problem and supposed to be abort() with BUG/Issue creation)

and the second

swap_surface ->get_free_surface()-> pool.find_free() -> no free surface at the time (resource unavailable at moment)

The second is expected to be processed.
No other exception types is throwing.

So, according to idea that GAPI_Assert has no abort inside - we need to catch only second case. So it make sense to find out what types of exception is throwing by GAPI_Assert and sort out different types in pool.find_free

using namespace cv::gapi::wip::onevpl;

const auto params = GetParam();
source_t src = findDataFile(get<0>(params));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we put false here as second argument to not complain on file not found error? Or I am wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe yes, i will add it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like false is confusing here. it added but has reverted

From my perspective:

  1. if user didn't configured EXTRA path in here initTestDataPathSilent() then SkipException is OK here
  2. IF user did configured EXTRA path here, then IT MUST contain requested test files. So findDataFile must request it availability

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please align it with the latest news from @mpashchenkov

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asked a detailed question in @mpashchenkov PR by #20995 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#20995 (comment)
So:
test must fail if no env OPENCV_TEST_DATA_PATH is set (in case of test requires external test data)
test must fail if OPENCV_TEST_DATA_PATH set and filepath is incorrect.

Looks like i need to remove my initTestDataPathSilent check


const auto params = GetParam();
source_t src = findDataFile(get<0>(params));
codec_t type = get<1>(params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you find using std::tie(src, type) = params; better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as for me this transforms into the same 3 lines

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind using std::tie instead of get<N> is type safety

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure that it is important in UTs, because an one tuple type is for a single one test and when you changed the tuple set - it had caused by the reason that you intended to change test by itself.
If a some common tuple set (hypothetically) was changed by adding new type for a some specific single new one test then it is more convenient for me to do not brake ALL tests which depended from common tuple set by adding std::ignore for all of them (in case if i have 100500 legacy unit tests).

Additionally, AND which is more important for me: please compare

std::string path;
...
std::tie(path,...) = GetParams();
path = findDataFile(path);
...

vs

auto path = findDataFile(std::get<0>(param));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because an one tuple type is for a single one test and when you changed the tuple set - it had caused by the reason that you intended to change test by itself.

Not always, unfortunately - sometimes these parameters are introduced only for particular tests in the suite and the rest is not updated.

std::vector<CfgParam> cfg_params {
CfgParam::create<std::string>("mfxImplDescription.Impl", "MFX_IMPL_TYPE_HARDWARE"),
CfgParam::create("mfxImplDescription.mfxDecoderDescription.decoder.CodecID", type),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there are difference between instantiations? std::string is not deducible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first function deduces template argument as const char* which uses as type for construction cv::variant later.
cv::variant is pretty different from std::variant and cannot support construct_as semantic for std::string (At the moment when current PR introduced). Thus explicit template instantiation requested here due to limited functionality of cv::variant

std::count_if(surfaces.begin(), surfaces.end(),
[](const surface_ptr_t& val) {
GAPI_DbgAssert(val && "Pool contains empty surface");
return !val->get_locks_count();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is still more tricky to read !int()
Is this some common way to represent logic statement in some famous libraries (likes STD)?
Just a minor question, not to address!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder when you asked - as for me it's well known expression, especially in C in the same way as if (num %2) and so on

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Asya's point, get_lock_count() == 0 is more straight to understand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will change by majority request

// prepare sync object for new surface
LegacyDecodeSession::op_handle_t sync_pair{};

// enqueue qecode operation with current session surface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for such comment!
qecode -> decode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

ready_frames.emplace(cv::MediaFrame(std::move(frame_adapter)), sess.generate_frame_meta());

// pop away ready sync onject
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks


if (my_sess.last_status == MFX_ERR_NONE) {
my_sess.sync_queue.emplace(sync_pair);
} else if (MFX_ERR_MORE_DATA != my_sess.last_status) /* suppress MFX_ERR_MORE_DATA warning */ {
Copy link
Contributor

@AsyaPronina AsyaPronina Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different order of operands in comparison against if above, however, such order is mostly required for == comparison...

is it intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was written in different times of day :)

using namespace cv::gapi::wip::onevpl;

const auto params = GetParam();
source_t src = findDataFile(get<0>(params));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we usually use std::tie instead of std::get<>

};

cv::gapi::wip::IStreamSource::Ptr source_ptr;
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, wondering why it isn't under ifdef ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes sense.
What is the current order for UT: to skip or to disable ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added #ifdef HAVE_ONEVPL in the beginning

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable (put under ifdef)

// create empty pool
pool_t pool;
pool.reserve(pool_size);
// create empty pool with reservation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd start with capital and usually we use these tags before comment:
NB, FIXME, TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, what is NB ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nota bene

@dmatveev dmatveev self-assigned this Oct 26, 2021
@dmatveev dmatveev added this to the 4.5.5 milestone Oct 26, 2021

const auto params = GetParam();
source_t src = findDataFile(get<0>(params));
codec_t type = get<1>(params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind using std::tie instead of get<N> is type safety

Comment on lines +59 to +57
TEST_CYCLE()
{
source_ptr->pull(out);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be honest I'm not sure how well it maps to the idea of a performance test this way.

TEST_CYCLE() measures the same thing again and again to calculate its statistic values. Here you actually decode different frames, but measure it as the same random value - not sure if it is correct.

Maybe running the whole file under the same TEST_CYCLE would make sense w.r.t stats, but then this test can take foerever...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I's true, because different frames take up different decode duration but...

It's supposed to be compared in performance metrics against VideoCap on the same input streams (and frames) AND find out hotspots by using Vtune metrics in overall pipeline. I means auxiliary code around on direct MFX decode api

Wrapping TEST_CYCLE for all case and to incorporate a source creation also will measure MFX library/DX11 CoCreate load/unload and so on and doesn't give me relative result performance in comparison with VideoCapdecode api

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping TEST_CYCLE for all case and to incorporate a source creation also will measure MFX library/DX11 CoCreate load/unload and so on and doesn't give me relative result performance in comparison with VideoCapdecode api

This makes sense, but can it be done only once before the TEST_CYCLE() body kicks in? Is there a way to decompose it to prologue/epilogue so that initialization/deinitialization for every individual run is left out of profile?


using namespace cv::gapi::wip;

source_t src = findDataFile(GetParam(), false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the method which throws SkipTestException so we don't need to test it on our own

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It receives initial PATH and return modified PATH according to the OPENCV_TEST_DAT_PATH prefix but for only single case if file is found by path in result.
So, if path doesn't find then no modified PATH returned and no exception was throws in case of second false params.
So i was confused by changing second params to false
Let's remove false here - false means optional, so when requested file doesn't found it returns non-modified path and throw Skip otherwise.

My suggestions:
if initTestDataPathSilent is success, that means OPENCV EXTRA is present that MUST provide test video file by requested path otherwise configuration is incorrect. so false must be removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please align with @mpashchenkov

// create empty pool
pool_t pool;
pool.reserve(pool_size);
// create empty pool with reservation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nota bene

GAPI_LOG_WARNING(nullptr, "[" << my_sess.session <<
"] has no surface, reason: " <<
ex.what());
// TODO it is supposed to place `break;` here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue of bad_alloc is that it can be handled explicitly - here we handle all the exceptions uniformly and this bad-alloc / general error difference is not clear to me

Comment on lines 296 to 298
sess.swap_surface(*this);
return ExecutionStatus::Continue;
} catch (const std::exception& ex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too

Comment on lines 361 to 364
try {
sess.swap_surface(*this);
return ExecutionStatus::Continue;
} catch (const std::exception& ex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here too

Copy link
Contributor

@TolyaTalamanov TolyaTalamanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :+1

};

cv::gapi::wip::IStreamSource::Ptr source_ptr;
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable (put under ifdef)

@sivanov-work sivanov-work force-pushed the merge_source_unite_perf_mod branch from 1220067 to 1d8df9f Compare November 2, 2021 12:48
Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to merge

@sivanov-work
Copy link
Contributor Author

@alalek Could you merge it please? alignment with VPL builder would be done later when all parts of vpl source merged

@alalek alalek merged commit da63442 into opencv:4.x Nov 10, 2021
@alalek alalek mentioned this pull request Dec 30, 2021
@alalek alalek mentioned this pull request Feb 22, 2022
@sivanov-work sivanov-work deleted the merge_source_unite_perf_mod branch March 5, 2022 05:48
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
…perf_mod

 G-API: oneVPL - Performance: Add async decode pipeline & add cached pool

* Add async decode pipeline & intro cached pool

* Fix performacne test with checking OPENCV_EXTRA

* Add sip perf test with no VPL

* Fix misprint

* Remove empty line..

* Apply some comments

* Apply some comments

* Make perf test fail if no OPENCV_TEST_DATA_PATH declared
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants