Skip to content

Add capacity to Videocapture to return the extraData from FFmpeg when required #20978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Nov 23, 2021

Conversation

cudawarped
Copy link
Contributor

@cudawarped cudawarped commented Oct 29, 2021

the feedback to #20444 it is not desirable to add the facility to VideoCapture to allow RTSP streams to be written directly inside of it, however it would be acceptable to write these from cudacodec::VideoReader() instead. This would leave the VideoCapture() routine as it is, only able to do one thing or the other, either decode or output raw encoded data, not both at the same time and the cudacodec::VideoReader() able to do both at the same time.

In order to achieve this in a robust way it would be desirable for the extraData from FFMPEG to be output at the begining of the raw output contained in the frame retrieved when in raw mode (VideoCapture::set(CAP_PROP_FORMAT, -1)). Therefore this pull request updates rawMode to append any extra data recieved during the initial negotiation of an RTSP stream or during the parsing of an MPEG4 file header (not related to file writing but required for cudacodec::VideoReader() to decode MPEG4).
For h264[5] RTSP streams this ensures the parameter sets if available are always returned on the first call to grab()/read() and has two purposes:

  1. To ensure the parameter sets are available even if they are not transmitted in band. This is common for axis ip camera's.
  2. To allow callers of VideoCapture::grab()[read()] to write to split the raw stream over multiple files by appending the parameter sets to the begining of any new files.
    For (1) there is no alternative, for (2) if the parameter sets were provided in band it would be possible to parse the raw bit stream and search for the parameter sets however that would be a lot of work when that information is already provided by FFMPEG.
    For MPEG4 files this information is only suplied in the header and is required for decoding.

Two properties are also required to enable the raw encoded bitstream to be written to multiple files, these are;

  1. an indicator as to whether the last frame was a key frame or not - each new file needs to start at a key frame to avoid storing unusable frame diffs,
  2. the length in bytes of the paramater sets contained in the last frame - required to split the paramater sets from the frame without having to parse the stream. Any call to VideoCapture::get(CAP_PROP_LF_PARAM_SET_LEN) returning a number greater than zero indicates the presense of a parameter set at the begining of the raw bitstream.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom Win
build_contrib:Custom Win=OFF

build_image:Custom Win=ffmpeg
Xbuild_image:Custom Win=ffmpeg-plugin
buildworker:Custom Win=windows-2
test_modules:Custom Win=videoio
test_opencl:Custom Win=ON

James Bowley and others added 8 commits June 10, 2019 17:42
Old flag resulted in software implementation being selected when the Intel decoder is not the primary adapter.
…egotiation of an RTSP stream or during the parsing of an MPEG4 file header.

For h264[5] RTSP streams this ensures the parameter sets if available are always returned on the first call to grab()/read() and has two purposes:
1) To ensure the parameter sets are available even if they are not transmitted in band.  This is common for axis ip camera's.
2) To allow callers of VideoCapture::grab()[read()] to write to split the raw stream over multiple files by appending the parameter sets to the begining of any new files.
For (1) there is no alternative, for (2) if the parameter sets were provided in band it would be possible to parse the raw bit stream and search for the parameter sets however that would be a lot of work when that information is already provided by FFMPEG.
For MPEG4 files this information is only suplied in the header and is required for decoding.

Two properties are also required to enable the raw encoded bitstream to be written to multiple files, these are;
1) an indicator as to whether the last frame was a key frame or not - each new file needs to start at a key frame to avoid storing unusable frame diffs,
2) the length in bytes of the paramater sets contained in the last frame - required to split the paramater sets from the frame without having to parse the stream.  Any call to VideoCapture::get(CAP_PROP_LF_PARAM_SET_LEN) returning a number greater than zero indicates the presense of a parameter set at the begining of the raw bitstream.
Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only able to do one thing or the other, either decode or output raw encoded data

Backend should not write files, but it could emit RAW packets from demuxer.
This can be designed similar to audio streams which return extra data through additional .retrieve() calls (for now, audio is available for MSMF backend only).

@@ -200,6 +200,8 @@ enum VideoCaptureProperties {
CAP_PROP_AUDIO_TOTAL_CHANNELS = 64, //!< (read-only) Number of audio channels in the selected audio stream (mono, stereo, etc)
CAP_PROP_AUDIO_TOTAL_STREAMS = 65, //!< (read-only) Number of audio streams.
CAP_PROP_AUDIO_SYNCHRONIZE = 66, //!< (open, read) Enables audio synchronization.
CAP_PROP_LF_KEY_FRAME=67, //!< Indicates whether the last frame was a key frame or not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last frame

Name doesn't make sense because streams from IP cameras doesn't have the end or duration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, as the flag refers to the packet data stored in the last raw frame would something similar to the below make more sense?

CAP_PROP_LAST_RAW_FRAME_CONTAINS_KEY_FRAME
CAP_PROP_LRF_CONTAINS_KEY_FRAME
CAP_PROP_LRF_HAS_KEY_FRAME
CAP_PROP_LRF_HAS_KF

Regarding the other property, as the extra data is not necessarily going to be a paramater set, as is the the case of mp4v maybe it makes more sense to call it extra data.

CAP_PROP_LRF_PARAM_SET_LEN
CAP_PROP_LRF_EXTRA_DATA_LEN

I did try to avoid using extra properties but couldn't think of a way of sending this information inside the raw frame without breaking any code which currently relies on it.

Comment on lines 1620 to 1621
case CAP_PROP_LF_KEY_FRAME:
return packet.flags == 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong usage.

1

Don't use magic number.

== 1

Must be ((packet.flags & AV_PKT_FLAG_KEY) != 0) ? 1 : 0

Check libavcodec/packet.h for other values.


Anyway this value is not about the frame. This property is about the packet (due to B/P frames). Frames may be returned in a different order.

@@ -468,7 +468,7 @@ static void ffmpeg_check_read_raw(VideoCapture& cap)
cap >> data;
EXPECT_EQ(CV_8UC1, data.type()) << "CV_8UC1 != " << typeToString(data.type());
EXPECT_TRUE(data.rows == 1 || data.cols == 1) << data.size;
EXPECT_EQ((size_t)29729, data.total());
EXPECT_EQ((size_t)29774, data.total());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed behavior.

This patch probably emit broken stream.

Copy link
Contributor Author

@cudawarped cudawarped Oct 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the Nvidia video codec SDK samples it appears mpeg4 files contained in mp4 containers cannot be decoded without the additional information inside the mp4 header.
This means that highgui/video/big_buck_bunny.mp4 cannot currently be decoded inside cudacodec.
If you have access to the lastet Nvidia video codec SDK samples (sorry I can't find a recent version on github) they have included this information for MPEG4 decoding inside Samples\Utils\FFmpegDemuxer.h, extract below

bMp4MPEG4 = eVideoCodec == AV_CODEC_ID_MPEG4 && (
        !strcmp(fmtc->iformat->long_name, "QuickTime / MOV")
        || !strcmp(fmtc->iformat->long_name, "FLV (Flash Video)")
        || !strcmp(fmtc->iformat->long_name, "Matroska / WebM")
    );

if (bMp4MPEG4 && (frameCount == 0)) {

    int extraDataSize = fmtc->streams[iVideoStream]->codecpar->extradata_size;

    if (extraDataSize > 0) {

        // extradata contains start codes 00 00 01. Subtract its size
        pDataWithHeader = (uint8_t *)av_malloc(extraDataSize + pkt.size - 3*sizeof(uint8_t));

        if (!pDataWithHeader) {
            LOG(ERROR) << "FFmpeg error: " << __FILE__ << " " << __LINE__;
            return false;
        }

        memcpy(pDataWithHeader, fmtc->streams[iVideoStream]->codecpar->extradata, extraDataSize);
        memcpy(pDataWithHeader+extraDataSize, pkt.data+3, pkt.size - 3*sizeof(uint8_t));

        *ppVideo = pDataWithHeader;
        *pnVideoBytes = extraDataSize + pkt.size - 3*sizeof(uint8_t);
    }

} else {
    *ppVideo = pkt.data;
    *pnVideoBytes = pkt.size;
}

althouth the decoder works without removing the start code in this case.

*width = p.size;
if (includeExtraData) {
includeExtraData = false;
extraDataLen = ic->streams[video_stream]->codec->extradata_size + paddExtraData;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look on FFmpeg package filters.
Especially on h264_mp4toannexb which is already applied.
You are doing the similar things, but it is unclear "why?"

Copy link
Contributor Author

@cudawarped cudawarped Nov 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi alalek,

My understanding was that anexb simply inserts start codes between NAL units to convert from a length prefixed stream of data. I didn't think it had anything to do with the parameters required to set up the decoder, am I wrong here?

If not I'm sorry my explanation above must be confusing I was trying to explain two things at once (file writing and extra data required for decoding) and probably failed on both counts. Forgetting the file writing for a second, there are two reasons for this pull request:

  1. Currently cudacodec::VideoReader() is unable to decode mp4v codec (mpeg4 contained inside mp4 files one of the formats opencv appears writes to on windows nativly from VideoWriter and the format of the test file highgui/video/big_buck_bunny.mp4 ) because the extra data contained in the header of mp4 file which is required for decoding is not present in the raw bitstream. I think this is might have been why files with this fourcc code are not currently included in cudacodec (to decode mp4v case CV_FOURCC_MACRO('m', 'p', '4', 'v'): // fallthru or similar would need to be present here).
  2. Using h264 as and example from what I have read and from experiance with several ip camera's the sequence and picture parameter sets (SPS and PPS) which are required for decoding h264 files can be sent either;
    a) in band (RTP) just at the begining of a stream, and/or
    b) in band (RTP) throughout the stream, and/or
    c) in the Session Description returned from RTSP DESCRIBE request e.g. sprop-parameter-sets=Z0LgC5ZUCg/I,aM4BrFSAa
    If the parameter sets are only transmitted in the Session Description (c) then cudacodec is unable to decode the stream recieved bitstream.

To fix 1) and 2) I thought that the extra data should be sent at the begining of the first raw frame. From my testing this does not break any existing code relying on VideoCapture() when in "raw" mode. I also thought that it would be useful to signal the length of this data somewhere so that anyone using the api could trim and/or store it if required (for example when writing a stream over multiple file this data would need to be included at the begining of each file if it is not transmitted in band)

Copy link
Member

@alalek alalek Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to separate PRs with changing of extra data for decoding.

BTW, FFmpeg does not have extra bsf for MPEG4 cuvid decoding:
https://github.com/FFmpeg/FFmpeg/blame/945b2dcc631a891fcc8c911891fd6730a40b8248/libavcodec/cuviddec.c#L1138-L1160

NVIDIA doesn't declare any support for MPEG4: https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new

Copy link
Contributor Author

@cudawarped cudawarped Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi alalek,

Firtsly thank you for looking back into this for me, I am happy to remove the MPEG4 condition but I am not sure if it is what you are suggesting, see below.

It is better to separate PRs with changing of extra data for decoding.

The extra data required for the RTSP stream's is the same as that required for MPEG4. If I do a seperate PR for RTSP and MPEG4 then that would just involve removing

const bool bMp4MPEG4 = eVideoCodec == AV_CODEC_ID_MPEG4 && ( !strcmp(ic->iformat->long_name, "QuickTime / MOV")
    || !strcmp(ic->iformat->long_name, "FLV (Flash Video)") || !strcmp(ic->iformat->long_name, "Matroska / WebM"));

from this PR and placing it into another, which I am happy to do but I do not know if that is what you mean?

BTW, FFmpeg does not have extra bsf for MPEG4 cuvid decoding:
https://github.com/FFmpeg/FFmpeg/blame/945b2dcc631a891fcc8c911891fd6730a40b8248/libavcodec/cuviddec.c#L1138-L1160

From the Nvidia Video Codec SDK example for MPEG4 it does not need an additional bsf it just needs the extra info contained in
fmtc->streams[iVideoStream]->codecpar->extradata

NVIDIA doesn't declare any support for MPEG4: https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new

I am not sure why Nvidia doesn't list it on its support matrix, but it is and has been supported since Fermi (2010) and it is included in their latest programming guide dated October 26 2021.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it just needs the extra info contained in fmtc->streams[iVideoStream]->codecpar->extradata

So we should not modify "packets".
At least FFmpeg doesn't do that too:

ffmpeg -dump_attachment "extra.data" -i testdata/highgui/video/big_buck_bunny.mp4 -vcodec copy -an -f rawvideo test.mp4v

So codec's extradata is a dedicated output for FFmpeg.


I believe we should interpret that as dedicated output too and implement this through call:

cap.retrieve(codec_extradata_buffer, <video_codec_extradata_idx_property_value>);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we should not modify "packets".

Ideally no, I added to the head of the existing packet to avoid changing the api, and didn't think of using retrieve, which seems like a much better option.

So will it be ok to modify everything down from
virtual bool retrieveFrame(int, cv::OutputArray frame) CV_OVERRIDE
to include <video_codec_extradata_idx_property_value>?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use "int" parameter to retrieve non-frame data.
"0" value corresponds to video stream (default).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alalek updated to address your concerns.

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for update!

@@ -1402,20 +1402,26 @@ bool CvCapture_FFMPEG::grabFrame()
return valid;
}

bool CvCapture_FFMPEG::retrieveFrame(int, unsigned char** data, int* step, int* width, int* height, int* cn)
bool CvCapture_FFMPEG::retrieveFrame(int, unsigned char** data, int* step, int* width, int* height, int* cn, const int flag)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const int flag

There is already the first parameter for that value.

}));

Mat data;
cap.retrieve(data, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1

  1. It is better to avoid magic numbers.
  2. How users/application may know if this feature supported? This can be done through property, see my previous comment with:
    cap.retrieve(codec_extradata_buffer, <video_codec_extradata_idx_property_value>);
    Value in brackets should be fetched first through
    int codec_extradata_idx = (int)cap.get(CAP_PROP_CODEC_EXTRADATA_INDEX);
    and validated
    if (codec_extradata_idx <= 0) bailout;
    Retrieve call should go after that:
    cap.retrieve(codec_extradata_buffer, codec_extradata_idx);
  3. Need to decide do we want to expose code_extradata for RAW streams only or for other cases too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How users/application may know if this feature supported? This can be done through property, see my previous comment with:
cap.retrieve(codec_extradata_buffer, <video_codec_extradata_idx_property_value>);
Value in brackets should be fetched first through
int codec_extradata_idx = (int)cap.get(CAP_PROP_CODEC_EXTRADATA_INDEX);
and validated
if (codec_extradata_idx <= 0) bailout;
Retrieve call should go after that:
cap.retrieve(codec_extradata_buffer, codec_extradata_idx);

Should the extraData field be used to return other info aswell in the future? If so is it better to have a separate enum for the type of data

enum {
    CODEC_EXTRADATA = 0
}

or to include two properties in the existing enum as

CAP_PROP_CODEC_EXTRADATA_FLAG
CAP_PROP_CODEC_EXTRADATA_SUPPORTED

I'm not sure if it would then make sense to have something like

bool codec_extradata_supported = (bool)cap.get(CAP_PROP_CODEC_EXTRADATA_SUPPORTED);
if (!codec_extradata_supported ) bailout;
cap.retrieve(codec_extradata_buffer, CODEC_EXTRADATA (or CAP_PROP_CODEC_EXTRADATA_FLAG));

Alternatively we could stick with a single flag as

if (!cap.get(CAP_PROP_CODEC_EXTRADATA_INDEX)) bailout;
cap.retrieve(codec_extradata_buffer, CAP_PROP_CODEC_EXTRADATA_INDEX);

Need to decide do we want to expose code_extradata for RAW streams only or for other cases too.

I can't think of a use case for this, but I also can't see any reason for not also exposing this capability when decoding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single property should be enough.
"Support" check should look like if (cap.get(CAP_PROP_CODEC_EXTRADATA_INDEX) <= 0) bailout;. "0" parameter for .retrieve() is already reserved for the video data so it is considered as invalid for codec's extradata.

Direct using CAP_PROP_CODEC_EXTRADATA_INDEX as index is not enough for .retrieve() call.
It would be hard to avoid conflict with other backends or features (e.g., audio). Also there are different types/meaning of "property" and ".retrieve() index" - we don't want to mix them as we don't mix seconds vs meters. Please use returned value from .get() instead (result is backend-specific).


other info aswell in the future

For other info we could add similar properties, e.g.

CAP_PROP_CAMERA_PINHOLE_INTRINSIC_MATRIX_INDEX = ...,  //!< .retrieve() index to fetch camera intrinsic matrix
CAP_PROP_CAMERA_PINHOLE_DISTORTION_PARAMETERS_INDEX = ...,  //!< .retrieve() index to fetch camera distortion parameters

(backends may support or not support them).


May be we could improve property naming:

  • CAP_PROP_CODEC_EXTRADATA_INDEX
  • CAP_PROP_CODEC_EXTRADATA_RETRIEVE_INDEX
  • CAP_PROP_RETRIEVE_CODEC_EXTRADATA_INDEX
  • CAP_PROP_RETRIEVE_INDEX_CODEC_EXTRADATA
  • other suggestions

/cc @vpisarev @asmorkalov

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CAP_PROP_RETRIEVE_CODEC_EXTRADATA_FLAG?

@@ -30,7 +30,7 @@ OPENCV_FFMPEG_API int cvSetCaptureProperty_FFMPEG(struct CvCapture_FFMPEG* cap,
OPENCV_FFMPEG_API double cvGetCaptureProperty_FFMPEG(struct CvCapture_FFMPEG* cap, int prop);
OPENCV_FFMPEG_API int cvGrabFrame_FFMPEG(struct CvCapture_FFMPEG* cap);
OPENCV_FFMPEG_API int cvRetrieveFrame_FFMPEG(struct CvCapture_FFMPEG* capture, unsigned char** data,
int* step, int* width, int* height, int* cn);
int* step, int* width, int* height, int* cn, const int flag);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not change already existed API.
Legacy API should be used with flag=0 only.

flag != 0 should be handled differently.
Please take a look on addition of retrieveHWFrame implementation (which has missing flag parameter too...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Add property to determine if returning extra data is supported.
Always allow extra data to be returned on calls to cap.retrieve()
Update test case.
EXPECT_NO_THROW(cap.open(video_file, CAP_FFMPEG));
Mat data;
int codecExtradataIdx = (int)cap.get(CAP_PROP_CODEC_EXTRADATA_INDEX);
EXPECT_TRUE(codecExtradataIdx > 0);
Copy link
Member

@alalek alalek Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EXPECT_TRUE(codecExtradataIdx > 0);

FFmpeg wrapper for Windows is updated separately, so test code should be aligned.

#ifdef _WIN32  // handle old FFmpeg backend
if (codecExtradataIdx <= 0)
    throw SkipTestException("Codec extra data is not supported by backend or video stream");`
#endif

Copy link
Contributor Author

@cudawarped cudawarped Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to design and specify behavior for video streams without codec extra data.

@@ -584,6 +585,7 @@ void CvCapture_FFMPEG::init()
va_type = cv::VIDEO_ACCELERATION_NONE; // TODO OpenCV 5.0: change to _ANY?
hw_device = -1;
use_opencl = 0;
extraDataIdx = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some codecs doesn't use/have extra data. So we should not force that.

Copy link
Contributor Author

@cudawarped cudawarped Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that
cap.get(CAP_PROP_CODEC_EXTRADATA_INDEX) > 0
is there to indicate that extra data is supported and that it exists not just that it is supported by the video backend?

Or that it should just not be a member variable. If so where should the index be defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where should we define the index for the extra data index?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we have cap.get(CAP_PROP_CODEC_EXTRADATA_INDEX) > 0 iff backend supports this property

  • it is still true if there is no support for used codec
  • it is still true if this data is empty

I'm OK with that for now.

*height = 1;
*cn = 1;
return p.data != NULL;
return *data != NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return *data != NULL;

Why we fail even if codec extradata is empty? (e.g., not needed with used video stream)

Copy link
Contributor Author

@cudawarped cudawarped Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g., not needed with used video stream

Sorry alalek can you clarify? If we request extra data but it is not available shouldn't the function return false?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"return false;" means error here (C style coding rudiment - exception with proper message is more suitable here).
Empty data is empty cv::Mat buffer.

Enforce existing return status convention.
#ifdef _WIN32 // handle old FFmpeg backend
if (codecExtradataIdx <= 0)
throw SkipTestException("Codec extra data is not supported by backend or video stream");
#endif
cap.retrieve(data, codecExtradataIdx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please validate .retrieve() return value (it is a test).

ASSERT_TRUE(...)


Also we lost isOpened() check above:

ASSERT_TRUE(cap.isOpened()) << "Can't open the video: " << video_file;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry alalek I missed the << video_file; info in the message and I haven't kicked the build bot again just for that in case there are more changes.

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thank you 👍

@alalek alalek merged commit 97c6ec6 into opencv:4.x Nov 23, 2021
@alalek alalek mentioned this pull request Dec 30, 2021
@alalek alalek mentioned this pull request Feb 22, 2022
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
…_enchancement

Add capacity to Videocapture to return the extraData from FFmpeg when required

* Update rawMode to append any extra data recieved during the initial negotiation of an RTSP stream or during the parsing of an MPEG4 file header.
For h264[5] RTSP streams this ensures the parameter sets if available are always returned on the first call to grab()/read() and has two purposes:
1) To ensure the parameter sets are available even if they are not transmitted in band.  This is common for axis ip camera's.
2) To allow callers of VideoCapture::grab()[read()] to write to split the raw stream over multiple files by appending the parameter sets to the begining of any new files.
For (1) there is no alternative, for (2) if the parameter sets were provided in band it would be possible to parse the raw bit stream and search for the parameter sets however that would be a lot of work when that information is already provided by FFMPEG.
For MPEG4 files this information is only suplied in the header and is required for decoding.

Two properties are also required to enable the raw encoded bitstream to be written to multiple files, these are;
1) an indicator as to whether the last frame was a key frame or not - each new file needs to start at a key frame to avoid storing unusable frame diffs,
2) the length in bytes of the paramater sets contained in the last frame - required to split the paramater sets from the frame without having to parse the stream.  Any call to VideoCapture::get(CAP_PROP_LF_PARAM_SET_LEN) returning a number greater than zero indicates the presense of a parameter set at the begining of the raw bitstream.

* Adjust test data to account for extraData

* Address warning.

* Change added property names and remove paramater set start code check.

* Output extra data on calls to retrieve instead of appending to the first packet.

* Reverted old test case and added new one to evaluate new functionality.

* Add missing definition.

* Remove flag from legacy api.
Add property to determine if returning extra data is supported.
Always allow extra data to be returned on calls to cap.retrieve()
Update test case.

* Update condition which indicates CAP_PROP_CODEC_EXTRADATA_INDEX is not supported in test case.

* Include compatibility for windows dll if not updated.
Enforce existing return status convention.

* Fix return error and missing test constraints.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants