Optimize VideoCapture with cvtColor instead of sws_scale #27652

dkurt · 2025-08-10T17:38:48Z

Pull Request Readiness Checklist

Test code that reads every frame with FFMpeg from 1600x900 1 hour mp4 video:

using CPU:
total time: 45.6 seconds -> 30.6 seconds

using GPU HW acceleration
total time: 232 seconds -> 69 seconds

HW utilization with sws_scale:

Device 0 [NVIDIA GeForce RTX 4090] PCIe GEN 4@16x RX: 54.54 MiB/s TX: 869.8 MiB/s
GPU 2805MHz MEM 10501MH TEMP  34°C FAN  32% POW 109 / 480 W
GPU[||                        7%] MEM[|          0.905Gi/23.988Gi] DEC[|||   29%]

HW utilization with cvtColor:

Device 0 [NVIDIA GeForce RTX 4090] PCIe GEN 4@16x RX: 46.44 MiB/s TX: 2.885 GiB/s
GPU 2805MHz MEM 10501MH TEMP  34°C FAN  32% POW 107 / 480 W
GPU[||||||                   21%] MEM[|          0.905Gi/23.988Gi] DEC[|||||100%]

import time
import numpy as np
import os
import cv2 as cv

os.environ["OPENCV_FFMPEG_CAPTURE_OPTIONS"] = "hwaccel;cuvid|video_codec;h264_cuvid|vsync;0"

start = time.time()
cap = cv.VideoCapture("test.mp4", cv.CAP_FFMPEG)
assert(cap.isOpened())
while True:
    has_frame, frame = cap.read()
    if not has_frame:
        break
print(time.time() - start)

$ ffprobe -i test.mp4

ffprobe version n7.0.3 Copyright (c) 2007-2025 the FFmpeg developers
  built with gcc 13 (Ubuntu 13.2.0-23ubuntu4)
  configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --enable-nvdec --enable-swresample --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --disable-static --enable-shared
  libavutil      59.  8.100 / 59.  8.100
  libavcodec     61.  3.100 / 61.  3.100
  libavformat    61.  1.100 / 61.  1.100
  libavdevice    61.  1.100 / 61.  1.100
  libavfilter    10.  1.100 / 10.  1.100
  libswscale      8.  1.100 /  8.  1.100
  libswresample   5.  1.100 /  5.  1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'test.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf60.3.100
  Duration: 00:55:27.48, start: 0.000000, bitrate: 199 kb/s
  Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1600x900, 126 kb/s, 25 fps, 25 tbr, 12800 tbn (default)
      Metadata:
        handler_name    : VideoHandler
        vendor_id       : [0][0][0][0]
        encoder         : Lavc60.3.100 libx264
  Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 66 kb/s (default)
      Metadata:
        handler_name    : SoundHandler
        vendor_id       : [0][0][0][0]

resolves #21969

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

asmorkalov · 2025-08-11T07:51:50Z

What "using CPU: total_time: 45 seconds" means?

dkurt · 2025-08-11T07:57:43Z

What "using CPU: total_time: 45 seconds" means?

Same Python script but without os.environ["OPENCV_FFMPEG_CAPTURE_OPTIONS"] = "hwaccel;cuvid|video_codec;h264_cuvid|vsync;0"

I've figured out that sws_scale for AV_PIX_FMT_YUV420P also slower than OpenCV's cvtColor (the script might work about 30 seconds which is 1.5 faster). I will update this PR to demonstrate it later.

Use Opencv NV12 to RGB instead of sws_scale

c969c05

dkurt mentioned this pull request Aug 10, 2025

After setting the decoder to hevc_cuvid, VideoCapture reads video more slowly #21969

Open

4 tasks

asmorkalov added this to the 4.13.0 milestone Aug 11, 2025

asmorkalov self-requested a review August 11, 2025 07:52

asmorkalov self-assigned this Aug 11, 2025

asmorkalov added optimization category: videoio labels Aug 11, 2025

dkurt added 3 commits August 11, 2025 14:10

Accept u, v, separately in cvtThreePlaneYUVtoBGR

ebbdc80

OPENCV_FFMPEG_USE_SWSCALE environment option

ab33b28

fix tests

8e3511c

dkurt changed the title ~~Use Opencv NV12 to RGB instead of sws_scale~~ Optimize VideoCapture with OpenCV cvtColor instead of sws_scale Aug 11, 2025

dkurt changed the title ~~Optimize VideoCapture with OpenCV cvtColor instead of sws_scale~~ Optimize VideoCapture with cvtColor instead of sws_scale Aug 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Optimize VideoCapture with cvtColor instead of sws_scale #27652

Optimize VideoCapture with cvtColor instead of sws_scale #27652

dkurt commented Aug 10, 2025 •

edited

Loading

Uh oh!

asmorkalov commented Aug 11, 2025

Uh oh!

dkurt commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!

Optimize VideoCapture with cvtColor instead of sws_scale #27652

Are you sure you want to change the base?

Optimize VideoCapture with cvtColor instead of sws_scale #27652

Conversation

dkurt commented Aug 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

asmorkalov commented Aug 11, 2025

Uh oh!

dkurt commented Aug 11, 2025

Uh oh!

Uh oh!

dkurt commented Aug 10, 2025 •

edited

Loading