Skip to content

Fast way to write pyvips image to FFmpeg stdin (and others; suggestions; big read) #198

Closed
@Tremeschin

Description

@Tremeschin

Hey @jcupitt, looks like you're very active helping people here so I thought it wouldn't hurt asking something. This will be a bit long as I'll give you the full context and wanted some suggestions if possible.

I'm working on a project in making a music visualizer and unsurprisingly I'm dealing with lots and lots of image processing, gaussian blur, vignetting, resize, alpha composite, etc.

While numpy + PIL works they aren't that much fast comparing to a proper image library or some GPU accelerated canvas (this last one woudn't quite work because I have to move back and forth a lot of images, and textures are somewhat expensive so I'd have to process them on the GPU itself, I'm not that deep into it yet)

For a 8.5 second audio using non multiprocessed code it takes about 3 minutes to make a 720p60 video out of it, I thought: hmm I don't have only one thread on my CPU so multiprocessing would work better, right? no! the IPC and mutexes shenanigans I wasn't aware about didn't scale up much, it cut in half the processing time down to 1m30sec though.

I tried using something like NodeJS and thought using C++ with Python to alpha composite and process the images very fast, the first one didn't quite work and haven't tried the second one yet.

So I stumbled across pyvips and with little effort I could alpha composite 100 particle images (random coordinates) onto a 4k image at 84 fps!! and it didn't even use much RAM and only half the CPU.

Though when piping the images to FFmpeg we have to convert them into a format we can write to the stdin and be readable + compatible by FFmpeg and its arguments.

Here comes my question after this short context: when I use something like

  • image.write_to_buffer('.JPEG', Q=100)

  • image = np.ndarray(buffer=image.write_to_memory(), dtype=format_to_dtype[image.format], shape=[image.height, image.width, image.bands])

Piping the images takes about 0.0764 seconds in total each cycle of making + converting to bytes + write to stdin but those two example lines takes about 0.0617 seconds to run. (those numbers are average of 510 piped frames). That's nearly all the time spend on the loop.

I'm not sure of how I can ask this but am I doing something wrong or is there an better way of getting the full fat raw image out of a pyvips Image object and send to FFmpeg stdin?

Again, this is my main bottleneck so any advice on quickly converting the images to a video is what I need.

I use somewhat canonical piping arguments (with ffmpeg-python package)

self.pipe_subprocess = (
    ffmpeg
    .input('pipe:', format='image2pipe', pix_fmt='rgba', r=self.context.fps, s='{}x{}'.format(self.context.width, self.context.height))
    .output(output, pix_fmt='yuv420p', vcodec='libx264', r=self.context.fps, crf=18, loglevel="quiet")
    .global_args('-i', self.context.input_file)
    .overwrite_output()
    .run_async(pipe_stdin=True)
)

I only change image2pipe with rawvideo when piping the numpy array's raw data.

I've seen and read a few places before asking this, most notably:

And I've tried lurking the docs on pyvips.Image class.

I'm looking forward using this library for images from now on, it works really well and is VERY simple to use.

I almost cried when I saw the Image.composite method, that is because I had manually implemented something equal to this by hand here (spoiler: it took a while to crop and composite only the needed parts)

And looks like pyvips can handle big images like they are nothing!!

Thanks for the project for using libvips easily through Python.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions