Zero copy using CUDA allocated memory on Jetson TX2

Nicolas Dufresne nicolas at
Fri Dec 13 17:57:35 UTC 2019


Le jeudi 12 décembre 2019 à 08:57 -0600, itaidagan a écrit :
> Hello everyone,
> I'm currently working on a plugin derived from VideoFlip that takes a frame,
> copies the contents of the regular Mat to a GpuMat (using "upload"),
> performs a cv::cuda::remap function and then copies the remapped GpuMat back
> to the buffer frame (using "download").
> I'd like to accelerate this process by avoiding the unnecessary copies. Is
> there a way to tell gstreamer to allocate buffers using cudaMalloc in a way
> that will allow me to implement this pipeline without any unnecessary
> copies?

It's not a guarantied to be used, but each element can reply to
upstream (sink query) ALLOCATION query and offer an allocator, a buffer
pool, or both. For video element this is nearly always implemented to
announce at least support for GstVideoMeta (allowing flexible stride to
be used).

What happens next will depends on the element that precedes yours. In
general, to achieve zero-copy, it is better if you can control the code
of the each element that are doing to share memory, as this only works
in pair.

> Here's the relevant code:
>     // Upload Mat to GpuMat
>     jmundistort->frame_gpu_mat.upload(jmundistort->frame_mat);
>     // Remap
>     cuda::remap(
>             jmundistort->frame_gpu_mat,
>             jmundistort->undistorted_frame_gpu_mat,
>             jmundistort->dist_params.map1_gpu,
>             jmundistort->dist_params.map2_gpu,
>             INTER_CUBIC);
>     // Download
>     jmundistort->>frame_mat);
> --
> Sent from:
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: This is a digitally signed message part
URL: <>

More information about the gstreamer-devel mailing list