Slow Memory Access in AppSink

Michael Olbrich m.olbrich at pengutronix.de
Wed Sep 2 05:57:28 UTC 2020


Hi,

On Tue, Sep 01, 2020 at 02:44:08AM -0500, Kazanian wrote:
> I'm working with an  NXP i.MX8M and I have built a GStreamer-pipeline where
> I decode a H264-stream and put the data in an appsink. In the
> appsink-callback-function, I do some resorting of all pixels to get it in a
> different format. The pipeline looks like this:
> H264-file -> h264parse -> vpudec -> appsink -> resorting algorithm
> 
> It works so far, but to get memory access to the frame buffer in the
> callback-function, I use the function gst_buffer_map with the flag
> GST_MAP_WRITE and it takes too much time. For a 1920x1080 video frame, it
> takes 34 ms and then the resorting algorithm takes 5 ms.  If I use
> GST_MAP_READ instead, the map function is fast (0.02 ms), but then the
> resorting algorithm takes much longer. Probably because the data has to be
> fetched from some other memory.
> What exactly is the reason for this? What can I do to make the mapping
> faster?

The vpudec element is the one provided by NXP, right? I don't know exactly
what the element does but what happens is probably something like this:

The buffers provided by vpudec are mapped uncached. So any access will be
_really_ slow. There is nothing you can do about that.
And you're decoding h264 so the decoder will still need the buffer as a
reference frame to decode the next one. So you're not allowed to write to
it. So if you do a GST_MAP_WRITE, then the buffer is copied in the
background.

> My idea to solve this was to allocate a few buffers before starting the
> pipeline and call gst_buffer_map with flag GST_MAP_WRITE on these buffers
> beforehand and let the gstreamer use these pre-allocated buffers. But I have
> not found a way to tell gstreamer to write the decoded data into these
> buffers? Is there a way to do this?

I don't think that's possible. The hardware decoder has special
requirements for the buffer, so it cannot just write into any memory you
provide. And, as noted above, it needs an unmodified copy of the buffer to
decode the next frame.
So if you want to modify the buffer then it must be copied. And performance
wise it really does not matter where the copy happens.

Regards,
Michael

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |


More information about the gstreamer-devel mailing list