[EXT] Re: V4L2 video decoder buffer fence

Nicolas Dufresne nicolas at ndufresne.ca
Wed Jan 13 16:12:18 UTC 2021


Le mar. 12 janv. 2021 22 h 15, Bing Song <bing.song at nxp.com> a écrit :

> For video decoder output buffer, it can’t use out fence as video decoder
> output buffer reording.
>

Hantro is a stateless decoder, reordering happens in userspace. For this
reason, it is not affected by fence / queue ordering limitations.

Of course, fences needs to be used inside the userspace decoder, so the you
don't do lock step decoding inside your reordering queue. This is also true
for current queue based decoding, I have patches pending to ensure that for
V4L2codecs plugin.

> Video decoder output buffer can use input fence which from gl-render in
> Weston. Can this use case improve performance?
>
I'm not sure how worthy having full duplex fence will benefit performance.
Many GL stack uses implicit fences (only visible by kernel drivers, notably
etnaviv). In that case, waiting for the fence is not about performance but
correctness. I saw a kernel patch from Philipp Zabel that address this
inside VB2 (which does not have any fence support), but the down side is
that it blocks userspace inside qbuf ioctl. Implicit fence is strictly
kernel. If it was explicit, we could wait or poll in userspace before
reusing that buffer.

>
>
> Regards,
>
> Bing
>
>
>
> *From:* gstreamer-devel <gstreamer-devel-bounces at lists.freedesktop.org> *On
> Behalf Of *Nicolas Dufresne
> *Sent:* 2021年1月12日 22:19
> *To:* Discussion of the development of and with GStreamer <
> gstreamer-devel at lists.freedesktop.org>
> *Subject:* [EXT] Re: V4L2 video decoder buffer fence
>
>
>
> *Caution: *EXT Email
>
>
>
> Le mar. 12 janv. 2021 02 h 15, Bing Song <bing.song at nxp.com> a écrit :
>
> Hi,
>
>
>
> I want to implement v4l2 video decoder buffer fence. But I don’t know why
> it can benefit performance? Video HW decoder is one step decode. We use
> Hantro video decoder. CPU SW will parser SPS/PPS and slice header. HW will
> decode video frame within one step decode. How dma buf fence can benefit
> decode performance?
>
>
>
> Fences alone don't save in performance. You need to combine these fences
> with a GPU or a display driver API to actually gain.
>
>
>
> Fences in GPU and display driver are used to parallelize the processing
> without using extra threads, so without the context switch cost.
>
>
>
> With the fences, the driver can deliver incomplete frames and program the
> next job without blocking. This is equivalent to adding a render delay of 1
> frame, but without the full frame latency.
>
>
>
> Note that fences are not yet supported in V4L2 API, there was a proposal
> but with some limitations (ordering and timestamp related).
>
>
>
>
>
> Regards,
>
> Bing
>
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fgstreamer-devel&data=04%7C01%7Cbing.song%40nxp.com%7C93e348942eed43cc95a208d8b70510e5%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637460579683369061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=xqtnvbopz2%2BI41%2FELdtnB0JfJtJDMkMBIsdcBbu0eTs%3D&reserved=0>
>
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/gstreamer-devel/attachments/20210113/292a9608/attachment.htm>


More information about the gstreamer-devel mailing list