Support for 2D engines/blitters in V4L2 and DRM

Wed Apr 24 16:54:27 UTC 2019

On 2019-04-24 5:44 p.m., Nicolas Dufresne wrote:
> Le mercredi 24 avril 2019 à 17:06 +0200, Daniel Vetter a écrit :
>> On Wed, Apr 24, 2019 at 4:41 PM Paul Kocialkowski
>> <paul.kocialkowski at bootlin.com> wrote:
>>> On Wed, 2019-04-24 at 16:39 +0200, Michel Dänzer wrote:
>>>> On 2019-04-24 2:01 p.m., Nicolas Dufresne wrote:
>>>>>
>>>>> Rendering a video stream is more complex then what you describe here.
>>>>> Whenever there is a unexpected delay (late delivery of a frame as an
>>>>> example) you may endup in situation where one frame is ready after the
>>>>> targeted vblank. If there is another frame that targets the following
>>>>> vblank that gets ready on-time, the previous frame should be replaced
>>>>> by the most recent one.
>>>>>
>>>>> With fences, what happens is that even if you received the next frame
>>>>> on time, naively replacing it is not possible, because we don't know
>>>>> when the fence for the next frame will be signalled. If you simply
>>>>> always replace the current frame, you may endup skipping a lot more
>>>>> vblank then what you expect, and that results in jumpy playback.
>>>>
>>>> So you want to be able to replace a queued flip with another one then.
>>>> That doesn't necessarily require allowing more than one flip to be
>>>> queued ahead of time.
>>>
>>> There might be other ways to do it, but this one has plenty of
>>> advantages.
>>
>> The point of kms (well one of the reasons) was to separate the
>> implementation of modesetting for specific hw from policy decisions
>> like which frames to drop and how to schedule them. Kernel gives
>> tools, userspace implements the actual protocols.
>>
>> There's definitely a bit a gap around scheduling flips for a specific
>> frame or allowing to cancel/overwrite an already scheduled flip, but
>> no one yet has come up with a clear proposal for new uapi + example
>> implementation + userspace implementation + big enough support from
>> other compositors that this is what they want too.

Actually, the ATOMIC_AMEND patches propose a way to replace a scheduled
flip?

>>>> Note that this can also be done in userspace with explicit fencing (by
>>>> only selecting a frame and submitting it to the kernel after all
>>>> corresponding fences have signalled), at least to some degree, but the
>>>> kernel should be able to do it up to a later point in time and more
>>>> reliably, with less risk of missing a flip for a frame which becomes
>>>> ready just in time.
>>>
>>> Indeed, but it would be great if we could do that with implicit fencing
>>> as well.
>>
>> 1. extract implicit fences from dma-buf. This part is just an idea,
>> but easy to implement once we have someone who actually wants this.
>> All we need is a new ioctl on the dma-buf to export the fences from
>> the reservation_object as a sync_file (either the exclusive or the
>> shared ones, selected with a flag).
>> 2. do the exact same frame scheduling as with explicit fencing
>> 3. supply explicit fences in your atomic ioctl calls - these should
>> overrule any implicit fences (assuming correct kernel drivers, but we
>> have helpers so you can assume they all work correctly).
>>
>> By design this is possible, it's just that no one yet bothered enough
>> to make it happen.
>> -Daniel
> 
> I'm not sure I understand the workflow of this one. I'm all in favour
> leaving the hard work to userspace. Note that I have assumed explicit
> fences from the start, I don't think implicit fence will ever exist in
> v4l2, but I might be wrong. What I understood is that there was a
> previous attempt in the past but it raised more issues then it actually
> solved. So that being said, how do handle exactly the follow use cases:
> 
>  - A frame was lost by capture driver, but it was schedule as being the
> next buffer to render (normally previous frame should remain).

Userspace just doesn't call into the kernel to flip to the lost frame,
so the previous one remains.

>  - The scheduled frame is late for the next vblank (didn't signal on-
> time), a new one may be better for the next vlbank, but we will only
> know when it's fence is signaled.

Userspace only selects a frame and submits it to the kernel after all
its fences have signalled.

> Better in this context means the the presentation time of this frame is
> closer to the next vblank time. Keep in mind that the idea is to
> schedule the frames before they are signal, in order to make the usage
> of the fence useful in lowering the latency.

Fences are about signalling completion, not about low latency.

With a display server, the client can send frames to the display server
ahead of time, only the display server needs to wait for fences to
signal before submitting frames to the kernel.

> Of course as Michel said, we could just always wait on the fence and
> just schedule. But if you do that, why would you care implementing the
> fence in v4l2 to start with, DQBuf does just that already.

A fence is more likely to work out of the box with non-V4L-related code
than DQBuf?

-- 
Earthling Michel Dänzer               |              https://www.amd.com
Libre software enthusiast             |             Mesa and X developer

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190424/53b22990/attachment.sig>