[PATCH] drm/i915: Before pageflip, also wait for shared dmabuf fences.

Tue Sep 13 12:52:01 UTC 2016

Am 13.09.2016 um 11:39 schrieb Chris Wilson:
> On Tue, Sep 13, 2016 at 10:44:11AM +0200, Christian König wrote:
>> Am 09.09.2016 um 03:15 schrieb Michel Dänzer:
>>> On 09/09/16 01:23 AM, Chris Wilson wrote:
>>>> On Thu, Sep 08, 2016 at 05:21:42PM +0200, Mario Kleiner wrote:
>>>>> On 09/08/2016 08:30 AM, Chris Wilson wrote:
>>>>>> On Thu, Sep 08, 2016 at 02:14:43AM +0200, Mario Kleiner wrote:
>>>>>>> amdgpu-kms uses shared fences for its prime exported dmabufs,
>>>>>>> instead of an exclusive fence. Therefore we need to wait for
>>>>>>> all fences of the dmabuf reservation object to prevent
>>>>>>> unsynchronized rendering and flipping.
>>>>>> No. Fix the root cause as this affects not just flips but copies -
>>>>>> this implies that everybody using the resv object must wait for all
>>>>>> fences. The resv object is not just used for prime, but all fencing, so
>>>>>> this breaks the ability to schedule parallel operations across engine.
>>>>>> -Chris
>>>>>>
>>>>> Ok. I think i now understand the difference, but let's check: The
>>>>> exclusive fence is essentially acting a bit like a write-lock, and
>>>>> the shared fences as readers-locks? So you can have multiple readers
>>>>> but only one writer at a time?
>>>> That's how we (i915.ko and I hope the rest of the world) are using them.
>>>> In the model where here is just one reservation object on the GEM
>>>> object, that reservation object is then shared between internal driver
>>>> scheduling and external. We are reliant on being able to use buffers on
>>>> multiple engines through the virtue of the shared fences, and to serialise
>>>> after writes by waiting on the exclusive fence. (So we can have concurrent
>>>> reads on the display engine, render engines and on the CPU - or
>>>> alternatively an exclusive writer.)
>>>>
>>>> In the near future, i915 flips will wait on the common reservation object
>>>> not only for dma-bufs, but also its own GEM objects.
>>>>> Ie.:
>>>>>
>>>>> Writer must wait for all fences before starting write access to a
>>>>> buffer, then attach the exclusive fence and signal it on end of
>>>>> write access. E.g., write to renderbuffer, write to texture etc.
>>>> Yes.
>>>>> Readers must wait for exclusive fence, then attach a shared fence
>>>>> per reader and signal it on end of read access? E.g., read from
>>>>> texture, fb, scanout?
>>>> Yes.
>>>>> Is that correct? In that case we'd have a missing exclusive fence in
>>>>> amdgpu for the linear target dmabuf? Probably beyond my level of
>>>>> knowledge to fix this?
>>>> i915.ko requires the client to mark which buffers are written to.
>>>>
>>>> In ttm, there are ttm_validate_buffer objects which mark whether they
>>>> should be using shared or exclusive fences. Afaict, in amdgpu they are
>>>> all set to shared, the relevant user interface seems to be
>>>> amdgpu_bo_list_set().
>>> This all makes sense to me.
>>>
>>> Christian, why is amdgpu setting only shared fences? Can we fix that?
>> No, amdgpu relies on the fact that we even allow concurrent write
>> accesses by userspace.
>>
>> E.g. one part of the buffer can be rendered by one engine while
>> another part could be rendered by another engine.
>>
>> Just imagine X which is composing a buffer with both the 3D engine
>> as well as the DMA engine.
>>
>> All engines need to run in parallel and you need to wait for all of
>> them to finish before scanout.
>>
>> Everybody which needs exclusive access to the reservation object
>> (like scanouts do) needs to wait for all fences, not just the
>> exclusive one.
>>
>> The Intel driver clearly needs to be fixed here.
> If you are not using implicit fencing, you have to pass explicit fences
> instead.

Which is exactly what we do, but only for the driver internally command 
submissions.

All command submissions from the same process can run concurrently with 
amdgpu, only when we see a fence from another driver or process we wait 
for it to complete before starting to run a command submission.

Other drivers can't make any assumption on what a shared access is 
actually doing (e.g. writing or reading) with a buffer.

So the model i915.ko is using the reservation object and it's shared 
fences is certainly not correct and needs to be fixed.

Regards,
Christian.

> -Chris
>