[PATCH] drm/i915: Before pageflip, also wait for shared dmabuf fences.

Wed Sep 21 12:56:36 UTC 2016

On Wed, Sep 21, 2016 at 1:19 PM, Christian König
<deathsimple at vodafone.de> wrote:
> Am 21.09.2016 um 13:04 schrieb Daniel Vetter:
>>
>> On Wed, Sep 21, 2016 at 12:30 PM, Christian König
>> <deathsimple at vodafone.de> wrote:
>>>
>>> Am 21.09.2016 um 11:56 schrieb Michel Dänzer:
>>>>
>>>>
>>>> Looks like there are different interpretations of the semantics of
>>>> exclusive vs. shared fences. Where are these semantics documented?
>>>
>>>
>>> Yeah, I think as well that this is the primary question here.
>>>
>>> IIRC the fences were explicitly called exclusive/shared instead of
>>> writing/reading on purpose.
>>>
>>> I absolutely don't mind switching to them to writing/reading semantics,
>>> but
>>> amdgpu really needs multiple writers at the same time.
>>>
>>> So in this case the writing side of a reservation object needs to be a
>>> collection of fences as well.
>>
>> You can't have multiple writers with implicit syncing. That confusion
>> is exactly why we called them shared/exclusive. Multiple writers
>> generally means that you do some form of fencing in userspace
>> (unsync'ed gl buffer access is the common one). What you do for
>> private buffers doesn't matter, but when you render into a
>> shared/winsys buffer you really need to set the exclusive fence (and
>> there can only ever be one). So probably needs some userspace
>> adjustments to make sure you don't accidentally set an exclusive write
>> hazard when you don't really want that implicit sync.
>
>
> Nope, that isn't true.
>
> We use multiple writers without implicit syncing between processes in the
> amdgpu stack perfectly fine.
>
> See amdgpu_sync.c for the implementation. What we do there is taking a look
> at all the fences associated with a reservation object and only sync to
> those who are from another process.
>
> Then we use implicit syncing for command submissions in the form of
> "dependencies". E.g. for each CS we report back an identifier of that
> submission to user space and on the next submission you can give this
> identifier as dependency which needs to be satisfied before the command
> submission can start running.

This is called explicit fencing. Implemented with a driver-private
primitive (and not sync_file fds like on android), but still
conceptually explicit fencing. Implicit fencing really only can handle
one writer, at least as currently implemented by struct
reservation_object.

> This was done to allow multiple engines (3D, DMA, Compute) to compose a
> buffer while still allow compatibility with protocols like DRI2/DRI3.

Instead of the current solution you need to stop attaching exclusive
fences to non-shared buffers (which are coordinated using the
driver-private explicit fencing you're describing), and only attach
exclusive fences to shared buffers (DRI2/3, PRIME, whatever). Since
you're doing explicit syncing for internal stuff anyway you can still
opt to ignore the exclusive fences if you want to (driven by a flag or
something similar).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch