[RFC] replacing dma_resv API

Thu Aug 22 09:27:40 UTC 2019

Am 21.08.19 um 22:05 schrieb Daniel Vetter:
> On Wed, Aug 21, 2019 at 06:13:27PM +0200, Daniel Vetter wrote:
>> On Wed, Aug 21, 2019 at 02:31:37PM +0200, Christian König wrote:
>>> Hi everyone,
>>>
>>> In previous discussion it surfaced that different drivers use the shared
>>> and explicit fences in the dma_resv object with different meanings.
>>>
>>> This is problematic when we share buffers between those drivers and
>>> requirements for implicit and explicit synchronization leaded to quite a
>>> number of workarounds related to this.
>>>
>>> So I started an effort to get all drivers back to a common understanding
>>> of what the fences in the dma_resv object mean and be able to use the
>>> object for different kind of workloads independent of the classic DRM
>>> command submission interface.
>>>
>>> The result is this patch set which modifies the dma_resv API to get away
>>> from a single explicit fence and multiple shared fences, towards a
>>> notation where we have explicit categories for writers, readers and
>>> others.
>>>
>>> To do this I came up with a new container called dma_resv_fences which
>>> can store both a single fence as well as multiple fences in a
>>> dma_fence_array.
>>>
>>> This turned out to actually be even be quite a bit simpler, since we
>>> don't need any complicated dance between RCU and sequence count
>>> protected updates any more.
>>>
>>> Instead we can just grab a reference to the dma_fence_array under RCU
>>> and so keep the current state of synchronization alive until we are done
>>> with it.
>>>
>>> This results in both a small performance improvement since we don't need
>>> so many barriers any more, as well as fewer lines of code in the actual
>>> implementation.
>> I think you traded lack of barriers/retry loops for correctness here, see
>> reply later on. But I haven't grokked the full thing in details, so easily
>> might have missed something.
>>
>> But high level first, and I don't get this at all. Current state:
>>
>> Ill defined semantics, no docs. You have to look at the implementations.
>>
>> New state after you patch series:
>>
>> Ill defined semantics (but hey different!), no docs. You still have to
>> look at the implementations to understand what's going on.
>>
>> I think what has actually changed (aside from the entire implementation)
>> is just these three things:
>> - we now allow multiple exclusive fences
> This isn't really new, you could just attach a dma_fence_array already to
> the exclusive slot. So not really new either.

Correct, the problem is really that in this case we still wouldn't have 
a clear semantic what means which.

>> - exclusive was renamed to writer fences, shared to reader fences
> Bit more context why I think this is a pure bikeshed: We've had (what at
> least felt like) a multi-year bikeshed on what to call these, with the two
> options writer/readers and exclusive/shared. Somehow (it's not documented,
> hooray) we ended up going with exlusive/shared. Switching that over to the
> other bikeshed again, still without documenting what exactly you should be
> putting there (since amdgpu still doesn't always fill out the writer,
> because that's not how amdgpu works), feels really silly.

I simple haven't change the implementation in amdgpu because I wanted to 
negotiated what we are actually going to do first.

>> - there's a new "other" group, for ... otherwordly fences?
> I guess this is to better handle the amdkfd magic fence, or the vm fences?

Both, this is simply for fences which doesn't participate in implicit 
synchronization at all.

> Still no idea since not used.
>
> One other thing I've found while trying to figure out your motivation here
> (since I'm not getting what you're aiming) is that setting the exclusive
> fence through the old interface now sets both exclusive and shared fences.
>
> I guess if that's all (I'm assuming I'm blind) we can just add a "give me
> all the fences" interface, and use that for the drivers that want that.
>
>> Afaiui we have the following to issues with the current fence semantics:
>> - amdgpu came up with a totally different notion of implicit sync, using
>>    the owner to figure out when to sync. I have no idea at all how that
>>    meshes with multiple writers, but I guess there's a connection.
>> - amdkfd does a very fancy eviction/preempt fence. Is that what the other
>>    bucket is for?
>>
>> I guess I could read the amdgpu/ttm code in very fine detail and figure
>> this out, but I really don't see how that's moving stuff forward.
>>
>> Also, I think it'd be really good to decouple semantic changes from
>> implementation changes, because untangling them if we have to revert one
>> or the other is going to be nigh impossible. And dma_* is not really an
>> area where we can proudly claim that reverts don't happen.
> I think we should go even further with this, and start earlier.
>
> step 1: Document the current semantics.

I don't think that this is a good idea, because we don't have a clear 
current semantics.

What we have is a container with fences and no definition what those 
fences mean.

We would just spend a lot of time and documenting that we messed it up 
with no gain at all.

The aim of this patch set is to:
a) replace the current container with something which can be re-used 
multiple times.
b) actually define what the fences in the container actually mean.

I mixed those two goals up in a single patch and you are absolutely 
correct that this wasn't a good idea, going to fix that for the next 
iteration.

Maybe it becomes clearer then what I try to do here,
Christian.

> Once we have that, we can look at the amdkfd and amdgpu vm stuff and
> whatever else there is, and figure out what's missing. Maybe even throw in
> the exact thing you're doign in amdkfd/gpu into the above documentation,
> in an effort to cover what's done. I can add some entertaining things from
> i915's side too :-)
>
> And I mean actual real docs that explain stuff, not oneliner kerneldocs
> for functions and that's it. Without that I think we'll just move in
> circles and go nowhere at all.
> -Daniel