[rfc repost] drm sync objects - a new beginning (make ickle happier?)

Thu Apr 20 08:36:21 UTC 2017

Am 19.04.2017 um 21:14 schrieb Dave Airlie:
> On 20 April 2017 at 04:42, Dave Airlie <airlied at gmail.com> wrote:
>> On 19 April 2017 at 22:07, Christian König <deathsimple at vodafone.de> wrote:
>>> Am 13.04.2017 um 03:41 schrieb Dave Airlie:
>>>> Okay I've taken Chris's suggestions to heart and reworked things
>>>> around a sem_file to see how they might look.
>>>>
>>>> This means the drm_syncobj are currently only useful for semaphores,
>>>> the flags field could be used in future to use it for other things,
>>>> and we can reintroduce some of the API then if needed.
>>>>
>>>> This refactors sync_file first to add some basic rcu wrappers
>>>> about the fence pointer, as this point never updates this should
>>>> all be fine unlocked.
>>>>
>>>> It then creates the sem_file with a mutex, and uses that to
>>>> track the semaphores with reduced fops and the replace and
>>>> get APIs.
>>>>
>>>> Then it reworks the drm stuff on top, and fixes amdgpu bug
>>>> with old_fence.
>>>>
>>>> Let's see if anyone prefers one approach over the other.
>>>
>>> Yeah, I clearly prefer keeping only one object type for synchronization in
>>> the kernel.
>>>
>>> As I wrote in the other mail the argument of using the sync file for
>>> semaphores was to be able to use it as in fence with the atomic mode setting
>>> as well.
>>>
>>> That a wait consumes a previous signal should be a specific behavior of the
>>> operation and not the property of the object.
>>>
>>> In other words I'm fine with using the sync_file in a 1:1 fashion with
>>> Vulkan, but for the atomic API we probably want 1:N to be able to flip a
>>> rendering result on multiple CRTCs at the same time.
>> Well ideally atomic modesetting should be moved to using syncobjects
>> as an option.
>>
>> I'd rather sync_files were limited in scope to interaction with non-drm drivers,
>> and possibly interprocess operations, consuming fd's is bad and merging doesn't
>> really fix that.

I agree that sync_files are a bit inflexible, but we can probably all 
agree that lesson learned from flink names is that we should stick with 
fd's for interprocess operations.

So at least that is a good starting point and the additional semantics 
of replacing the fence instead of merging it doesn't sound like so much 
of a difference here.

Another really good design decision of the sync_files is that you can't 
build deadlocks with it.

>> I'm starting to narrow down to what I think the sync_obj needs to do, and I'm
>> contemplating a bit more something like the following:
>>
>> a) no file backing, a simple kref object that gets tracked in an idr
>> (like a gem object).

I'm not even sure if an idr with a new object type is the right 
approach. See for amdgpu we have the fence numbers which are returned to 
userspace with every command submission.

Those fence numbers can be translated back into a fence structure (as 
long as they aren't ancient and already signaled).

So what we could do is just add a functionality to export those fence 
numbers into a semaphore fd with optionally replacing the fence inside 
an existing semaphore fd.

This way you only need to keep fds from foreign processes.

>> This object can have an optional fence attached to it. If there is no
>> fence, it's unsignalled,
>> if there is a fence it's signalled.

The price question is what happens when you try to wait for an object 
which doesn't (yet) have a fence attached to it?

Waiting is not really an option, since then you can created deadlocks 
inside the kernel. E.g. CS A waits for CS B and B waits for A.

So I think that creating an unsignaled semaphore fd should be forbidden 
in the first place.

> This should say, if there's a fence, the status of the fence.
> Thanks to Andres for pointing it out, it is 5am.

Yeah, I know what you mean. How cute little ones are the lack of sleep 
makes thinking straight rather complicated.

Christian.

>
> Dave.