Shared semaphores for amdgpu

Thu Mar 9 09:12:40 UTC 2017

Am 09.03.2017 um 09:15 schrieb Dave Airlie:
> On 9 March 2017 at 17:38, Christian König <christian.koenig at amd.com> wrote:
>>> I do wonder if we need the separate sem signal/wait interface, I think
>>> we should just add
>>> semaphore chunks to the CS interface.
>> Yeah, that's what I've said as well from the very first beginning.
>>
>> Another question is if we should really create another implementation to
>> share semaphores between processes.
>>
>> In other words putting the current fences inside the semaphore into a
>> sync_file with the signal_on_any bit set would have pretty much the same
>> effect, except that the resulting object then had the sync_file semantics
>> for adding new fences and can be used in the atomic IOCTLs as well.
> So the vulkan external semaphore spec has two different type of semaphore
> semantics, I'm not sure the sync_file semantics match the first type,
> only the second.

I haven't completely read that part of the spec yet, but from what I 
know the first semantics is actually a bit scary and I'm not sure if we 
want to fully support that.

Especially that you can wait on a semaphore object which is not signaled 
yet can easily lead to deadlocks and bound resources in the kernel and 
windowing system.

Imagine that you send a command submission to the kernel with the 
request to wait for a semaphore object and then never signal that 
semaphore object. At least for amdgpu the kernel driver would accept 
that CS and push it into the scheduler. This operation needs memory, so 
by doing this the application would bind kernel memory without the 
prospect of releasing it anytime soon.

We could of course try to limit the amounts of waiting CS in the kernel, 
but then we have the problem of deadlocks again. E.g. the signaling CS 
wouldn't be accepted by the kernel because we have so many waiters.

Additional to that you can easily build deadlocks in the form CS A 
depends on CS B and CS B depends on CS A. The exact same problem for 
Android fences where discussed on the list as well, but the semantic 
there is especially designed so that you can't build deadlocks with it.

> I think we would still need separate objects to do the first type,
> which I want for VR stuff..

Which is perfectly reasonable, sharing the object between processes 
takes time. So you only want to do this once.

As a possible solution what do you think about adding some new 
functionality to the sync file IOCTLs?

IIRC we currently only support adding new fences to the sync file and 
then waiting for all of the in the CS/Atomic page flip.

But what if we also allow replacing the fence(s) in the sync file? And 
then additional to that consuming the fence in the CS/Atomic page flip 
IOCTL?

That's trivial to implement and should give us pretty much the same 
semantics as the shared semaphore object in Vulkan.

Christian.

>
> I'll try and think about it a bit harder tomorrow.
>
> Dave.