Shared semaphores for amdgpu
david1.zhou at amd.com
Thu Mar 9 09:43:02 UTC 2017
On 2017年03月09日 17:12, Christian König wrote:
> Am 09.03.2017 um 09:15 schrieb Dave Airlie:
>> On 9 March 2017 at 17:38, Christian König <christian.koenig at amd.com>
>>>> I do wonder if we need the separate sem signal/wait interface, I think
>>>> we should just add
>>>> semaphore chunks to the CS interface.
>>> Yeah, that's what I've said as well from the very first beginning.
>>> Another question is if we should really create another
>>> implementation to
>>> share semaphores between processes.
>>> In other words putting the current fences inside the semaphore into a
>>> sync_file with the signal_on_any bit set would have pretty much the
>>> effect, except that the resulting object then had the sync_file
>>> for adding new fences and can be used in the atomic IOCTLs as well.
>> So the vulkan external semaphore spec has two different type of
>> semantics, I'm not sure the sync_file semantics match the first type,
>> only the second.
> I haven't completely read that part of the spec yet, but from what I
> know the first semantics is actually a bit scary and I'm not sure if
> we want to fully support that.
> Especially that you can wait on a semaphore object which is not
> signaled yet can easily lead to deadlocks and bound resources in the
> kernel and windowing system.
> Imagine that you send a command submission to the kernel with the
> request to wait for a semaphore object and then never signal that
> semaphore object. At least for amdgpu the kernel driver would accept
> that CS and push it into the scheduler. This operation needs memory,
> so by doing this the application would bind kernel memory without the
> prospect of releasing it anytime soon.
> We could of course try to limit the amounts of waiting CS in the
> kernel, but then we have the problem of deadlocks again. E.g. the
> signaling CS wouldn't be accepted by the kernel because we have so
> many waiters.
> Additional to that you can easily build deadlocks in the form CS A
> depends on CS B and CS B depends on CS A. The exact same problem for
> Android fences where discussed on the list as well, but the semantic
> there is especially designed so that you can't build deadlocks with it.
Forbidding to wait un-sginaled sem will be enough for your this concern.
>> I think we would still need separate objects to do the first type,
Agreed, the implementation indeed do this.
>> which I want for VR stuff..
> Which is perfectly reasonable, sharing the object between processes
> takes time. So you only want to do this once.
> As a possible solution what do you think about adding some new
> functionality to the sync file IOCTLs?
> IIRC we currently only support adding new fences to the sync file and
> then waiting for all of the in the CS/Atomic page flip.
> But what if we also allow replacing the fence(s) in the sync file? And
> then additional to that consuming the fence in the CS/Atomic page flip
I feel the new sem implementation of what I attached is very good, at
least a good start, maybe we could discuss from there, not talk in the
fly, so that any problem we can improve it.
> That's trivial to implement and should give us pretty much the same
> semantics as the shared semaphore object in Vulkan.
>> I'll try and think about it a bit harder tomorrow.
More information about the amd-gfx