[PATCH 2/2] dma-buf: fix reservation_object_wait_timeout_rcu to wait correctly v2
Christian König
deathsimple at vodafone.de
Thu Aug 10 18:19:52 UTC 2017
Am 10.08.2017 um 19:11 schrieb Chris Wilson:
> Quoting Alex Deucher (2017-08-10 18:01:49)
>> From: Christian König <christian.koenig at amd.com>
>>
>> With hardware resets in mind it is possible that all shared fences are
>> signaled, but the exlusive isn't. Fix waiting for everything in this situation.
> I'm still puzzling over this one.
>
> Setting an exclusive fence will clear all shared, so we must have added
> the shared fences after the exclusive fence. But those shared fences must
> follow the exclusive fence, you should not be able to reorder the shared
> fences ahead of the exclusive. If you have completed shared fences, but
> incomplete exclusive, doesn't that imply you have an ordering issue?
No, that is not an ordering issue.
The problem happens when the shared fences are "aborted" because of a
GPU reset.
See the exclusive fence is often a DMA moving bytes on behalf of the
kernel while the shared fences are the actual command submission from
user space.
What can happen is that the userspace process is killed and/or it's
scheduled command submission aborted for other reasons.
In this case the shared fences get an error code, are set to the
signaled state and the job they represent is never executed.
Regards,
Christian.
More information about the amd-gfx
mailing list