[PATCH 09/19] drm/radeon: handle lockup in delayed work, v2
maarten.lankhorst at canonical.com
Fri Aug 1 10:46:17 PDT 2014
On 01-08-14 18:35, Christian König wrote:
> Am 31.07.2014 um 17:33 schrieb Maarten Lankhorst:
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>> V1 had a nasty bug breaking gpu lockup recovery. The fix is not
>> allowing radeon_fence_driver_check_lockup to take exclusive_lock,
>> and kill it during lockup recovery instead.
> That looks like the delayed work starts running as soon as we submit a fence, and not when it's needed for waiting.
> Since it's a backup for failing IRQs I would rather put it into radeon_irq_kms.c and start/stop it when the IRQs are started/stoped.
The delayed work is not just for failing irq's, it's also the handler that's used to detect lockups, which is why I trigger after processing fences, and reset the timer after processing.
Specifically what happened was this scenario:
- lock up occurs
- write lock taken by gpu_reset
- delayed work runs, tries to acquire read lock, blocks
- gpu_reset tries to cancel delayed work synchronously
- has to wait for delayed work to finish -> deadlock
More information about the dri-devel