Avoid uninterruptible sleep during process exit

Andrey Grodzovsky andrey.grodzovsky at amd.com
Tue Apr 24 15:30:33 UTC 2018


Following 3 patches address an issue we encounter in AMDGPU driver.

When GPU pipe is stalling for some reason (shader code error, incorrectly programmed registers e.t.c...) 
uninterruptible wait in kernel puts the user process in unresponsive state 
which only can be remedied by  system's hard reset.   

Each patch addresses a different use case of such problem.

First one is normal exit (not from signal processing) the change in 
core/signal.c - to allow propagation of KILL signal to process marked as exiting.

Second one is exit due to death because of unhanded  signal during signal 
processing - to avoid waiting for SIGKILL if you are called from
...->do_signal->get_signal->do_group_exit->do_exit->...->wait_event_killable

Third one is nor related to process exit and just avoids uninterruptible wait 
for particular job completion on the GPU pipe.

P.S Sending this to the kernel mailing list mainly because of the first patch, 
the 2 others are intended more for amd-gfx at lists.freedesktop.org and 
are given here just to provide more context for the problem we try to solve.

Andrey Grodzovsky (3):

signals: Allow generation of SIGKILL to exiting task.   
drm/scheduler: Don't call wait_event_killable for signaled process.   
drm/amdgpu: Switch to interrupted wait to recover from ring hang.

drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   | 14 ++++++++++----
drivers/gpu/drm/scheduler/gpu_scheduler.c |  5 +++--
kernel/signal.c                           |  4 ++--
3 files changed, 15 insertions(+), 8 deletions(-)



More information about the amd-gfx mailing list