[PATCH] drm/amdgpu: Fix a race of IB test
Christian König
ckoenig.leichtzumerken at gmail.com
Sat Sep 11 07:45:37 UTC 2021
Am 11.09.21 um 03:55 schrieb xinhui pan:
> Direct IB submission should be exclusive. So use write lock.
>
> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 19323b4cce7b..acbe02928791 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1358,10 +1358,15 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
> }
>
> /* Avoid accidently unparking the sched thread during GPU reset */
> - r = down_read_killable(&adev->reset_sem);
> + r = down_write_killable(&adev->reset_sem);
> if (r)
> return r;
>
> + /* Avoid concurrently IB test but not cancel it as I don't know whether we
> + * would add more code in the delayed init work.
> + */
> + flush_delayed_work(&adev->delayed_init_work);
> +
That won't work. It's at least theoretical possible that the delayed
init work waits for the reset_sem which we are holding here.
Very unlikely to happen, but lockdep might be able to point that out
with a nice backtrace in the logs.
On the other hand delayed init work and direct IB test through this
interface should work at the same time, so I would just drop it.
Christian.
> /* hold on the scheduler */
> for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
> struct amdgpu_ring *ring = adev->rings[i];
> @@ -1387,7 +1392,7 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
> kthread_unpark(ring->sched.thread);
> }
>
> - up_read(&adev->reset_sem);
> + up_write(&adev->reset_sem);
>
> pm_runtime_mark_last_busy(dev->dev);
> pm_runtime_put_autosuspend(dev->dev);
More information about the amd-gfx
mailing list