[PATCH v2] drm/amdgpu: Fix a race of IB test

Lazar, Lijo lijo.lazar at amd.com
Mon Sep 13 04:00:43 UTC 2021



On 9/13/2021 5:18 AM, xinhui pan wrote:
> Direct IB submission should be exclusive. So use write lock.
> 
> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 19323b4cce7b..be5d12ed3db1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1358,7 +1358,7 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>   	}
>   
>   	/* Avoid accidently unparking the sched thread during GPU reset */
> -	r = down_read_killable(&adev->reset_sem);
> +	r = down_write_killable(&adev->reset_sem);

There are many ioctls and debugfs calls which takes this lock and as you 
know the purpose is to avoid them while there is a reset. The purpose is 
*not to* fix any concurrency issues those calls themselves have 
otherwise and fixing those concurrency issues this way is just lazy and 
not acceptable.

This will take away any fairness given to the writer in this rw lock and 
that is supposed to be the reset thread.

Thanks,
Lijo

>   	if (r)
>   		return r;
>   
> @@ -1387,7 +1387,7 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>   		kthread_unpark(ring->sched.thread);
>   	}
>   
> -	up_read(&adev->reset_sem);
> +	up_write(&adev->reset_sem);
>   
>   	pm_runtime_mark_last_busy(dev->dev);
>   	pm_runtime_put_autosuspend(dev->dev);
> 


More information about the amd-gfx mailing list