[PATCH 1/3] drm/amdgpu: Forward soft recovery errors to userspace

Christian König christian.koenig at amd.com
Fri Mar 8 08:33:38 UTC 2024


Am 07.03.24 um 20:04 schrieb Joshua Ashton:
> As we discussed before[1], soft recovery should be
> forwarded to userspace, or we can get into a really
> bad state where apps will keep submitting hanging
> command buffers cascading us to a hard reset.

Marek you are in favor of this like forever.  So I would like to request 
you to put your Reviewed-by on it and I will just push it into our 
internal kernel branch.

Regards,
Christian.

>
> 1: https://lore.kernel.org/all/bf23d5ed-9a6b-43e7-84ee-8cbfd0d60f18@froggi.es/
> Signed-off-by: Joshua Ashton <joshua at froggi.es>
>
> Cc: Friedrich Vock <friedrich.vock at gmx.de>
> Cc: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
> Cc: Christian König <christian.koenig at amd.com>
> Cc: André Almeida <andrealmeid at igalia.com>
> Cc: stable at vger.kernel.org
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 4b3000c21ef2..aebf59855e9f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -262,9 +262,8 @@ amdgpu_job_prepare_job(struct drm_sched_job *sched_job,
>   	struct dma_fence *fence = NULL;
>   	int r;
>   
> -	/* Ignore soft recovered fences here */
>   	r = drm_sched_entity_error(s_entity);
> -	if (r && r != -ENODATA)
> +	if (r)
>   		goto error;
>   
>   	if (!fence && job->gang_submit)



More information about the amd-gfx mailing list