[PATCH] nouveau/gsp: add a 50ms delay between fbsr and driver unload rpcs

Danilo Krummrich dakr at kernel.org
Thu Jul 3 21:46:00 UTC 2025


On 7/3/25 1:27 AM, Dave Airlie wrote:
> From: Dave Airlie <airlied at redhat.com>
> 
> This fixes a bunch of command hangs after runtime suspend/resume.
> 
> This fixes a regression caused by code movement in the commit below,
> the commit seems to just change timings enough to cause this to happen
> now, and adding the sleep seems to avoid it.
> 
> I've spent some time trying to root cause it to no great avail,
> it seems like a bug on the firmware side, but it could be a bug
> in our rpc handling that I can't find.
> 
> Either way, we should land the workaround to fix the problem,
> while we continue to work out the root cause.

I think we should add a TODO above the msleep(); what do you think would be a
good comment here?

I can add it when applying the patch if you want.

> Signed-off-by: Dave Airlie <airlied at redhat.com>
> Cc: Ben Skeggs <bskeggs at nvidia.com>
> Cc: Danilo Krummrich <dakr at kernel.org>
> Fixes: 21b039715ce9 ("drm/nouveau/gsp: add hals for fbsr.suspend/resume()")
> ---
>   drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c
> index baf42339f93e..ff362a6d9f5c 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c
> @@ -1744,6 +1744,9 @@ r535_gsp_fini(struct nvkm_gsp *gsp, bool suspend)
>   			nvkm_gsp_sg_free(gsp->subdev.device, &gsp->sr.sgt);
>   			return ret;
>   		}
> +
> +		/* without this Turing ends up resetting all channels after resume. */
> +		msleep(50);
>   	}
>   
>   	ret = r535_gsp_rpc_unloading_guest_driver(gsp, suspend);



More information about the dri-devel mailing list