[PATCH] drm/amdgpu: fix gpu recovery disable with per queue reset
Kim, Jonathan
Jonathan.Kim at amd.com
Thu Jan 9 14:57:41 UTC 2025
[Public]
> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Thursday, January 9, 2025 1:14 AM
> To: Kim, Jonathan <Jonathan.Kim at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Kasiviswanathan, Harish <Harish.Kasiviswanathan at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: fix gpu recovery disable with per queue reset
>
>
>
> On 1/9/2025 1:31 AM, Jonathan Kim wrote:
> > Per queue reset should be bypassed when gpu recovery is disabled
> > with module parameter.
> >
> > Signed-off-by: Jonathan Kim <jonathan.kim at amd.com>
> > ---
> > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> > index cc66ebb7bae1..441568163e20 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> > @@ -1131,6 +1131,9 @@ uint64_t kgd_gfx_v9_hqd_get_pq_addr(struct
> amdgpu_device *adev,
> > uint32_t low, high;
> > uint64_t queue_addr = 0;
> >
> > + if (!amdgpu_gpu_recovery)
> > + return 0;
> > +
> > kgd_gfx_v9_acquire_queue(adev, pipe_id, queue_id, inst);
> > amdgpu_gfx_rlc_enter_safe_mode(adev, inst);
> >
> > @@ -1179,6 +1182,9 @@ uint64_t kgd_gfx_v9_hqd_reset(struct amdgpu_device
> *adev,
> > uint32_t low, high, pipe_reset_data = 0;
> > uint64_t queue_addr = 0;
> >
> > + if (!amdgpu_gpu_recovery)
> > + return 0;
> > +
>
> I think the right place for this check is not inside callback, should be
> from the place where the callback gets called.
I don't think it really matters. We're going to have different reset types in the future that my come from different callers.
It's probably easier to remember to put the bypass where the reset is actually happening.
Jon
>
> Thanks,
> Lijo
>
> > kgd_gfx_v9_acquire_queue(adev, pipe_id, queue_id, inst);
> > amdgpu_gfx_rlc_enter_safe_mode(adev, inst);
> >
More information about the amd-gfx
mailing list