[PATCH] drm/amdgpu: add ring reset messages

Russell, Kent Kent.Russell at amd.com
Mon Oct 28 15:16:53 UTC 2024


[Public]

Seems simple enough to me


Reviewed-by: Kent Russell <kent.russell at amd.com>



> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Alex
> Deucher
> Sent: Monday, October 28, 2024 10:42 AM
> To: Deucher, Alexander <Alexander.Deucher at amd.com>
> Cc: amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add ring reset messages
>
> Ping?
>
> On Fri, Oct 18, 2024 at 11:47 AM Alex Deucher <alexdeucher at gmail.com> wrote:
> >
> > Ping?
> >
> > On Tue, Oct 15, 2024 at 2:28 PM Alex Deucher <alexander.deucher at amd.com>
> wrote:
> > >
> > > Add messages to make it clear when a per ring reset
> > > happens.  This is helpful for debugging and aligns with
> > > other reset methods.
> > >
> > > Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > index 102742f1faa2..2d60552a13ac 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > @@ -137,6 +137,7 @@ static enum drm_gpu_sched_stat
> amdgpu_job_timedout(struct drm_sched_job *s_job)
> > >         /* attempt a per ring reset */
> > >         if (amdgpu_gpu_recovery &&
> > >             ring->funcs->reset) {
> > > +               dev_err(adev->dev, "Starting %s ring reset\n", s_job->sched->name);
> > >                 /* stop the scheduler, but don't mess with the
> > >                  * bad job yet because if ring reset fails
> > >                  * we'll fall back to full GPU reset.
> > > @@ -150,8 +151,10 @@ static enum drm_gpu_sched_stat
> amdgpu_job_timedout(struct drm_sched_job *s_job)
> > >                         amdgpu_fence_driver_force_completion(ring);
> > >                         if (amdgpu_ring_sched_ready(ring))
> > >                                 drm_sched_start(&ring->sched);
> > > +                       dev_err(adev->dev, "Ring reset success\n");
> > >                         goto exit;
> > >                 }
> > > +               dev_err(adev->dev, "Ring reset failure\n");
> > >         }
> > >
> > >         if (amdgpu_device_should_recover_gpu(ring->adev)) {
> > > --
> > > 2.46.2
> > >


More information about the amd-gfx mailing list