[PATCH 2/2] drm/amdgpu: set job guilty if reset skipped

Alex Deucher alexdeucher at gmail.com
Thu Jan 14 17:25:12 UTC 2021


On Thu, Jan 14, 2021 at 9:48 AM Andrey Grodzovsky
<Andrey.Grodzovsky at amd.com> wrote:
>
> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>
> Andrey
>
> On 1/14/21 8:37 AM, Horace Chen wrote:
> > If 2 jobs on 2 different ring timed out the at a very
> > short period, the reset for second job will be skipped
> > because the reset is already in progress.
> >
> > But it doesn't mean the second job is not guilty since it also
> > timed out and can be a bad job. So before skipped out from the
> > reset, we need to increase karma for this job too.
> >
> > Signed-off-by: Horace Chen <horace.chen at amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index a28e138ac72c..d1112e29c8b4 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -4572,6 +4572,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> >               if (atomic_cmpxchg(&hive->in_reset, 0, 1) != 0) {
> >                       DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress",
> >                               job ? job->base.id : -1, hive->hive_id);
> > +                     if(job)

space between the if and (.  E.g.,

if (job)

> > +                             drm_sched_increase_karma(&job->base);
> >                       amdgpu_put_xgmi_hive(hive);
> >                       return 0;
> >               }
> > @@ -4596,6 +4598,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> >                       dev_info(adev->dev, "Bailing on TDR for s_job:%llx, as another already in progress",
> >                                       job ? job->base.id : -1);
> >                       r = 0;
> > +                     if(job)

Same here.

Alex

> > +                             drm_sched_increase_karma(&job->base);
> >                       goto skip_recovery;
> >               }
> >
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


More information about the amd-gfx mailing list