[PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii

Russell, Kent Kent.Russell at amd.com
Tue Jan 11 14:41:26 UTC 2022


[AMD Official Use Only]

Reviewed-by: Kent Russell <kent.russell at amd.com>


> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Felix Kuehling
> Sent: Monday, January 10, 2022 7:13 PM
> To: amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH 1/2] drm/amdkfd: Fix DQM asserts on Hawaii
>
> Ping.
>
> On 2021-12-08 3:25 a.m., Felix Kuehling wrote:
>
> > start_nocpsch would never set dqm->sched_running on Hawaii due to an
> > early return statement. This would trigger asserts in other functions
> > and end up in inconsistent states.
> >
> > Bug:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FRad
> eonOpenCompute%2FROCm%2Fissues%2F1624&data=04%7C01%7Ckent.russell%40a
> md.com%7C44c423a1e21b4676d29c08d9d4972868%7C3dd8961fe4884e608e11a82d994e18
> 3d%7C0%7C0%7C637774567959648449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=IYVH4ZU
> UOL1cVzCLZfvoFkRO5%2FKlHsSd6H8RRUP73Nk%3D&reserved=0
> > Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 9 ++++++---
> >   1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > index dd0b952f0173..104b70e61ba0 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > @@ -1004,14 +1004,17 @@ static void uninitialize(struct device_queue_manager *dqm)
> >
> >   static int start_nocpsch(struct device_queue_manager *dqm)
> >   {
> > +   int r = 0;
> > +
> >     pr_info("SW scheduler is used");
> >     init_interrupts(dqm);
> >
> >     if (dqm->dev->adev->asic_type == CHIP_HAWAII)
> > -           return pm_init(&dqm->packet_mgr, dqm);
> > -   dqm->sched_running = true;
> > +           r = pm_init(&dqm->packet_mgr, dqm);
> > +   if (!r)
> > +           dqm->sched_running = true;
> >
> > -   return 0;
> > +   return r;
> >   }
> >
> >   static int stop_nocpsch(struct device_queue_manager *dqm)


More information about the amd-gfx mailing list