[PATCH] drm/amdgpu: fix ib test hang with gfxoff enabled

Huang Rui ray.huang at amd.com
Fri Jun 1 09:29:43 UTC 2018


On Fri, Jun 01, 2018 at 05:13:49PM +0800, Christian König wrote:
> Am 01.06.2018 um 08:41 schrieb Huang Rui:
> > After defer the execution of gfx/compute ib tests. However, at that time, the
> > gfx already go into "mid state" of gfxoff.
> >
> > PWR_MISC_CNTL_STATUS: PWR_GFXOFF_STATUS field (2:1 bits)
> > 0 = GFXOFF.
> > 1 = Transition out of GFXOFF state.
> > 2 = Not in GFXOFF.
> > 3 = Transition into GFXOFF.
> >
> > If hit the mid state (1 or 3), the doorbell writing interrupt cannot wake up the
> > gfx back successfully. And the field value is 1 when we issue the ib test at
> > that, so we got the hang. This is the root cause that we encountered the issue.
> >
> > Meanwhile, we cannot set clockgating of GFX after gfx is already in "off" state.
> > So here we should move the gfx powergating and gfxoff enabling behavior at the
> > end of initialization behind ib test and clockgating.
> 
> Mhm, that still looks like a only halve backed solution:
> 
> 1. What prevents this bug from happening during "normal" IB submission 
> from userspace?
> 
> 2. Shouldn't we poll the PWR_MISC_CNTL_STATUS register to make sure we 
> are not in any transition phase instead?
> 

Yes, right. How about also add polling of PWR_MISC_CNTL_STATUS in
amdgpu_ring_commit() behind set_wptr that confirm the status as "0" or "2"?

Thanks,
Ray


More information about the amd-gfx mailing list