[PATCH] drm/amdgpu: fix ib test hang with gfxoff enabled
Huang Rui
ray.huang at amd.com
Fri Jun 1 09:29:43 UTC 2018
On Fri, Jun 01, 2018 at 05:13:49PM +0800, Christian König wrote:
> Am 01.06.2018 um 08:41 schrieb Huang Rui:
> > After defer the execution of gfx/compute ib tests. However, at that time, the
> > gfx already go into "mid state" of gfxoff.
> >
> > PWR_MISC_CNTL_STATUS: PWR_GFXOFF_STATUS field (2:1 bits)
> > 0 = GFXOFF.
> > 1 = Transition out of GFXOFF state.
> > 2 = Not in GFXOFF.
> > 3 = Transition into GFXOFF.
> >
> > If hit the mid state (1 or 3), the doorbell writing interrupt cannot wake up the
> > gfx back successfully. And the field value is 1 when we issue the ib test at
> > that, so we got the hang. This is the root cause that we encountered the issue.
> >
> > Meanwhile, we cannot set clockgating of GFX after gfx is already in "off" state.
> > So here we should move the gfx powergating and gfxoff enabling behavior at the
> > end of initialization behind ib test and clockgating.
>
> Mhm, that still looks like a only halve backed solution:
>
> 1. What prevents this bug from happening during "normal" IB submission
> from userspace?
>
> 2. Shouldn't we poll the PWR_MISC_CNTL_STATUS register to make sure we
> are not in any transition phase instead?
>
Yes, right. How about also add polling of PWR_MISC_CNTL_STATUS in
amdgpu_ring_commit() behind set_wptr that confirm the status as "0" or "2"?
Thanks,
Ray
More information about the amd-gfx
mailing list