[PATCH 3/3] drm/amd: Add safety check to make sure RLC is only turned off while in GFXOFF

Alex Deucher alexdeucher at gmail.com
Wed May 17 12:24:16 UTC 2023


On Tue, May 16, 2023 at 10:23 PM Limonciello, Mario <mlimonci at amd.com> wrote:
>
>
> On 5/16/2023 4:57 PM, Alex Deucher wrote:
> > On Tue, May 16, 2023 at 5:50 PM Limonciello, Mario <mlimonci at amd.com> wrote:
> >>
> >> On 5/16/2023 4:39 PM, Alex Deucher wrote:
> >>> On Tue, May 16, 2023 at 2:15 PM Mario Limonciello
> >>> <mario.limonciello at amd.com> wrote:
> >>>> On GFX11 if RLC is stopped when not in GFXOFF the system will hang.
> >>>> Prevent this case from ever happening.
> >>>>
> >>>> Tested-by: Juan Martinez <Juan.Martinez at amd.com>
> >>>> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
> >>>> ---
> >>>>    drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 ++++
> >>>>    1 file changed, 4 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> >>>> index dcbdb2641086..f1f879d9ed8d 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> >>>> @@ -1766,6 +1766,10 @@ static void gfx_v11_0_rlc_stop(struct amdgpu_device *adev)
> >>>>    {
> >>>>           u32 tmp = RREG32_SOC15(GC, 0, regRLC_CNTL);
> >>>>
> >>>> +       if (!adev->gfx.gfx_off_state) {
> >>>> +               dev_err(adev->dev, "GFX is not in GFXOFF\n");
> >>>> +               return;
> >>>> +       }
> >>> This should move up before the RREG above?  Also, I think it would be
> >>> cleaner to just not mess with the RLC in S0i3.  Can we just return
> >>> early in smu_disable_dpms() for the APU case?  All of the DPM features
> >>> are controlled by the SMU so that function is mostly a nop of APUs
> >>> anyway.
> >>>
> >>> Alex
> >> That was what the original attempt did when we first identified this issue.
> >> Unfortunately though just skipping RLC (without patches 1 and 2) means
> >> that GFXOFF still either doesn't get toggled at suspend entry or isn't fully
> >>
> >> off at suspend entry.
> >>
> >> This leads to the graphics core behaving erratically upon resume.
> >>
> >> So if you're OK with patches 1 and 2, I'll adjust patch 3 to also skip
> >> RLC for
> >> APU.
> > Sure.
> OK, let me double check RLC skip and I'll send out a v2.
> > I wonder if we need something similar as patch 2 for other APUs?
> I expect patch 1 "alone" to help Renoir and Cezanne hitting a similar
> circumstance.
> For Rembrandt and Mendocino, they don't have IMU, so what would you poll?

For older chips, mmPWR_MISC_CNTL_STATUS (smu10), mmSMUIO_GFX_MISC_CNTL
(smu12).  See smu_v12_0_gfx_off_control() and smu10_disable_gfx_off().
It looks like smu_v13_0_gfx_off_control() doesn't wait for gfxoff like
the other functions.

Alex

> >
> > Thinking out loud here, I wonder if we shouldn't just return early in
> > the top level suspend/resume functions for S0i3.
>
> I think this can make sense for GFX10 and GFX11 maybe, but as it's already
> bifurcated I think it's probably better to do case by case basis.
>


More information about the amd-gfx mailing list