[PATCH 3/3] drm/amd: Add safety check to make sure RLC is only turned off while in GFXOFF
Limonciello, Mario
mlimonci at amd.com
Wed May 17 02:23:31 UTC 2023
On 5/16/2023 4:57 PM, Alex Deucher wrote:
> On Tue, May 16, 2023 at 5:50 PM Limonciello, Mario <mlimonci at amd.com> wrote:
>>
>> On 5/16/2023 4:39 PM, Alex Deucher wrote:
>>> On Tue, May 16, 2023 at 2:15 PM Mario Limonciello
>>> <mario.limonciello at amd.com> wrote:
>>>> On GFX11 if RLC is stopped when not in GFXOFF the system will hang.
>>>> Prevent this case from ever happening.
>>>>
>>>> Tested-by: Juan Martinez <Juan.Martinez at amd.com>
>>>> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
>>>> ---
>>>> drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 ++++
>>>> 1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>>>> index dcbdb2641086..f1f879d9ed8d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>>>> @@ -1766,6 +1766,10 @@ static void gfx_v11_0_rlc_stop(struct amdgpu_device *adev)
>>>> {
>>>> u32 tmp = RREG32_SOC15(GC, 0, regRLC_CNTL);
>>>>
>>>> + if (!adev->gfx.gfx_off_state) {
>>>> + dev_err(adev->dev, "GFX is not in GFXOFF\n");
>>>> + return;
>>>> + }
>>> This should move up before the RREG above? Also, I think it would be
>>> cleaner to just not mess with the RLC in S0i3. Can we just return
>>> early in smu_disable_dpms() for the APU case? All of the DPM features
>>> are controlled by the SMU so that function is mostly a nop of APUs
>>> anyway.
>>>
>>> Alex
>> That was what the original attempt did when we first identified this issue.
>> Unfortunately though just skipping RLC (without patches 1 and 2) means
>> that GFXOFF still either doesn't get toggled at suspend entry or isn't fully
>>
>> off at suspend entry.
>>
>> This leads to the graphics core behaving erratically upon resume.
>>
>> So if you're OK with patches 1 and 2, I'll adjust patch 3 to also skip
>> RLC for
>> APU.
> Sure.
OK, let me double check RLC skip and I'll send out a v2.
> I wonder if we need something similar as patch 2 for other APUs?
I expect patch 1 "alone" to help Renoir and Cezanne hitting a similar
circumstance.
For Rembrandt and Mendocino, they don't have IMU, so what would you poll?
>
> Thinking out loud here, I wonder if we shouldn't just return early in
> the top level suspend/resume functions for S0i3.
I think this can make sense for GFX10 and GFX11 maybe, but as it's already
bifurcated I think it's probably better to do case by case basis.
More information about the amd-gfx
mailing list