[PATCH]] drm/amdgpu/gfx9: add gfxoff quirk

Alex Deucher alexdeucher at gmail.com
Wed Mar 10 17:06:43 UTC 2021


On Wed, Mar 10, 2021 at 11:37 AM Daniel Gomez <daniel at qtec.com> wrote:
>
> Disabling GFXOFF via the quirk list fixes a hardware lockup in
> Ryzen V1605B, RAVEN 0x1002:0x15DD rev 0x83.
>
> Signed-off-by: Daniel Gomez <daniel at qtec.com>
> ---
>
> This patch is a continuation of the work here:
> https://lkml.org/lkml/2021/2/3/122 where a hardware lockup was discussed and
> a dma_fence deadlock was provoke as a side effect. To reproduce the issue
> please refer to the above link.
>
> The hardware lockup was introduced in 5.6-rc1 for our particular revision as it
> wasn't part of the new blacklist. Before that, in kernel v5.5, this hardware was
> working fine without any hardware lock because the GFXOFF was actually disabled
> by the if condition for the CHIP_RAVEN case. So this patch, adds the 'Radeon
> Vega Mobile Series [1002:15dd] (rev 83)' to the blacklist to disable the GFXOFF.
>
> But besides the fix, I'd like to ask from where this revision comes from. Is it
> an ASIC revision or is it hardcoded in the VBIOS from our vendor? From what I
> can see, it comes from the ASIC and I wonder if somehow we can get an APU in the
> future, 'not blacklisted', with the same problem. Then, should this table only
> filter for the vendor and device and not the revision? Do you know if there are
> any revisions for the 1002:15dd validated, tested and functional?

The pci revision id (RID) is used to specify the specific SKU within a
family.  GFXOFF is supposed to be working on all raven variants.  It
was tested and functional on all reference platforms and any OEM
platforms that launched with Linux support.  There are a lot of
dependencies on sbios in the early raven variants (0x15dd), so it's
likely more of a specific platform issue, but there is not a good way
to detect this so we use the DID/SSID/RID as a proxy.  The newer raven
variants (0x15d8) have much better GFXOFF support since they all
shipped with newer firmware and sbios.

Alex


>
> Logs:
> [   27.708348] [drm] initializing kernel modesetting (RAVEN
> 0x1002:0x15DD 0x1002:0x15DD 0x83).
> [   27.789156] amdgpu: ATOM BIOS: 113-RAVEN-115
>
> Thanks in advance,
> Daniel
>
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> index 65db88bb6cbc..319d4b99aec8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> @@ -1243,6 +1243,8 @@ static const struct amdgpu_gfxoff_quirk amdgpu_gfxoff_quirk_list[] = {
>         { 0x1002, 0x15dd, 0x103c, 0x83e7, 0xd3 },
>         /* GFXOFF is unstable on C6 parts with a VBIOS 113-RAVEN-114 */
>         { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc6 },
> +       /* GFXOFF provokes a hw lockup on 83 parts with a VBIOS 113-RAVEN-115 */
> +       { 0x1002, 0x15dd, 0x1002, 0x15dd, 0x83 },
>         { 0, 0, 0, 0, 0 },
>  };
>
> --
> 2.30.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


More information about the amd-gfx mailing list