[PATCH]] drm/amdgpu/gfx9: add gfxoff quirk
Alex Deucher
alexdeucher at gmail.com
Wed Mar 10 17:06:43 UTC 2021
On Wed, Mar 10, 2021 at 11:37 AM Daniel Gomez <daniel at qtec.com> wrote:
>
> Disabling GFXOFF via the quirk list fixes a hardware lockup in
> Ryzen V1605B, RAVEN 0x1002:0x15DD rev 0x83.
>
> Signed-off-by: Daniel Gomez <daniel at qtec.com>
> ---
>
> This patch is a continuation of the work here:
> https://lkml.org/lkml/2021/2/3/122 where a hardware lockup was discussed and
> a dma_fence deadlock was provoke as a side effect. To reproduce the issue
> please refer to the above link.
>
> The hardware lockup was introduced in 5.6-rc1 for our particular revision as it
> wasn't part of the new blacklist. Before that, in kernel v5.5, this hardware was
> working fine without any hardware lock because the GFXOFF was actually disabled
> by the if condition for the CHIP_RAVEN case. So this patch, adds the 'Radeon
> Vega Mobile Series [1002:15dd] (rev 83)' to the blacklist to disable the GFXOFF.
>
> But besides the fix, I'd like to ask from where this revision comes from. Is it
> an ASIC revision or is it hardcoded in the VBIOS from our vendor? From what I
> can see, it comes from the ASIC and I wonder if somehow we can get an APU in the
> future, 'not blacklisted', with the same problem. Then, should this table only
> filter for the vendor and device and not the revision? Do you know if there are
> any revisions for the 1002:15dd validated, tested and functional?
The pci revision id (RID) is used to specify the specific SKU within a
family. GFXOFF is supposed to be working on all raven variants. It
was tested and functional on all reference platforms and any OEM
platforms that launched with Linux support. There are a lot of
dependencies on sbios in the early raven variants (0x15dd), so it's
likely more of a specific platform issue, but there is not a good way
to detect this so we use the DID/SSID/RID as a proxy. The newer raven
variants (0x15d8) have much better GFXOFF support since they all
shipped with newer firmware and sbios.
Alex
>
> Logs:
> [ 27.708348] [drm] initializing kernel modesetting (RAVEN
> 0x1002:0x15DD 0x1002:0x15DD 0x83).
> [ 27.789156] amdgpu: ATOM BIOS: 113-RAVEN-115
>
> Thanks in advance,
> Daniel
>
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> index 65db88bb6cbc..319d4b99aec8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> @@ -1243,6 +1243,8 @@ static const struct amdgpu_gfxoff_quirk amdgpu_gfxoff_quirk_list[] = {
> { 0x1002, 0x15dd, 0x103c, 0x83e7, 0xd3 },
> /* GFXOFF is unstable on C6 parts with a VBIOS 113-RAVEN-114 */
> { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc6 },
> + /* GFXOFF provokes a hw lockup on 83 parts with a VBIOS 113-RAVEN-115 */
> + { 0x1002, 0x15dd, 0x1002, 0x15dd, 0x83 },
> { 0, 0, 0, 0, 0 },
> };
>
> --
> 2.30.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
More information about the amd-gfx
mailing list