[PATCH] drm/amd: Add a workaround for GFX11 systems that fail to flush TLB

Thu Dec 14 15:06:48 UTC 2023

On Thu, Dec 14, 2023 at 9:24 AM Liu, Shaoyun <Shaoyun.Liu at amd.com> wrote:
>
> [AMD Official Use Only - General]
>
> The gmc flush tlb function is used on both  baremetal and sriov.   But the  function  amdgpu_virt_kiq_reg_write_reg_wait is defined in amdgpu_virt.c with  name  'virt'  make it appear as a SRIOV only function, this sounds confusion . Will it make more sense to move the function out of amdgpu_virt.c file and  rename it as amdgpu_kig_reg_write_reg_wait ?
>
> Another thing I'm not sure is inside amdgpu_virt_kiq_reg_write_reg_wait , has below logic :
>         if (adev->mes.ring.sched.ready) {
>                 amdgpu_mes_reg_write_reg_wait(adev, reg0, reg1,
>                                               ref, mask);
>                 return;
>         }
> On MES enabled situation , it will always call to mes queue to do the register write and  wait .  Shouldn't this OP been directly send to kiq itself ?  The ring for kiq and  mes is different ,  driver should use kiq(adev->gfx.kiq[0].ring) for these register read/write or wait operation  and  mes ( adev->mes.ring) for add/remove queues  etc.
>

I understand why it is needed for SR-IOV.  Is there a reason to use
the MES or KIQ for TLB invalidation rather than the register method on
bare metal?  It looks like the register method is never used anymore.
 Seems like we should either, make the KIQ/MES method SR-IOV only, or
drop the register method and just always use KIQ/MES.

Alex

> Regards
> Shaoyun.liu
>
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Christian König
> Sent: Thursday, December 14, 2023 4:22 AM
> To: Alex Deucher <alexdeucher at gmail.com>; Limonciello, Mario <Mario.Limonciello at amd.com>
> Cc: Huang, Tim <Tim.Huang at amd.com>; amd-gfx at lists.freedesktop.org; Koenig, Christian <Christian.Koenig at amd.com>; stable at vger.kernel.org
> Subject: Re: [PATCH] drm/amd: Add a workaround for GFX11 systems that fail to flush TLB
>
> Am 13.12.23 um 20:44 schrieb Alex Deucher:
> > On Wed, Dec 13, 2023 at 2:32 PM Mario Limonciello
> > <mario.limonciello at amd.com> wrote:
> >> On 12/13/2023 13:12, Mario Limonciello wrote:
> >>> On 12/13/2023 13:07, Alex Deucher wrote:
> >>>> On Wed, Dec 13, 2023 at 1:00 PM Mario Limonciello
> >>>> <mario.limonciello at amd.com> wrote:
> >>>>> Some systems with MP1 13.0.4 or 13.0.11 have a firmware bug that
> >>>>> causes the first MES packet after resume to fail. This packet is
> >>>>> used to flush the TLB when GART is enabled.
> >>>>>
> >>>>> This issue is fixed in newer firmware, but as OEMs may not roll
> >>>>> this out to the field, introduce a workaround that will retry the
> >>>>> flush when detecting running on an older firmware and decrease
> >>>>> relevant error messages to debug while workaround is in use.
> >>>>>
> >>>>> Cc: stable at vger.kernel.org # 6.1+
> >>>>> Cc: Tim Huang <Tim.Huang at amd.com>
> >>>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3045
> >>>>> Signed-off-by: Mario Limonciello <mario.limonciello at amd.com>
> >>>>> ---
> >>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 10 ++++++++--
> >>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  2 ++
> >>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c  | 17 ++++++++++++++++-
> >>>>>    drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  |  8 ++++++--
> >>>>>    4 files changed, 32 insertions(+), 5 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> >>>>> index 9ddbf1494326..6ce3f6e6b6de 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> >>>>> @@ -836,8 +836,14 @@ int amdgpu_mes_reg_write_reg_wait(struct
> >>>>> amdgpu_device *adev,
> >>>>>           }
> >>>>>
> >>>>>           r = adev->mes.funcs->misc_op(&adev->mes, &op_input);
> >>>>> -       if (r)
> >>>>> -               DRM_ERROR("failed to reg_write_reg_wait\n");
> >>>>> +       if (r) {
> >>>>> +               const char *msg = "failed to
> >>>>> + reg_write_reg_wait\n";
> >>>>> +
> >>>>> +               if (adev->mes.suspend_workaround)
> >>>>> +                       DRM_DEBUG(msg);
> >>>>> +               else
> >>>>> +                       DRM_ERROR(msg);
> >>>>> +       }
> >>>>>
> >>>>>    error:
> >>>>>           return r;
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> >>>>> index a27b424ffe00..90f2bba3b12b 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> >>>>> @@ -135,6 +135,8 @@ struct amdgpu_mes {
> >>>>>
> >>>>>           /* ip specific functions */
> >>>>>           const struct amdgpu_mes_funcs   *funcs;
> >>>>> +
> >>>>> +       bool                            suspend_workaround;
> >>>>>    };
> >>>>>
> >>>>>    struct amdgpu_mes_process {
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>>> index 23d7b548d13f..e810c7bb3156 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>>> @@ -889,7 +889,11 @@ static int gmc_v11_0_gart_enable(struct
> >>>>> amdgpu_device *adev)
> >>>>>                   false : true;
> >>>>>
> >>>>>           adev->mmhub.funcs->set_fault_enable_default(adev, value);
> >>>>> -       gmc_v11_0_flush_gpu_tlb(adev, 0, AMDGPU_MMHUB0(0), 0);
> >>>>> +
> >>>>> +       do {
> >>>>> +               gmc_v11_0_flush_gpu_tlb(adev, 0, AMDGPU_MMHUB0(0), 0);
> >>>>> +               adev->mes.suspend_workaround = false;
> >>>>> +       } while (adev->mes.suspend_workaround);
> >>>> Shouldn't this be something like:
> >>>>
> >>>>> +       do {
> >>>>> +               gmc_v11_0_flush_gpu_tlb(adev, 0, AMDGPU_MMHUB0(0), 0);
> >>>>> +               adev->mes.suspend_workaround = false;
> >>>>> +               gmc_v11_0_flush_gpu_tlb(adev, 0, AMDGPU_MMHUB0(0), 0);
> >>>>> +       } while (adev->mes.suspend_workaround);
> >>>> If we actually need the flush.  Maybe a better approach would be to
> >>>> check if we are in s0ix in
> >>> Ah you're right; I had shifted this around to keep less stateful
> >>> variables and push them up the stack from when I first made it and
> >>> that logic is wrong now.
> >>>
> >>> I don't think the one you suggested is right either; it's going to
> >>> apply twice on ASICs that only need it once.
> >>>
> >>> I guess pending on what Christian comments on below I'll respin to
> >>> logic that only calls twice on resume for these ASICs.
> >> One more comment.  Tim and I both did an experiment for this
> >> (skipping the flush) separately.  The problem isn't the flush itself,
> >> rather it's the first MES packet after exiting GFXOFF.
>
> Well that's an ugly one. Can that happen every time GFXOFF kicks in?
>
> >>
> >> So it seems that it pushes off the issue to the next thing which is a
> >> ring buffer test:
> >>
> >> [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on
> >> comp_1.0.0 (-110).
> >> [drm:process_one_work] *ERROR* ib ring test failed (-110).
> >>
> >> So maybe a better workaround is a "dummy" command that is only sent
> >> for the broken firmware that we don't care about the outcome and discard errors.
> >>
> >> Then the workaround doesn't need to get as entangled with correct flow.
> > Yeah. Something like that seems cleaner.  Just a question of where to
> > put it since we skip GC and MES for s0ix.  Probably somewhere in
> > gmc_v11_0_resume() before gmc_v11_0_gart_enable().  Maybe add a new
> > mes callback.
>
> Please try to keep it completely outside of the TLB invalidation and VM flush handling.
>
> Regards,
> Christian.
>
> >
> > Alex
> >
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c in gmc_v11_0_flush_gpu_tlb():
> >>>> index 23d7b548d13f..bd6d9953a80e 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >>>> @@ -227,7 +227,8 @@ static void gmc_v11_0_flush_gpu_tlb(struct
> >>>> amdgpu_device *adev, uint32_t vmid,
> >>>>            * Directly use kiq to do the vm invalidation instead
> >>>>            */
> >>>>           if ((adev->gfx.kiq[0].ring.sched.ready ||
> >>>> adev->mes.ring.sched.ready) &&
> >>>> -           (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
> >>>> +           (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev)) ||
> >>>> +           !adev->in_s0ix) {
> >>>>                   amdgpu_virt_kiq_reg_write_reg_wait(adev, req,
> >>>> ack, inv_req,
> >>>>                                   1 << vmid, GET_INST(GC, 0));
> >>>>                   return;
> >>>>
> >>>> @Christian Koenig is this logic correct?
> >>>>
> >>>>           /* For SRIOV run time, driver shouldn't access the
> >>>> register through MMIO
> >>>>            * Directly use kiq to do the vm invalidation instead
> >>>>            */
> >>>>           if ((adev->gfx.kiq[0].ring.sched.ready ||
> >>>> adev->mes.ring.sched.ready) &&
> >>>>               (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
> >>>>                   amdgpu_virt_kiq_reg_write_reg_wait(adev, req,
> >>>> ack, inv_req,
> >>>>                                   1 << vmid, GET_INST(GC, 0));
> >>>>                   return;
> >>>>           }
> >>>>
> >>>> We basically always use the MES with that logic.  If that is the
> >>>> case, we should just drop the rest of that function.  Shouldn't we
> >>>> only use KIQ or MES for SR-IOV?  gmc v10 has similar logic which
> >>>> also seems wrong.
> >>>>
> >>>> Alex
> >>>>
> >>>>
> >>>>>           DRM_INFO("PCIE GART of %uM enabled (table at 0x%016llX).\n",
> >>>>>                    (unsigned int)(adev->gmc.gart_size >> 20), @@
> >>>>> -960,6 +964,17 @@ static int gmc_v11_0_resume(void *handle)
> >>>>>           int r;
> >>>>>           struct amdgpu_device *adev = (struct amdgpu_device
> >>>>> *)handle;
> >>>>>
> >>>>> +       switch (amdgpu_ip_version(adev, MP1_HWIP, 0)) {
> >>>>> +       case IP_VERSION(13, 0, 4):
> >>>>> +       case IP_VERSION(13, 0, 11):
> >>>>> +               /* avoid problems with first TLB flush after resume */
> >>>>> +               if ((adev->pm.fw_version & 0x00FFFFFF) < 0x004c4900)
> >>>>> +                       adev->mes.suspend_workaround = adev->in_s0ix;
> >>>>> +               break;
> >>>>> +       default:
> >>>>> +               break;
> >>>>> +       }
> >>>>> +
> >>>>>           r = gmc_v11_0_hw_init(adev);
> >>>>>           if (r)
> >>>>>                   return r;
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> >>>>> b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> >>>>> index 4dfec56e1b7f..84ab8c611e5e 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> >>>>> @@ -137,8 +137,12 @@ static int
> >>>>> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
> >>>>>           r = amdgpu_fence_wait_polling(ring, ring->fence_drv.sync_seq,
> >>>>>                         timeout);
> >>>>>           if (r < 1) {
> >>>>> -               DRM_ERROR("MES failed to response msg=%d\n",
> >>>>> -                         x_pkt->header.opcode);
> >>>>> +               if (mes->suspend_workaround)
> >>>>> +                       DRM_DEBUG("MES failed to response msg=%d\n",
> >>>>> +                                 x_pkt->header.opcode);
> >>>>> +               else
> >>>>> +                       DRM_ERROR("MES failed to response msg=%d\n",
> >>>>> +                                 x_pkt->header.opcode);
> >>>>>
> >>>>>                   while (halt_if_hws_hang)
> >>>>>                           schedule();
> >>>>> --
> >>>>> 2.34.1
> >>>>>
>