[PATCH] drm/amdgpu: Fix page table setup on Arcturus

Thu Aug 25 15:49:20 UTC 2022

[AMD Official Use Only - General]


> -----Original Message-----
> From: Alex Deucher <alexdeucher at gmail.com>
> Sent: Thursday, August 25, 2022 11:26 AM
> To: Joshi, Mukul <Mukul.Joshi at amd.com>
> Cc: amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus
> 
> [CAUTION: External Email]
> 
> On Thu, Aug 25, 2022 at 10:49 AM Joshi, Mukul <Mukul.Joshi at amd.com>
> wrote:
> >
> > [AMD Official Use Only - General]
> >
> >
> >
> > > -----Original Message-----
> > > From: Alex Deucher <alexdeucher at gmail.com>
> > > Sent: Thursday, August 25, 2022 9:33 AM
> > > To: Joshi, Mukul <Mukul.Joshi at amd.com>
> > > Cc: amd-gfx at lists.freedesktop.org
> > > Subject: Re: [PATCH] drm/amdgpu: Fix page table setup on Arcturus
> > >
> > > [CAUTION: External Email]
> > >
> > > On Mon, Aug 22, 2022 at 11:53 AM Mukul Joshi <mukul.joshi at amd.com>
> > > wrote:
> > > >
> > > > When translate_further is enabled, page table depth needs to be
> > > > updated. This was missing on Arcturus MMHUB init. This was causing
> > > > address translations to fail for SDMA user-mode queues.
> > > >
> > >
> > > Do other mmhub implementations need a similar fix?  It looks like
> > > some of them are missing similar changes.
> > >
> >
> > I am not sure if there is a plan to enable translate_further on other ASICs.
> > For now, its only enabled for Arcturus, Aldebaran and Raven.
> > If we plan to enable it on other ASICs, then yes the other mmhub
> > implementations would need similar changes.
> 
> It would be nice to fix them up preemptively so that if we ever enable it on
> another asic, it will just work.
> 
Sure I can take a look at all mmhub and gfxhub implementations and send out a patch
for the ones that are missing this page table setup change when translate_further is
enabled.

Regards,
Mukul

> Alex
> 
> 
> >
> > Regards,
> > Mukul
> >
> > > Alex
> > >
> > > > Fixes: 2abf2573b1c69 ("drm/amdgpu: Enable translate_further to
> > > > extend
> > > UTCL2 reach"
> > > > Signed-off-by: Mukul Joshi <mukul.joshi at amd.com>
> > > > ---
> > > >  drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 12 ++++++++++--
> > > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > > b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > > index 6e0145b2b408..445cb06b9d26 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
> > > > @@ -295,9 +295,17 @@ static void
> > > > mmhub_v9_4_disable_identity_aperture(struct amdgpu_device
> *adev,
> > > > static void mmhub_v9_4_setup_vmid_config(struct amdgpu_device
> > > *adev, int hubid)  {
> > > >         struct amdgpu_vmhub *hub = &adev-
> >vmhub[AMDGPU_MMHUB_0];
> > > > +       unsigned int num_level, block_size;
> > > >         uint32_t tmp;
> > > >         int i;
> > > >
> > > > +       num_level = adev->vm_manager.num_level;
> > > > +       block_size = adev->vm_manager.block_size;
> > > > +       if (adev->gmc.translate_further)
> > > > +               num_level -= 1;
> > > > +       else
> > > > +               block_size -= 9;
> > > > +
> > > >         for (i = 0; i <= 14; i++) {
> > > >                 tmp = RREG32_SOC15_OFFSET(MMHUB, 0,
> > > mmVML2VC0_VM_CONTEXT1_CNTL,
> > > >                                 hubid *
> > > > MMHUB_INSTANCE_REGISTER_OFFSET
> > > > + i); @@ -305,7 +313,7 @@ static void
> > > mmhub_v9_4_setup_vmid_config(struct amdgpu_device *adev, int
> hubid)
> > > >                                     ENABLE_CONTEXT, 1);
> > > >                 tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > >                                     PAGE_TABLE_DEPTH,
> > > > -                                   adev->vm_manager.num_level);
> > > > +                                   num_level);
> > > >                 tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > >                                     RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
> > > >                 tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > > @@
> > > > -323,7 +331,7 @@ static void mmhub_v9_4_setup_vmid_config(struct
> > > amdgpu_device *adev, int hubid)
> > > >                                     EXECUTE_PROTECTION_FAULT_ENABLE_DEFAULT, 1);
> > > >                 tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > >                                     PAGE_TABLE_BLOCK_SIZE,
> > > > -                                   adev->vm_manager.block_size - 9);
> > > > +                                   block_size);
> > > >                 /* Send no-retry XNACK on fault to suppress VM fault storm. */
> > > >                 tmp = REG_SET_FIELD(tmp, VML2VC0_VM_CONTEXT1_CNTL,
> > > >
> > > > RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
> > > > --
> > > > 2.35.1
> > > >