[PATCH] drm/amdkfd: enable heavy-weight TLB flush on Arcturus

Felix Kuehling felix.kuehling at amd.com
Wed Jan 19 00:16:06 UTC 2022


Am 2022-01-18 um 7:08 p.m. schrieb Russell, Kent:
> One question inline
>
>
> *KENT RUSSELL***  
>
> Sr. Software Engineer | Linux Compute Kernel
>
> 1 Commerce Valley Drive East
>
> Markham, ON L3T 7X6
>
> *O*+(1) 289-695-2122**| Ext 72122
>
>
> ------------------------------------------------------------------------
> *From:* amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of
> Felix Kuehling <felix.kuehling at amd.com>
> *Sent:* Tuesday, January 18, 2022 6:36 PM
> *To:* Huang, JinHuiEric <JinHuiEric.Huang at amd.com>;
> amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/amdkfd: enable heavy-weight TLB flush on
> Arcturus
>  
> Am 2022-01-18 um 5:45 p.m. schrieb Eric Huang:
> > SDMA FW fixes the hang issue for adding heavy-weight TLB
> > flush on Arcturus, so we can enable it.
> >
> > Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
>
> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
>
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c |  6 ------
> >  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c         | 10 ++++++++--
> >  2 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index a64cbbd943ba..acb4fd973e60 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -1892,12 +1892,6 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
> >                                true);
> >        ret = unreserve_bo_and_vms(&ctx, false, false);
>> > -     /* Only apply no TLB flush on Aldebaran to
> > -      * workaround regressions on other Asics.
> > -      */
> > -     if (table_freed && (adev->asic_type != CHIP_ALDEBARAN))
> > -             *table_freed = true;
> > -
> >        goto out;
>> >  out_unreserve:
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > index b570c0454ce9..485d4c52c7de 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > @@ -1596,6 +1596,12 @@ static int
> kfd_ioctl_free_memory_of_gpu(struct file *filep,
> >        return ret;
> >  }
>> > +static bool kfd_flush_tlb_after_unmap(struct kfd_dev *dev) {
> > +     return KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2)
>
> Do we need to add a check for sdma ver >=8 here?

What's the significance of version 8 for Aldebaran? This code was
working on Aldebaran without a version check before. Did we ever
publicly release an SDMA firmware older than version 8 that for Aldebaran?

Regards,
  Felix


>  
> ||
> > +            (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1) &&
> > +             dev->adev->sdma.instance[0].fw_version >= 18);
> > +}
> > +
> >  static int kfd_ioctl_map_memory_to_gpu(struct file *filep,
> >                                        struct kfd_process *p, void
> *data)
> >  {
> > @@ -1692,7 +1698,7 @@ static int kfd_ioctl_map_memory_to_gpu(struct
> file *filep,
> >        }
>> >        /* Flush TLBs after waiting for the page table updates to
> complete */
> > -     if (table_freed) {
> > +     if (table_freed || !kfd_flush_tlb_after_unmap(dev)) {
> >                for (i = 0; i < args->n_devices; i++) {
> >                        peer = kfd_device_by_id(devices_arr[i]);
> >                        if (WARN_ON_ONCE(!peer))
> > @@ -1806,7 +1812,7 @@ static int
> kfd_ioctl_unmap_memory_from_gpu(struct file *filep,
> >        }
> >        mutex_unlock(&p->mutex);
>> > -     if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2)) {
> > +     if (kfd_flush_tlb_after_unmap(dev)) {
> >                err = amdgpu_amdkfd_gpuvm_sync_memory(dev->adev,
> >                                (struct kgd_mem *) mem, true);
> >                if (err) {


More information about the amd-gfx mailing list