[PATCH] drm/amdkfd: enable heavy-weight TLB flush on Arcturus
Alex Deucher
alexdeucher at gmail.com
Tue Jan 18 20:15:09 UTC 2022
On Tue, Jan 18, 2022 at 3:13 PM Eric Huang <jinhuieric.huang at amd.com> wrote:
>
> I understand Alex's concern. I think usually we only check the version
> when a feature is only available in a specific version, and other
> version or newer version doesn't have.
>
> In case of FW fix, we assume the driver and FWs have to be synchronous.
> If we have driver backward compatibility for FWs, there must be a lot of
> redundant codes for FW version check. So this patch and SDMA fix will be
> pushed into ROCm 5.1 release branch at the same time.
We need to check. The firmwares are not distributed in lock step with
the driver.
Alex
>
> Regards,
> Eric
>
> On 2022-01-18 14:35, Alex Deucher wrote:
> > On Tue, Jan 18, 2022 at 2:27 PM Russell, Kent <Kent.Russell at amd.com> wrote:
> >> [AMD Official Use Only]
> >>
> >> I think what he means is that if we are using SDMA v17, this will cause issues, won't it? Should we check that SDMA version is >=18 before enabling it? Or am I misunderstanding the fix?
> > Yes, that was my concern.
> >
> > Alex
> >
> >> Kent
> >>
> >>> -----Original Message-----
> >>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Eric Huang
> >>> Sent: Tuesday, January 18, 2022 2:09 PM
> >>> To: Alex Deucher <alexdeucher at gmail.com>
> >>> Cc: amd-gfx list <amd-gfx at lists.freedesktop.org>
> >>> Subject: Re: [PATCH] drm/amdkfd: enable heavy-weight TLB flush on Arcturus
> >>>
> >>> The SDMA fix is generic and not in a specific version of FW, so we don't
> >>> have to check.
> >>>
> >>> Thanks,
> >>> Eric
> >>>
> >>> On 2022-01-18 11:35, Alex Deucher wrote:
> >>>> On Tue, Jan 18, 2022 at 11:16 AM Eric Huang <jinhuieric.huang at amd.com> wrote:
> >>>>> SDMA FW fixes the hang issue for adding heavy-weight TLB
> >>>>> flush on Arcturus, so we can enable it.
> >>>> Do we need to check for a specific fw version?
> >>>>
> >>>> Alex
> >>>>
> >>>>> Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
> >>>>> ---
> >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 8 +++++---
> >>>>> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++-
> >>>>> 2 files changed, 7 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >>>>> index a64cbbd943ba..7b24a920c12e 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >>>>> @@ -1892,10 +1892,12 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
> >>>>> true);
> >>>>> ret = unreserve_bo_and_vms(&ctx, false, false);
> >>>>>
> >>>>> - /* Only apply no TLB flush on Aldebaran to
> >>>>> - * workaround regressions on other Asics.
> >>>>> + /* Only apply no TLB flush on Aldebaran and Arcturus
> >>>>> + * to workaround regressions on other Asics.
> >>>>> */
> >>>>> - if (table_freed && (adev->asic_type != CHIP_ALDEBARAN))
> >>>>> + if (table_freed &&
> >>>>> + (adev->asic_type != CHIP_ALDEBARAN) &&
> >>>>> + (adev->asic_type != CHIP_ARCTURUS))
> >>>>> *table_freed = true;
> >>>>>
> >>>>> goto out;
> >>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> >>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> >>>>> index b570c0454ce9..ef4d676cc71c 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> >>>>> @@ -1806,7 +1806,8 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file
> >>> *filep,
> >>>>> }
> >>>>> mutex_unlock(&p->mutex);
> >>>>>
> >>>>> - if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2)) {
> >>>>> + if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2) ||
> >>>>> + KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1)) {
> >>>>> err = amdgpu_amdkfd_gpuvm_sync_memory(dev->adev,
> >>>>> (struct kgd_mem *) mem, true);
> >>>>> if (err) {
> >>>>> --
> >>>>> 2.25.1
> >>>>>
>
More information about the amd-gfx
mailing list