[PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

Zhang, Yifan Yifan1.Zhang at amd.com
Mon Oct 4 09:42:28 UTC 2021


[AMD Official Use Only]

Hi Felix,

After sync w/ James, we agree that this patch series could fix both our problems, and he verified this patch series will not cause regression of his previous issue. Do you have more comments regarding this patch series ? Thanks.

BRs,
Yifan

From: Zhu, James <James.Zhu at amd.com>
Sent: Wednesday, September 29, 2021 9:19 PM
To: Kuehling, Felix <Felix.Kuehling at amd.com>; Zhang, Yifan <Yifan1.Zhang at amd.com>; amd-gfx at lists.freedesktop.org
Subject: Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init


[AMD Official Use Only]

H Felix,

Since the previous patch can help on PCO suspend/resume hung issue. Let me work with YiFan to see if
there is proper way to cover both cases.


Thanks & Best Regards!



James Zhu

________________________________
From: Kuehling, Felix <Felix.Kuehling at amd.com<mailto:Felix.Kuehling at amd.com>>
Sent: Tuesday, September 28, 2021 11:41 AM
To: Zhang, Yifan <Yifan1.Zhang at amd.com<mailto:Yifan1.Zhang at amd.com>>; amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org> <amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>>; Zhu, James <James.Zhu at amd.com<mailto:James.Zhu at amd.com>>
Subject: Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

[+James]

This basically undoes James's change "drm/amdgpu: move iommu_resume
before ip init/resume". I assume James made his change for a reason. Can
you please discuss the issue with him and determine a solution that
solves both your problem and his?

If James' patch series was a mistake, I'd prefer to revert his patches,
because his patches complicated the initialization sequence and exposed
the iommu init sequence in amdgpu.

Thanks,
  Felix


Am 2021-09-28 um 4:28 a.m. schrieb Yifan Zhang:
> This patch is to fix clinfo failure in Raven/Picasso:
>
> Number of platforms: 1
>   Platform Profile: FULL_PROFILE
>   Platform Version: OpenCL 2.2 AMD-APP (3364.0)
>   Platform Name: AMD Accelerated Parallel Processing
>   Platform Vendor: Advanced Micro Devices, Inc.
>   Platform Extensions: cl_khr_icd cl_amd_event_callback
>
>   Platform Name: AMD Accelerated Parallel Processing Number of devices: 0
>
> Signed-off-by: Yifan Zhang <yifan1.zhang at amd.com<mailto:yifan1.zhang at amd.com>>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 4c8f2f4647c0..89ed9b091386 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2393,10 +2393,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>        if (r)
>                goto init_failed;
>
> -     r = amdgpu_amdkfd_resume_iommu(adev);
> -     if (r)
> -             goto init_failed;
> -
>        r = amdgpu_device_ip_hw_init_phase1(adev);
>        if (r)
>                goto init_failed;
> @@ -2435,6 +2431,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>        if (!adev->gmc.xgmi.pending_reset)
>                amdgpu_amdkfd_device_init(adev);
>
> +     r = amdgpu_amdkfd_resume_iommu(adev);
> +     if (r)
> +             goto init_failed;
> +
>        amdgpu_fru_get_product_info(adev);
>
>  init_failed:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20211004/30e26375/attachment.htm>


More information about the amd-gfx mailing list