[PATCH] drm/amdgpu: Fix SDMA engine resume issue under SRIOV
Alex Deucher
alexdeucher at gmail.com
Thu Oct 6 19:56:18 UTC 2022
On Thu, Oct 6, 2022 at 2:11 PM Zhang, Bokun <Bokun.Zhang at amd.com> wrote:
>
> [AMD Official Use Only - General]
>
> Hey guys,
> Please help review this patch for the suspend and resume issue.
> I have tested it with multi-VF environment, I think it is ok.
Seems a little hacky, but I think that's the least intrusive for
stable. How about the attached patches?
Alex
>
> Thanks!
>
> -----Original Message-----
> From: Bokun Zhang <Bokun.Zhang at amd.com>
> Sent: Thursday, October 6, 2022 2:09 PM
> To: amd-gfx at lists.freedesktop.org
> Cc: Zhang, Bokun <Bokun.Zhang at amd.com>
> Subject: [PATCH] drm/amdgpu: Fix SDMA engine resume issue under SRIOV
>
> - Under SRIOV, SDMA engine is shared between VFs. Therefore,
> we will not stop SDMA during hw_fini. This is not an issue
> with normal dirver loading and unloading.
>
> - However, when we put the SDMA engine to suspend state and resume
> it, the issue starts to show up. Something could attempt to use
> that SDMA engine to clear or move memory before the engine is
> initialized since the DRM entity is still there.
>
> - Therefore, we will call sdma_v5_2_enable(false) during hw_fini,
> and if we are under SRIOV, we will call sdma_v5_2_enable(true)
> afterwards to allow other VFs to use SDMA. This way, the DRM
> entity of SDMA engine is emptied and it will follow the flow
> of resume code path.
>
> Signed-off-by: Bokun Zhang <Bokun.Zhang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> index f136fec7b4f4..3eaf1a573e73 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> @@ -1357,12 +1357,19 @@ static int sdma_v5_2_hw_fini(void *handle) {
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> - if (amdgpu_sriov_vf(adev))
> - return 0;
> -
> + /*
> + * Under SRIOV, the VF cannot single-mindedly stop SDMA engine
> + * However, we still need to clean up the DRM entity
> + * Therefore, we will re-enable SDMA afterwards.
> + */
> sdma_v5_2_ctx_switch_enable(adev, false);
> sdma_v5_2_enable(adev, false);
>
> + if (amdgpu_sriov_vf(adev)) {
> + sdma_v5_2_enable(adev, true);
> + sdma_v5_2_ctx_switch_enable(adev, true);
> + }
> +
> return 0;
> }
>
> --
> 2.34.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-drm-amdgpu-switch-sdma-buffer-function-tear-down-to-.patch
Type: text/x-patch
Size: 10649 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20221006/50d19d9c/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-amdgpu-Fix-SDMA-engine-resume-issue-under-SRIOV.patch
Type: text/x-patch
Size: 1945 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20221006/50d19d9c/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-drm-amdgpu-fix-SDMA-suspend-resume-on-SR-IOV.patch
Type: text/x-patch
Size: 3426 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20221006/50d19d9c/attachment-0002.bin>
More information about the amd-gfx
mailing list