[PATCH] drm/amdgpu: Fix SDMA engine resume issue under SRIOV
Bokun Zhang
Bokun.Zhang at amd.com
Thu Oct 6 18:08:38 UTC 2022
- Under SRIOV, SDMA engine is shared between VFs. Therefore,
we will not stop SDMA during hw_fini. This is not an issue
with normal dirver loading and unloading.
- However, when we put the SDMA engine to suspend state and resume
it, the issue starts to show up. Something could attempt to use
that SDMA engine to clear or move memory before the engine is
initialized since the DRM entity is still there.
- Therefore, we will call sdma_v5_2_enable(false) during hw_fini,
and if we are under SRIOV, we will call sdma_v5_2_enable(true)
afterwards to allow other VFs to use SDMA. This way, the DRM
entity of SDMA engine is emptied and it will follow the flow
of resume code path.
Signed-off-by: Bokun Zhang <Bokun.Zhang at amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index f136fec7b4f4..3eaf1a573e73 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -1357,12 +1357,19 @@ static int sdma_v5_2_hw_fini(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- if (amdgpu_sriov_vf(adev))
- return 0;
-
+ /*
+ * Under SRIOV, the VF cannot single-mindedly stop SDMA engine
+ * However, we still need to clean up the DRM entity
+ * Therefore, we will re-enable SDMA afterwards.
+ */
sdma_v5_2_ctx_switch_enable(adev, false);
sdma_v5_2_enable(adev, false);
+ if (amdgpu_sriov_vf(adev)) {
+ sdma_v5_2_enable(adev, true);
+ sdma_v5_2_ctx_switch_enable(adev, true);
+ }
+
return 0;
}
--
2.34.1
More information about the amd-gfx
mailing list