[PATCH V2] Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute" for Raven

jesse.zhang at amd.com jesse.zhang at amd.com
Wed Feb 28 08:43:16 UTC 2024


From: "Jesse.Zhang" <Jesse.Zhang at amd.com>

fix the issue:
"amdgpu: Failed to create process VM object".

[Why]when amdgpu initialized, seq64 do mampping and update bo mapping in vm page table.
But when clifo run. It also initializes a vm for a process device through the function kfd_process_device_init_vm
and ensure the root PD is clean through the function amdgpu_vm_pt_is_root_clean.
So they have a conflict, and clinfo  always failed.

[HOW]
Skip the seq64 entry check in vm page table.

Signed-off-by: Jesse Zhang <Jesse.Zhang at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
index a160265ddc07..bdae5381887e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
@@ -746,8 +746,21 @@ bool amdgpu_vm_pt_is_root_clean(struct amdgpu_device *adev,
 	enum amdgpu_vm_level root = adev->vm_manager.root_level;
 	unsigned int entries = amdgpu_vm_pt_num_entries(adev, root);
 	unsigned int i = 0;
+	u64 seq64_addr = (adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT) - AMDGPU_VA_RESERVED_TOP;
+
+	seq64_addr /= AMDGPU_GPU_PAGE_SIZE;
+	mask = amdgpu_vm_pt_entries_mask(adev, adev->vm_manager.root_level);
+	shift = amdgpu_vm_pt_level_shift(adev, adev->vm_manager.root_level);
+	seq64_entry = (seq64_addr >> shift) & mask;
 
 	for (i = 0; i < entries; i++) {
+		/* seq64  reserve 2M memory from top of address space.
+		 * Then do the mapping and update the vm page table at amdgpu initialize.
+		 * So skip the know result.
+		 */
+
+		if(i == seq64_entry)
+			continue;
 		if (to_amdgpu_bo_vm(vm->root.bo)->entries[i].bo)
 			return false;
 	}
-- 
2.34.1



More information about the amd-gfx mailing list