[PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault

Mon Nov 18 22:24:35 UTC 2019

Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
Signed-off-by: Alex Sierra <alex.sierra at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index d51ac8771ae0..358a4f50fcfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3207,6 +3207,12 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
 		value = adev->dummy_page_addr;
 		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
 			AMDGPU_PTE_WRITEABLE;
+
+		if (vm->is_compute_context) {
+			/* Setting PTE flags to trigger a no-retry-fault  */
+			flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
+				AMDGPU_PTE_TF;
+		}
 	} else {
 		/* Let the hw retry silently on the PTE */
 		value = 0;
-- 
2.17.1