[PATCH 5/5] drm/amdkfd: enable subsequent retry fault
Philip Yang
Philip.Yang at amd.com
Mon Apr 26 21:26:31 UTC 2021
After draining the stale retry fault, or failed to validate the range
to recover, have to remove the fault address from fault filter ring, to
be able to handle subsequent retry interrupt on same address. Otherwise
the retry fault will not be processed to recover until timeout passed.
Signed-off-by: Philip Yang <Philip.Yang at amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 00d759b257f4..d9111fea724b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2363,8 +2363,10 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
mutex_lock(&prange->migrate_mutex);
- if (svm_range_skip_recover(prange))
+ if (svm_range_skip_recover(prange)) {
+ amdgpu_gmc_filter_faults_remove(adev, addr, pasid);
goto out_unlock_range;
+ }
timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp;
/* skip duplicate vm fault on different pages of same range */
@@ -2426,6 +2428,7 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
if (r == -EAGAIN) {
pr_debug("recover vm fault later\n");
+ amdgpu_gmc_filter_faults_remove(adev, addr, pasid);
r = 0;
}
return r;
--
2.17.1
More information about the amd-gfx
mailing list