[PATCH 5/6] drm/amdkfd: enable subsequent retry fault
Felix Kuehling
felix.kuehling at amd.com
Wed Apr 21 01:22:16 UTC 2021
Am 2021-04-20 um 4:21 p.m. schrieb Philip Yang:
> After draining the stale retry fault, or failed to validate the range
> to recover, have to remove the fault address from fault filter ring, to
> be able to handle subsequent retry interrupt on same address. Otherwise
> the retry fault will not be processed to recover until timeout passed.
>
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
Patches 1-3 and patch 5 are
Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
I didn't see a patch 6. Was the email lost or not send intentionally?
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 45dd055118eb..d90e0cb6e573 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -2262,8 +2262,10 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
> mutex_lock(&prange->migrate_mutex);
>
> - if (svm_range_skip_recover(prange))
> + if (svm_range_skip_recover(prange)) {
> + amdgpu_gmc_filter_faults_remove(adev, addr, pasid);
> goto out_unlock_range;
> + }
>
> timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp;
> /* skip duplicate vm fault on different pages of same range */
> @@ -2325,6 +2327,7 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
> if (r == -EAGAIN) {
> pr_debug("recover vm fault later\n");
> + amdgpu_gmc_filter_faults_remove(adev, addr, pasid);
> r = 0;
> }
> return r;
More information about the amd-gfx
mailing list