[PATCH v4] drm/amdgpu: Fix the race condition for draining retry fault

Wed Mar 12 22:40:25 UTC 2025

On 2025-03-09 20:51, Deng, Emily wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> *From:*Chen, Xiaogang <Xiaogang.Chen at amd.com>
> *Sent:* Saturday, March 8, 2025 8:38 AM
> *To:* Deng, Emily <Emily.Deng at amd.com>; amd-gfx at lists.freedesktop.org
> *Subject:* Re: [PATCH v4] drm/amdgpu: Fix the race condition for 
> draining retry fault
>
> On 3/6/2025 7:27 PM, Deng, Emily wrote:
>
>     [AMD Official Use Only - AMD Internal Distribution Only]
>
>     *From:*Chen, Xiaogang <Xiaogang.Chen at amd.com>
>     <mailto:Xiaogang.Chen at amd.com>
>     *Sent:* Friday, March 7, 2025 1:01 AM
>     *To:* Deng, Emily <Emily.Deng at amd.com>
>     <mailto:Emily.Deng at amd.com>; amd-gfx at lists.freedesktop.org
>     *Subject:* Re: [PATCH v4] drm/amdgpu: Fix the race condition for
>     draining retry fault
>
>     Thanks for catch up and fix this race condition. It looks good to
>     me. One minor thing below:
>
>     On 3/6/2025 12:03 AM, Emily Deng wrote:
>
>         Issue:
>
>         In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts
>
>           has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu
>
>         is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages
>
>         continues execution and reaches svm_range_from_addr. This results in
>
>         a "failed to find prange..." error, causing the page recovery to fail.
>
>           
>
>         How to fix:
>
>         Move the timestamp check code under the protection of svm->lock.
>
>           
>
>         v2:
>
>         Make sure all right locks are released before go out.
>
>           
>
>         v3:
>
>         Directly goto out_unlock_svms, and return -EAGAIN.
>
>           
>
>         v4:
>
>         Refine code.
>
>           
>
>         Signed-off-by: Emily Deng<Emily.Deng at amd.com>  <mailto:Emily.Deng at amd.com>
>
>         ---
>
>           drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 30 +++++++++++++++-------------
>
>           1 file changed, 16 insertions(+), 14 deletions(-)
>
>           
>
>         diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>         index d04725583f19..83ac14bf7a7a 100644
>
>         --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>         +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>
>         @@ -3008,19 +3008,6 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>                          goto out;
>
>                  }
>
>           
>
>         -       /* check if this page fault time stamp is before svms->checkpoint_ts */
>
>         -       if (svms->checkpoint_ts[gpuidx] != 0) {
>
>         -               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
>
>         -                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
>
>         -                       r = 0;
>
>         -                       goto out;
>
>         -               } else
>
>         -                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
>
>         -                        * to zero to avoid following ts wrap around give wrong comparing
>
>         -                        */
>
>         -                svms->checkpoint_ts[gpuidx] = 0;
>
>         -       }
>
>         -
>
>                  if (!p->xnack_enabled) {
>
>                          pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
>
>                          r = -EFAULT;
>
>         @@ -3040,6 +3027,20 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>                  mmap_read_lock(mm);
>
>           retry_write_locked:
>
>                  mutex_lock(&svms->lock);
>
>         +
>
>         +       /* check if this page fault time stamp is before svms->checkpoint_ts */
>
>         +       if (svms->checkpoint_ts[gpuidx] != 0) {
>
>         +               if (amdgpu_ih_ts_after_or_equal(ts,  svms->checkpoint_ts[gpuidx])) {
>
>         +                       pr_debug("draining retry fault, drop fault 0x%llx\n", addr);
>
>         +                       r = -EAGAIN;
>
>     We drop page fault because it is stale, not mean to handle it
>     again. if return -EAGAIN we do amdgpu_gmc_filter_faults_remove. If
>     after unmap, user map same range again we should treat page fault
>     happened at same range as new one.
>
>     Regards
>
>     Xiaogang
>
>     Sorry, I didn't quite catch that. So, you think we shouldn't
>     remove the fault from amdgpu_gmc_filter_faults_remove?
>
> I think return "-EAGAIN" means a page fault with an addr and a pasid 
> is not handled this time. Same page fault will come back again and kfd 
> will handle it, so remove it from filter to make sure it will be 
> handled next time.
>
> Yes, I also think we should remove it to make sure it will be handled 
> next time. From this point of view, it should be removed from filter.
>
I agree. Without removing it from the filter, we would ignore later 
faults on the same address, which would result in stalling the 
application indefinitely.

Regards,
   Felix


> Emily Deng
>
> Best Wishes
>
> In this case this page fault is stale and we do not need or cannot 
> handle it. There is no indication it will come back again and we need 
> handle it.  I am not sure if should remove it from filter. I just 
> concern if should return "-EAGAIN" in this case.
>
> Regards
>
> Xiaogang
>
>     Emily Deng
>
>     Best Wishes
>
>           
>
>         +                       goto out_unlock_svms;
>
>         +               } else
>
>         +                       /* ts is after svms->checkpoint_ts now, reset svms->checkpoint_ts
>
>         +                        * to zero to avoid following ts wrap around give wrong comparing
>
>         +                        */
>
>         +                svms->checkpoint_ts[gpuidx] = 0;
>
>         +       }
>
>         +
>
>                  prange = svm_range_from_addr(svms, addr, NULL);
>
>                  if (!prange) {
>
>                          pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
>
>         @@ -3165,7 +3166,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>
>                  mutex_unlock(&svms->lock);
>
>                  mmap_read_unlock(mm);
>
>           
>
>         -       svm_range_count_fault(node, p, gpuidx);
>
>         +       if (r != -EAGAIN)
>
>         +               svm_range_count_fault(node, p, gpuidx);
>
>           
>
>                  mmput(mm);
>
>           out:
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250312/553c3f7a/attachment-0001.htm>