[PATCH v2 14/15] drm/amdgpu: Use mmu_range_notifier instead of hmm_mirror
Jason Gunthorpe
jgg at mellanox.com
Fri Nov 1 18:21:06 UTC 2019
On Fri, Nov 01, 2019 at 02:44:51PM +0000, Yang, Philip wrote:
> @@ -854,12 +853,20 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
> r = -EPERM;
> goto out_unlock;
> }
> + up_read(&mm->mmap_sem);
> + timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
> +
> +retry:
> + range->notifier_seq = mmu_range_read_begin(&bo->notifier);
>
> + down_read(&mm->mmap_sem);
> r = hmm_range_fault(range, 0);
> up_read(&mm->mmap_sem);
> -
> - if (unlikely(r < 0))
> + if (unlikely(r <= 0)) {
> + if ((r == 0 || r == -EBUSY) && !time_after(jiffies, timeout))
> + goto retry;
> goto out_free_pfns;
> + }
I was reflecting on why this suddently became necessary, and I think
what might be happening is that hmm_range_fault() is trigging
invalidations as it runs (ie it is faulting in pages or something) and
that in turn causes the mrn to need retry.
The hmm version of this had a bug where a full
invalidate_range_start/end pair would not trigger retry, so this this
didn't happen.
This is unfortunate as the retry is unnecessary, but at this time I
can't think of a good way to separate an ignorable synchronous
invalidation caused by hmm_range_fault from an async one that cannot
be ignored..
A basic fix would be to not update the mrq seq in the notifier if
the invalidate is triggered by hmm_range_fault, but that seems
difficult to determine..
Any thoughts Jerome?
Jason
More information about the amd-gfx
mailing list