[PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

Thu Sep 27 15:36:05 UTC 2018

Yeah, I've read that as well.

My best guess is that we just need to add a call to hmm_vma_range_done() 
after taking the lock and also replace get_user_pages() with 
hmm_vma_get_pfns().

But I'm still not 100% sure how all of that is supposed to work together.

Regards,
Christian.

Am 27.09.2018 um 16:50 schrieb Kuehling, Felix:
>
> I think the answer is here: 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216
>
> Regards,
>
>   Felix
>
> *From:*Koenig, Christian
> *Sent:* Thursday, September 27, 2018 10:30 AM
> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
> *Cc:* j.glisse at gmail.com; Yang, Philip <Philip.Yang at amd.com>; 
> amd-gfx at lists.freedesktop.org
> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace 
> mmu notifier v4
>
> At least with get_user_pages() that is perfectly possible.
>
> For HMM it could be that this is prevented somehow.
>
> Christian.
>
> Am 27.09.2018 16:27 schrieb "Kuehling, Felix" <Felix.Kuehling at amd.com 
> <mailto:Felix.Kuehling at amd.com>>:
>
> > In this case you can end up accessing pages which are invalidated while 
> get_user_pages is in process.
>
> What’s the sequence of events you have in mind? Something like this?
>
>   * Page table is updated and triggers MMU notifier
>   * amdgpu MMU notifier runs and waits for pending CS to finish while
>     holding the read lock
>   * New CS starts just after invalidate_range_start MMU notifier
>     finishes but before the page table update is done
>   * get_user_pages returns outdated physical addresses
>
> I hope that’s not actually possible and that get_user_pages or 
> hmm_vma_fault would block until the page table update is done. That 
> is, invalidate_range_start marks the start of a page table update, and 
> while that update is in progress, get_user_pages or hmm_vma_fault 
> block. Jerome, can you comment on that?
>
> Thanks,
>   Felix
>
> *From:*Koenig, Christian
> *Sent:* Thursday, September 27, 2018 9:59 AM
> *To:* Kuehling, Felix <Felix.Kuehling at amd.com 
> <mailto:Felix.Kuehling at amd.com>>
> *Cc:* Yang, Philip <Philip.Yang at amd.com <mailto:Philip.Yang at amd.com>>; 
> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>; 
> Jerome Glisse <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace 
> mmu notifier v4
>
> Yeah I understand that, but again that won't work.
>
> In this case you can end up accessing pages which are invalidated 
> while get_user_pages is in process.
>
> Christian.
>
> Am 27.09.2018 15:41 schrieb "Kuehling, Felix" <Felix.Kuehling at amd.com 
> <mailto:Felix.Kuehling at amd.com>>:
>
> > I’m not planning to change that. I don’t think there is any need to 
> change it.
>
> >
> > Yeah, but when HMM doesn't provide both the start and the end hock 
> of the invalidation this way won't work any more.
> >
> > So we need to find a solution for this,
> > Christian.
>
> My whole argument is that you don’t need to hold the read lock until 
> the invalidate_range_end. Just read_lock and read_unlock in the 
> invalidate_range_start function.
>
> Regards,
>
> Felix
>
> *From:*Koenig, Christian
> *Sent:* Thursday, September 27, 2018 9:22 AM
> *To:* Kuehling, Felix <Felix.Kuehling at amd.com 
> <mailto:Felix.Kuehling at amd.com>>
> *Cc:* Yang, Philip <Philip.Yang at amd.com <mailto:Philip.Yang at amd.com>>; 
> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>; 
> Jerome Glisse <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
> *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to replace 
> mmu notifier v4
>
> Am 27.09.2018 um 15:18 schrieb Kuehling, Felix:
>
>     > The problem is here:
>     >
>
>     > ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
>
>     > amdgpu_mn_unlock(p->mn);
>
>     >
>     > We need to hold the lock until the fence is added to the
>     reservation object.
>     >
>     > Otherwise somebody could have changed the page tables just in
>     the moment between the check of
>     amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to the
>     reservation object.
>
>     I’m not planning to change that. I don’t think there is any need
>     to change it.
>
>
> Yeah, but when HMM doesn't provide both the start and the end hock of 
> the invalidation this way won't work any more.
>
> So we need to find a solution for this,
> Christian.
>
>     Regards,
>
>     Felix
>
>     *From:*Koenig, Christian
>     *Sent:* Thursday, September 27, 2018 7:24 AM
>     *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>     <mailto:Felix.Kuehling at amd.com>
>     *Cc:* Yang, Philip <Philip.Yang at amd.com>
>     <mailto:Philip.Yang at amd.com>; amd-gfx at lists.freedesktop.org
>     <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>     <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>     *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
>     replace mmu notifier v4
>
>     Am 27.09.2018 um 13:08 schrieb Kuehling, Felix:
>
>         > We double check that there wasn't any page table
>         modification while we prepared the submission and restart the
>         whole process when there actually was some update.
>         >
>         > The reason why we need to do this is here:
>         >
>
>         > ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>         p->fence);
>         >        amdgpu_mn_unlock(p->mn);
>
>         >
>         > Only after the new fence is added to the buffer object we
>         can release the lock so that any invalidation will now block
>         on our command submission to finish before it modifies the
>         page table.
>
>         I don’t see why this requires holding the read-lock until
>         invalidate_range_end. amdgpu_ttm_tt_affect_userptr gets called
>         while the mn read-lock is held in invalidate_range_start notifier.
>
>
>     That's not related to amdgpu_ttm_tt_affect_userptr(), this
>     function could actually be called outside the lock.
>
>     The problem is here:
>
>         ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
>
>         amdgpu_mn_unlock(p->mn);
>
>
>     We need to hold the lock until the fence is added to the
>     reservation object.
>
>     Otherwise somebody could have changed the page tables just in the
>     moment between the check of amdgpu_ttm_tt_userptr_needs_pages()
>     and adding the fence to the reservation object.
>
>     Regards,
>     Christian.
>
>
>         Regards,
>
>         Felix
>
>         *From:*Koenig, Christian
>         *Sent:* Thursday, September 27, 2018 5:27 AM
>         *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>         <mailto:Felix.Kuehling at amd.com>
>         *Cc:* Yang, Philip <Philip.Yang at amd.com>
>         <mailto:Philip.Yang at amd.com>; amd-gfx at lists.freedesktop.org
>         <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>         <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>         *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
>         replace mmu notifier v4
>
>         That is correct, but take a look what we do when after calling
>         the amdgpu_mn_read_lock():
>
>
>                     /* No memory allocation is allowed while holding
>             the mn lock */
>                     amdgpu_mn_lock(p->mn);
>             amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) {
>                             struct amdgpu_bo *bo =
>             ttm_to_amdgpu_bo(e->tv.bo);
>
>                             if
>             (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm)) {
>                                     r = -ERESTARTSYS;
>                                     goto error_abort;
>                             }
>                     }
>
>
>         We double check that there wasn't any page table modification
>         while we prepared the submission and restart the whole process
>         when there actually was some update.
>
>         The reason why we need to do this is here:
>
>             ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>             p->fence);
>                     amdgpu_mn_unlock(p->mn);
>
>
>         Only after the new fence is added to the buffer object we can
>         release the lock so that any invalidation will now block on
>         our command submission to finish before it modifies the page
>         table.
>
>         The only other option would be to add the fence first and then
>         check if there was any update to the page tables.
>
>         The issue with that approach is that adding a fence can't be
>         made undone, so if we find that there actually was an update
>         to the page tables we would need to somehow turn the CS into a
>         dummy (e.g. overwrite all IBs with NOPs or something like
>         that) and still submit it.
>
>         Not sure if that is actually possible.
>
>         Regards,
>         Christian.
>
>         Am 27.09.2018 um 10:47 schrieb Kuehling, Felix:
>
>             So back to my previous question:
>
>             >> But do we really need another lock for this? Wouldn't the
>
>             >> re-validation of userptr BOs (currently calling
>             get_user_pages) force
>
>             >> synchronization with the ongoing page table
>             invalidation through the
>
>             >> mmap_sem or other MM locks?
>
>             >
>
>             > No and yes. We don't hold any other locks while doing
>             command submission, but I expect that HMM has its own
>             mechanism to prevent that.
>
>             >
>
>             > Since we don't modify
>             amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly not
>             using this mechanism correctly.
>
>             The existing amdgpu_mn_lock/unlock should block the MMU
>             notifier while a command submission is in progress. It
>             should also block command submission while an MMU notifier
>             is in progress.
>
>             What we lose with HMM is the ability to hold a read-lock
>             for the entire duration of the invalidate_range_start
>             until invalidate_range_end. As I understand it, that lock
>             is meant to prevent new command submissions while the page
>             tables are being updated by the kernel. But my point is,
>             that get_user_pages or hmm_vma_fault should do the same
>             kind of thing. Before the command submission can go ahead,
>             it needs to update the userptr addresses. If the page
>             tables are still being updated, it will block there even
>             without holding the amdgpu_mn_read_lock.
>
>             Regards,
>
>               Felix
>
>             *From:* Koenig, Christian
>             *Sent:* Thursday, September 27, 2018 3:00 AM
>             *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>             <mailto:Felix.Kuehling at amd.com>
>             *Cc:* Yang, Philip <Philip.Yang at amd.com>
>             <mailto:Philip.Yang at amd.com>;
>             amd-gfx at lists.freedesktop.org
>             <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>             <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>             *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback
>             to replace mmu notifier v4
>
>             No, that won't work. We would still run into lock
>             inversion problems.
>
>             What we could do with the scheduler is to turn submissions
>             into dummies if we find that the page tables are now outdated.
>
>             But that would be really hacky and I'm not sure if that
>             would really work in all cases.
>
>             Christian.
>
>             Am 27.09.2018 08:53 schrieb "Kuehling, Felix"
>             <Felix.Kuehling at amd.com <mailto:Felix.Kuehling at amd.com>>:
>
>             I had a chat with Jerome yesterday. He pointed out that
>             the new blockable parameter can be used to infer whether
>             the MMU notifier is being called  in a reclaim operation.
>             So if blockable==true, it should even be safe to take the
>             BO reservation lock without problems. I think with that we
>             should be able to remove the read-write locking completely
>             and go back to locking (or try-locking for
>             blockable==false) the reservation locks in the MMU notifier?
>
>             Regards,
>               Felix
>
>             -----Original Message-----
>             From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org
>             <mailto:amd-gfx-bounces at lists.freedesktop.org>> On Behalf
>             Of Christian König
>             Sent: Saturday, September 15, 2018 3:47 AM
>             To: Kuehling, Felix <Felix.Kuehling at amd.com
>             <mailto:Felix.Kuehling at amd.com>>; Yang, Philip
>             <Philip.Yang at amd.com <mailto:Philip.Yang at amd.com>>;
>             amd-gfx at lists.freedesktop.org
>             <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>             <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>             Subject: Re: [PATCH] drm/amdgpu: use HMM mirror callback
>             to replace mmu notifier v4
>
>             Am 14.09.2018 um 22:21 schrieb Felix Kuehling:
>             > On 2018-09-14 01:52 PM, Christian König wrote:
>             >> Am 14.09.2018 um 19:47 schrieb Philip Yang:
>             >>> On 2018-09-14 03:51 AM, Christian König wrote:
>             >>>> Am 13.09.2018 um 23:51 schrieb Felix Kuehling:
>             >>>>> On 2018-09-13 04:52 PM, Philip Yang wrote:
>             >>>>> [SNIP]
>             >>>>>>    + amdgpu_mn_read_unlock(amn);
>             >>>>>> +
>             >>>>> amdgpu_mn_read_lock/unlock support recursive locking
>             for multiple
>             >>>>> overlapping or nested invalidation ranges. But if
>             you'r locking
>             >>>>> and unlocking in the same function. Is that still a
>             concern?
>             >>> I don't understand the possible recursive case, but
>             >>> amdgpu_mn_read_lock() still support recursive locking.
>             >>>> Well the real problem is that unlocking them here
>             won't work.
>             >>>>
>             >>>> We need to hold the lock until we are sure that the
>             operation which
>             >>>> updates the page tables is completed.
>             >>>>
>             >>> The reason for this change is because hmm mirror has
>             >>> invalidate_start callback, no invalidate_end callback
>             >>>
>             >>> Check mmu_notifier.c and hmm.c again, below is entire
>             logic to
>             >>> update CPU page tables and callback:
>             >>>
>             >>> mn lock amn->lock is used to protect interval tree
>             access because
>             >>> user may submit/register new userptr anytime.
>             >>> This is same for old and new way.
>             >>>
>             >>> step 2 guarantee the GPU operation is done before
>             updating CPU page
>             >>> table.
>             >>>
>             >>> So I think the change is safe. We don't need hold mn
>             lock until the
>             >>> CPU page tables update is completed.
>             >> No, that isn't even remotely correct. The lock doesn't
>             protects the
>             >> interval tree.
>             >>
>             >>> Old:
>             >>>     1. down_read_non_owner(&amn->lock)
>             >>>     2. loop to handle BOs from node->bos through
>             interval tree
>             >>> amn->object nodes
>             >>>         gfx: wait for pending BOs fence operation
>             done, mark user
>             >>> pages dirty
>             >>>         kfd: evict user queues of the process, wait
>             for queue
>             >>> unmap/map operation done
>             >>>     3. update CPU page tables
>             >>>     4. up_read(&amn->lock)
>             >>>
>             >>> New, switch step 3 and 4
>             >>>     1. down_read_non_owner(&amn->lock)
>             >>>     2. loop to handle BOs from node->bos through
>             interval tree
>             >>> amn->object nodes
>             >>>         gfx: wait for pending BOs fence operation
>             done, mark user
>             >>> pages dirty
>             >>>         kfd: evict user queues of the process, wait
>             for queue
>             >>> unmap/map operation done
>             >>>     3. up_read(&amn->lock)
>             >>>     4. update CPU page tables
>             >> The lock is there to make sure that we serialize page
>             table updates
>             >> with command submission.
>             > As I understand it, the idea is to prevent command
>             submission (adding
>             > new fences to BOs) while a page table invalidation is in
>             progress.
>
>             Yes, exactly.
>
>             > But do we really need another lock for this? Wouldn't the
>             > re-validation of userptr BOs (currently calling
>             get_user_pages) force
>             > synchronization with the ongoing page table invalidation
>             through the
>             > mmap_sem or other MM locks?
>
>             No and yes. We don't hold any other locks while doing
>             command submission, but I expect that HMM has its own
>             mechanism to prevent that.
>
>             Since we don't modify amdgpu_mn_lock()/amdgpu_mn_unlock()
>             we are certainly not using this mechanism correctly.
>
>             Regards,
>             Christian.
>             _______________________________________________
>             amd-gfx mailing list
>             amd-gfx at lists.freedesktop.org
>             <mailto:amd-gfx at lists.freedesktop.org>
>             https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180927/c0ca6355/attachment-0001.html>