[PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

Thu Oct 4 07:20:38 UTC 2018

> Is my understanding correct?
Yes, of hand that sounds correct to me.

The other occasions should just be early bail out to optimize things 
under memory pressure.

Christian.

Am 03.10.2018 um 22:31 schrieb Philip Yang:
> Hi Christian,
>
> Yes, I agree. I am working on patch 2 to replace get_user_page with 
> HMM. One problem is in current gfx path, we check if mmu_invalidation 
> multiple times in amdgpu_cs_ioctl() path after get_user_page(), 
> amdgpu_cs_parser_bos(), amdgpu_cs_list_validate(), and 
> amdgpu_cs_submit(). For HMM, hmm_vma_range_done() has to be called 
> once and only once after hmm_vma_get_pfns()/hmm_vma_fault(), so I will 
> call hmm_vma_range_done() inside amdgpu_cs_submit after holding the mn 
> lock. Is my understanding correct?
>
> Philip
>
> On 2018-10-02 11:05 AM, Christian König wrote:
>> Checking more code and documentation and thinking about it over my 
>> vacation I think I have some new conclusions here.
>>
>> Currently we are using get_user_pages() together with an MMU notifier 
>> to guarantee coherent address space view, because get_user_pages() 
>> works by grabbing a reference to the pages and ignoring concurrent 
>> page table updates.
>>
>> But HMM uses a different approach by checking the address space for 
>> modifications using hmm_vma_range_done() and re-trying when the 
>> address space has changed.
>>
>> Now what you are trying to do is to change that into get_user_pages() 
>> and HMM callback and this is what won't work. We can either use 
>> get_user_pages() with MMU notifier or we can use HMM for the work, 
>> but we can't mix and match.
>>
>> So my initial guess was correct that we just need to change both 
>> sides of the implementation at the same time.
>>
>> Regards,
>> Christian.
>>
>> Am 28.09.2018 um 17:13 schrieb Koenig, Christian:
>>> No it definitely isn't.
>>>
>>> We have literally worked month on this with the core MM developers.
>>>
>>> Making sure that we have a consistent page array is absolutely vital 
>>> for correct operation.
>>>
>>> Please also check Jerome's presentation from XDC it also perfectly 
>>> explains why this approach won't work correctly.
>>>
>>> Christian.
>>>
>>> Am 28.09.2018 17:07 schrieb "Yang, Philip" <Philip.Yang at amd.com>:
>>> For B path, we take mm->mmap_sem, then call hmm_vma_get_pfns() or 
>>> get_user_pages(). This is obvious.
>>>
>>> For A path, mmu notifier 
>>> mmu_notifier_invalidate_range_start()/mmu_notifier_invalidate_range_end() 
>>> is called in many places, and the calling path is quit complicated 
>>> inside mm, it's not obvious. I checked many of the them, for example:
>>>
>>> do_munmap()
>>>   down_write(&mm->mmap_sem)
>>>   arch_unmap()
>>>     mpx_notify_unmap()...
>>>        zap_bt_entries_mapping()
>>>          zap_page_range()
>>>  up_write(&mm->mmap_sem)
>>>
>>> void zap_page_range(struct vm_area_struct *vma, unsigned long start,
>>>         unsigned long size)
>>> {
>>>     struct mm_struct *mm = vma->vm_mm;
>>>     struct mmu_gather tlb;
>>>     unsigned long end = start + size;
>>>
>>>     lru_add_drain();
>>>     tlb_gather_mmu(&tlb, mm, start, end);
>>>     update_hiwater_rss(mm);
>>>     mmu_notifier_invalidate_range_start(mm, start, end);
>>>     for ( ; vma && vma->vm_start < end; vma = vma->vm_next)
>>>         unmap_single_vma(&tlb, vma, start, end, NULL);
>>>     mmu_notifier_invalidate_range_end(mm, start, end);
>>>     tlb_finish_mmu(&tlb, start, end);
>>> }
>>>
>>> So AFAIK it's okay without invalidate_range_end() callback.
>>>
>>> Regards,
>>> Philip
>>>
>>> On 2018-09-28 01:25 AM, Koenig, Christian wrote:
>>>> No, that is incorrect as well :)
>>>>
>>>> The mmap_sem isn't necessary taken during page table updates.
>>>>
>>>> What you could do is replace get_user_pages() directly with HMM. If 
>>>> I'm not completely mistaken that should work as expected.
>>>>
>>>> Christian.
>>>>
>>>> Am 27.09.2018 22:18 schrieb "Yang, Philip" <Philip.Yang at amd.com>:
>>>> I was trying to understand the way how HMM handle this concurrent 
>>>> issue and how we handle it in amdgpu_ttm_tt_userptr_needs_pages() 
>>>> and  amdgpu_ttm_tt_affect_userptr(). HMM uses range->valid flag, we 
>>>> use gtt->mmu_invalidations and gtt->last_set_pages. Both use the 
>>>> same lock plus flag idea actually.
>>>>
>>>> Thanks for the information, now I understand fence 
>>>> ttm_eu_fence_buffer_objects() put to BOs will block CPU page table 
>>>> update. This is another side of this concurrent issue I didn't know.
>>>>
>>>> I had same worry that it has issue without invalidate_range_end() 
>>>> callback as the calling sequence Felix lists. Now I think it's fine 
>>>> after taking a look again today because of mm->mmap_sem usage, this 
>>>> is my understanding:
>>>>
>>>> A path:
>>>>
>>>> down_write(&mm->mmap_sem);
>>>> mmu_notifier_invalidate_range_start()
>>>>     take_lock()
>>>>     release_lock()
>>>> CPU page table update
>>>> mmu_notifier_invalidate_range_end()
>>>> up_write(&mm->mmap_sem);
>>>>
>>>> B path:
>>>>
>>>> again:
>>>> down_read(&mm->mmap_sem);
>>>> hmm_vma_get_pfns()
>>>> up_read(&mm->mmap_sem);
>>>> ....
>>>> ....
>>>> take_lock()
>>>> if (!hmm_vma_range_done()) {
>>>>    release_lock()
>>>>    goto again
>>>> }
>>>> submit command job...
>>>> release_lock()
>>>>
>>>> If you agree, I will submit patch v5 with some minor changes, and 
>>>> submit another patch to replace get_user_page() with HMM.
>>>>
>>>> Regards,
>>>> Philip
>>>>
>>>> On 2018-09-27 11:36 AM, Christian König wrote:
>>>>> Yeah, I've read that as well.
>>>>>
>>>>> My best guess is that we just need to add a call to 
>>>>> hmm_vma_range_done() after taking the lock and also replace 
>>>>> get_user_pages() with hmm_vma_get_pfns().
>>>>>
>>>>> But I'm still not 100% sure how all of that is supposed to work 
>>>>> together.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 27.09.2018 um 16:50 schrieb Kuehling, Felix:
>>>>>>
>>>>>> I think the answer is here: 
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>>   Felix
>>>>>>
>>>>>> *From:*Koenig, Christian
>>>>>> *Sent:* Thursday, September 27, 2018 10:30 AM
>>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>>>>>> *Cc:* j.glisse at gmail.com; Yang, Philip <Philip.Yang at amd.com>; 
>>>>>> amd-gfx at lists.freedesktop.org
>>>>>> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to 
>>>>>> replace mmu notifier v4
>>>>>>
>>>>>> At least with get_user_pages() that is perfectly possible.
>>>>>>
>>>>>> For HMM it could be that this is prevented somehow.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>> Am 27.09.2018 16:27 schrieb "Kuehling, Felix" 
>>>>>> <Felix.Kuehling at amd.com <mailto:Felix.Kuehling at amd.com>>:
>>>>>>
>>>>>> > In this case you can end up accessing pages which are invalidated while 
>>>>>> get_user_pages is in process.
>>>>>>
>>>>>> What’s the sequence of events you have in mind? Something like this?
>>>>>>
>>>>>>   * Page table is updated and triggers MMU notifier
>>>>>>   * amdgpu MMU notifier runs and waits for pending CS to finish
>>>>>>     while holding the read lock
>>>>>>   * New CS starts just after invalidate_range_start MMU notifier
>>>>>>     finishes but before the page table update is done
>>>>>>   * get_user_pages returns outdated physical addresses
>>>>>>
>>>>>> I hope that’s not actually possible and that get_user_pages or 
>>>>>> hmm_vma_fault would block until the page table update is done. 
>>>>>> That is, invalidate_range_start marks the start of a page table 
>>>>>> update, and while that update is in progress, get_user_pages or 
>>>>>> hmm_vma_fault block. Jerome, can you comment on that?
>>>>>>
>>>>>> Thanks,
>>>>>>   Felix
>>>>>>
>>>>>> *From:*Koenig, Christian
>>>>>> *Sent:* Thursday, September 27, 2018 9:59 AM
>>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com 
>>>>>> <mailto:Felix.Kuehling at amd.com>>
>>>>>> *Cc:* Yang, Philip <Philip.Yang at amd.com 
>>>>>> <mailto:Philip.Yang at amd.com>>; amd-gfx at lists.freedesktop.org 
>>>>>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse 
>>>>>> <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>>>>>> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to 
>>>>>> replace mmu notifier v4
>>>>>>
>>>>>> Yeah I understand that, but again that won't work.
>>>>>>
>>>>>> In this case you can end up accessing pages which are invalidated 
>>>>>> while get_user_pages is in process.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>> Am 27.09.2018 15:41 schrieb "Kuehling, Felix" 
>>>>>> <Felix.Kuehling at amd.com <mailto:Felix.Kuehling at amd.com>>:
>>>>>>
>>>>>> > I’m not planning to change that. I don’t think there is any need to 
>>>>>> change it.
>>>>>>
>>>>>> >
>>>>>> > Yeah, but when HMM doesn't provide both the start and the end 
>>>>>> hock of the invalidation this way won't work any more.
>>>>>> >
>>>>>> > So we need to find a solution for this,
>>>>>> > Christian.
>>>>>>
>>>>>> My whole argument is that you don’t need to hold the read lock 
>>>>>> until the invalidate_range_end. Just read_lock and read_unlock in 
>>>>>> the invalidate_range_start function.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>>   Felix
>>>>>>
>>>>>> *From:*Koenig, Christian
>>>>>> *Sent:* Thursday, September 27, 2018 9:22 AM
>>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com 
>>>>>> <mailto:Felix.Kuehling at amd.com>>
>>>>>> *Cc:* Yang, Philip <Philip.Yang at amd.com 
>>>>>> <mailto:Philip.Yang at amd.com>>; amd-gfx at lists.freedesktop.org 
>>>>>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse 
>>>>>> <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>>>>>> *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to 
>>>>>> replace mmu notifier v4
>>>>>>
>>>>>> Am 27.09.2018 um 15:18 schrieb Kuehling, Felix:
>>>>>>
>>>>>>     > The problem is here:
>>>>>>     >
>>>>>>
>>>>>>     > ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>>>>>>     p->fence);
>>>>>>
>>>>>>     > amdgpu_mn_unlock(p->mn);
>>>>>>
>>>>>>     >
>>>>>>     > We need to hold the lock until the fence is added to the
>>>>>>     reservation object.
>>>>>>     >
>>>>>>     > Otherwise somebody could have changed the page tables just
>>>>>>     in the moment between the check of
>>>>>>     amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to
>>>>>>     the reservation object.
>>>>>>
>>>>>>     I’m not planning to change that. I don’t think there is any
>>>>>>     need to change it.
>>>>>>
>>>>>>
>>>>>> Yeah, but when HMM doesn't provide both the start and the end 
>>>>>> hock of the invalidation this way won't work any more.
>>>>>>
>>>>>> So we need to find a solution for this,
>>>>>> Christian.
>>>>>>
>>>>>>     Regards,
>>>>>>
>>>>>>       Felix
>>>>>>
>>>>>>     *From:*Koenig, Christian
>>>>>>     *Sent:* Thursday, September 27, 2018 7:24 AM
>>>>>>     *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>>>>>>     <mailto:Felix.Kuehling at amd.com>
>>>>>>     *Cc:* Yang, Philip <Philip.Yang at amd.com>
>>>>>>     <mailto:Philip.Yang at amd.com>; amd-gfx at lists.freedesktop.org
>>>>>>     <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>>>     <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>>>>>>     *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
>>>>>>     replace mmu notifier v4
>>>>>>
>>>>>>     Am 27.09.2018 um 13:08 schrieb Kuehling, Felix:
>>>>>>
>>>>>>         > We double check that there wasn't any page table
>>>>>>         modification while we prepared the submission and restart
>>>>>>         the whole process when there actually was some update.
>>>>>>         >
>>>>>>         > The reason why we need to do this is here:
>>>>>>         >
>>>>>>
>>>>>>         > ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>>>>>>         p->fence);
>>>>>>         > amdgpu_mn_unlock(p->mn);
>>>>>>
>>>>>>         >
>>>>>>         > Only after the new fence is added to the buffer object
>>>>>>         we can release the lock so that any invalidation will now
>>>>>>         block on our command submission to finish before it
>>>>>>         modifies the page table.
>>>>>>
>>>>>>         I don’t see why this requires holding the read-lock until
>>>>>>         invalidate_range_end. amdgpu_ttm_tt_affect_userptr gets
>>>>>>         called while the mn read-lock is held in
>>>>>>         invalidate_range_start notifier.
>>>>>>
>>>>>>
>>>>>>     That's not related to amdgpu_ttm_tt_affect_userptr(), this
>>>>>>     function could actually be called outside the lock.
>>>>>>
>>>>>>     The problem is here:
>>>>>>
>>>>>>         ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>>>>>>         p->fence);
>>>>>>
>>>>>>         amdgpu_mn_unlock(p->mn);
>>>>>>
>>>>>>
>>>>>>     We need to hold the lock until the fence is added to the
>>>>>>     reservation object.
>>>>>>
>>>>>>     Otherwise somebody could have changed the page tables just in
>>>>>>     the moment between the check of
>>>>>>     amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to
>>>>>>     the reservation object.
>>>>>>
>>>>>>     Regards,
>>>>>>     Christian.
>>>>>>
>>>>>>
>>>>>>         Regards,
>>>>>>
>>>>>>           Felix
>>>>>>
>>>>>>         *From:*Koenig, Christian
>>>>>>         *Sent:* Thursday, September 27, 2018 5:27 AM
>>>>>>         *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>>>>>>         <mailto:Felix.Kuehling at amd.com>
>>>>>>         *Cc:* Yang, Philip <Philip.Yang at amd.com>
>>>>>>         <mailto:Philip.Yang at amd.com>;
>>>>>>         amd-gfx at lists.freedesktop.org
>>>>>>         <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>>>         <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>>>>>>         *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror
>>>>>>         callback to replace mmu notifier v4
>>>>>>
>>>>>>         That is correct, but take a look what we do when after
>>>>>>         calling the amdgpu_mn_read_lock():
>>>>>>
>>>>>>
>>>>>>                     /* No memory allocation is allowed while
>>>>>>             holding the mn lock */
>>>>>>             amdgpu_mn_lock(p->mn);
>>>>>>             amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) {
>>>>>>                             struct amdgpu_bo *bo =
>>>>>>             ttm_to_amdgpu_bo(e->tv.bo);
>>>>>>
>>>>>>                             if
>>>>>>             (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm)) {
>>>>>>                                     r = -ERESTARTSYS;
>>>>>>                                     goto error_abort;
>>>>>>                             }
>>>>>>                     }
>>>>>>
>>>>>>
>>>>>>         We double check that there wasn't any page table
>>>>>>         modification while we prepared the submission and restart
>>>>>>         the whole process when there actually was some update.
>>>>>>
>>>>>>         The reason why we need to do this is here:
>>>>>>
>>>>>>             ttm_eu_fence_buffer_objects(&p->ticket,
>>>>>>             &p->validated, p->fence);
>>>>>>             amdgpu_mn_unlock(p->mn);
>>>>>>
>>>>>>
>>>>>>         Only after the new fence is added to the buffer object we
>>>>>>         can release the lock so that any invalidation will now
>>>>>>         block on our command submission to finish before it
>>>>>>         modifies the page table.
>>>>>>
>>>>>>         The only other option would be to add the fence first and
>>>>>>         then check if there was any update to the page tables.
>>>>>>
>>>>>>         The issue with that approach is that adding a fence can't
>>>>>>         be made undone, so if we find that there actually was an
>>>>>>         update to the page tables we would need to somehow turn
>>>>>>         the CS into a dummy (e.g. overwrite all IBs with NOPs or
>>>>>>         something like that) and still submit it.
>>>>>>
>>>>>>         Not sure if that is actually possible.
>>>>>>
>>>>>>         Regards,
>>>>>>         Christian.
>>>>>>
>>>>>>         Am 27.09.2018 um 10:47 schrieb Kuehling, Felix:
>>>>>>
>>>>>>             So back to my previous question:
>>>>>>
>>>>>>             >> But do we really need another lock for this?
>>>>>>             Wouldn't the
>>>>>>
>>>>>>             >> re-validation of userptr BOs (currently calling
>>>>>>             get_user_pages) force
>>>>>>
>>>>>>             >> synchronization with the ongoing page table
>>>>>>             invalidation through the
>>>>>>
>>>>>>             >> mmap_sem or other MM locks?
>>>>>>
>>>>>>             >
>>>>>>
>>>>>>             > No and yes. We don't hold any other locks while
>>>>>>             doing command submission, but I expect that HMM has
>>>>>>             its own mechanism to prevent that.
>>>>>>
>>>>>>             >
>>>>>>
>>>>>>             > Since we don't modify
>>>>>>             amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly
>>>>>>             not using this mechanism correctly.
>>>>>>
>>>>>>             The existing amdgpu_mn_lock/unlock should block the
>>>>>>             MMU notifier while a command submission is in
>>>>>>             progress. It should also block command submission
>>>>>>             while an MMU notifier is in progress.
>>>>>>
>>>>>>             What we lose with HMM is the ability to hold a
>>>>>>             read-lock for the entire duration of the
>>>>>>             invalidate_range_start until invalidate_range_end. As
>>>>>>             I understand it, that lock is meant to prevent new
>>>>>>             command submissions while the page tables are being
>>>>>>             updated by the kernel. But my point is, that
>>>>>>             get_user_pages or hmm_vma_fault should do the same
>>>>>>             kind of thing. Before the command submission can go
>>>>>>             ahead, it needs to update the userptr addresses. If
>>>>>>             the page tables are still being updated, it will
>>>>>>             block there even without holding the amdgpu_mn_read_lock.
>>>>>>
>>>>>>             Regards,
>>>>>>
>>>>>>               Felix
>>>>>>
>>>>>>             *From:* Koenig, Christian
>>>>>>             *Sent:* Thursday, September 27, 2018 3:00 AM
>>>>>>             *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>>>>>>             <mailto:Felix.Kuehling at amd.com>
>>>>>>             *Cc:* Yang, Philip <Philip.Yang at amd.com>
>>>>>>             <mailto:Philip.Yang at amd.com>;
>>>>>>             amd-gfx at lists.freedesktop.org
>>>>>>             <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>>>             <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>>>>>>             *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror
>>>>>>             callback to replace mmu notifier v4
>>>>>>
>>>>>>             No, that won't work. We would still run into lock
>>>>>>             inversion problems.
>>>>>>
>>>>>>             What we could do with the scheduler is to turn
>>>>>>             submissions into dummies if we find that the page
>>>>>>             tables are now outdated.
>>>>>>
>>>>>>             But that would be really hacky and I'm not sure if
>>>>>>             that would really work in all cases.
>>>>>>
>>>>>>             Christian.
>>>>>>
>>>>>>             Am 27.09.2018 08:53 schrieb "Kuehling, Felix"
>>>>>>             <Felix.Kuehling at amd.com <mailto:Felix.Kuehling at amd.com>>:
>>>>>>
>>>>>>             I had a chat with Jerome yesterday. He pointed out
>>>>>>             that the new blockable parameter can be used to infer
>>>>>>             whether the MMU notifier is being called  in a
>>>>>>             reclaim operation. So if blockable==true, it should
>>>>>>             even be safe to take the BO reservation lock without
>>>>>>             problems. I think with that we should be able to
>>>>>>             remove the read-write locking completely and go back
>>>>>>             to locking (or try-locking for blockable==false) the
>>>>>>             reservation locks in the MMU notifier?
>>>>>>
>>>>>>             Regards,
>>>>>>               Felix
>>>>>>
>>>>>>             -----Original Message-----
>>>>>>             From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org
>>>>>>             <mailto:amd-gfx-bounces at lists.freedesktop.org>> On
>>>>>>             Behalf Of Christian König
>>>>>>             Sent: Saturday, September 15, 2018 3:47 AM
>>>>>>             To: Kuehling, Felix <Felix.Kuehling at amd.com
>>>>>>             <mailto:Felix.Kuehling at amd.com>>; Yang, Philip
>>>>>>             <Philip.Yang at amd.com <mailto:Philip.Yang at amd.com>>;
>>>>>>             amd-gfx at lists.freedesktop.org
>>>>>>             <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>>>             <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>>>>>>             Subject: Re: [PATCH] drm/amdgpu: use HMM mirror
>>>>>>             callback to replace mmu notifier v4
>>>>>>
>>>>>>             Am 14.09.2018 um 22:21 schrieb Felix Kuehling:
>>>>>>             > On 2018-09-14 01:52 PM, Christian König wrote:
>>>>>>             >> Am 14.09.2018 um 19:47 schrieb Philip Yang:
>>>>>>             >>> On 2018-09-14 03:51 AM, Christian König wrote:
>>>>>>             >>>> Am 13.09.2018 um 23:51 schrieb Felix Kuehling:
>>>>>>             >>>>> On 2018-09-13 04:52 PM, Philip Yang wrote:
>>>>>>             >>>>> [SNIP]
>>>>>>             >>>>>>    + amdgpu_mn_read_unlock(amn);
>>>>>>             >>>>>> +
>>>>>>             >>>>> amdgpu_mn_read_lock/unlock support recursive
>>>>>>             locking for multiple
>>>>>>             >>>>> overlapping or nested invalidation ranges. But
>>>>>>             if you'r locking
>>>>>>             >>>>> and unlocking in the same function. Is that
>>>>>>             still a concern?
>>>>>>             >>> I don't understand the possible recursive case, but
>>>>>>             >>> amdgpu_mn_read_lock() still support recursive
>>>>>>             locking.
>>>>>>             >>>> Well the real problem is that unlocking them
>>>>>>             here won't work.
>>>>>>             >>>>
>>>>>>             >>>> We need to hold the lock until we are sure that
>>>>>>             the operation which
>>>>>>             >>>> updates the page tables is completed.
>>>>>>             >>>>
>>>>>>             >>> The reason for this change is because hmm mirror has
>>>>>>             >>> invalidate_start callback, no invalidate_end callback
>>>>>>             >>>
>>>>>>             >>> Check mmu_notifier.c and hmm.c again, below is
>>>>>>             entire logic to
>>>>>>             >>> update CPU page tables and callback:
>>>>>>             >>>
>>>>>>             >>> mn lock amn->lock is used to protect interval
>>>>>>             tree access because
>>>>>>             >>> user may submit/register new userptr anytime.
>>>>>>             >>> This is same for old and new way.
>>>>>>             >>>
>>>>>>             >>> step 2 guarantee the GPU operation is done before
>>>>>>             updating CPU page
>>>>>>             >>> table.
>>>>>>             >>>
>>>>>>             >>> So I think the change is safe. We don't need hold
>>>>>>             mn lock until the
>>>>>>             >>> CPU page tables update is completed.
>>>>>>             >> No, that isn't even remotely correct. The lock
>>>>>>             doesn't protects the
>>>>>>             >> interval tree.
>>>>>>             >>
>>>>>>             >>> Old:
>>>>>>             >>>     1. down_read_non_owner(&amn->lock)
>>>>>>             >>>     2. loop to handle BOs from node->bos through
>>>>>>             interval tree
>>>>>>             >>> amn->object nodes
>>>>>>             >>>         gfx: wait for pending BOs fence operation
>>>>>>             done, mark user
>>>>>>             >>> pages dirty
>>>>>>             >>>         kfd: evict user queues of the process,
>>>>>>             wait for queue
>>>>>>             >>> unmap/map operation done
>>>>>>             >>>     3. update CPU page tables
>>>>>>             >>>     4. up_read(&amn->lock)
>>>>>>             >>>
>>>>>>             >>> New, switch step 3 and 4
>>>>>>             >>>     1. down_read_non_owner(&amn->lock)
>>>>>>             >>>     2. loop to handle BOs from node->bos through
>>>>>>             interval tree
>>>>>>             >>> amn->object nodes
>>>>>>             >>>         gfx: wait for pending BOs fence operation
>>>>>>             done, mark user
>>>>>>             >>> pages dirty
>>>>>>             >>>         kfd: evict user queues of the process,
>>>>>>             wait for queue
>>>>>>             >>> unmap/map operation done
>>>>>>             >>>     3. up_read(&amn->lock)
>>>>>>             >>>     4. update CPU page tables
>>>>>>             >> The lock is there to make sure that we serialize
>>>>>>             page table updates
>>>>>>             >> with command submission.
>>>>>>             > As I understand it, the idea is to prevent command
>>>>>>             submission (adding
>>>>>>             > new fences to BOs) while a page table invalidation
>>>>>>             is in progress.
>>>>>>
>>>>>>             Yes, exactly.
>>>>>>
>>>>>>             > But do we really need another lock for this?
>>>>>>             Wouldn't the
>>>>>>             > re-validation of userptr BOs (currently calling
>>>>>>             get_user_pages) force
>>>>>>             > synchronization with the ongoing page table
>>>>>>             invalidation through the
>>>>>>             > mmap_sem or other MM locks?
>>>>>>
>>>>>>             No and yes. We don't hold any other locks while doing
>>>>>>             command submission, but I expect that HMM has its own
>>>>>>             mechanism to prevent that.
>>>>>>
>>>>>>             Since we don't modify
>>>>>>             amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly
>>>>>>             not using this mechanism correctly.
>>>>>>
>>>>>>             Regards,
>>>>>>             Christian.
>>>>>>             _______________________________________________
>>>>>>             amd-gfx mailing list
>>>>>>             amd-gfx at lists.freedesktop.org
>>>>>>             <mailto:amd-gfx at lists.freedesktop.org>
>>>>>>             https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20181004/4c910eb1/attachment-0001.html>