[PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4
Philip Yang
philip.yang at amd.com
Wed Oct 3 20:31:37 UTC 2018
Hi Christian,
Yes, I agree. I am working on patch 2 to replace get_user_page with HMM.
One problem is in current gfx path, we check if mmu_invalidation
multiple times in amdgpu_cs_ioctl() path after get_user_page(),
amdgpu_cs_parser_bos(), amdgpu_cs_list_validate(), and
amdgpu_cs_submit(). For HMM, hmm_vma_range_done() has to be called once
and only once after hmm_vma_get_pfns()/hmm_vma_fault(), so I will call
hmm_vma_range_done() inside amdgpu_cs_submit after holding the mn lock.
Is my understanding correct?
Philip
On 2018-10-02 11:05 AM, Christian König wrote:
> Checking more code and documentation and thinking about it over my
> vacation I think I have some new conclusions here.
>
> Currently we are using get_user_pages() together with an MMU notifier
> to guarantee coherent address space view, because get_user_pages()
> works by grabbing a reference to the pages and ignoring concurrent
> page table updates.
>
> But HMM uses a different approach by checking the address space for
> modifications using hmm_vma_range_done() and re-trying when the
> address space has changed.
>
> Now what you are trying to do is to change that into get_user_pages()
> and HMM callback and this is what won't work. We can either use
> get_user_pages() with MMU notifier or we can use HMM for the work, but
> we can't mix and match.
>
> So my initial guess was correct that we just need to change both sides
> of the implementation at the same time.
>
> Regards,
> Christian.
>
> Am 28.09.2018 um 17:13 schrieb Koenig, Christian:
>> No it definitely isn't.
>>
>> We have literally worked month on this with the core MM developers.
>>
>> Making sure that we have a consistent page array is absolutely vital
>> for correct operation.
>>
>> Please also check Jerome's presentation from XDC it also perfectly
>> explains why this approach won't work correctly.
>>
>> Christian.
>>
>> Am 28.09.2018 17:07 schrieb "Yang, Philip" <Philip.Yang at amd.com>:
>> For B path, we take mm->mmap_sem, then call hmm_vma_get_pfns() or
>> get_user_pages(). This is obvious.
>>
>> For A path, mmu notifier
>> mmu_notifier_invalidate_range_start()/mmu_notifier_invalidate_range_end()
>> is called in many places, and the calling path is quit complicated
>> inside mm, it's not obvious. I checked many of the them, for example:
>>
>> do_munmap()
>> down_write(&mm->mmap_sem)
>> arch_unmap()
>> mpx_notify_unmap()...
>> zap_bt_entries_mapping()
>> zap_page_range()
>> up_write(&mm->mmap_sem)
>>
>> void zap_page_range(struct vm_area_struct *vma, unsigned long start,
>> unsigned long size)
>> {
>> struct mm_struct *mm = vma->vm_mm;
>> struct mmu_gather tlb;
>> unsigned long end = start + size;
>>
>> lru_add_drain();
>> tlb_gather_mmu(&tlb, mm, start, end);
>> update_hiwater_rss(mm);
>> mmu_notifier_invalidate_range_start(mm, start, end);
>> for ( ; vma && vma->vm_start < end; vma = vma->vm_next)
>> unmap_single_vma(&tlb, vma, start, end, NULL);
>> mmu_notifier_invalidate_range_end(mm, start, end);
>> tlb_finish_mmu(&tlb, start, end);
>> }
>>
>> So AFAIK it's okay without invalidate_range_end() callback.
>>
>> Regards,
>> Philip
>>
>> On 2018-09-28 01:25 AM, Koenig, Christian wrote:
>>> No, that is incorrect as well :)
>>>
>>> The mmap_sem isn't necessary taken during page table updates.
>>>
>>> What you could do is replace get_user_pages() directly with HMM. If
>>> I'm not completely mistaken that should work as expected.
>>>
>>> Christian.
>>>
>>> Am 27.09.2018 22:18 schrieb "Yang, Philip" <Philip.Yang at amd.com>:
>>> I was trying to understand the way how HMM handle this concurrent
>>> issue and how we handle it in amdgpu_ttm_tt_userptr_needs_pages()
>>> and amdgpu_ttm_tt_affect_userptr(). HMM uses range->valid flag, we
>>> use gtt->mmu_invalidations and gtt->last_set_pages. Both use the
>>> same lock plus flag idea actually.
>>>
>>> Thanks for the information, now I understand fence
>>> ttm_eu_fence_buffer_objects() put to BOs will block CPU page table
>>> update. This is another side of this concurrent issue I didn't know.
>>>
>>> I had same worry that it has issue without invalidate_range_end()
>>> callback as the calling sequence Felix lists. Now I think it's fine
>>> after taking a look again today because of mm->mmap_sem usage, this
>>> is my understanding:
>>>
>>> A path:
>>>
>>> down_write(&mm->mmap_sem);
>>> mmu_notifier_invalidate_range_start()
>>> take_lock()
>>> release_lock()
>>> CPU page table update
>>> mmu_notifier_invalidate_range_end()
>>> up_write(&mm->mmap_sem);
>>>
>>> B path:
>>>
>>> again:
>>> down_read(&mm->mmap_sem);
>>> hmm_vma_get_pfns()
>>> up_read(&mm->mmap_sem);
>>> ....
>>> ....
>>> take_lock()
>>> if (!hmm_vma_range_done()) {
>>> release_lock()
>>> goto again
>>> }
>>> submit command job...
>>> release_lock()
>>>
>>> If you agree, I will submit patch v5 with some minor changes, and
>>> submit another patch to replace get_user_page() with HMM.
>>>
>>> Regards,
>>> Philip
>>>
>>> On 2018-09-27 11:36 AM, Christian König wrote:
>>>> Yeah, I've read that as well.
>>>>
>>>> My best guess is that we just need to add a call to
>>>> hmm_vma_range_done() after taking the lock and also replace
>>>> get_user_pages() with hmm_vma_get_pfns().
>>>>
>>>> But I'm still not 100% sure how all of that is supposed to work
>>>> together.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 27.09.2018 um 16:50 schrieb Kuehling, Felix:
>>>>>
>>>>> I think the answer is here:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216
>>>>>
>>>>> Regards,
>>>>>
>>>>> Felix
>>>>>
>>>>> *From:*Koenig, Christian
>>>>> *Sent:* Thursday, September 27, 2018 10:30 AM
>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>>>>> *Cc:* j.glisse at gmail.com; Yang, Philip <Philip.Yang at amd.com>;
>>>>> amd-gfx at lists.freedesktop.org
>>>>> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to
>>>>> replace mmu notifier v4
>>>>>
>>>>> At least with get_user_pages() that is perfectly possible.
>>>>>
>>>>> For HMM it could be that this is prevented somehow.
>>>>>
>>>>> Christian.
>>>>>
>>>>> Am 27.09.2018 16:27 schrieb "Kuehling, Felix"
>>>>> <Felix.Kuehling at amd.com <mailto:Felix.Kuehling at amd.com>>:
>>>>>
>>>>> > In this case you can end up accessing pages which are invalidated while
>>>>> get_user_pages is in process.
>>>>>
>>>>> What’s the sequence of events you have in mind? Something like this?
>>>>>
>>>>> * Page table is updated and triggers MMU notifier
>>>>> * amdgpu MMU notifier runs and waits for pending CS to finish
>>>>> while holding the read lock
>>>>> * New CS starts just after invalidate_range_start MMU notifier
>>>>> finishes but before the page table update is done
>>>>> * get_user_pages returns outdated physical addresses
>>>>>
>>>>> I hope that’s not actually possible and that get_user_pages or
>>>>> hmm_vma_fault would block until the page table update is done.
>>>>> That is, invalidate_range_start marks the start of a page table
>>>>> update, and while that update is in progress, get_user_pages or
>>>>> hmm_vma_fault block. Jerome, can you comment on that?
>>>>>
>>>>> Thanks,
>>>>> Felix
>>>>>
>>>>> *From:*Koenig, Christian
>>>>> *Sent:* Thursday, September 27, 2018 9:59 AM
>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com
>>>>> <mailto:Felix.Kuehling at amd.com>>
>>>>> *Cc:* Yang, Philip <Philip.Yang at amd.com
>>>>> <mailto:Philip.Yang at amd.com>>; amd-gfx at lists.freedesktop.org
>>>>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>> <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>>>>> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to
>>>>> replace mmu notifier v4
>>>>>
>>>>> Yeah I understand that, but again that won't work.
>>>>>
>>>>> In this case you can end up accessing pages which are invalidated
>>>>> while get_user_pages is in process.
>>>>>
>>>>> Christian.
>>>>>
>>>>> Am 27.09.2018 15:41 schrieb "Kuehling, Felix"
>>>>> <Felix.Kuehling at amd.com <mailto:Felix.Kuehling at amd.com>>:
>>>>>
>>>>> > I’m not planning to change that. I don’t think there is any need to
>>>>> change it.
>>>>>
>>>>> >
>>>>> > Yeah, but when HMM doesn't provide both the start and the end
>>>>> hock of the invalidation this way won't work any more.
>>>>> >
>>>>> > So we need to find a solution for this,
>>>>> > Christian.
>>>>>
>>>>> My whole argument is that you don’t need to hold the read lock
>>>>> until the invalidate_range_end. Just read_lock and read_unlock in
>>>>> the invalidate_range_start function.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Felix
>>>>>
>>>>> *From:*Koenig, Christian
>>>>> *Sent:* Thursday, September 27, 2018 9:22 AM
>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com
>>>>> <mailto:Felix.Kuehling at amd.com>>
>>>>> *Cc:* Yang, Philip <Philip.Yang at amd.com
>>>>> <mailto:Philip.Yang at amd.com>>; amd-gfx at lists.freedesktop.org
>>>>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>> <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>>>>> *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
>>>>> replace mmu notifier v4
>>>>>
>>>>> Am 27.09.2018 um 15:18 schrieb Kuehling, Felix:
>>>>>
>>>>> > The problem is here:
>>>>> >
>>>>>
>>>>> > ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>>>>> p->fence);
>>>>>
>>>>> > amdgpu_mn_unlock(p->mn);
>>>>>
>>>>> >
>>>>> > We need to hold the lock until the fence is added to the
>>>>> reservation object.
>>>>> >
>>>>> > Otherwise somebody could have changed the page tables just
>>>>> in the moment between the check of
>>>>> amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to
>>>>> the reservation object.
>>>>>
>>>>> I’m not planning to change that. I don’t think there is any
>>>>> need to change it.
>>>>>
>>>>>
>>>>> Yeah, but when HMM doesn't provide both the start and the end hock
>>>>> of the invalidation this way won't work any more.
>>>>>
>>>>> So we need to find a solution for this,
>>>>> Christian.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Felix
>>>>>
>>>>> *From:*Koenig, Christian
>>>>> *Sent:* Thursday, September 27, 2018 7:24 AM
>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>>>>> <mailto:Felix.Kuehling at amd.com>
>>>>> *Cc:* Yang, Philip <Philip.Yang at amd.com>
>>>>> <mailto:Philip.Yang at amd.com>; amd-gfx at lists.freedesktop.org
>>>>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>> <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>>>>> *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
>>>>> replace mmu notifier v4
>>>>>
>>>>> Am 27.09.2018 um 13:08 schrieb Kuehling, Felix:
>>>>>
>>>>> > We double check that there wasn't any page table
>>>>> modification while we prepared the submission and restart
>>>>> the whole process when there actually was some update.
>>>>> >
>>>>> > The reason why we need to do this is here:
>>>>> >
>>>>>
>>>>> > ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>>>>> p->fence);
>>>>> > amdgpu_mn_unlock(p->mn);
>>>>>
>>>>> >
>>>>> > Only after the new fence is added to the buffer object
>>>>> we can release the lock so that any invalidation will now
>>>>> block on our command submission to finish before it
>>>>> modifies the page table.
>>>>>
>>>>> I don’t see why this requires holding the read-lock until
>>>>> invalidate_range_end. amdgpu_ttm_tt_affect_userptr gets
>>>>> called while the mn read-lock is held in
>>>>> invalidate_range_start notifier.
>>>>>
>>>>>
>>>>> That's not related to amdgpu_ttm_tt_affect_userptr(), this
>>>>> function could actually be called outside the lock.
>>>>>
>>>>> The problem is here:
>>>>>
>>>>> ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>>>>> p->fence);
>>>>>
>>>>> amdgpu_mn_unlock(p->mn);
>>>>>
>>>>>
>>>>> We need to hold the lock until the fence is added to the
>>>>> reservation object.
>>>>>
>>>>> Otherwise somebody could have changed the page tables just in
>>>>> the moment between the check of
>>>>> amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to
>>>>> the reservation object.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Felix
>>>>>
>>>>> *From:*Koenig, Christian
>>>>> *Sent:* Thursday, September 27, 2018 5:27 AM
>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>>>>> <mailto:Felix.Kuehling at amd.com>
>>>>> *Cc:* Yang, Philip <Philip.Yang at amd.com>
>>>>> <mailto:Philip.Yang at amd.com>;
>>>>> amd-gfx at lists.freedesktop.org
>>>>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>> <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>>>>> *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback
>>>>> to replace mmu notifier v4
>>>>>
>>>>> That is correct, but take a look what we do when after
>>>>> calling the amdgpu_mn_read_lock():
>>>>>
>>>>>
>>>>> /* No memory allocation is allowed while
>>>>> holding the mn lock */
>>>>> amdgpu_mn_lock(p->mn);
>>>>> amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) {
>>>>> struct amdgpu_bo *bo =
>>>>> ttm_to_amdgpu_bo(e->tv.bo);
>>>>>
>>>>> if
>>>>> (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm)) {
>>>>> r = -ERESTARTSYS;
>>>>> goto error_abort;
>>>>> }
>>>>> }
>>>>>
>>>>>
>>>>> We double check that there wasn't any page table
>>>>> modification while we prepared the submission and restart
>>>>> the whole process when there actually was some update.
>>>>>
>>>>> The reason why we need to do this is here:
>>>>>
>>>>> ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>>>>> p->fence);
>>>>> amdgpu_mn_unlock(p->mn);
>>>>>
>>>>>
>>>>> Only after the new fence is added to the buffer object we
>>>>> can release the lock so that any invalidation will now
>>>>> block on our command submission to finish before it
>>>>> modifies the page table.
>>>>>
>>>>> The only other option would be to add the fence first and
>>>>> then check if there was any update to the page tables.
>>>>>
>>>>> The issue with that approach is that adding a fence can't
>>>>> be made undone, so if we find that there actually was an
>>>>> update to the page tables we would need to somehow turn
>>>>> the CS into a dummy (e.g. overwrite all IBs with NOPs or
>>>>> something like that) and still submit it.
>>>>>
>>>>> Not sure if that is actually possible.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 27.09.2018 um 10:47 schrieb Kuehling, Felix:
>>>>>
>>>>> So back to my previous question:
>>>>>
>>>>> >> But do we really need another lock for this?
>>>>> Wouldn't the
>>>>>
>>>>> >> re-validation of userptr BOs (currently calling
>>>>> get_user_pages) force
>>>>>
>>>>> >> synchronization with the ongoing page table
>>>>> invalidation through the
>>>>>
>>>>> >> mmap_sem or other MM locks?
>>>>>
>>>>> >
>>>>>
>>>>> > No and yes. We don't hold any other locks while
>>>>> doing command submission, but I expect that HMM has
>>>>> its own mechanism to prevent that.
>>>>>
>>>>> >
>>>>>
>>>>> > Since we don't modify
>>>>> amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly
>>>>> not using this mechanism correctly.
>>>>>
>>>>> The existing amdgpu_mn_lock/unlock should block the
>>>>> MMU notifier while a command submission is in
>>>>> progress. It should also block command submission
>>>>> while an MMU notifier is in progress.
>>>>>
>>>>> What we lose with HMM is the ability to hold a
>>>>> read-lock for the entire duration of the
>>>>> invalidate_range_start until invalidate_range_end. As
>>>>> I understand it, that lock is meant to prevent new
>>>>> command submissions while the page tables are being
>>>>> updated by the kernel. But my point is, that
>>>>> get_user_pages or hmm_vma_fault should do the same
>>>>> kind of thing. Before the command submission can go
>>>>> ahead, it needs to update the userptr addresses. If
>>>>> the page tables are still being updated, it will block
>>>>> there even without holding the amdgpu_mn_read_lock.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Felix
>>>>>
>>>>> *From:* Koenig, Christian
>>>>> *Sent:* Thursday, September 27, 2018 3:00 AM
>>>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>>>>> <mailto:Felix.Kuehling at amd.com>
>>>>> *Cc:* Yang, Philip <Philip.Yang at amd.com>
>>>>> <mailto:Philip.Yang at amd.com>;
>>>>> amd-gfx at lists.freedesktop.org
>>>>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>> <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>>>>> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror
>>>>> callback to replace mmu notifier v4
>>>>>
>>>>> No, that won't work. We would still run into lock
>>>>> inversion problems.
>>>>>
>>>>> What we could do with the scheduler is to turn
>>>>> submissions into dummies if we find that the page
>>>>> tables are now outdated.
>>>>>
>>>>> But that would be really hacky and I'm not sure if
>>>>> that would really work in all cases.
>>>>>
>>>>> Christian.
>>>>>
>>>>> Am 27.09.2018 08:53 schrieb "Kuehling, Felix"
>>>>> <Felix.Kuehling at amd.com <mailto:Felix.Kuehling at amd.com>>:
>>>>>
>>>>> I had a chat with Jerome yesterday. He pointed out
>>>>> that the new blockable parameter can be used to infer
>>>>> whether the MMU notifier is being called in a reclaim
>>>>> operation. So if blockable==true, it should even be
>>>>> safe to take the BO reservation lock without problems.
>>>>> I think with that we should be able to remove the
>>>>> read-write locking completely and go back to locking
>>>>> (or try-locking for blockable==false) the reservation
>>>>> locks in the MMU notifier?
>>>>>
>>>>> Regards,
>>>>> Felix
>>>>>
>>>>> -----Original Message-----
>>>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org
>>>>> <mailto:amd-gfx-bounces at lists.freedesktop.org>> On
>>>>> Behalf Of Christian König
>>>>> Sent: Saturday, September 15, 2018 3:47 AM
>>>>> To: Kuehling, Felix <Felix.Kuehling at amd.com
>>>>> <mailto:Felix.Kuehling at amd.com>>; Yang, Philip
>>>>> <Philip.Yang at amd.com <mailto:Philip.Yang at amd.com>>;
>>>>> amd-gfx at lists.freedesktop.org
>>>>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>>>>> <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>>>>> Subject: Re: [PATCH] drm/amdgpu: use HMM mirror
>>>>> callback to replace mmu notifier v4
>>>>>
>>>>> Am 14.09.2018 um 22:21 schrieb Felix Kuehling:
>>>>> > On 2018-09-14 01:52 PM, Christian König wrote:
>>>>> >> Am 14.09.2018 um 19:47 schrieb Philip Yang:
>>>>> >>> On 2018-09-14 03:51 AM, Christian König wrote:
>>>>> >>>> Am 13.09.2018 um 23:51 schrieb Felix Kuehling:
>>>>> >>>>> On 2018-09-13 04:52 PM, Philip Yang wrote:
>>>>> >>>>> [SNIP]
>>>>> >>>>>> + amdgpu_mn_read_unlock(amn);
>>>>> >>>>>> +
>>>>> >>>>> amdgpu_mn_read_lock/unlock support recursive
>>>>> locking for multiple
>>>>> >>>>> overlapping or nested invalidation ranges. But
>>>>> if you'r locking
>>>>> >>>>> and unlocking in the same function. Is that
>>>>> still a concern?
>>>>> >>> I don't understand the possible recursive case, but
>>>>> >>> amdgpu_mn_read_lock() still support recursive locking.
>>>>> >>>> Well the real problem is that unlocking them here
>>>>> won't work.
>>>>> >>>>
>>>>> >>>> We need to hold the lock until we are sure that
>>>>> the operation which
>>>>> >>>> updates the page tables is completed.
>>>>> >>>>
>>>>> >>> The reason for this change is because hmm mirror has
>>>>> >>> invalidate_start callback, no invalidate_end callback
>>>>> >>>
>>>>> >>> Check mmu_notifier.c and hmm.c again, below is
>>>>> entire logic to
>>>>> >>> update CPU page tables and callback:
>>>>> >>>
>>>>> >>> mn lock amn->lock is used to protect interval tree
>>>>> access because
>>>>> >>> user may submit/register new userptr anytime.
>>>>> >>> This is same for old and new way.
>>>>> >>>
>>>>> >>> step 2 guarantee the GPU operation is done before
>>>>> updating CPU page
>>>>> >>> table.
>>>>> >>>
>>>>> >>> So I think the change is safe. We don't need hold
>>>>> mn lock until the
>>>>> >>> CPU page tables update is completed.
>>>>> >> No, that isn't even remotely correct. The lock
>>>>> doesn't protects the
>>>>> >> interval tree.
>>>>> >>
>>>>> >>> Old:
>>>>> >>> 1. down_read_non_owner(&amn->lock)
>>>>> >>> 2. loop to handle BOs from node->bos through
>>>>> interval tree
>>>>> >>> amn->object nodes
>>>>> >>> gfx: wait for pending BOs fence operation
>>>>> done, mark user
>>>>> >>> pages dirty
>>>>> >>> kfd: evict user queues of the process,
>>>>> wait for queue
>>>>> >>> unmap/map operation done
>>>>> >>> 3. update CPU page tables
>>>>> >>> 4. up_read(&amn->lock)
>>>>> >>>
>>>>> >>> New, switch step 3 and 4
>>>>> >>> 1. down_read_non_owner(&amn->lock)
>>>>> >>> 2. loop to handle BOs from node->bos through
>>>>> interval tree
>>>>> >>> amn->object nodes
>>>>> >>> gfx: wait for pending BOs fence operation
>>>>> done, mark user
>>>>> >>> pages dirty
>>>>> >>> kfd: evict user queues of the process,
>>>>> wait for queue
>>>>> >>> unmap/map operation done
>>>>> >>> 3. up_read(&amn->lock)
>>>>> >>> 4. update CPU page tables
>>>>> >> The lock is there to make sure that we serialize
>>>>> page table updates
>>>>> >> with command submission.
>>>>> > As I understand it, the idea is to prevent command
>>>>> submission (adding
>>>>> > new fences to BOs) while a page table invalidation
>>>>> is in progress.
>>>>>
>>>>> Yes, exactly.
>>>>>
>>>>> > But do we really need another lock for this?
>>>>> Wouldn't the
>>>>> > re-validation of userptr BOs (currently calling
>>>>> get_user_pages) force
>>>>> > synchronization with the ongoing page table
>>>>> invalidation through the
>>>>> > mmap_sem or other MM locks?
>>>>>
>>>>> No and yes. We don't hold any other locks while doing
>>>>> command submission, but I expect that HMM has its own
>>>>> mechanism to prevent that.
>>>>>
>>>>> Since we don't modify
>>>>> amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly
>>>>> not using this mechanism correctly.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx at lists.freedesktop.org
>>>>> <mailto:amd-gfx at lists.freedesktop.org>
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>
>>>>
>>>
>>
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20181003/5f803f76/attachment-0001.html>
More information about the amd-gfx
mailing list