[PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4
Philip Yang
philip.yang at amd.com
Thu Sep 27 20:17:54 UTC 2018
I was trying to understand the way how HMM handle this concurrent issue
and how we handle it in amdgpu_ttm_tt_userptr_needs_pages() and
amdgpu_ttm_tt_affect_userptr(). HMM uses range->valid flag, we use
gtt->mmu_invalidations and gtt->last_set_pages. Both use the same lock
plus flag idea actually.
Thanks for the information, now I understand fence
ttm_eu_fence_buffer_objects() put to BOs will block CPU page table
update. This is another side of this concurrent issue I didn't know.
I had same worry that it has issue without invalidate_range_end()
callback as the calling sequence Felix lists. Now I think it's fine
after taking a look again today because of mm->mmap_sem usage, this is
my understanding:
A path:
down_write(&mm->mmap_sem);
mmu_notifier_invalidate_range_start()
take_lock()
release_lock()
CPU page table update
mmu_notifier_invalidate_range_end()
up_write(&mm->mmap_sem);
B path:
again:
down_read(&mm->mmap_sem);
hmm_vma_get_pfns()
up_read(&mm->mmap_sem);
....
....
take_lock()
if (!hmm_vma_range_done()) {
release_lock()
goto again
}
submit command job...
release_lock()
If you agree, I will submit patch v5 with some minor changes, and submit
another patch to replace get_user_page() with HMM.
Regards,
Philip
On 2018-09-27 11:36 AM, Christian König wrote:
> Yeah, I've read that as well.
>
> My best guess is that we just need to add a call to
> hmm_vma_range_done() after taking the lock and also replace
> get_user_pages() with hmm_vma_get_pfns().
>
> But I'm still not 100% sure how all of that is supposed to work together.
>
> Regards,
> Christian.
>
> Am 27.09.2018 um 16:50 schrieb Kuehling, Felix:
>>
>> I think the answer is here:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216
>>
>> Regards,
>>
>> Felix
>>
>> *From:*Koenig, Christian
>> *Sent:* Thursday, September 27, 2018 10:30 AM
>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>> *Cc:* j.glisse at gmail.com; Yang, Philip <Philip.Yang at amd.com>;
>> amd-gfx at lists.freedesktop.org
>> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace
>> mmu notifier v4
>>
>> At least with get_user_pages() that is perfectly possible.
>>
>> For HMM it could be that this is prevented somehow.
>>
>> Christian.
>>
>> Am 27.09.2018 16:27 schrieb "Kuehling, Felix" <Felix.Kuehling at amd.com
>> <mailto:Felix.Kuehling at amd.com>>:
>>
>> > In this case you can end up accessing pages which are invalidated
>> while get_user_pages is in process.
>>
>> What’s the sequence of events you have in mind? Something like this?
>>
>> * Page table is updated and triggers MMU notifier
>> * amdgpu MMU notifier runs and waits for pending CS to finish while
>> holding the read lock
>> * New CS starts just after invalidate_range_start MMU notifier
>> finishes but before the page table update is done
>> * get_user_pages returns outdated physical addresses
>>
>> I hope that’s not actually possible and that get_user_pages or
>> hmm_vma_fault would block until the page table update is done. That
>> is, invalidate_range_start marks the start of a page table update,
>> and while that update is in progress, get_user_pages or hmm_vma_fault
>> block. Jerome, can you comment on that?
>>
>> Thanks,
>> Felix
>>
>> *From:*Koenig, Christian
>> *Sent:* Thursday, September 27, 2018 9:59 AM
>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com
>> <mailto:Felix.Kuehling at amd.com>>
>> *Cc:* Yang, Philip <Philip.Yang at amd.com
>> <mailto:Philip.Yang at amd.com>>; amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>> <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace
>> mmu notifier v4
>>
>> Yeah I understand that, but again that won't work.
>>
>> In this case you can end up accessing pages which are invalidated
>> while get_user_pages is in process.
>>
>> Christian.
>>
>> Am 27.09.2018 15:41 schrieb "Kuehling, Felix" <Felix.Kuehling at amd.com
>> <mailto:Felix.Kuehling at amd.com>>:
>>
>> > I’m not planning to change that. I don’t think there is any need to
>> change it.
>>
>> >
>> > Yeah, but when HMM doesn't provide both the start and the end hock
>> of the invalidation this way won't work any more.
>> >
>> > So we need to find a solution for this,
>> > Christian.
>>
>> My whole argument is that you don’t need to hold the read lock until
>> the invalidate_range_end. Just read_lock and read_unlock in the
>> invalidate_range_start function.
>>
>> Regards,
>>
>> Felix
>>
>> *From:*Koenig, Christian
>> *Sent:* Thursday, September 27, 2018 9:22 AM
>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com
>> <mailto:Felix.Kuehling at amd.com>>
>> *Cc:* Yang, Philip <Philip.Yang at amd.com
>> <mailto:Philip.Yang at amd.com>>; amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>> <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>> *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to replace
>> mmu notifier v4
>>
>> Am 27.09.2018 um 15:18 schrieb Kuehling, Felix:
>>
>> > The problem is here:
>> >
>>
>> > ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
>>
>> > amdgpu_mn_unlock(p->mn);
>>
>> >
>> > We need to hold the lock until the fence is added to the
>> reservation object.
>> >
>> > Otherwise somebody could have changed the page tables just in
>> the moment between the check of
>> amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to the
>> reservation object.
>>
>> I’m not planning to change that. I don’t think there is any need
>> to change it.
>>
>>
>> Yeah, but when HMM doesn't provide both the start and the end hock of
>> the invalidation this way won't work any more.
>>
>> So we need to find a solution for this,
>> Christian.
>>
>> Regards,
>>
>> Felix
>>
>> *From:*Koenig, Christian
>> *Sent:* Thursday, September 27, 2018 7:24 AM
>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>> <mailto:Felix.Kuehling at amd.com>
>> *Cc:* Yang, Philip <Philip.Yang at amd.com>
>> <mailto:Philip.Yang at amd.com>; amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>> <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>> *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
>> replace mmu notifier v4
>>
>> Am 27.09.2018 um 13:08 schrieb Kuehling, Felix:
>>
>> > We double check that there wasn't any page table
>> modification while we prepared the submission and restart the
>> whole process when there actually was some update.
>> >
>> > The reason why we need to do this is here:
>> >
>>
>> > ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>> p->fence);
>> > amdgpu_mn_unlock(p->mn);
>>
>> >
>> > Only after the new fence is added to the buffer object we
>> can release the lock so that any invalidation will now block
>> on our command submission to finish before it modifies the
>> page table.
>>
>> I don’t see why this requires holding the read-lock until
>> invalidate_range_end. amdgpu_ttm_tt_affect_userptr gets
>> called while the mn read-lock is held in
>> invalidate_range_start notifier.
>>
>>
>> That's not related to amdgpu_ttm_tt_affect_userptr(), this
>> function could actually be called outside the lock.
>>
>> The problem is here:
>>
>> ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
>>
>> amdgpu_mn_unlock(p->mn);
>>
>>
>> We need to hold the lock until the fence is added to the
>> reservation object.
>>
>> Otherwise somebody could have changed the page tables just in the
>> moment between the check of amdgpu_ttm_tt_userptr_needs_pages()
>> and adding the fence to the reservation object.
>>
>> Regards,
>> Christian.
>>
>>
>> Regards,
>>
>> Felix
>>
>> *From:*Koenig, Christian
>> *Sent:* Thursday, September 27, 2018 5:27 AM
>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>> <mailto:Felix.Kuehling at amd.com>
>> *Cc:* Yang, Philip <Philip.Yang at amd.com>
>> <mailto:Philip.Yang at amd.com>; amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>> <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>> *Subject:* Re: [PATCH] drm/amdgpu: use HMM mirror callback to
>> replace mmu notifier v4
>>
>> That is correct, but take a look what we do when after
>> calling the amdgpu_mn_read_lock():
>>
>>
>> /* No memory allocation is allowed while holding
>> the mn lock */
>> amdgpu_mn_lock(p->mn);
>> amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) {
>> struct amdgpu_bo *bo =
>> ttm_to_amdgpu_bo(e->tv.bo);
>>
>> if
>> (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm)) {
>> r = -ERESTARTSYS;
>> goto error_abort;
>> }
>> }
>>
>>
>> We double check that there wasn't any page table modification
>> while we prepared the submission and restart the whole
>> process when there actually was some update.
>>
>> The reason why we need to do this is here:
>>
>> ttm_eu_fence_buffer_objects(&p->ticket, &p->validated,
>> p->fence);
>> amdgpu_mn_unlock(p->mn);
>>
>>
>> Only after the new fence is added to the buffer object we can
>> release the lock so that any invalidation will now block on
>> our command submission to finish before it modifies the page
>> table.
>>
>> The only other option would be to add the fence first and
>> then check if there was any update to the page tables.
>>
>> The issue with that approach is that adding a fence can't be
>> made undone, so if we find that there actually was an update
>> to the page tables we would need to somehow turn the CS into
>> a dummy (e.g. overwrite all IBs with NOPs or something like
>> that) and still submit it.
>>
>> Not sure if that is actually possible.
>>
>> Regards,
>> Christian.
>>
>> Am 27.09.2018 um 10:47 schrieb Kuehling, Felix:
>>
>> So back to my previous question:
>>
>> >> But do we really need another lock for this? Wouldn't the
>>
>> >> re-validation of userptr BOs (currently calling
>> get_user_pages) force
>>
>> >> synchronization with the ongoing page table
>> invalidation through the
>>
>> >> mmap_sem or other MM locks?
>>
>> >
>>
>> > No and yes. We don't hold any other locks while doing
>> command submission, but I expect that HMM has its own
>> mechanism to prevent that.
>>
>> >
>>
>> > Since we don't modify
>> amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly not
>> using this mechanism correctly.
>>
>> The existing amdgpu_mn_lock/unlock should block the MMU
>> notifier while a command submission is in progress. It
>> should also block command submission while an MMU
>> notifier is in progress.
>>
>> What we lose with HMM is the ability to hold a read-lock
>> for the entire duration of the invalidate_range_start
>> until invalidate_range_end. As I understand it, that lock
>> is meant to prevent new command submissions while the
>> page tables are being updated by the kernel. But my point
>> is, that get_user_pages or hmm_vma_fault should do the
>> same kind of thing. Before the command submission can go
>> ahead, it needs to update the userptr addresses. If the
>> page tables are still being updated, it will block there
>> even without holding the amdgpu_mn_read_lock.
>>
>> Regards,
>>
>> Felix
>>
>> *From:* Koenig, Christian
>> *Sent:* Thursday, September 27, 2018 3:00 AM
>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>
>> <mailto:Felix.Kuehling at amd.com>
>> *Cc:* Yang, Philip <Philip.Yang at amd.com>
>> <mailto:Philip.Yang at amd.com>;
>> amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>> <j.glisse at gmail.com> <mailto:j.glisse at gmail.com>
>> *Subject:* RE: [PATCH] drm/amdgpu: use HMM mirror
>> callback to replace mmu notifier v4
>>
>> No, that won't work. We would still run into lock
>> inversion problems.
>>
>> What we could do with the scheduler is to turn
>> submissions into dummies if we find that the page tables
>> are now outdated.
>>
>> But that would be really hacky and I'm not sure if that
>> would really work in all cases.
>>
>> Christian.
>>
>> Am 27.09.2018 08:53 schrieb "Kuehling, Felix"
>> <Felix.Kuehling at amd.com <mailto:Felix.Kuehling at amd.com>>:
>>
>> I had a chat with Jerome yesterday. He pointed out that
>> the new blockable parameter can be used to infer whether
>> the MMU notifier is being called in a reclaim operation.
>> So if blockable==true, it should even be safe to take the
>> BO reservation lock without problems. I think with that
>> we should be able to remove the read-write locking
>> completely and go back to locking (or try-locking for
>> blockable==false) the reservation locks in the MMU notifier?
>>
>> Regards,
>> Felix
>>
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org
>> <mailto:amd-gfx-bounces at lists.freedesktop.org>> On Behalf
>> Of Christian König
>> Sent: Saturday, September 15, 2018 3:47 AM
>> To: Kuehling, Felix <Felix.Kuehling at amd.com
>> <mailto:Felix.Kuehling at amd.com>>; Yang, Philip
>> <Philip.Yang at amd.com <mailto:Philip.Yang at amd.com>>;
>> amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>; Jerome Glisse
>> <j.glisse at gmail.com <mailto:j.glisse at gmail.com>>
>> Subject: Re: [PATCH] drm/amdgpu: use HMM mirror callback
>> to replace mmu notifier v4
>>
>> Am 14.09.2018 um 22:21 schrieb Felix Kuehling:
>> > On 2018-09-14 01:52 PM, Christian König wrote:
>> >> Am 14.09.2018 um 19:47 schrieb Philip Yang:
>> >>> On 2018-09-14 03:51 AM, Christian König wrote:
>> >>>> Am 13.09.2018 um 23:51 schrieb Felix Kuehling:
>> >>>>> On 2018-09-13 04:52 PM, Philip Yang wrote:
>> >>>>> [SNIP]
>> >>>>>> + amdgpu_mn_read_unlock(amn);
>> >>>>>> +
>> >>>>> amdgpu_mn_read_lock/unlock support recursive
>> locking for multiple
>> >>>>> overlapping or nested invalidation ranges. But if
>> you'r locking
>> >>>>> and unlocking in the same function. Is that still a
>> concern?
>> >>> I don't understand the possible recursive case, but
>> >>> amdgpu_mn_read_lock() still support recursive locking.
>> >>>> Well the real problem is that unlocking them here
>> won't work.
>> >>>>
>> >>>> We need to hold the lock until we are sure that the
>> operation which
>> >>>> updates the page tables is completed.
>> >>>>
>> >>> The reason for this change is because hmm mirror has
>> >>> invalidate_start callback, no invalidate_end callback
>> >>>
>> >>> Check mmu_notifier.c and hmm.c again, below is entire
>> logic to
>> >>> update CPU page tables and callback:
>> >>>
>> >>> mn lock amn->lock is used to protect interval tree
>> access because
>> >>> user may submit/register new userptr anytime.
>> >>> This is same for old and new way.
>> >>>
>> >>> step 2 guarantee the GPU operation is done before
>> updating CPU page
>> >>> table.
>> >>>
>> >>> So I think the change is safe. We don't need hold mn
>> lock until the
>> >>> CPU page tables update is completed.
>> >> No, that isn't even remotely correct. The lock doesn't
>> protects the
>> >> interval tree.
>> >>
>> >>> Old:
>> >>> 1. down_read_non_owner(&amn->lock)
>> >>> 2. loop to handle BOs from node->bos through
>> interval tree
>> >>> amn->object nodes
>> >>> gfx: wait for pending BOs fence operation
>> done, mark user
>> >>> pages dirty
>> >>> kfd: evict user queues of the process, wait
>> for queue
>> >>> unmap/map operation done
>> >>> 3. update CPU page tables
>> >>> 4. up_read(&amn->lock)
>> >>>
>> >>> New, switch step 3 and 4
>> >>> 1. down_read_non_owner(&amn->lock)
>> >>> 2. loop to handle BOs from node->bos through
>> interval tree
>> >>> amn->object nodes
>> >>> gfx: wait for pending BOs fence operation
>> done, mark user
>> >>> pages dirty
>> >>> kfd: evict user queues of the process, wait
>> for queue
>> >>> unmap/map operation done
>> >>> 3. up_read(&amn->lock)
>> >>> 4. update CPU page tables
>> >> The lock is there to make sure that we serialize page
>> table updates
>> >> with command submission.
>> > As I understand it, the idea is to prevent command
>> submission (adding
>> > new fences to BOs) while a page table invalidation is
>> in progress.
>>
>> Yes, exactly.
>>
>> > But do we really need another lock for this? Wouldn't the
>> > re-validation of userptr BOs (currently calling
>> get_user_pages) force
>> > synchronization with the ongoing page table
>> invalidation through the
>> > mmap_sem or other MM locks?
>>
>> No and yes. We don't hold any other locks while doing
>> command submission, but I expect that HMM has its own
>> mechanism to prevent that.
>>
>> Since we don't modify amdgpu_mn_lock()/amdgpu_mn_unlock()
>> we are certainly not using this mechanism correctly.
>>
>> Regards,
>> Christian.
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> <mailto:amd-gfx at lists.freedesktop.org>
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180927/eec56798/attachment-0001.html>
More information about the amd-gfx
mailing list