<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Christian,<br>
<br>
Yes, I agree. I am working on patch 2 to replace get_user_page
with HMM. One problem is in current gfx path, we check if
mmu_invalidation multiple times in amdgpu_cs_ioctl() path after
get_user_page(), amdgpu_cs_parser_bos(),
amdgpu_cs_list_validate(), and amdgpu_cs_submit(). For HMM,
hmm_vma_range_done() has to be called once and only once after
hmm_vma_get_pfns()/hmm_vma_fault(), so I will call
hmm_vma_range_done() inside amdgpu_cs_submit after holding the mn
lock. Is my understanding correct?<br>
<br>
Philip<br>
<br>
On 2018-10-02 11:05 AM, Christian König wrote:<br>
</div>
<blockquote type="cite"
cite="mid:09916f9a-3f5f-27ab-01e6-6d77303cf052@gmail.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div class="moz-cite-prefix">Checking more code and documentation
and thinking about it over my vacation I think I have some new
conclusions here.<br>
<br>
Currently we are using get_user_pages() together with an MMU
notifier to guarantee coherent address space view, because
get_user_pages() works by grabbing a reference to the pages and
ignoring concurrent page table updates.<br>
<br>
But HMM uses a different approach by checking the address space
for modifications using hmm_vma_range_done() and re-trying when
the address space has changed.<br>
<br>
Now what you are trying to do is to change that into
get_user_pages() and HMM callback and this is what won't work.
We can either use get_user_pages() with MMU notifier or we can
use HMM for the work, but we can't mix and match.<br>
<br>
So my initial guess was correct that we just need to change both
sides of the implementation at the same time.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 28.09.2018 um 17:13 schrieb Koenig, Christian:<br>
</div>
<blockquote type="cite"
cite="mid:b8686e6b-0c3e-4feb-afbd-80397aac31a0@email.android.com">
<meta content="text/html; charset=Windows-1252">
<div dir="auto">No it definitely isn't.
<div dir="auto"><br>
</div>
<div dir="auto">We have literally worked month on this with
the core MM developers.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Making sure that we have a consistent page
array is absolutely vital for correct operation.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Please also check Jerome's presentation from
XDC it also perfectly explains why this approach won't work
correctly.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Christian.</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">Am 28.09.2018 17:07 schrieb "Yang,
Philip" <a class="moz-txt-link-rfc2396E"
href="mailto:Philip.Yang@amd.com" moz-do-not-send="true"><Philip.Yang@amd.com></a>:<br
type="attribution">
</div>
</div>
<div>
<div class="moz-cite-prefix">For B path, we take
mm->mmap_sem, then call hmm_vma_get_pfns() or
get_user_pages(). This is obvious.<br>
<br>
For A path, mmu notifier
mmu_notifier_invalidate_range_start()/mmu_notifier_invalidate_range_end()
is called in many places, and the calling path is quit
complicated inside mm, it's not obvious. I checked many of
the them, for example:<br>
<br>
do_munmap() <br>
down_write(&mm->mmap_sem)<br>
arch_unmap()<br>
mpx_notify_unmap()...<br>
zap_bt_entries_mapping()<br>
zap_page_range()<br>
up_write(&mm->mmap_sem)<br>
<br>
void zap_page_range(struct vm_area_struct *vma, unsigned
long start,<br>
unsigned long size)<br>
{<br>
struct mm_struct *mm = vma->vm_mm;<br>
struct mmu_gather tlb;<br>
unsigned long end = start + size;<br>
<br>
lru_add_drain();<br>
tlb_gather_mmu(&tlb, mm, start, end);<br>
update_hiwater_rss(mm);<br>
mmu_notifier_invalidate_range_start(mm, start, end);<br>
for ( ; vma && vma->vm_start < end; vma =
vma->vm_next)<br>
unmap_single_vma(&tlb, vma, start, end, NULL);<br>
mmu_notifier_invalidate_range_end(mm, start, end);<br>
tlb_finish_mmu(&tlb, start, end);<br>
}<br>
<br>
So AFAIK it's okay without invalidate_range_end() callback.
<br>
<br>
Regards,<br>
Philip<br>
<br>
On 2018-09-28 01:25 AM, Koenig, Christian wrote:<br>
</div>
<blockquote type="cite">
<meta content="text/html; charset=utf-8">
<div dir="auto">No, that is incorrect as well :)
<div dir="auto"><br>
</div>
<div dir="auto">The mmap_sem isn't necessary taken during
page table updates.</div>
<div dir="auto"><br>
</div>
<div dir="auto">What you could do is replace
get_user_pages() directly with HMM. If I'm not
completely mistaken that should work as expected.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Christian.</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">Am 27.09.2018 22:18 schrieb
"Yang, Philip" <a class="moz-txt-link-rfc2396E"
href="mailto:Philip.Yang@amd.com"
moz-do-not-send="true"> <Philip.Yang@amd.com></a>:<br
type="attribution">
</div>
</div>
<div>
<div class="moz-cite-prefix">I was trying to understand
the way how HMM handle this concurrent issue and how we
handle it in amdgpu_ttm_tt_userptr_needs_pages() and
amdgpu_ttm_tt_affect_userptr(). HMM uses range->valid
flag, we use gtt->mmu_invalidations and
gtt->last_set_pages. Both use the same lock plus flag
idea actually.<br>
<br>
Thanks for the information, now I understand fence
ttm_eu_fence_buffer_objects() put to BOs will block CPU
page table update. This is another side of this
concurrent issue I didn't know.<br>
<br>
I had same worry that it has issue without
invalidate_range_end() callback as the calling sequence
Felix lists. Now I think it's fine after taking a look
again today because of mm->mmap_sem usage, this is my
understanding:<br>
<br>
A path:<br>
<br>
down_write(&mm->mmap_sem);<br>
mmu_notifier_invalidate_range_start()<br>
take_lock()<br>
release_lock()<br>
CPU page table update<br>
mmu_notifier_invalidate_range_end()<br>
up_write(&mm->mmap_sem);<br>
<br>
B path:<br>
<br>
again: <br>
down_read(&mm->mmap_sem);<br>
hmm_vma_get_pfns()<br>
up_read(&mm->mmap_sem);<br>
....<br>
....<br>
take_lock()<br>
if (!hmm_vma_range_done()) {<br>
release_lock()<br>
goto again<br>
}<br>
submit command job...<br>
release_lock()<br>
<br>
If you agree, I will submit patch v5 with some minor
changes, and submit another patch to replace
get_user_page() with HMM.<br>
<br>
Regards,<br>
Philip <br>
<br>
On 2018-09-27 11:36 AM, Christian König wrote:<br>
</div>
<blockquote type="cite">
<div class="moz-cite-prefix">Yeah, I've read that as
well.<br>
<br>
My best guess is that we just need to add a call to
hmm_vma_range_done() after taking the lock and also
replace get_user_pages() with hmm_vma_get_pfns().<br>
<br>
But I'm still not 100% sure how all of that is
supposed to work together.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 27.09.2018 um 16:50 schrieb Kuehling, Felix:<br>
</div>
<blockquote type="cite">
<meta name="Generator" content="Microsoft Word 15
(filtered medium)">
<style>
<!--
@font-face
{font-family:"Cambria Math"}
@font-face
{font-family:Calibri}
@font-face
{font-family:Consolas}
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black}
p.msonormal0, li.msonormal0, div.msonormal0
{margin-right:0in;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif}
span.PlainTextChar
{font-family:Consolas;
color:black}
p.msonormal00, li.msonormal00, div.msonormal00
{margin-right:0in;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif}
p.msonormal000, li.msonormal000, div.msonormal000
{margin-right:0in;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black}
p.emailquote, li.emailquote, div.emailquote
{margin-right:0in;
margin-left:1.0pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black}
p.msochpdefault, li.msochpdefault, div.msochpdefault
{margin-right:0in;
margin-left:0in;
font-size:10.0pt;
font-family:"Calibri",sans-serif}
span.plaintextchar0
{font-family:Consolas;
color:black}
span.plaintextchar00
{font-family:"Calibri",sans-serif}
span.emailstyle22
{font-family:"Calibri",sans-serif;
color:windowtext}
span.emailstyle23
{font-family:"Calibri",sans-serif;
color:windowtext}
span.emailstyle24
{font-family:"Calibri",sans-serif;
color:windowtext}
span.emailstyle25
{font-family:"Calibri",sans-serif;
color:windowtext}
span.emailstyle30
{font-family:"Calibri",sans-serif;
color:windowtext}
span.EmailStyle33
{font-family:"Calibri",sans-serif;
color:windowtext}
.MsoChpDefault
{font-size:10.0pt}
@page WordSection1
{margin:1.0in 1.0in 1.0in 1.0in}
ol
{margin-bottom:0in}
ul
{margin-bottom:0in}
-->
</style>
<div class="WordSection1">
<p class="MsoNormal"><span style="color:windowtext">I
think the answer is here: <a
href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216"
moz-do-not-send="true">
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216</a></span></p>
<p class="MsoNormal"><span style="color:windowtext"> </span></p>
<p class="MsoNormal"><span style="color:windowtext">Regards,</span></p>
<p class="MsoNormal"><span style="color:windowtext">
Felix</span></p>
<p class="MsoNormal"><span style="color:windowtext"> </span></p>
<div>
<div style="border:none; border-top:solid #E1E1E1
1.0pt; padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="color:windowtext">From:</span></b><span
style="color:windowtext"> Koenig, Christian
<br>
<b>Sent:</b> Thursday, September 27, 2018
10:30 AM<br>
<b>To:</b> Kuehling, Felix <a
class="moz-txt-link-rfc2396E"
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true">
<Felix.Kuehling@amd.com></a><br>
<b>Cc:</b> <a
class="moz-txt-link-abbreviated"
href="mailto:j.glisse@gmail.com"
moz-do-not-send="true">j.glisse@gmail.com</a>;
Yang, Philip <a
class="moz-txt-link-rfc2396E"
href="mailto:Philip.Yang@amd.com"
moz-do-not-send="true"><Philip.Yang@amd.com></a>;
<a class="moz-txt-link-abbreviated"
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
<b>Subject:</b> RE: [PATCH] drm/amdgpu: use
HMM mirror callback to replace mmu notifier
v4</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal"><span
style="color:windowtext">At least with
get_user_pages() that is perfectly possible. </span></p>
<div>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
</div>
<div>
<p class="MsoNormal"><span
style="color:windowtext">For HMM it could be
that this is prevented somehow.</span></p>
</div>
<div>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
</div>
<div>
<p class="MsoNormal"><span
style="color:windowtext">Christian.</span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<div>
<p class="MsoNormal"><span
style="color:windowtext">Am 27.09.2018 16:27
schrieb "Kuehling, Felix" <<a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</span></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span
style="color:windowtext">> In this case
you can end up accessing pages which are
invalidated while get_user_pages is in
process.</span></p>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<p class="MsoNormal"><span
style="color:windowtext">What’s the sequence
of events you have in mind? Something like
this?</span></p>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<ul style="margin-top:0in" type="disc">
<li class="MsoListParagraph"
style="color:windowtext; margin-left:0in">Page
table is updated and triggers MMU notifier </li>
<li class="MsoListParagraph"
style="color:windowtext; margin-left:0in">amdgpu
MMU notifier runs and waits for pending CS
to finish while holding the read lock </li>
<li class="MsoListParagraph"
style="color:windowtext; margin-left:0in">New
CS starts just after invalidate_range_start
MMU notifier finishes but before the page
table update is done </li>
<li class="MsoListParagraph"
style="color:windowtext; margin-left:0in">get_user_pages
returns outdated physical addresses </li>
</ul>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<p class="MsoNormal"><span
style="color:windowtext">I hope that’s not
actually possible and that get_user_pages or
hmm_vma_fault would block until the page
table update is done. That is,
invalidate_range_start marks the start of a
page table update, and while that update is
in progress, get_user_pages or hmm_vma_fault
block. Jerome, can you comment on that?</span></p>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<p class="MsoNormal"><span
style="color:windowtext">Thanks,<br>
Felix</span></p>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<div>
<div style="border:none; border-top:solid
#E1E1E1 1.0pt; padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="color:windowtext">From:</span></b><span
style="color:windowtext"> Koenig,
Christian <br>
<b>Sent:</b> Thursday, September 27,
2018 9:59 AM<br>
<b>To:</b> Kuehling, Felix <<a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true">Felix.Kuehling@amd.com</a>><br>
<b>Cc:</b> Yang, Philip <<a
href="mailto:Philip.Yang@amd.com"
moz-do-not-send="true">Philip.Yang@amd.com</a>>;
<a
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
Jerome Glisse <<a
href="mailto:j.glisse@gmail.com"
moz-do-not-send="true">j.glisse@gmail.com</a>><br>
<b>Subject:</b> RE: [PATCH] drm/amdgpu:
use HMM mirror callback to replace mmu
notifier v4</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal"><span
style="color:windowtext">Yeah I understand
that, but again that won't work. </span></p>
<div>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
</div>
<div>
<p class="MsoNormal"><span
style="color:windowtext">In this case
you can end up accessing pages which are
invalidated while get_user_pages is in
process.</span></p>
</div>
<div>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
</div>
<div>
<p class="MsoNormal"><span
style="color:windowtext">Christian.</span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<div>
<p class="MsoNormal"><span
style="color:windowtext">Am 27.09.2018
15:41 schrieb "Kuehling, Felix" <<a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</span></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span
style="color:windowtext">> I’m not
planning to change that. I don’t think
there is any need to change it.</span></p>
<p class="MsoNormal">><br>
> Yeah, but when HMM doesn't provide
both the start and the end hock of the
invalidation this way won't work any more.<br>
><br>
> So we need to find a solution for
this,<br>
> Christian.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">My whole argument is
that you don’t need to hold the read lock
until the invalidate_range_end. Just
read_lock and read_unlock in the
invalidate_range_start function.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">Regards,</p>
<p class="MsoNormal"
style="margin-bottom:12.0pt"> Felix</p>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<div>
<div style="border:none; border-top:solid
#E1E1E1 1.0pt; padding:3.0pt 0in 0in
0in">
<p class="MsoNormal"><b><span
style="color:windowtext">From:</span></b><span
style="color:windowtext"> Koenig,
Christian <br>
<b>Sent:</b> Thursday, September 27,
2018 9:22 AM<br>
<b>To:</b> Kuehling, Felix <<a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true">Felix.Kuehling@amd.com</a>><br>
<b>Cc:</b> Yang, Philip <<a
href="mailto:Philip.Yang@amd.com"
moz-do-not-send="true">Philip.Yang@amd.com</a>>;
<a
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
Jerome Glisse <<a
href="mailto:j.glisse@gmail.com"
moz-do-not-send="true">j.glisse@gmail.com</a>><br>
<b>Subject:</b> Re: [PATCH]
drm/amdgpu: use HMM mirror callback
to replace mmu notifier v4</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal">Am 27.09.2018 um
15:18 schrieb Kuehling, Felix:</p>
</div>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal">> The problem is
here:<br>
></p>
<p class="MsoNormal">>
ttm_eu_fence_buffer_objects(&p->ticket,
&p->validated, p->fence);</p>
<p class="MsoNormal">>
amdgpu_mn_unlock(p->mn); </p>
<p class="MsoNormal"
style="margin-bottom:12.0pt">><br>
> We need to hold the lock until the
fence is added to the reservation
object.<br>
><br>
> Otherwise somebody could have
changed the page tables just in the
moment between the check of
amdgpu_ttm_tt_userptr_needs_pages() and
adding the fence to the reservation
object.</p>
<p class="MsoNormal"><span
style="color:windowtext">I’m not
planning to change that. I don’t think
there is any need to change it.</span></p>
</blockquote>
<p class="MsoNormal"
style="margin-bottom:12.0pt"><br>
Yeah, but when HMM doesn't provide both
the start and the end hock of the
invalidation this way won't work any more.<br>
<br>
So we need to find a solution for this,<br>
Christian.</p>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<p class="MsoNormal"><span
style="color:windowtext">Regards,</span></p>
<p class="MsoNormal"><span
style="color:windowtext"> Felix</span></p>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<div>
<div style="border:none;
border-top:solid #E1E1E1 1.0pt;
padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="color:windowtext">From:</span></b><span
style="color:windowtext"> Koenig,
Christian <br>
<b>Sent:</b> Thursday, September
27, 2018 7:24 AM<br>
<b>To:</b> Kuehling, Felix <a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
<b>Cc:</b> Yang, Philip <a
href="mailto:Philip.Yang@amd.com"
moz-do-not-send="true"><Philip.Yang@amd.com></a>;
<a
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
Jerome Glisse <a
href="mailto:j.glisse@gmail.com"
moz-do-not-send="true"><j.glisse@gmail.com></a><br>
<b>Subject:</b> Re: [PATCH]
drm/amdgpu: use HMM mirror
callback to replace mmu notifier
v4</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal">Am 27.09.2018 um
13:08 schrieb Kuehling, Felix:</p>
</div>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal">> We double
check that there wasn't any page table
modification while we prepared the
submission and restart the whole
process when there actually was some
update.<br>
><br>
> The reason why we need to do this
is here:<br>
></p>
<p class="MsoNormal">>
ttm_eu_fence_buffer_objects(&p->ticket,
&p->validated, p->fence);<br>
>
amdgpu_mn_unlock(p->mn);</p>
<p class="MsoNormal"
style="margin-bottom:12.0pt">><br>
> Only after the new fence is added
to the buffer object we can release
the lock so that any invalidation will
now block on our command submission to
finish before it modifies the page
table.<br>
<br>
</p>
<p class="MsoNormal"><span
style="color:windowtext">I don’t see
why this requires holding the
read-lock until
invalidate_range_end.
amdgpu_ttm_tt_affect_userptr gets
called while the mn read-lock is
held in invalidate_range_start
notifier.</span></p>
</blockquote>
<p class="MsoNormal"
style="margin-bottom:12.0pt"><br>
That's not related to
amdgpu_ttm_tt_affect_userptr(), this
function could actually be called
outside the lock.<br>
<br>
The problem is here:</p>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal">ttm_eu_fence_buffer_objects(&p->ticket,
&p->validated, p->fence);</p>
</blockquote>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal">amdgpu_mn_unlock(p->mn);
</p>
</blockquote>
<p class="MsoNormal"
style="margin-bottom:12.0pt"><br>
We need to hold the lock until the fence
is added to the reservation object.<br>
<br>
Otherwise somebody could have changed
the page tables just in the moment
between the check of
amdgpu_ttm_tt_userptr_needs_pages() and
adding the fence to the reservation
object.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<br>
</p>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<p class="MsoNormal"><span
style="color:windowtext">Regards,</span></p>
<p class="MsoNormal"><span
style="color:windowtext"> Felix</span></p>
<p class="MsoNormal"><span
style="color:windowtext"> </span></p>
<div>
<div style="border:none;
border-top:solid #E1E1E1 1.0pt;
padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="color:windowtext">From:</span></b><span
style="color:windowtext">
Koenig, Christian <br>
<b>Sent:</b> Thursday, September
27, 2018 5:27 AM<br>
<b>To:</b> Kuehling, Felix <a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
<b>Cc:</b> Yang, Philip <a
href="mailto:Philip.Yang@amd.com"
moz-do-not-send="true"><Philip.Yang@amd.com></a>;
<a
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
Jerome Glisse <a
href="mailto:j.glisse@gmail.com"
moz-do-not-send="true"><j.glisse@gmail.com></a><br>
<b>Subject:</b> Re: [PATCH]
drm/amdgpu: use HMM mirror
callback to replace mmu notifier
v4</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal"
style="margin-bottom:12.0pt">That is
correct, but take a look what we do
when after calling the
amdgpu_mn_read_lock():<br>
<br>
<br>
</p>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal"> /* No
memory allocation is allowed while
holding the mn lock */<br>
amdgpu_mn_lock(p->mn);<br>
amdgpu_bo_list_for_each_userptr_entry(e,
p->bo_list) {<br>
struct amdgpu_bo
*bo =
ttm_to_amdgpu_bo(e->tv.bo);<br>
<br>
if
(amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm))
{<br>
r =
-ERESTARTSYS;<br>
goto
error_abort;<br>
}<br>
}</p>
</blockquote>
<p class="MsoNormal"
style="margin-bottom:12.0pt"><br>
We double check that there wasn't
any page table modification while we
prepared the submission and restart
the whole process when there
actually was some update.<br>
<br>
The reason why we need to do this is
here:<br>
<br>
</p>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal">
ttm_eu_fence_buffer_objects(&p->ticket,
&p->validated,
p->fence);<br>
amdgpu_mn_unlock(p->mn);</p>
</blockquote>
<p class="MsoNormal"><br>
Only after the new fence is added to
the buffer object we can release the
lock so that any invalidation will
now block on our command submission
to finish before it modifies the
page table.<br>
<br>
The only other option would be to
add the fence first and then check
if there was any update to the page
tables.<br>
<br>
The issue with that approach is that
adding a fence can't be made undone,
so if we find that there actually
was an update to the page tables we
would need to somehow turn the CS
into a dummy (e.g. overwrite all IBs
with NOPs or something like that)
and still submit it.<br>
<br>
Not sure if that is actually
possible.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 27.09.2018 um 10:47 schrieb
Kuehling, Felix:</p>
</div>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<p class="MsoNormal">So back to my
previous question:</p>
<p class="MsoPlainText"> </p>
<p class="MsoPlainText">>> But
do we really need another lock for
this? Wouldn't the </p>
<p class="MsoPlainText">>>
re-validation of userptr BOs
(currently calling get_user_pages)
force </p>
<p class="MsoPlainText">>>
synchronization with the ongoing
page table invalidation through the
</p>
<p class="MsoPlainText">>>
mmap_sem or other MM locks?</p>
<p class="MsoPlainText">> </p>
<p class="MsoPlainText">> No and
yes. We don't hold any other locks
while doing command submission, but
I expect that HMM has its own
mechanism to prevent that.</p>
<p class="MsoPlainText">> </p>
<p class="MsoPlainText">> Since we
don't modify
amdgpu_mn_lock()/amdgpu_mn_unlock()
we are certainly not using this
mechanism correctly.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">The existing
amdgpu_mn_lock/unlock should block
the MMU notifier while a command
submission is in progress. It should
also block command submission while
an MMU notifier is in progress.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">What we lose with
HMM is the ability to hold a
read-lock for the entire duration of
the invalidate_range_start until
invalidate_range_end. As I
understand it, that lock is meant to
prevent new command submissions
while the page tables are being
updated by the kernel. But my point
is, that get_user_pages or
hmm_vma_fault should do the same
kind of thing. Before the command
submission can go ahead, it needs to
update the userptr addresses. If the
page tables are still being updated,
it will block there even without
holding the amdgpu_mn_read_lock.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">Regards,</p>
<p class="MsoNormal"> Felix</p>
<p class="MsoNormal"> </p>
<div>
<div style="border:none;
border-top:solid #E1E1E1 1.0pt;
padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b>
Koenig, Christian <br>
<b>Sent:</b> Thursday, September
27, 2018 3:00 AM<br>
<b>To:</b> Kuehling, Felix <a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
<b>Cc:</b> Yang, Philip <a
href="mailto:Philip.Yang@amd.com"
moz-do-not-send="true"><Philip.Yang@amd.com></a>;
<a
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
Jerome Glisse <a
href="mailto:j.glisse@gmail.com"
moz-do-not-send="true"><j.glisse@gmail.com></a><br>
<b>Subject:</b> RE: [PATCH]
drm/amdgpu: use HMM mirror
callback to replace mmu notifier
v4</p>
</div>
</div>
<p class="MsoNormal"> </p>
<div>
<div>
<p class="MsoNormal">No, that
won't work. We would still run
into lock inversion problems.</p>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">What we
could do with the scheduler is
to turn submissions into
dummies if we find that the
page tables are now outdated.</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">But that
would be really hacky and I'm
not sure if that would really
work in all cases.</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">Christian.</p>
</div>
</div>
<div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal">Am
27.09.2018 08:53 schrieb
"Kuehling, Felix" <<a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</p>
</div>
</div>
</div>
<div>
<p class="MsoNormal">I had a chat
with Jerome yesterday. He pointed
out that the new blockable
parameter can be used to infer
whether the MMU notifier is being
called in a reclaim operation. So
if blockable==true, it should even
be safe to take the BO reservation
lock without problems. I think
with that we should be able to
remove the read-write locking
completely and go back to locking
(or try-locking for
blockable==false) the reservation
locks in the MMU notifier?<br>
<br>
Regards,<br>
Felix<br>
<br>
-----Original Message-----<br>
From: amd-gfx <<a
href="mailto:amd-gfx-bounces@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx-bounces@lists.freedesktop.org</a>>
On Behalf Of Christian König<br>
Sent: Saturday, September 15, 2018
3:47 AM<br>
To: Kuehling, Felix <<a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
Yang, Philip <<a
href="mailto:Philip.Yang@amd.com"
moz-do-not-send="true">Philip.Yang@amd.com</a>>;
<a
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
Jerome Glisse <<a
href="mailto:j.glisse@gmail.com"
moz-do-not-send="true">j.glisse@gmail.com</a>><br>
Subject: Re: [PATCH] drm/amdgpu:
use HMM mirror callback to replace
mmu notifier v4<br>
<br>
Am 14.09.2018 um 22:21 schrieb
Felix Kuehling:<br>
> On 2018-09-14 01:52 PM,
Christian König wrote:<br>
>> Am 14.09.2018 um 19:47
schrieb Philip Yang:<br>
>>> On 2018-09-14 03:51
AM, Christian König wrote:<br>
>>>> Am 13.09.2018 um
23:51 schrieb Felix Kuehling:<br>
>>>>> On 2018-09-13
04:52 PM, Philip Yang wrote:<br>
>>>>> [SNIP]<br>
>>>>>> +
amdgpu_mn_read_unlock(amn);<br>
>>>>>> +<br>
>>>>>
amdgpu_mn_read_lock/unlock support
recursive locking for multiple <br>
>>>>> overlapping
or nested invalidation ranges. But
if you'r locking <br>
>>>>> and unlocking
in the same function. Is that
still a concern?<br>
>>> I don't understand
the possible recursive case, but<br>
>>> amdgpu_mn_read_lock()
still support recursive locking.<br>
>>>> Well the real
problem is that unlocking them
here won't work.<br>
>>>><br>
>>>> We need to hold
the lock until we are sure that
the operation which <br>
>>>> updates the page
tables is completed.<br>
>>>><br>
>>> The reason for this
change is because hmm mirror has <br>
>>> invalidate_start
callback, no invalidate_end
callback<br>
>>><br>
>>> Check mmu_notifier.c
and hmm.c again, below is entire
logic to <br>
>>> update CPU page
tables and callback:<br>
>>><br>
>>> mn lock amn->lock
is used to protect interval tree
access because <br>
>>> user may
submit/register new userptr
anytime.<br>
>>> This is same for old
and new way.<br>
>>><br>
>>> step 2 guarantee the
GPU operation is done before
updating CPU page <br>
>>> table.<br>
>>><br>
>>> So I think the change
is safe. We don't need hold mn
lock until the <br>
>>> CPU page tables
update is completed.<br>
>> No, that isn't even
remotely correct. The lock doesn't
protects the <br>
>> interval tree.<br>
>><br>
>>> Old:<br>
>>> 1.
down_read_non_owner(&amn->lock)<br>
>>> 2. loop to handle
BOs from node->bos through
interval tree<br>
>>> amn->object nodes<br>
>>> gfx: wait for
pending BOs fence operation done,
mark user <br>
>>> pages dirty<br>
>>> kfd: evict
user queues of the process, wait
for queue <br>
>>> unmap/map operation
done<br>
>>> 3. update CPU
page tables<br>
>>> 4.
up_read(&amn->lock)<br>
>>><br>
>>> New, switch step 3
and 4<br>
>>> 1.
down_read_non_owner(&amn->lock)<br>
>>> 2. loop to handle
BOs from node->bos through
interval tree<br>
>>> amn->object nodes<br>
>>> gfx: wait for
pending BOs fence operation done,
mark user <br>
>>> pages dirty<br>
>>> kfd: evict
user queues of the process, wait
for queue <br>
>>> unmap/map operation
done<br>
>>> 3.
up_read(&amn->lock)<br>
>>> 4. update CPU
page tables<br>
>> The lock is there to make
sure that we serialize page table
updates <br>
>> with command submission.<br>
> As I understand it, the idea
is to prevent command submission
(adding <br>
> new fences to BOs) while a
page table invalidation is in
progress.<br>
<br>
Yes, exactly.<br>
<br>
> But do we really need another
lock for this? Wouldn't the <br>
> re-validation of userptr BOs
(currently calling get_user_pages)
force <br>
> synchronization with the
ongoing page table invalidation
through the <br>
> mmap_sem or other MM locks?<br>
<br>
No and yes. We don't hold any
other locks while doing command
submission, but I expect that HMM
has its own mechanism to prevent
that.<br>
<br>
Since we don't modify
amdgpu_mn_lock()/amdgpu_mn_unlock()
we are certainly not using this
mechanism correctly.<br>
<br>
Regards,<br>
Christian.<br>
_______________________________________________<br>
amd-gfx mailing list<br>
<a
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
<a
href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a></p>
</div>
</blockquote>
<p class="MsoNormal"> </p>
</blockquote>
<p class="MsoNormal"> </p>
</blockquote>
<p class="MsoNormal"> </p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</blockquote>
<br>
</div>
</blockquote>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>