<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><br>
</p>
<br>
<div class="moz-cite-prefix">On 2016-11-22 04:03 PM, Daniel Vetter
wrote:<br>
</div>
<blockquote
cite="mid:CAKMK7uF+k5LvcPEHvtdcXQFrpKVbFxwZ32EexoU3rZ9LFhVSow@mail.gmail.com"
type="cite">
<pre wrap="">On Tue, Nov 22, 2016 at 9:35 PM, Serguei Sagalovitch
<a class="moz-txt-link-rfc2396E" href="mailto:serguei.sagalovitch@amd.com"><serguei.sagalovitch@amd.com></a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
On 2016-11-22 03:10 PM, Daniel Vetter wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
On Tue, Nov 22, 2016 at 9:01 PM, Dan Williams <a class="moz-txt-link-rfc2396E" href="mailto:dan.j.williams@intel.com"><dan.j.williams@intel.com></a>
wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
On Tue, Nov 22, 2016 at 10:59 AM, Serguei Sagalovitch
<a class="moz-txt-link-rfc2396E" href="mailto:serguei.sagalovitch@amd.com"><serguei.sagalovitch@amd.com></a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
I personally like "device-DAX" idea but my concerns are:
- How well it will co-exists with the DRM infrastructure /
implementations
in part dealing with CPU pointers?
</pre>
</blockquote>
<pre wrap="">
Inside the kernel a device-DAX range is "just memory" in the sense
that you can perform pfn_to_page() on it and issue I/O, but the vma is
not migratable. To be honest I do not know how well that co-exists
with drm infrastructure.
</pre>
<blockquote type="cite">
<pre wrap="">- How well we will be able to handle case when we need to
"move"/"evict"
memory/data to the new location so CPU pointer should point to the
new
physical location/address
(and may be not in PCI device memory at all)?
</pre>
</blockquote>
<pre wrap="">
So, device-DAX deliberately avoids support for in-kernel migration or
overcommit. Those cases are left to the core mm or drm. The device-dax
interface is for cases where all that is needed is a direct-mapping to
a statically-allocated physical-address range be it persistent memory
or some other special reserved memory range.
</pre>
</blockquote>
<pre wrap="">
For some of the fancy use-cases (e.g. to be comparable to what HMM can
pull off) I think we want all the magic in core mm, i.e. migration and
overcommit. At least that seems to be the very strong drive in all
general-purpose gpu abstractions and implementations, where memory is
allocated with malloc, and then mapped/moved into vram/gpu address
space through some magic,
</pre>
</blockquote>
<pre wrap="">
It is possible that there is other way around: memory is requested to be
allocated and should be kept in vram for performance reason but due
to possible overcommit case we need at least temporally to "move" such
allocation to system memory.
</pre>
</blockquote>
<pre wrap="">
With migration I meant migrating both ways of course. And with stuff
like numactl we can also influence where exactly the malloc'ed memory
is allocated originally, at least if we'd expose the vram range as a
very special numa node that happens to be far away and not hold any
cpu cores.
-Daniel
</pre>
</blockquote>
One additional item to consider: it is not only "plain" numa case
where<br>
we could have different performance for access but also possibility
that<br>
we will not have access at all (or write only access) particular if
PCIe<br>
devices belong to different root complex. I must admit that I do not
know<br>
how to detect reliably such cases in the kernel.<br>
<div class="moz-signature">
<address><font face="DejaVu Sans"><br>
</font></address>
</div>
</body>
</html>