<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><br>
</p>
<br>
<div class="moz-cite-prefix">On 2016-11-23 02:32 PM, Jason Gunthorpe
wrote:<br>
</div>
<blockquote cite="mid:20161123193228.GC12146@obsidianresearch.com"
type="cite">
<pre wrap="">On Wed, Nov 23, 2016 at 02:14:40PM -0500, Serguei Sagalovitch wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
On 2016-11-23 02:05 PM, Jason Gunthorpe wrote:
</pre>
</blockquote>
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">As Bart says, it would be best to be combined with something like
Mellanox's ODP MRs, which allows a page to be evicted and then trigger
a CPU interrupt if a DMA is attempted so it can be brought back.
</pre>
</blockquote>
</blockquote>
<pre wrap="">
</pre>
<blockquote type="cite">
<pre wrap="">Please note that in the general case (including MR one) we could have
"page fault" from the different PCIe device. So all PCIe device must
be synchronized.
</pre>
</blockquote>
<pre wrap="">
Standard RDMA MRs require pinned pages, the DMA address cannot change
while the MR exists (there is no hardware support for this at all), so
page faulting from any other device is out of the question while they
exist. This is the same requirement as typical simple driver DMA which
requires pages pinned until the simple device completes DMA.
ODP RDMA MRs do not require that, they just page fault like the CPU or
really anything and the kernel has to make sense of concurrant page
faults from multiple sources.
The upshot is that GPU scenarios that rely on highly dynamic
virtual->physical translation cannot sanely be combined with standard
long-life RDMA MRs.</pre>
</blockquote>
We do not want to have "highly" dynamic translation due to
performance cost. <br>
We need to support "overcommit" but would like to minimize impact.<br>
<br>
To support RDMA MRs for GPU/VRAM/PCIe device memory (which is must)
<br>
we need either globally force pinning for the scope of <br>
"get_user_pages() / "put_pages" or have special handling for RDMA
MRs and <br>
similar cases. Generally it could be difficult to correctly handle
"DMA in progress"<br>
due to the facts that (a) DMA could originate from numerous PCIe
devices<br>
simultaneously including requests to receive network data. (b) in
HSA case DMA could<br>
originated from user space without kernel driver knowledge. <br>
So without corresponding h/w support everywhere I do not see how it
could<br>
be solved effectively.<br>
<blockquote cite="mid:20161123193228.GC12146@obsidianresearch.com"
type="cite">
<pre wrap="">
Certainly, any solution for GPUs must follow the typical page pinning
semantics, changing the DMA address of a page must be blocked while
any DMA is in progress.
</pre>
</blockquote>
<blockquote cite="mid:20161123193228.GC12146@obsidianresearch.com"
type="cite">
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">Does HMM solve the peer-peer problem? Does it do it generically or
only for drivers that are mirroring translation tables?
</pre>
</blockquote>
</blockquote>
<pre wrap="">
</pre>
<blockquote type="cite">
<pre wrap="">In current form HMM doesn't solve peer-peer problem. Currently it allow
"mirroring" of "malloc" memory on GPU which is not always what needed.
Additionally there is need to have opportunity to share VRAM allocations
between different processes.
</pre>
</blockquote>
<pre wrap="">
Humm, so it can be removed from Alexander's list then :\</pre>
</blockquote>
HMM is very useful for some type of scenarios as well as it could
significantly <br>
simplify (for performance) implementations of some features e.g.
OpenCL 2.0 SVM.<br>
<blockquote cite="mid:20161123193228.GC12146@obsidianresearch.com"
type="cite">
<pre wrap="">
As Dan suggested, maybe we need to do both. Some kind of fix for
get_user_pages() for smaller mappings (eg ZONE_DEVICE) and a mandatory
API conversion to get_user_dma_sg() for other cases?
Jason
</pre>
</blockquote>
<br>
<div class="moz-signature">
<address><font face="DejaVu Sans">Sincerely yours,<br>
Serguei Sagalovitch<br>
</font></address>
</div>
</body>
</html>