<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
Am 06.11.23 um 15:11 schrieb Danilo Krummrich:<br>
<blockquote type="cite" cite="mid:ZUj0DdYZUgjhcvf5@pollux">
<pre class="moz-quote-pre" wrap="">On Mon, Nov 06, 2023 at 02:05:13PM +0100, Christian König wrote:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Am 06.11.23 um 13:16 schrieb Danilo Krummrich:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">[SNIP]
This reference count just prevents that the VM is freed as long as other
ressources are attached to it that carry a VM pointer, such as mappings and
VM_BOs. The motivation for that are VM_BOs. For mappings it's indeed a bit
paranoid, but it doesn't hurt either and keeps it consistant.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
Ah! Yeah, we have similar semantics in amdgpu as well.
But we keep the reference to the root GEM object and not the VM.
Ok, that makes much more sense then keeping one reference for each mapping.
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Because of this the mapping should *never* have a reference to the VM, but
rather the VM destroys all mapping when it is destroyed itself.
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Hence, If the VM is still alive at a point where you don't expect it to
be, then it's
simply a driver bug.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">Driver bugs is just what I try to prevent here. When individual mappings
keep the VM structure alive then drivers are responsible to clean them up,
if the VM cleans up after itself then we don't need to worry about it in the
driver.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">Drivers are *always* responsible for that. This has nothing to do with whether
the VM is reference counted or not. GPUVM can't clean up mappings after itself.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
Why not?
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
I feel like we're talking past each other here, at least to some extend.
However, I can't yet see where exactly the misunderstanding resides.</pre>
</blockquote>
<br>
+1<br>
<br>
<span style="white-space: pre-wrap">
</span>
<blockquote type="cite" cite="mid:ZUj0DdYZUgjhcvf5@pollux">
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">At least in amdgpu we have it exactly like that. E.g. the higher level can
cleanup the BO_VM structure at any time possible, even when there are
mappings.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
What do you mean with "cleanup the VM_BO structue" exactly?
The VM_BO structure keeps track of all the mappings mapped in the VM_BO's VM
being backed by the VM_BO's GEM object. And the GEM objects keeps a list of
the corresponding VM_BOs.
Hence, as long as there are mappings that this VM_BO keeps track of, this VM_BO
should stay alive.</pre>
</blockquote>
<br>
No, exactly the other way around. When the VM_BO structure is
destroyed the mappings are destroyed with them.<br>
<br>
Otherwise you would need to destroy each individual mapping
separately before teardown which is quite inefficient.<br>
<br>
<span style="white-space: pre-wrap">
</span>
<blockquote type="cite" cite="mid:ZUj0DdYZUgjhcvf5@pollux">
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">The VM then keeps track which areas still need to be invalidated
in the physical representation of the page tables.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
And the VM does that through its tree of mappings (struct drm_gpuva). Hence, if
the VM would just remove those structures on cleanup by itself, you'd loose the
ability of cleaning up the page tables. Unless, you track this separately, which
would make the whole tracking of GPUVM itself kinda pointless.</pre>
</blockquote>
<br>
But how do you then keep track of areas which are freed and needs to
be updated so that nobody can access the underlying memory any more?<br>
<br>
<span style="white-space: pre-wrap">
</span>
<blockquote type="cite" cite="mid:ZUj0DdYZUgjhcvf5@pollux">
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">I would expect that the generalized GPU VM handling would need something
similar. If we leave that to the driver then each driver would have to
implement that stuff on it's own again.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
Similar to what? What exactly do you think can be generalized here?</pre>
</blockquote>
<br>
Similar to how amdgpu works.<br>
<br>
From what I can see you are basically re-inventing everything we
already have in there and asking the same questions we stumbled over
years ago.<br>
<br>
<span style="white-space: pre-wrap">
</span>
<blockquote type="cite" cite="mid:ZUj0DdYZUgjhcvf5@pollux">
<blockquote type="cite">
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">If the driver left mappings, GPUVM would just leak them without reference count.
It doesn't know about the drivers surrounding structures, nor does it know about
attached ressources such as PT(E)s.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
What are we talking with the word "mapping"? The BO_VM structure? Or each
individual mapping?
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
An individual mapping represented by struct drm_gpuva.</pre>
</blockquote>
<br>
Yeah than that certainly doesn't work. See below.<br>
<br>
<span style="white-space: pre-wrap">
</span>
<blockquote type="cite" cite="mid:ZUj0DdYZUgjhcvf5@pollux">
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">E.g. what we need to prevent is that VM structure (or the root GEM object)
is released while VM_BOs are still around. That's what I totally agree on.
But each individual mapping is a different story. Userspace can create so
many of them that we probably could even overrun a 32bit counter quite
easily.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
REFCOUNT_MAX is specified as 0x7fff_ffff. I agree there can be a lot of
mappings, but (including the VM_BO references) more than 2.147.483.647 per VM?</pre>
</blockquote>
<br>
IIRC on amdgpu we can create something like 100k mappings per second
and each takes ~64 bytes.<br>
<br>
So you just need 128GiB of memory and approx 20 seconds to let the
kernel run into a refcount overrun.<br>
<br>
The worst I've seen in a real world game was around 19k mappings,
but that doesn't mean that this here can't be exploited.<br>
<br>
What can be done is to keep one reference per VM_BO structure, but I
think per mapping is rather unrealistic.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<br>
<br>
</body>
</html>