<div dir="auto"><div><br><div class="gmail_extra"><br><div class="gmail_quote">On May 18, 2017 10:17 AM, "Michel Dänzer" <<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>> wrote:<br type="attribution"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="quoted-text">On 17/05/17 09:35 PM, Marek Olšák wrote:<br>
> On May 16, 2017 3:57 AM, "Michel Dänzer" <<a href="mailto:michel@daenzer.net">michel@daenzer.net</a><br>
</div><div class="quoted-text">> <mailto:<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>>> wrote:<br>
> On 15/05/17 07:11 PM, Marek Olšák wrote:<br>
> > On May 15, 2017 4:29 AM, "Michel Dänzer" <<a href="mailto:michel@daenzer.net">michel@daenzer.net</a><br>
> <mailto:<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>><br>
</div><div class="quoted-text">> > <mailto:<a href="mailto:michel@daenzer.net">michel@daenzer.net</a> <mailto:<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>>>> wrote:<br>
> ><br>
> > I think the next step should be to make radeonsi keep track of<br>
> how much<br>
> > VRAM it's trying to use that's expected to be accessed by the<br>
> CPU, and<br>
> > to use GTT instead when that exceeds a threshold (probably<br>
> derived from<br>
> > vram_vis_size).<br>
> ><br>
> > That's difficult to estimate. There are apps with 600MB of mapped VRAM<br>
> > and don't experience any performance issues. And some apps with<br>
> 300MB of<br>
> > mapped VRAM do. It only depends on the CPU access pattern, not what<br>
> > radeonsi sees.<br>
><br>
> What I mean is keeping track of the total size of resources which have<br>
> RADEON_DOMAIN_VRAM and RADEON_FLAG_CPU_ACCESS set, and if it exceeds a<br>
> threshold, create new ones having those flags in GTT instead. Even<br>
> though this might not be strictly necessary with amdgpu in the long run,<br>
> it probably is for radeon anyway, and in the short term it might help<br>
> even with amdgpu.<br>
><br>
><br>
> That might hurt us more than it can help.<br>
<br>
</div>You may be right, but I think I'll play with that idea a little anyway<br>
to see how it goes. :)<br>
<div class="quoted-text"><br>
> All mappable buffers have the CPU access flag set, but many of them are<br>
> immutable.<br>
<br>
</div>You mean they're only written to once by the CPU? We shouldn't set the<br>
RADEON_FLAG_CPU_ACCESS flag for BOs where we expect that, because it<br>
will currently prevent them from being in the CPU invisible part of VRAM.<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">The only thing I can do is set the CPU access flag for persistently mapped buffers only. We certainly want buffers to go to the invisible part of VRAM if there is no CPU access for a certain timeframe. So maybe we shouldn't set the flag at all. What do you thing?</div><div dir="auto"><br></div><div dir="auto">The truth is we have no way to know what apps intend to do with any buffers.</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="quoted-text"><br>
<br>
> The only place where this can be handled is the kernel.<br>
<br>
</div>Ideally, the placement of a BO should be determined based on how it's<br>
actually being used by the GPU vs CPU. But I'm not sure how to determine<br>
that in a useful way.<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">CPU page faults are the only way to determine that CPU access is happening.</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="quoted-text"><br>
> Even if it's as simple as: if (bo->numcpufaults > 10) domain = GTT_WC;<br>
<br>
</div>I'm skeptical about the number of CPU page faults per se being a useful<br>
metric. It doesn't tell us much about how the BO is used even by the<br>
CPU, let alone the GPU. But let's see where this leads you.<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">It tells us more than what Mesa can ever know, which is nothing.</div><div dir="auto"><br></div><div dir="auto">Marek</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
One thing that might help would be if we could swap individual memory<br>
nodes between visible and invisible VRAM for CPU page faults, instead of<br>
moving/evicting whole BOs. Christian, do you think something like that<br>
would be possible?<br>
<br>
<br>
Another idea (to avoid issues such as the recent one with Rocket League)<br>
was to make VRAM CPU mappings write-only, and move the BO to GTT if<br>
there's a read fault. But not sure if this is possible at all, or how<br>
much effort it would be.<br>
<div class="elided-text"><br>
<br>
--<br>
Earthling Michel Dänzer | <a href="http://www.amd.com" rel="noreferrer" target="_blank">http://www.amd.com</a><br>
Libre software enthusiast | Mesa and X developer<br>
<br>
</div></blockquote></div><br></div></div></div>