<div dir="auto"><div><br><div class="gmail_extra"><br><div class="gmail_quote">On May 15, 2017 4:29 AM, "Michel Dänzer" <<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>> wrote:<br type="attribution"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="elided-text">On 14/05/17 06:31 AM, Marek Olšák wrote:<br>
> On Mon, Apr 17, 2017 at 11:55 AM, Michel Dänzer <<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>> wrote:<br>
>> On 17/04/17 07:58 AM, Marek Olšák wrote:<br>
>>> On Fri, Apr 14, 2017 at 12:14 PM, Michel Dänzer <<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>> wrote:<br>
>>>> On 04/04/17 05:11 AM, Marek Olšák wrote:<br>
>>>>> On Fri, Mar 31, 2017 at 5:24 AM, Michel Dänzer <<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>> wrote:<br>
>>>>>> On 30/03/17 07:03 PM, Michel Dänzer wrote:<br>
>>>>>>> On 25/03/17 01:33 AM, Marek Olšák wrote:<br>
>>>>>>>> Hi,<br>
>>>>>>>><br>
>>>>>>>> I'm sharing this idea here, because it's something that has been<br>
>>>>>>>> decreasing our performance a lot recently, for example:<br>
>>>>>>>> <a href="http://openbenchmarking.org/prospect/1703011-RI-RADEONDIR06/7b7668cfc109d1c3dc27e871c8aea71ca13f23fa" rel="noreferrer" target="_blank">http://openbenchmarking.org/<wbr>prospect/1703011-RI-<wbr>RADEONDIR06/<wbr>7b7668cfc109d1c3dc27e871c8aea7<wbr>1ca13f23fa</a><br>
>>>>>>><br>
>>>>>>> The attached proof-of-concept patch (on top of Christian's "CPU mapping<br>
>>>>>>> of split VRAM buffers" series, ported from radeon) results in 145.05 fps<br>
>>>>>>> on my Tonga.<br>
>>>>>><br>
>>>>>> I get the same result without my or Christian's patches though, with<br>
>>>>>> 4.11 based DRM or amd-staging-4.9. So I guess I just can't reproduce the<br>
>>>>>> problem with this test. Are there any other tests for it?<br>
>>>>><br>
>>>>> It's random. Sometimes the benchmark runs OK, other times it's slow.<br>
>>>>> You can easily see the difference but observing how smooth it is. The<br>
>>>>> visible VRAM evictions result in constant 100-200ms stalls but not<br>
>>>>> every frame, which feels like the frame rate is much lower than it<br>
>>>>> actually is.<br>
>>>>><br>
>>>>> Make sure your graphics details are maxed out. The best score I can<br>
>>>>> get with my rig is 70 fps. (Fiji & Core i5 3570)<br>
>>>><br>
>>>> I'm getting around 53-54 fps at Ultra with Tonga, both with Mesa 13.0.6<br>
>>>> and Git.<br>
>>>><br>
>>>> Have you tried if Christian's patches for CPU access to split VRAM<br>
>>>> buffers help? I can imagine that forcing contiguous VRAM buffers for CPU<br>
>>>> access could cause lots of other BOs to be unnecessarily evicted from<br>
>>>> VRAM, if at least one of their fragments happens to be in the CPU<br>
>>>> visible part of VRAM.<br>
>>><br>
>>> I've finally tested latest amd-staging-4.9 and I'm very pleased. For<br>
>>> the first time, the Deus Ex benchmark has almost no hiccups. I've<br>
>>> never seen it so smooth. At one point, the MB/s BO move rate increase<br>
>>> to 200MB/s, stayed there for a couple of seconds, and then it dropped<br>
>>> to 0 again. The frame rate was OK-ish, so I guess the moves didn't<br>
>>> happen all at once. I also tested DiRT Rally and I haven't been able<br>
>>> to reproduce the low FPS with the consistently-high BO move rate that<br>
>>> I saw several months ago.<br>
>>><br>
>>> We could do some move throttling there for sure, but it's much better<br>
>>> than it ever was.<br>
>><br>
>> That's great to hear. If you get a chance, it would be interesting if<br>
>> the attached updated patch improves things even more for you. (The patch<br>
>> I attached previously couldn't work as intended, this one at least might :)<br>
><br>
> Frogging101 on IRC noticed that we get a ton of TTM BO moves due to<br>
> visible VRAM thrashing and Michel's patch doesn't help. His kernel is<br>
> up to date with amd-staging. It looks like the only option left is my<br>
> original plan: BO move throttling for visible VRAM by redirecting<br>
> mapped buffers to GTT and not allowing them to go back to VRAM if some<br>
> counter is too high.<br>
><br>
> Opinions?<br>
<br>
</div>I think the next step should be to make radeonsi keep track of how much<br>
VRAM it's trying to use that's expected to be accessed by the CPU, and<br>
to use GTT instead when that exceeds a threshold (probably derived from<br>
vram_vis_size).<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">That's difficult to estimate. There are apps with 600MB of mapped VRAM and don't experience any performance issues. And some apps with 300MB of mapped VRAM do. It only depends on the CPU access pattern, not what radeonsi sees.</div><div dir="auto"><br></div><div dir="auto">Marek</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="elided-text"><br>
<br>
--<br>
Earthling Michel Dänzer | <a href="http://www.amd.com" rel="noreferrer" target="_blank">http://www.amd.com</a><br>
Libre software enthusiast | Mesa and X developer<br>
</div></blockquote></div><br></div></div></div>