Plan: BO move throttling for visible VRAM evictions

Marek Olšák maraeo at gmail.com
Tue Mar 28 10:47:35 UTC 2017


On Mar 28, 2017 10:41 AM, "Christian König" <deathsimple at vodafone.de> wrote:

Am 28.03.2017 um 10:35 schrieb Michel Dänzer:

> On 28/03/17 05:29 PM, Christian König wrote:
>
>> Am 28.03.2017 um 08:00 schrieb Michel Dänzer:
>>
>>> On 28/03/17 12:50 PM, zhoucm1 wrote:
>>>
>>>> On 2017年03月28日 10:40, Michel Dänzer wrote:
>>>>
>>>>> On 27/03/17 04:53 PM, Zhou, David(ChunMing) wrote:
>>>>>
>>>>>> For APU special case, can we prevent eviction happening between VRAM
>>>>>> <----> GTT?
>>>>>>
>>>>> We can, if we can close the performance gap between VRAM and GTT. We
>>>>> measured around 30% gap a while ago, though right now I'm only
>>>>> measuring
>>>>> ~5%, but the test system has slower RAM now (still dual channel
>>>>> though).
>>>>>
>>>> My impression VRAM and GTT have no much difference for APU case, if I'm
>>>> wrong, pls correct me.
>>>>
>>> The Mesa patch below makes radeonsi use mostly GTT instead of mostly
>>> VRAM, and slows down Unigine Valley by about 5% on my desktop Kaveri.
>>> You can try it for yourself.
>>>
>> Additional to that you still need the stolen VRAM on APUs for page
>> tables and DCE.
>>
>> So we need to keep the eviction from VRAM to GTT enabled, but what we
>> don't do is swapping them back in because Marek added the GTT flags on
>> APUs as extra domain to look into.
>>
> As long as there's a performance gap between VRAM and GTT, this means
> that performance of long-running apps (e.g. Xorg or the compositor) will
> degrade over time, or after e.g. a suspend-resume cycle.
>
> OTOH, if we can close the gap, we can stop trying to put most BOs in
> VRAM in the first place with APUs.
>

Yeah, John and I are already working on this (but mostly for GFX9).

The difference is that VRAM allocations are mostly contiguously, while GTT
allocations are scattered. So you got more TLB pressure with GTT.


Another aspect is that GART has smaller pages, so the translation cache has
to fetch more of the page directory and also the cache is finite, meaning
that it can be thrashed more easily with small pages.

Marek



Christian.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170328/60578fb7/attachment.html>


More information about the amd-gfx mailing list