[PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

Christian König deathsimple at vodafone.de
Wed May 24 11:27:27 UTC 2017


Am 24.05.2017 um 13:03 schrieb Marek Olšák:
> On Wed, May 24, 2017 at 9:56 AM, Michel Dänzer <michel at daenzer.net> wrote:
>> On 23/05/17 07:38 PM, Marek Olšák wrote:
>>> On Tue, May 23, 2017 at 2:45 AM, Michel Dänzer <michel at daenzer.net> wrote:
>>>> On 22/05/17 07:09 PM, Marek Olšák wrote:
>>>>> On Mon, May 22, 2017 at 12:00 PM, Michel Dänzer <michel at daenzer.net> wrote:
>>>>>> On 20/05/17 06:26 PM, Marek Olšák wrote:
>>>>>>> On May 20, 2017 3:26 AM, "Michel Dänzer" <michel at daenzer.net
>>>>>>> <mailto:michel at daenzer.net>> wrote:
>>>>>>>
>>>>>>>      On 20/05/17 01:14 AM, Marek Olšák wrote:
>>>>>>>      > Hi Michel,
>>>>>>>      >
>>>>>>>      > I've applied your series
>>>>>>>
>>>>>>>      Thanks for testing it.
>>>>>>>
>>>>>>>      > and it doesn't help with low Dirt Rally performance on Fiji. I see TTM
>>>>>>>      > buffer moves at 800MB/s and many VRAM page faults.
>>>>>>>
>>>>>>>      Did you see this:
>>>>>>>
>>>>>>>      >> Note that there's only little if any improvement of the average
>>>>>>>      framerate
>>>>>>>      >> reported, but the minimum framerate as seen on the HUD goes from
>>>>>>>      ~10 fps
>>>>>>>      >> to ~17.
>>>>>>>
>>>>>>>      I.e. it mostly affects the minimum framerate and smoothness for me
>>>>>>>      as well.
>>>>>>>
>>>>>>>
>>>>>>> Without the series, I get 70 average fps. With the series, I get 30
>>>>>>> average fps. That might just be random bad luck. I don't know.
>>>>>> Hmm, yeah, maybe that was just one of the random slowdowns you've been
>>>>>> talking about in other threads and on IRC?
>>>>>>
>>>>>> I can't reproduce any slowdown with these patches, even leaving visible
>>>>>> VRAM size at 256 MB.
>>>>> The random slowdowns with Dirt Rally are only caused by the pressure
>>>>> on visible VRAM. This whole thread is about those random slowdowns.
>>>> No, this thread is about the scenario described in the cover letter of
>>>> this patch series.
>>>>
>>>>
>>>>> If you're saying "maybe it was just one of the random slowdowns", you're
>>>>> saying "maybe it was just the visible VRAM pressure". It's only
>>>>> random with Dirt Rally, which makes it difficult to believe statements
>>>>> such as "I can't reproduce any slowdown".
>>>> I could say the same thing about you seeing random slowdowns... I've
>>>> never seen that, I had to artificially limit the size of visible VRAM to
>>>> 64 MB to make it significantly affect the benchmark result.
>>>>
>>>> How many times do you need to run the benchmark on average to hit a
>>>> random slowdown? Which desktop environment and other X clients are
>>>> running during the benchmark? Which tab is active in the Steam window
>>>> while the benchmark runs?
>>>>
>>>> In my case, it's only xfwm4, xterm and steam on the Dirt Rally page in
>>>> the library.
>>> Ubuntu Unity, Steam small mode (there are no tabs), Ultra settings in
>>> Dirt Rally.
>>>
>>> Every single time I run the game with this series, I get 700-1000MB/s
>>> of TTM BO moves. There doesn't seem to be any randomness.
>>>
>>> It was better without this series. (meaning it was sometimes OK, sometimes bad)
>> Thanks for the additional details. I presume that in the bad case there
>> are some BOs lying around in visible VRAM (e.g. from Unity), which
>> causes some of Dirt Rally's BOs to go back and forth between GTT on CPU
>> page faults and VRAM on GPU usage.
>>
>> This means at least patch 2 goes out the window. I'll see if I can
>> salvage something out of patch 3.
> I think the final solution (done in fault_reserve_notify) should be:
> if (bo->num_cpu_page_faults++ > 20)
>     bo->preferred_domain = GTT_WC;

I more or less agree on that, but setting preferred_domain permanently 
to GTT_WC is what worries me a bit.

E.g. imagine you alt+tab from a game to your browser and back and the 
game runs way slower now because BOs are never moved back to VRAM.

What we need is a global limit of number of bytes transfered per second 
for swap operations or something like that.

Or maybe a timeout which says when a BO was moved (either by swapping it 
out or by a CPU page fault) only move it back after +n jiffies or 
something like that.

Christian.

>
> Otherwise I think we'll be just going in circles and not get anywhere.
>
> Marek
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




More information about the amd-gfx mailing list