[Mesa-dev] [PATCH] gallium/radeon: add a new HUD query for the number of mapped buffers

Wed Jan 25 14:55:16 UTC 2017

Am 25.01.2017 um 15:19 schrieb Samuel Pitoiset:
>
>
> On 01/25/2017 03:56 AM, Michel Dänzer wrote:
>> On 25/01/17 12:05 AM, Marek Olšák wrote:
>>> On Tue, Jan 24, 2017 at 2:17 PM, Christian König
>>> <deathsimple at vodafone.de> wrote:
>>>> Am 24.01.2017 um 11:44 schrieb Samuel Pitoiset:
>>>>> On 01/24/2017 11:38 AM, Nicolai Hähnle wrote:
>>>>>> On 24.01.2017 11:34, Samuel Pitoiset wrote:
>>>>>>> On 01/24/2017 11:31 AM, Nicolai Hähnle wrote:
>>>>>>>> On 24.01.2017 11:25, Samuel Pitoiset wrote:
>>>>>>>>> On 01/24/2017 07:39 AM, Michel Dänzer wrote:
>>>>>>>>>> On 24/01/17 05:44 AM, Samuel Pitoiset wrote:
>>>>>>>>>>>
>>>>>>>>>>> Useful when debugging applications which map too much VRAM.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Is the number of mapped buffers really useful, as opposed to the
>>>>>>>>>> total
>>>>>>>>>> size of buffer mappings? Even if it was the latter though, it 
>>>>>>>>>> doesn't
>>>>>>>>>> show which mappings are for BOs in VRAM vs GTT, does it? 
>>>>>>>>>> Also, even
>>>>>>>>>> the
>>>>>>>>>> total size of mappings of BOs currently in VRAM doesn't directly
>>>>>>>>>> reflect
>>>>>>>>>> the pressure on the CPU visible part of VRAM — only the BOs 
>>>>>>>>>> which are
>>>>>>>>>> actively being accessed by the CPU contribute to that.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It's actually useful to know the number of mapped buffers, but 
>>>>>>>>> maybe
>>>>>>>>> it
>>>>>>>>> would be better to have two separate counters for GTT and VRAM.
>>>>>>>>> Although
>>>>>>>>> the number of mapped buffers in VRAM is most of the time very 
>>>>>>>>> high
>>>>>>>>> compared to GTT AFAIK.
>>>>>>>>>
>>>>>>>>> I will submit in a follow-up patch, something which reduces 
>>>>>>>>> the number
>>>>>>>>> of mapped buffers in VRAM (when a BO has been mapped only 
>>>>>>>>> once). And
>>>>>>>>> this new counter helped me.
>>>>>>>>
>>>>>>>>
>>>>>>>> Michel's point probably means that reducing the number/size of 
>>>>>>>> mapped
>>>>>>>> VRAM buffers isn't actually that important though.
>>>>>>>
>>>>>>>
>>>>>>> It seems useful for apps which map more than 256MB of VRAM.
>>>>>>
>>>>>>
>>>>>> True, if all of that range is actually used by the CPU (which may 
>>>>>> well
>>>>>> happen, of course). If I understand Michel correctly (and this 
>>>>>> was news
>>>>>> to me as well), if 1GB of VRAM is mapped, but only 64MB of that are
>>>>>> regularly accessed by the CPU, then the kernel will migrate all 
>>>>>> of the
>>>>>> rest into non-visible VRAM.
>>>>>
>>>>>
>>>>> And this can hurt us, for example DXMD maps over 500MB of VRAM. And a
>>>>> bunch of BOs are only mapped once.
>>>>
>>>>
>>>> But when they are mapped once that won't be a problem.
>>>>
>>>> Again as Michel noted when a VRAM buffer is mapped it is migrated 
>>>> into the
>>>> visible parts of VRAM on access, not on mapping.
>>>>
>>>> In other words you can map all your VRAM buffers and keep them 
>>>> mapped and
>>>> that won't hurt anybody.
>>>
>>> Are you saying that I can map 2 GB of VRAM and it will all stay in
>>> VRAM and I'll get maximum performance if it's not accessed by the CPU
>>> too much?
>>
>> Yes, that's how it's supposed to work.
>>
>>
>>> Are you sure it won't have any adverse effects on anything?
>>
>> That's a pretty big statement. :) Bugs happen.
>>
>>
>>> Having useless memory mappings certainly must have some negative
>>> effect on something. It doesn't seem like a good idea to have a lot of
>>> mapped memory that doesn't have to be mapped.
>>
>> I guess e.g. the bookkeeping overhead might become significant with
>> large numbers of mappings. Maybe the issue Sam has been looking into is
>> actually related to something like that, not to VRAM?
>
> Well, with some games that new query can report more than 6.8k mapped 
> buffers (both VRAM/GTT) but a bunch are for VRAM. And more than 1GB of 
> mapped VRAM.
>
> When I look at the number of bytes moved by TTM, the counter is also 
> very high in these apps and most likely tied to the slowdowns. The 
> kernel memory manager is moving data almost all the time... Presumably 
> it's because of that aperture limit of 256MB.

That is most likely an incorrect assumption. From experience I would 
rather expect that we move buffers in/out of VRAM because we run out of 
it during command submission.

You should take a look at amdgpu_vram_mm and confirm that the visible 
usage is around it's maximum (256MB).

Regards,
Christian.

> I would like to approach the problem by reducing the amount of vram 
> needed by the userspace in order to prevent TTM to move lot of data...
>
> Anyway, I'm going to push this patch.
>
>>
>>