[Mesa-dev] [PATCH 15/18] radeonsi: upload constants into VRAM instead of GTT

Thu Feb 16 22:36:22 UTC 2017

On Thu, Feb 16, 2017 at 4:21 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> On 16.02.2017 13:53, Marek Olšák wrote:
>>
>> From: Marek Olšák <marek.olsak at amd.com>
>>
>> This lowers lgkm wait cycles by 30% on VI and normal conditions.
>> The might be a measurable improvement when CE is disabled (radeon)
>> or under L2 thrashing.
>
>
> Good idea. I'm just wondering if all the users of const upload end up as
> streaming writes? I hope we don't accidentally hit some place where writes
> from the CPU end up extremely slow, e.g. where st/mesa uploads some
> structures.

I think constant buffers always benefit from being in VRAM. If every
CU loads a value from a constant buffer, you'll get at least 16 TC L2
read requests on Fiji (each group of 4 CUs submits one), which can be
misses under thrashing. This is very different from "streaming" where
you expect to get exactly 1 read request for each piece of data.

The small problem with VRAM uploads may be write combining. I don't
know the alignment at which it operates and how exactly it works. E.g.
if we get 2 16-byte uploads aligned to 32, there is an untouched hole
of 16 bytes. Does the hole have any effect on upload performance?
u_upload_mgr could fill all holes if it was a problem.

Also, Feral's games upload directly to VRAM all the time. This patch
is nothing compared to what they're doing.

Marek