[Mesa-dev] [PATCH v2 00/16] radeonsi: improve handling of temporary arrays

Thu Aug 11 09:44:04 UTC 2016

Am 11.08.2016 um 11:29 schrieb Nicolai Hähnle:
> On 10.08.2016 23:36, Marek Olšák wrote:
>> On Wed, Aug 10, 2016 at 9:23 PM, Nicolai Hähnle <nhaehnle at gmail.com> 
>> wrote:
>>> Hi,
>>>
>>> this is a respin of the series which scans the shader's TGSI to 
>>> determine
>>> which channels of an array are actually written to. Most of the st/mesa
>>> changes have become unnecessary. Most of the radeon-specific part stays
>>> the same.
>>>
>>> For one F1 2015 shader, it reduces the scratch size from 132096 to 
>>> 26624
>>> bytes, which is bound to be much nicer on the texture cache.
>>
>> This has been bugging me... is there something we can do to move
>> temporary arrays to registers?
>>
>> F1 2015 is the only game that doesn't "spill VGPRs", yet has the
>> highest scratch usage per shader. (without this series)
>>
>> If a shader uses 32 VGPRs and a *ton* of scratch space, you know
>> something is wrong.
>
> We actually already do that partially: in emit_declaration, we check 
> the size of the array, and if it's below a certain threshold (<= 16 
> currently) it is lowered to LLVM IR that becomes registers. In 
> particular, that one shader has:
>
> Before: Shader Stats: SGPRS: 40 VGPRS: 32 Code Size: 3316 LDS: 0 
> Scratch: 132096 Max Waves: 8 Spilled SGPRs: 0 Spilled VGPRs: 0
> After: Shader Stats: SGPRS: 32 VGPRS: 60 Code Size: 3068 LDS: 0 
> Scratch: 26624 Max Waves: 4 Spilled SGPRs: 0 Spilled VGPRs: 0
>
> Looks like some of the arrays now land in VGPRs since they have become 
> smaller with that series.
>
> There are still a _lot_ of weaknesses in all of this, and they mostly 
> have to do with limitations that are rather deeply baked into 
> assumptions of LLVM's codegen architecture.
>
> The biggest problem is that an array in VGPRs needs to be represented 
> somehow in the codegen, and it is currently being represented as one 
> of the VGPR vector register classes, which go up to VReg_512, i.e. 16 
> registers. Two problems with that:
>
> 1. The granularity sucks. If you have an array of 10 entries, it'll 
> end up effectively using 16 registers anyway.
>
> 2. You can't go above arrays of size 16. (Though to be fair, once you 
> reach that size, you should probably start worrying about VGPR pressure.)
>
> Some other issues are that
>
> 3. It should really be LLVM that decides how to lower an array, not 
> Mesa. Ideally, LLVM should be able to make an intelligent decision 
> based on the overall register pressure.
>
> 4. We currently don't use LDS for shaders. This was disabled because 
> LLVM needs to be taught about interactions with other LDS uses, 
> especially in tessellation.
>
> I think fixing point 4 is the thing with the highest impact/effort 
> ratio right now.
>
> For point 3, perhaps we could actually extend the alloca lowering even 
> further so that it lowers allocas into VGPRs 
> _after_register_allocation_. But there's a whole can of worms 
> associated with this.
>
> (Oh, another thing to keep in mind: we cannot do non-uniform relative 
> indexing of VGPR arrays. This is emulated by a loop in the shader. So 
> depending on the access patterns into arrays, LDS or in extreme cases 
> even scratch space can actually be faster than VGPR.)
>
> 1+2 are a serious headache. I'm not deeply enough into all the 
> GlobalISel work going on in LLVM, though I've read some things that 
> make me hopeful that CodeGen based on GlobalISel could help (because 
> it generally makes the process of register assignment more flexible 
> and configurable).

When I initially implemented the support for arrays in radeonsi one of 
the fundamental problems was that LLVM couldn't handle anything else 
than power of two vectors in its instruction selection.

I looked a bit into fixing this, but never completed it. Sounds like 
nobody worked on this since then and yes I completely agree with your 
points.

Regards,
Christian.

>
> Cheers,
> Nicolai
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev