[Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

Sun Apr 29 20:54:27 UTC 2018

Am 29.04.2018 um 21:44 schrieb Gert Wollny:
> Hello Benedict,
> 
> thanks for all the testing!

thanks for all the developing ;)

> 
> On 29.04.2018 12:12, Benedikt Schemmer wrote:
>>> Which are the names of these test? I'd like to check this on r600,
>>> because here I didn't see any regressions last time I checked. 
>>>
>> might of course be different on r600 (is bindless available?),
>> also shader-db is more sensitive to problems than piglit
>>
>> 1. tests/spec/arb_bindless_texture/compiler/images/arrays-of-struct.frag
>> 2. tests/spec/arb_bindless_texture/compiler/samplers/arrays-of-struct.frag
> Indeed, bindless testures are not available on r600, so it is quite
> difficult for me to test this. I would guess that parameters related to
> this might be stored in the TGSI declaration that I currently don't check.
> 
> If you have time for it, could you send me a TGSI dump of one of these
> shaders?
> With "ST_DEBUG=tgsi" this should be possible.

I created 'R600_DEBUG=merge' so I can switch without having to recompile.

$ST_DEBUG=tgsi ./run wollny/

ATTENTION: default value of option allow_glsl_extension_directive_midshader overridden by environment.
FRAG
DCL TEMP[0..55], ARRAY(1), LOCAL
IMM[0] INT32 {0, 0, 0, 0}
IMM[1] FLT32 {    1.0000,     2.0000,     3.0000,     4.0000}
  0: STORE TEMP[0], IMM[0].xxxx, IMM[1], 2D
  1: END

wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test - LLVM diagnostic (remark): <unknown>:0:0: 12 instructions in function
wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test - Shader Stats: SGPRS: 16 VGPRS: 16 Code Size: 80 LDS: 0 Scratch: 0 Max Waves: 8 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
FRAG
DCL OUT[0], COLOR
DCL TEMP[0..1], LOCAL
DCL TEMP[2..57], ARRAY(1), LOCAL
IMM[0] FLT32 {    0.0000,     0.0000,     0.0000,     0.0000}
  0: MOV TEMP[0].xy, IMM[0].xxxx
  1: TEX TEMP[1], TEMP[0], TEMP[2].xyxy, 2D
  2: MOV OUT[0], TEMP[1]
  3: END

wollny/a115868a349cd666b842a0e70f47451b4463903a_2.shader_test - LLVM diagnostic (remark): <unknown>:0:0: 11 instructions in function
wollny/a115868a349cd666b842a0e70f47451b4463903a_2.shader_test - Shader Stats: SGPRS: 24 VGPRS: 16 Code Size: 72 LDS: 0 Scratch: 0 Max Waves: 8 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0
Thread 0 took 0.13 seconds and compiled 2 shaders (not including SIMD16) with 1 GL context switches

$R600_DEBUG=merge ST_DEBUG=tgsi ./run wollny/

ATTENTION: default value of option allow_glsl_extension_directive_midshader overridden by environment.
run: state_tracker/st_glsl_to_tgsi.cpp:5783: ureg_dst dst_register(st_translate*, gl_register_file, unsigned int, unsigned int): Assertion `array_id && array_id <= t->num_temp_arrays' failed.

 => CRASHED <= while processing these shaders:

    wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test

>>
>>> For radeonsi my guess would be that the llvm optimizer works better
>>> when the registers are not yet merged, and that would be the reason why
>>> register_merge is disabled. 
>> well at least sometimes it doesn't, low hanging fruit maybe?
> Unfortunately, I can't test on radeonsi

I can, if you dont mind waiting for an answer sometimes.

But maybe even easier: is there an implicit/explicit magic number I can play with to see if it changes anything?

ATM it seems like your code improves half the shaders its affecting a lot and hurting the other half bad like it hits an invisible wall.
I think one problem could be the relationship between VGPRs and SGPRs used and max Wavefronts achieved.

This is somewhat similar to NIR although that changes things all over the place.

> 
> Best,
> Gert
>