[Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

Benedikt Schemmer ben at besd.de
Sun Apr 29 08:43:11 UTC 2018


Hi Gert,

couldn't resist at least to try what would happen if I enable register merge for radeonsi:

 PERCENTAGE DELTAS    Shaders     SGPRs     VGPRs SpillSGPR SpillVGPR  PrivVGPR   Scratch  CodeSize  MaxWaves     Waits
 piglit                 80732   -0.16 %   -0.02 %     .         .        0.87 %    0.86 %    0.04 %     .         .
 ----------------------------------------------------------------------------------------------------------------------
 All affected             513  -17.58 %   -2.30 %     .         .        4.12 %    5.87 %    1.73 %    0.10 %     .
 ----------------------------------------------------------------------------------------------------------------------
 Total                  80732   -0.16 %   -0.02 %     .         .        0.87 %    0.86 %    0.04 %     .         .

I had already removed the defines around the debug code so thats also happily outputting data.

fails with two piglit shaders:

<code>
[require]
GLSL >= 3.30

[fragment shader]
// [config]
// expect_result: pass
// glsl_version: 3.30
// require_extensions: GL_ARB_bindless_texture GL_ARB_shader_image_load_store
// [end config]

#version 330
#extension GL_ARB_bindless_texture: require
#extension GL_ARB_shader_image_load_store: enable
#extension GL_ARB_arrays_of_arrays: enable

struct s {
	writeonly image2D img[3][2];
	int y;
};

void main()
{
	s a[2][4];
	imageStore(a[0][0].img[0][0], ivec2(0, 0), vec4(1, 2, 3, 4));
}
</code>

and

<code>
[require]
GLSL >= 3.30

[fragment shader]
// [config]
// expect_result: pass
// glsl_version: 3.30
// require_extensions: GL_ARB_bindless_texture
// [end config]

#version 330
#extension GL_ARB_bindless_texture: require
#extension GL_ARB_arrays_of_arrays: enable

struct s {
	sampler2D tex[3][2];
	int y;
};

out vec4 color;

void main()
{
	s a[2][4];
	color = texture2D(a[0][0].tex[0][0], vec2(0, 0));
}
</code>

Real world is a little different:

Max Increase:

SGPRS: 72 -> 96 (33.33 %) (in shaders/cat/1787.shader_test)
VGPRS: 64 -> 84 (31.25 %) (in shaders/dirtrally/0859b69789591d7046e211400b1edd9a7cfca734_742.shader_test)
Spilled SGPRs: 0 -> 16 (0.00 %) (in shaders/deusex_mankind/d64e2084204e29749639e8fbd9a1e507c7e5e1dd_6840.shader_test)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 24 -> 32 (33.33 %) (in shaders/deusex_mankind/28cac87049d8c833e72296a5a02ea6118f1144e5_5876.shader_test)
Scratch size: 28 -> 36 (28.57 %) dwords per thread (in shaders/deusex_mankind/28cac87049d8c833e72296a5a02ea6118f1144e5_5876.shader_test)
Code Size: 6988 -> 8036 (15.00 %) bytes (in shaders/cat/1847.shader_test)
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 5 -> 7 (40.00 %) (in shaders/ruiner/0967c5fce7fc456496b1cfa25fbb1d1c4dcf9bed_2958.shader_test)
Wait states: 0 -> 0 (0.00 %)

Max Decrease:

SGPRS: 104 -> 64 (-38.46 %) (in shaders/deusex_mankind/480ddf21b1076d36f9ffd9911389656b5d8e12cb_2878.shader_test)
VGPRS: 44 -> 36 (-18.18 %) (in shaders/ruiner/0967c5fce7fc456496b1cfa25fbb1d1c4dcf9bed_2958.shader_test)
Spilled SGPRs: 19 -> 0 (-100.00 %) (in shaders/deusex_mankind/0749c9ae23417f918c7286fe502ff5de4cb8e1a0_3276.shader_test)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 17576 -> 17276 (-1.71 %) bytes (in shaders/ruiner/75b96ff36f5328b9ff9366f0d0fd58a1046f51bc_3053.shader_test)
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 8 -> 5 (-37.50 %) (in shaders/deusex_mankind/8dabec49e5b6c3b1cbcbaee194eff69f164d72f4_3968.shader_test)
Wait states: 0 -> 0 (0.00 %)


 PERCENTAGE DELTAS    Shaders     SGPRs     VGPRs SpillSGPR SpillVGPR  PrivVGPR   Scratch  CodeSize  MaxWaves     Waits
 0ad                        6     .         .         .         .         .         .         .         .         .
 aer                      590     .        0.26 %  -20.00 %     .         .         .        0.34 %     .         .
 alien_isolation         1414     .         .         .         .         .         .         .         .         .
 anholt                    10     .         .         .         .         .         .         .         .         .
 bioshock_infinite       2581   -0.02 %    0.03 %     .         .         .         .        0.13 %     .         .
 blackmesa                584     .         .         .         .         .         .         .         .         .
 cat                      573   -0.06 %   -0.12 %     .         .         .         .        0.20 %    0.05 %     .
 csgo                    1392     .         .       -0.88 %     .         .         .       -0.03 %     .         .
 deadisland_definitive   1776    0.06 %     .         .         .         .         .        0.15 %    0.01 %     .
 deadisland_original    11602     .         .         .         .         .         .        0.05 %     .         .
 deadisland_riptide_..    293   -0.06 %    0.06 %     .         .         .         .        0.32 %     .         .
 deusex_mankind          5051    0.08 %     .       -6.14 %     .       33.33 %   28.57 %    0.19 %   -0.01 %     .
 dirtrally                787     .        0.64 %    0.62 %     .         .         .        0.30 %   -0.31 %     .
 dolphin                   22     .         .         .         .         .         .         .         .         .
 dyinglight              4012     .        0.05 %     .         .         .         .        0.34 %   -0.01 %     .
 eurotruck2               216     .         .         .         .         .         .         .         .         .
 f1_2015                  746   -0.04 %   -0.02 %    2.72 %     .         .         .        0.14 %     .         .
 glamor                    16   -2.33 %     .         .         .         .         .        3.97 %     .         .
 hl2ep1                   294     .         .         .         .         .         .         .         .         .
 hl2ep2                   154     .         .         .         .         .         .         .         .         .
 hl2lostcoast              66     .         .         .         .         .         .         .         .         .
 hlsl3                    582     .         .         .         .         .         .       -0.14 %     .         .
 humus-celshading           4     .         .         .         .         .         .         .         .         .
 humus-domino               6     .         .         .         .         .         .         .         .         .
 humus-dynamicbranching    24     .         .         .         .         .         .         .         .         .
 humus-hdr                 10     .         .         .         .         .         .         .         .         .
 humus-portals              2     .         .         .         .         .         .         .         .         .
 humus-volumetricfog..      6     .         .         .         .         .         .         .         .         .
 kerbal                  1016     .        0.11 %     .         .         .         .        0.31 %     .         .
 larago                   664     .         .         .         .         .         .        0.01 %     .         .
 madmax                   354    0.04 %   -0.08 %     .         .         .         .       -0.02 %    0.04 %     .
 metro2033redux          4410     .        0.05 %     .         .         .         .        0.06 %   -0.04 %     .
 nexuiz                    80     .         .         .         .         .         .         .         .         .
 ruiner                   685   -0.10 %   -0.09 %     .         .         .         .        0.09 %    0.04 %     .
 sauerbraten                7     .         .         .         .         .         .         .         .         .
 serioussam2017           736    0.03 %   -0.07 %    7.09 %     .         .         .        0.05 %    0.06 %     .
 soma                     436     .         .         .         .         .         .         .         .         .
 specops                 1814     .         .         .         .         .         .        0.35 %     .         .
 stellaris                434     .         .         .         .         .         .        0.11 %     .         .
 supertuxkart               4     .         .         .         .         .         .         .         .         .
 talos                    762   -0.02 %     .        0.09 %     .         .         .        0.01 %     .         .
 tesseract                430     .         .         .         .         .         .         .         .         .
 tombraider              1012    0.21 %    0.31 %     .         .         .         .        0.22 %   -0.16 %     .
 total_war_shogun_2       176   -0.21 %     .       -2.10 %     .         .         .       -0.05 %     .         .
 total_war_warhammer      218     .        0.06 %     .         .         .         .        0.72 %   -0.06 %     .
 ubershaders               54   -2.04 %    0.20 %     .         .         .         .        1.38 %     .         .
 ug_gettysburg            149     .         .         .         .         .         .         .         .         .
 unigine_heaven           226     .         .         .         .         .         .         .         .         .
 unigine_superposition    733   -0.08 %    0.04 %     .         .         .         .        0.02 %     .         .
 unigine_valley           288     .         .         .         .         .         .         .         .         .
 unity                     72     .         .         .         .         .         .        0.04 %     .         .
 w40kdawn2                421     .         .         .         .         .         .       -0.20 %     .         .
 w40kdawn3                164    0.36 %     .         .         .         .         .         .         .         .
 warsow                   176     .         .         .         .         .         .         .         .         .
 warzone2100                4     .         .         .         .         .         .         .         .         .
 witcher2                 928   -0.07 %    0.06 %     .         .         .         .        0.04 %     .         .
 x3_albion                641     .         .         .         .         .         .         .         .         .
 xblades                  208     .         .         .         .         .         .        0.42 %     .         .
 xcom                    1020   -0.10 %     .         .         .         .         .        0.28 %     .         .
 xcom2                   1439     .         .         .         .         .         .         .         .         .
 yofrankie                 82     .         .         .         .         .         .         .         .         .
 ----------------------------------------------------------------------------------------------------------------------
 All affected            6394    0.04 %    0.16 %    0.46 %     .        7.41 %    6.67 %    0.51 %   -0.09 %     .
 ----------------------------------------------------------------------------------------------------------------------
 Total                  52662     .        0.03 %    0.26 %     .        1.34 %    1.09 %    0.13 %   -0.01 %     .


If theres an easy way to figure out when your code makes it worse and when its an improvement this would be really nice.

Really interesting.

Cheers, Benedikt

Am 29.04.2018 um 09:55 schrieb Gert Wollny:
> Hello Benedikt, 
> 
> Am Sonntag, den 29.04.2018, 00:06 +0200 schrieb Benedikt Schemmer:
>> Hi Gert
>>
>> Am 28.04.2018 um 23:51 schrieb Gert Wollny:
>>> Am Samstag, den 28.04.2018, 22:43 +0200 schrieb Benedikt Schemmer:
>>>> The patches apply cleanly, however I just did a shader-db test
>>>> run
>>>> and can't find a difference with your patch
>>>> applied, am I doing something wrong?
>>>
>>> AFAIK radeonsi doesn't use the register-merge optimizer in TGSI.
>>>
>>
>> Ah, ok. Was wondering why your debug code doesn't output anything.
>> Makes sense now ;)
> Not exactly, the reason there is no output is because -DNDEBUG is set.
> Without it the statistics should also be printed out on radeonsi, but
> thinking of it I should probably disable it when register_merge is not
> accessed, because without this the numbers will be inflated and don't
> have much meaning.
> 
>> So is this useless on radeonsi?
> Indeed. 
> 
>> Seemed interesting to me.
> :) it certainly helps on r600 
> 
> 
>>>> compile times went up though:
>>>
>>> This is strange, because "see above". Did you compile with debug
>>> information and c++11 or higher enables?
> ...
>>>   
>>>
>> not intentionally:
> 
> Then you should actually not run any code that this series adds to
> mesa. I checked again, apart from the debugging output nothing will
> ever be run if a drivers that report
> PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERS != 0 (as does radeonsi). 
> 
> Best, 
> Gert 
> 


More information about the mesa-dev mailing list