[Mesa-dev] [PATCH v2 7/7] i965: make use of nir linking
Jason Ekstrand
jason at jlekstrand.net
Tue Sep 26 14:00:00 UTC 2017
On September 26, 2017 4:05:04 AM Timothy Arceri <tarceri at itsqueeze.com> wrote:
>
> On 26/09/17 17:50, Kenneth Graunke wrote:
>> On Monday, September 25, 2017 6:23:13 PM PDT Timothy Arceri wrote:
>>> For now linking is just removing unused varyings between stages.
>>>
>>> shader-db results BDW:
>>>
>>> total instructions in shared programs: 13198288 -> 13191693 (-0.05%)
>>> instructions in affected programs: 48325 -> 41730 (-13.65%)
>>> helped: 473
>>> HURT: 0
>>>
>>> total cycles in shared programs: 541184926 -> 541159260 (-0.00%)
>>> cycles in affected programs: 213238 -> 187572 (-12.04%)
>>> helped: 435
>>> HURT: 8
>>>
>>> V2:
>>> - lower indirects on demoted inputs as well as outputs.
>>> ---
>>> src/mesa/drivers/dri/i965/brw_link.cpp | 56 ++++++++++++++++++++++++++++++++++
>>> 1 file changed, 56 insertions(+)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp
>>> b/src/mesa/drivers/dri/i965/brw_link.cpp
>>> index b7fab8d7a25..9ddf0230183 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_link.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
>>> @@ -253,6 +253,62 @@ brw_link_shader(struct gl_context *ctx, struct
>>> gl_shader_program *shProg)
>>> compiler->scalar_stage[stage]);
>>> }
>>>
>>> + /* Determine first and last stage. */
>>> + unsigned first = MESA_SHADER_STAGES;
>>> + unsigned last = 0;
>>> + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
>>> + if (!shProg->_LinkedShaders[i])
>>> + continue;
>>> + if (first == MESA_SHADER_STAGES)
>>> + first = i;
>>> + last = i;
>>> + }
>>> +
>>> + /* Linking the stages in the opposite order (from fragment to vertex)
>>> + * ensures that inter-shader outputs written to in an earlier stage
>>> + * are eliminated if they are (transitively) not used in a later
>>> + * stage.
>>> + */
>>> + if (first != last) {
>>> + int next = last;
>>> + for (int i = next - 1; i >= 0; i--) {
>>> + if (shProg->_LinkedShaders[i] == NULL)
>>> + continue;
>>> +
>>> + nir_shader *producer = shProg->_LinkedShaders[i]->Program->nir;
>>> + nir_shader *consumer = shProg->_LinkedShaders[next]->Program->nir;
>>> +
>>> + nir_remove_dead_variables(producer, nir_var_shader_out);
>>> + nir_remove_dead_variables(consumer, nir_var_shader_in);
>>> +
>>> + if (nir_remove_unused_varyings(producer, consumer)) {
>>> + nir_lower_global_vars_to_local(producer);
>>> + nir_lower_global_vars_to_local(consumer);
>>> +
>>> + nir_variable_mode indirect_mask = (nir_variable_mode) 0;
>>> + if (compiler->glsl_compiler_options[i].EmitNoIndirectTemp)
>>> + indirect_mask = (nir_variable_mode) nir_var_local;
>>> +
>>> + /* The backend might not be able to handle indirects on
>>> + * temporaries so we need to lower indirects on any of the
>>> + * varyings we have demoted here.
>>> + */
>>> + nir_lower_indirect_derefs(producer, indirect_mask);
>>> + nir_lower_indirect_derefs(consumer, indirect_mask);
>>> +
>>> + const bool p_is_scalar =
>>> compiler->scalar_stage[producer->stage];
>>> + shProg->_LinkedShaders[i]->Program->nir =
>>> + brw_nir_optimize(producer, compiler, p_is_scalar);
>>> +
>>> + const bool c_is_scalar =
>>> compiler->scalar_stage[producer->stage];
>>> + shProg->_LinkedShaders[next]->Program->nir =
>>> + brw_nir_optimize(consumer, compiler, c_is_scalar);
>>> + }
>>> +
>>> + next = i;
>>> + }
>>> + }
>>> +
>>> for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) {
>>> struct gl_linked_shader *shader = shProg->_LinkedShaders[stage];
>>> if (!shader)
>>>
>>
>> A couple more thoughts:
>>
>> 1. We're calling brw_nir_optimize even more now...it might make sense to
>> drop it from brw_preprocess_nir, drop it here, and put it once in the
>> final (third) loop, just before brw_shader_gather_info. At least,
>> it'd be interesting how that affects compile times / quality...
>
> Yeah I can have a play around. As is it didn't really change shader-db
> compile times on my BDW so I wasn't too concerned.
>
>>
>> 2. It would be great to use this in anv as well, especially given that
>> it has no GLSL IR linking to do this sort of stuff. Plus, the SPIR-V
>> might be compiled independently...and when you build the pipeline,
>> you have all the stages and can do cross-stage linking optimizations
>> like this.
>
> Yeah the reason for doing this was to eventually add it to radv. i965
> was just a nicer place to test it, same goes for the passes that will
> follow. Right now the biggest blocked is justifying the added complexity
> as there isn't a standard shader-db like tool, and it's not making any
> noticeable differences in any benchmarks.
>
> As far as anv I won't be doing any work on that directly, only radv.
That's fine. I'll add it in one of these days.
--Jason
More information about the mesa-dev
mailing list