[Mesa-dev] [PATCH v2 2/3] nir: Add a discard optimization pass

Wed Jul 4 22:47:51 UTC 2018

On July 4, 2018 15:35:15 Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl> wrote:

> On Wed, Jul 4, 2018 at 11:00 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> On Wed, Jul 4, 2018 at 1:20 PM, Francisco Jerez <currojerez at riseup.net>
>> wrote:
>>>
>>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>>
>>>> Many fragment shaders do a discard using relatively little information
>>>> but still put the discard fairly far down in the shader for no good
>>>> reason.  If the discard is moved higher up, we can possibly avoid doing
>>>> some or almost all of the work in the shader.  When this lets us skip
>>>> texturing operations, it's an especially high win.
>>>>
>>>> One of the biggest offenders here is DXVK.  The D3D APIs have different
>>>> rules for discards than OpenGL and Vulkan.  One effective way (which is
>>>> what DXVK uses) to implement DX behavior on top of GL or Vulkan is to
>>>> wait until the very end of the shader to discard.  This ends up in the
>>>> pessimal case where we always do all of the work before discarding.
>>>> This pass helps some DXVK shaders significantly.
>>>
>>> One thing to keep in mind is that this sort of transformation is trading
>>> off run-time of fragment shader invocations that don't call discard (or
>>> do so non-uniformly, which means that the code the discard jump is
>>> protecting will be executed anyway, so doing this can actually increase
>>> the critical path of the program) in favour of invocations that call
>>> discard uniformly (so executing discard early will effectively terminate
>>> the program early).
>>
>>
>> It's not really a uniform vs. non-uniform thing.  Even if a shader only
>> discards some of the fragments, it sill reduces the number of live channels
>> which reduces the cost of later non-uniform control-flow.
>>
>>>
>>> Optimizing for the latter case is an essentially
>>> heuristic assumption that needs to be verified experimentally.  Have you
>>> tested the effect of this pass on non-DX workloads extensively?
>>
>>
>> Yes, it is a trade-off.  No, I have not done particularly extensive testing.
>> We do, however, know of non-DXVK workloads that would benefit from this.  I
>> believe Manhattan is one such example though I have not yet benchmarked it.
>
> Out of curiosity, what is the performance trade-off here? What extra
> costs could we
> get by discarding early?

It's typically not high but there may be some cost from extra stalls from 
less latency hiding.  For instance, if you put the discard right after a 
texture, you can't process any more of the shader until it returns so you 
know whether or not to discard.  Also, you could end up increasing register 
pressure if you end up moving something to the top that's uses both in the 
discard and in something else.  It's the usual set of trade-offs you get 
every time you move instructions made possibly worse by how aggressive this 
pass is in making the only instructions before the discard the ones that 
are explicitly needed.

>
>>
>>>
>>>> v2 (Jason Ekstrand):
>>>> - Fix a couple of typos (Grazvydas, Ian)
>>>> - Use the new nir_instr_move helper
>>>> - Find all movable discards before moving anything so we don't
>>>> accidentally re-order anything and break dependencies
>>>> ---
>>>> src/compiler/Makefile.sources      |   1 +
>>>> src/compiler/nir/meson.build       |   1 +
>>>> src/compiler/nir/nir.h             |  10 +
>>>> src/compiler/nir/nir_opt_discard.c | 396 +++++++++++++++++++++++++++++
>>>> 4 files changed, 408 insertions(+)
>>>> create mode 100644 src/compiler/nir/nir_opt_discard.c
>>>>
>>>> diff --git a/src/compiler/Makefile.sources
>>>> b/src/compiler/Makefile.sources
>>>> index 9e3fbdc2612..8600ce81281 100644
>>>> --- a/src/compiler/Makefile.sources
>>>> +++ b/src/compiler/Makefile.sources
>>>> @@ -271,6 +271,7 @@ NIR_FILES = \
>>>> nir/nir_opt_cse.c \
>>>> nir/nir_opt_dce.c \
>>>> nir/nir_opt_dead_cf.c \
>>>> +     nir/nir_opt_discard.c \
>>>> nir/nir_opt_gcm.c \
>>>> nir/nir_opt_global_to_local.c \
>>>> nir/nir_opt_if.c \
>>>> diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build
>>>> index 28aa8de7014..e339258bb94 100644
>>>> --- a/src/compiler/nir/meson.build
>>>> +++ b/src/compiler/nir/meson.build
>>>> @@ -156,6 +156,7 @@ files_libnir = files(
>>>> 'nir_opt_cse.c',
>>>> 'nir_opt_dce.c',
>>>> 'nir_opt_dead_cf.c',
>>>> +  'nir_opt_discard.c',
>>>> 'nir_opt_gcm.c',
>>>> 'nir_opt_global_to_local.c',
>>>> 'nir_opt_if.c',
>>>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>>>> index c40a88c8ccc..dac019c17e8 100644
>>>> --- a/src/compiler/nir/nir.h
>>>> +++ b/src/compiler/nir/nir.h
>>>> @@ -2022,6 +2022,13 @@ typedef struct nir_shader_compiler_options {
>>>> */
>>>> bool vs_inputs_dual_locations;
>>>>
>>>> +   /**
>>>> +    * Whether or not derivatives are still a safe operation after a
>>>> discard
>>>> +    * has occurred.  Optimization passes may be able to be a bit more
>>>> +    * agressive if this is true.
>>>> +    */
>>>> +   bool derivatives_safe_after_discard;
>>>> +
>>>
>>> It's worth noting in the comment above that any driver that is in
>>> position to enable this option (e.g. i965) is strictly speaking
>>> non-compliant with GLSL and SPIR-V, whether or not this optimization
>>> pass is used.  The reason is that derivatives being safe after a
>>> non-uniform discard implies that any invocations involved in derivative
>>> computations must be executed even though they aren't supposed to
>>> according to the spec, and even though doing so might lead to undefined
>>> behaviour that wasn't present in the original program, e.g.:
>>>
>>> | int delta = non_uniform_computation();
>>> | if (delta == 0)
>>> |      discard;
>>> |
>>> | for (int i = 0; i < N; i += delta) {
>>> |   // Will loop forever if discarded fragments are incorrectly executed
>>> |   // by the back-end.
>>> | }
>>>
>>> The above shader is specified to terminate if the semantics of discard
>>> are as defined by GLSL or SPIRV, but not necessarily as defined by DX.
>>
>>
>> That is an interesting point.  One possible solution would be to define the
>> NIR discard intrinsic to have the DX behavior and then make the SPIR-V and
>> GLSL to NIR converters emit a discard followed immediately by a return (or
>> possibly a stronger "halt" instruction which could be called inside a helper
>> function and still kill the whole program).  A pass such as this could then
>> freely move the discard higher while leaving the control-flow effects in
>> place.  In any case, that doesn't really affect the correctness of this
>> pass, just the currectness of our current handling of discard.
>>
>>>
>>> This makes me think that DXVK is in a privileged position to decide
>>> where the discard jump should end up at, since it can make assumptions
>>> about code lexically after a discard being well-defined even if the
>>> discard condition evaluates to true.  It's unfortunate that it behaves
>>> so suboptimally currently that you need to work around it here.
>>
>>
>> You can make all sorts of arguments about where the optimal place to put a
>> pass is.  Really, the "optimal" thing would be for people to hand-write
>> their shaders in carefully optimized GEN assembly but that isn't going to
>> happen. :-)  The reality is that we get lots of garbage from shaders and we
>> have to be able to clean it up.  This particular bit of garbage clean-up is
>> needed for DXVK but it's also needed for other clients so having a pass is
>> useful regardless of whether or not the problem could be solved by the
>> client providing better shader code.
>
> I think dxvk is not in the position to do this earlier typically
> because derivatives after a vulkan discard are not defined and dxvk
> needs them defined. so in the presence of typical texture instructions
> it has to place the vulkan discard at the end.

Yes, that is the problem they are solving.

> I wonder if makes sense to only enable keeping the helper invocations
> if there are no loops after the discard, would that get enough shaders
> to be worth it?

Possibly. There may be other ways to affect dead channels; loops are just 
over fairly obvious example.

It's worth noting that, in the Skyrim shader I was looking at which 
motivated this pass, DXVK doing "the right thing" and putting the SPIR-V 
OpKill at the same location in the shader as the DX kill opcode wouldn't 
actually solve the issue.  The shader has the DX kill very late in the 
program even though the condition is just a couple of fmas of an 
interpolated input and a couple of uniforms.  In order to fix this, DXVK 
would basically need to implement exactly this optimization pass.

--Jason

>
>>
>>>
>>>> unsigned max_unroll_iterations;
>>>> } nir_shader_compiler_options;
>>>>
>>>> @@ -2901,6 +2908,9 @@ bool nir_opt_dce(nir_shader *shader);
>>>>
>>>> bool nir_opt_dead_cf(nir_shader *shader);
>>>>
>>>> +bool nir_opt_discard_if(nir_shader *shader);
>>>> +bool nir_opt_move_discards_to_top(nir_shader *shader);
>>>> +
>>>> bool nir_opt_gcm(nir_shader *shader, bool value_number);
>>>>
>>>> bool nir_opt_if(nir_shader *shader);
>>>> diff --git a/src/compiler/nir/nir_opt_discard.c
>>>> b/src/compiler/nir/nir_opt_discard.c
>>>> new file mode 100644
>>>> index 00000000000..c61af163707
>>>> --- /dev/null
>>>> +++ b/src/compiler/nir/nir_opt_discard.c
>>>> @@ -0,0 +1,396 @@
>>>> +/*
>>>> + * Copyright © 2018 Intel Corporation
>>>> + *
>>>> + * Permission is hereby granted, free of charge, to any person
>>>> obtaining a
>>>> + * copy of this software and associated documentation files (the
>>>> "Software"),
>>>> + * to deal in the Software without restriction, including without
>>>> limitation
>>>> + * the rights to use, copy, modify, merge, publish, distribute,
>>>> sublicense,
>>>> + * and/or sell copies of the Software, and to permit persons to whom
>>>> the
>>>> + * Software is furnished to do so, subject to the following conditions:
>>>> + *
>>>> + * The above copyright notice and this permission notice (including the
>>>> next
>>>> + * paragraph) shall be included in all copies or substantial portions
>>>> of the
>>>> + * Software.
>>>> + *
>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>>>> EXPRESS OR
>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>>>> MERCHANTABILITY,
>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>>>> SHALL
>>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>>>> OTHER
>>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>>>> ARISING
>>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>>>> DEALINGS
>>>> + * IN THE SOFTWARE.
>>>> + */
>>>> +
>>>> +#include "nir.h"
>>>> +#include "nir_builder.h"
>>>> +#include "nir_control_flow.h"
>>>> +#include "nir_worklist.h"
>>>> +
>>>> +static bool
>>>> +block_has_only_discard(nir_block *block)
>>>> +{
>>>> +   nir_instr *instr = nir_block_first_instr(block);
>>>> +   if (instr == NULL || instr != nir_block_last_instr(block))
>>>> +      return false;
>>>> +
>>>> +   if (instr->type != nir_instr_type_intrinsic)
>>>> +      return false;
>>>> +
>>>> +   nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>>>> +   return intrin->intrinsic == nir_intrinsic_discard;
>>>> +}
>>>> +
>>>> +static bool
>>>> +opt_discard_if_impl(nir_function_impl *impl)
>>>> +{
>>>> +   bool progress = false;
>>>> +
>>>> +   nir_builder b;
>>>> +   nir_builder_init(&b, impl);
>>>> +
>>>> +   nir_foreach_block(block, impl) {
>>>> +      nir_if *nif = nir_block_get_following_if(block);
>>>> +      if (!nif)
>>>> +         continue;
>>>> +
>>>> +      bool discard_in_then;
>>>> +      if (block_has_only_discard(nir_if_first_then_block(nif)))
>>>> +         discard_in_then = true;
>>>> +      else if (block_has_only_discard(nir_if_first_else_block(nif)))
>>>> +         discard_in_then = false;
>>>> +      else
>>>> +         continue;
>>>> +
>>>> +      b.cursor = nir_after_block(block);
>>>> +      nir_ssa_def *cond = nir_ssa_for_src(&b, nif->condition, 1);
>>>> +      if (!discard_in_then)
>>>> +         cond = nir_inot(&b, cond);
>>>> +
>>>> +      nir_intrinsic_instr *discard_if =
>>>> +         nir_intrinsic_instr_create(b.shader,
>>>> nir_intrinsic_discard_if);
>>>> +      discard_if->src[0] = nir_src_for_ssa(cond);
>>>> +      nir_builder_instr_insert(&b, &discard_if->instr);
>>>> +
>>>> +      nir_lower_phis_to_regs_block(nir_cf_node_as_block(
>>>> +                                   nir_cf_node_next(&nif->cf_node)));
>>>> +
>>>> +      nir_cf_list list;
>>>> +      if (discard_in_then)
>>>> +         nir_cf_list_extract(&list, &nif->else_list);
>>>> +      else
>>>> +         nir_cf_list_extract(&list, &nif->then_list);
>>>> +      nir_cf_reinsert(&list, nir_after_instr(&discard_if->instr));
>>>> +
>>>> +      nir_cf_node_remove(&nif->cf_node);
>>>> +
>>>> +      progress = true;
>>>> +   }
>>>> +
>>>> +   /* If we modified control-flow, metadata is toast.  Also, we may
>>>> have
>>>> +    * lowered some phis to registers so we need to back into SSA.
>>>> +    */
>>>> +   if (progress) {
>>>> +      nir_metadata_preserve(impl, 0);
>>>> +      nir_lower_regs_to_ssa_impl(impl);
>>>> +   }
>>>> +
>>>> +   return progress;
>>>> +}
>>>> +
>>>> +bool
>>>> +nir_opt_discard_if(nir_shader *shader)
>>>> +{
>>>> +   assert(shader->info.stage == MESA_SHADER_FRAGMENT);
>>>> +
>>>> +   bool progress = false;
>>>> +
>>>> +   nir_foreach_function(function, shader) {
>>>> +      if (function->impl &&
>>>> +          opt_discard_if_impl(function->impl))
>>>> +         progress = true;
>>>> +   }
>>>> +
>>>> +   return progress;
>>>> +}
>>>> +
>>>> +static bool
>>>> +nir_variable_mode_is_read_only(nir_variable_mode mode)
>>>> +{
>>>> +   return mode == nir_var_shader_in ||
>>>> +          mode == nir_var_uniform ||
>>>> +          mode == nir_var_system_value;
>>>> +}
>>>> +
>>>> +static bool
>>>> +nir_op_is_derivative(nir_op op)
>>>> +{
>>>> +   return op == nir_op_fddx ||
>>>> +          op == nir_op_fddy ||
>>>> +          op == nir_op_fddx_fine ||
>>>> +          op == nir_op_fddy_fine ||
>>>> +          op == nir_op_fddx_coarse ||
>>>> +          op == nir_op_fddy_coarse;
>>>> +}
>>>> +
>>>> +static bool
>>>> +nir_texop_implies_derivative(nir_texop op)
>>>> +{
>>>> +   return op == nir_texop_tex ||
>>>> +          op == nir_texop_txb ||
>>>> +          op == nir_texop_lod;
>>>> +}
>>>> +
>>>> +static bool
>>>> +nir_intrinsic_writes_external_memory(nir_intrinsic_op intrin)
>>>> +{
>>>> +   switch (intrin) {
>>>> +   case nir_intrinsic_store_deref:
>>>> +   case nir_intrinsic_copy_deref:
>>>> +   case nir_intrinsic_deref_atomic_add:
>>>> +   case nir_intrinsic_deref_atomic_imin:
>>>> +   case nir_intrinsic_deref_atomic_umin:
>>>> +   case nir_intrinsic_deref_atomic_imax:
>>>> +   case nir_intrinsic_deref_atomic_umax:
>>>> +   case nir_intrinsic_deref_atomic_and:
>>>> +   case nir_intrinsic_deref_atomic_or:
>>>> +   case nir_intrinsic_deref_atomic_xor:
>>>> +   case nir_intrinsic_deref_atomic_exchange:
>>>> +   case nir_intrinsic_deref_atomic_comp_swap:
>>>> +      /* If we ever start using variables for SSBO ops, we'll need to
>>>> do
>>>> +       * something here.  For now, they're safe.
>>>> +       */
>>>> +      return false;
>>>> +
>>>> +   case nir_intrinsic_store_ssbo:
>>>> +   case nir_intrinsic_ssbo_atomic_add:
>>>> +   case nir_intrinsic_ssbo_atomic_imin:
>>>> +   case nir_intrinsic_ssbo_atomic_umin:
>>>> +   case nir_intrinsic_ssbo_atomic_imax:
>>>> +   case nir_intrinsic_ssbo_atomic_umax:
>>>> +   case nir_intrinsic_ssbo_atomic_and:
>>>> +   case nir_intrinsic_ssbo_atomic_or:
>>>> +   case nir_intrinsic_ssbo_atomic_xor:
>>>> +   case nir_intrinsic_ssbo_atomic_exchange:
>>>> +   case nir_intrinsic_ssbo_atomic_comp_swap:
>>>> +      return true;
>>>> +
>>>> +   case nir_intrinsic_image_deref_store:
>>>> +   case nir_intrinsic_image_deref_atomic_add:
>>>> +   case nir_intrinsic_image_deref_atomic_min:
>>>> +   case nir_intrinsic_image_deref_atomic_max:
>>>> +   case nir_intrinsic_image_deref_atomic_and:
>>>> +   case nir_intrinsic_image_deref_atomic_or:
>>>> +   case nir_intrinsic_image_deref_atomic_xor:
>>>> +   case nir_intrinsic_image_deref_atomic_exchange:
>>>> +   case nir_intrinsic_image_deref_atomic_comp_swap:
>>>> +      return true;
>>>> +
>>>> +   default:
>>>> +      return false;
>>>> +   }
>>>> +}
>>>> +
>>>> +static bool
>>>> +add_src_instr_to_worklist(nir_src *src, void *wl)
>>>> +{
>>>> +   if (!src->is_ssa)
>>>> +      return false;
>>>> +
>>>> +   nir_instr_worklist_push_tail(wl, src->ssa->parent_instr);
>>>> +   return true;
>>>> +}
>>>> +
>>>> +/** Try to mark a discard instruction for moving
>>>> + *
>>>> + * This function does two things.  One is that it searches through the
>>>> + * dependency chain to see if this discard is an instruction that we
>>>> can move
>>>> + * up to the top.  Second, if the discard is one we can move, it adds
>>>> the
>>>> + * discard and its dependencies to discards_and_deps.
>>>> + */
>>>> +static void
>>>> +try_move_discard(nir_intrinsic_instr *discard,
>>>> +                 struct set *discards_and_deps)
>>>> +{
>>>> +   /* We require the discard to be in the top level of control flow.
>>>> We
>>>> +    * could, in theory, move discards that are inside ifs or loops but
>>>> that
>>>> +    * would be a lot more work.
>>>> +    */
>>>> +   if (discard->instr.block->cf_node.parent->type !=
>>>> nir_cf_node_function)
>>>> +      return;
>>>> +
>>>> +   /* Build the set of all instructions discard depends on.  We'll
>>>> union this
>>>> +    * one later with discard_and_deps if the discard is movable.
>>>> +    */
>>>> +   struct set *instrs = _mesa_set_create(NULL, _mesa_hash_pointer,
>>>> +                                         _mesa_key_pointer_equal);
>>>> +   nir_instr_worklist *work = nir_instr_worklist_create();
>>>> +
>>>> +   _mesa_set_add(instrs, &discard->instr);
>>>> +   add_src_instr_to_worklist(&discard->src[0], work);
>>>> +
>>>> +   bool can_move_discard = true;
>>>> +   nir_foreach_instr_in_worklist(instr, work) {
>>>> +      /* Don't process an instruction twice */
>>>> +      if (_mesa_set_search(instrs, instr))
>>>> +         continue;
>>>> +
>>>> +      _mesa_set_add(instrs, instr);
>>>> +
>>>> +      /* Phi instructions can't be moved at all.  Also, if we're
>>>> dependent on
>>>> +       * a phi then we are dependent on some other bit of control flow
>>>> and
>>>> +       * it's hard to figure out the proper condition.
>>>> +       */
>>>> +      if (instr->type == nir_instr_type_phi) {
>>>> +         can_move_discard = false;
>>>> +         break;
>>>> +      }
>>>> +
>>>> +      if (instr->type == nir_instr_type_intrinsic) {
>>>> +         nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>>>> +         if (intrin->intrinsic == nir_intrinsic_load_deref) {
>>>> +            nir_deref_instr *deref = nir_src_as_deref(intrin->src[0]);
>>>> +            if (!nir_variable_mode_is_read_only(deref->mode)) {
>>>> +               can_move_discard = false;
>>>> +               break;
>>>> +            }
>>>> +         } else if (!(nir_intrinsic_infos[intrin->intrinsic].flags &
>>>> +                      NIR_INTRINSIC_CAN_REORDER)) {
>>>> +            can_move_discard = false;
>>>> +            break;
>>>> +         }
>>>> +      }
>>>> +
>>>> +      if (!nir_foreach_src(instr, add_src_instr_to_worklist, work)) {
>>>> +         can_move_discard = false;
>>>> +         break;
>>>> +      }
>>>> +   }
>>>> +
>>>> +   if (can_move_discard) {
>>>> +      struct set_entry *entry;
>>>> +      set_foreach(instrs, entry)
>>>> +         _mesa_set_add(discards_and_deps, entry->key);
>>>> +   }
>>>> +
>>>> +   nir_instr_worklist_destroy(work);
>>>> +   _mesa_set_destroy(instrs, NULL);
>>>> +}
>>>> +
>>>> +static bool
>>>> +opt_move_discards_to_top_impl(nir_function_impl *impl)
>>>> +{
>>>> +   const nir_shader_compiler_options *options =
>>>> impl->function->shader->options;
>>>> +
>>>> +   /* This optimization only operates on discard_if.  Run the
>>>> discard_if
>>>> +    * optimization (it's very cheap if it doesn't make progress) so
>>>> that we
>>>> +    * have some hope of move_discards_to_top making progress.
>>>> +    */
>>>> +   bool progress = opt_discard_if_impl(impl);
>>>> +
>>>> +   struct set *move_instrs = _mesa_set_create(NULL, _mesa_hash_pointer,
>>>> +                                              _mesa_key_pointer_equal);
>>>> +
>>>> +   /* Walk through the instructions and look for a discard that we can
>>>> move
>>>> +    * to the top of the program.  If we hit any operation along the way
>>>> that
>>>> +    * we cannot safely move a discard above, break out of the loop and
>>>> stop
>>>> +    * trying to move any more discards.
>>>> +    */
>>>> +   nir_foreach_block(block, impl) {
>>>> +      nir_foreach_instr_safe(instr, block) {
>>>> +         switch (instr->type) {
>>>> +         case nir_instr_type_alu: {
>>>> +            nir_alu_instr *alu = nir_instr_as_alu(instr);
>>>> +            if (nir_op_is_derivative(alu->op) &&
>>>> +                !options->derivatives_safe_after_discard)
>>>> +               goto break_all;
>>>> +            continue;
>>>> +         }
>>>> +
>>>> +         case nir_instr_type_deref:
>>>> +         case nir_instr_type_load_const:
>>>> +         case nir_instr_type_ssa_undef:
>>>> +         case nir_instr_type_phi:
>>>> +            /* These are all safe */
>>>> +            continue;
>>>> +
>>>> +         case nir_instr_type_call:
>>>> +            /* We don't know what the function will do */
>>>> +            goto break_all;
>>>> +
>>>> +         case nir_instr_type_tex: {
>>>> +            nir_tex_instr *tex = nir_instr_as_tex(instr);
>>>> +            if (nir_texop_implies_derivative(tex->op) &&
>>>> +                !options->derivatives_safe_after_discard)
>>>> +               goto break_all;
>>>> +            continue;
>>>> +         }
>>>> +
>>>> +         case nir_instr_type_intrinsic: {
>>>> +            nir_intrinsic_instr *intrin =
>>>> nir_instr_as_intrinsic(instr);
>>>> +            if
>>>> (nir_intrinsic_writes_external_memory(intrin->intrinsic))
>>>> +               goto break_all;
>>>> +
>>>> +            if (intrin->intrinsic == nir_intrinsic_discard_if)
>>>> +               try_move_discard(intrin, move_instrs);
>>>> +            continue;
>>>> +         }
>>>> +
>>>> +         case nir_instr_type_jump: {
>>>> +            nir_jump_instr *jump = nir_instr_as_jump(instr);
>>>> +            /* A return would cause the discard to not get executed */
>>>> +            if (jump->type == nir_jump_return)
>>>> +               goto break_all;
>>>> +            continue;
>>>> +         }
>>>> +
>>>> +         case nir_instr_type_parallel_copy:
>>>> +            unreachable("Unhanded instruction type");
>>>> +         }
>>>> +      }
>>>> +   }
>>>> +break_all:
>>>> +
>>>> +   if (move_instrs->entries) {
>>>> +      /* Walk the list of instructions and move the discard and
>>>> everything it
>>>> +       * depends on to the top.  We walk the instruction list here
>>>> because it
>>>> +       * ensures that everything stays in its original order.  This
>>>> provides
>>>> +       * stability for the algorithm and ensures that we don't
>>>> accidentally
>>>> +       * get dependencies out-of-order.
>>>> +       */
>>>> +      nir_cursor cursor = nir_before_block(nir_start_block(impl));
>>>> +      nir_foreach_block(block, impl) {
>>>> +         nir_foreach_instr_safe(instr, block) {
>>>> +            if (_mesa_set_search(move_instrs, instr)) {
>>>> +               nir_instr_move(cursor, instr);
>>>> +               cursor = nir_after_instr(instr);
>>>> +            }
>>>> +         }
>>>> +      }
>>>> +      progress = true;
>>>> +   }
>>>> +
>>>> +   _mesa_set_destroy(move_instrs, NULL);
>>>> +
>>>> +   if (progress) {
>>>> +      nir_metadata_preserve(impl, nir_metadata_block_index |
>>>> +                                  nir_metadata_dominance);
>>>> +   }
>>>> +
>>>> +   return progress;
>>>> +}
>>>> +
>>>> +bool
>>>> +nir_opt_move_discards_to_top(nir_shader *shader)
>>>> +{
>>>> +   assert(shader->info.stage == MESA_SHADER_FRAGMENT);
>>>> +
>>>> +   bool progress = false;
>>>> +
>>>> +   nir_foreach_function(function, shader) {
>>>> +      if (function->impl &&
>>>> +          opt_move_discards_to_top_impl(function->impl))
>>>> +         progress = true;
>>>> +   }
>>>> +
>>>> +   return progress;
>>>> +}
>>>> --
>>>> 2.17.1
>>>>
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>>
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev