[Mesa-dev] [PATCH 2/2] i965: Make DCE set null destinations on messages with side effects.
Francisco Jerez
currojerez at riseup.net
Tue Jan 17 19:40:33 UTC 2017
Kenneth Graunke <kenneth at whitecape.org> writes:
> (Co-authored by Matt Turner.)
>
> Image atomics, for example, return a value - but the shader may not
> want to use it. We assigned a useless VGRF destination. This seemed
> harmless, but it can actually be quite harmful. The register allocator
> has to assign that VGRF to a real register. It may assign the same
> actual GRF to the destination of an instruction that follows soon after.
>
> This results in a write-after-write (WAW) dependency, and stall.
>
> A number of "Deus Ex: Mankind Divided" shaders use image atomics, but
> don't use the return value. Several of these were hitting WAW stalls
> for nearly 14,000 (poorly estimated) cycles a pop. Making dead code
> elimination null out the destination avoids this issue.
>
> This patch cuts one shader's estimated cycles by -98.39%! Removing the
> message response should also help with data cluster bandwidth.
>
> On Skylake:
>
> total instructions in shared programs: 13366907 -> 13363051 (-0.03%)
> instructions in affected programs: 49635 -> 45779 (-7.77%)
> helped: 133
> HURT: 0
>
> total cycles in shared programs: 255433388 -> 248081818 (-2.88%)
> cycles in affected programs: 12370702 -> 5019132 (-59.43%)
> helped: 100
> HURT: 24
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
> .../dri/i965/brw_fs_dead_code_eliminate.cpp | 56 ++++++++++++++++------
> 1 file changed, 41 insertions(+), 15 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> index 930dc733b45..885ae2638a8 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> @@ -34,6 +34,43 @@
> * yet in the tail end of this block.
> */
>
> +static bool
> +can_eliminate(const fs_inst *inst, BITSET_WORD *flag_live)
> +{
> + return inst->opcode != BRW_OPCODE_IF &&
> + inst->opcode != BRW_OPCODE_WHILE &&
Isn't this missing a bunch of other control flow instructions that
shouldn't be eliminated? What's going on here, are they also being
DCE'ed? Shouldn't this be !inst->is_control_flow()?
> + !inst->has_side_effects() &&
> + !(flag_live[0] & inst->flags_written()) &&
> + !inst->writes_accumulator;
> +}
> +
> +static bool
> +can_omit_write(const fs_inst *inst, BITSET_WORD *flag_live)
> +{
> + switch (inst->opcode) {
> + case SHADER_OPCODE_UNTYPED_ATOMIC:
> + case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL:
> + case SHADER_OPCODE_TYPED_ATOMIC:
> + case SHADER_OPCODE_TYPED_ATOMIC_LOGICAL:
> + return true;
> + default:
> + /* If we're going to eliminate the instruction entirely, omitting the
> + * write is always safe.
> + */
> + if (can_eliminate(inst, flag_live))
> + return true;
I think it would be cleaner to make this a predicate of the instruction
alone meaning 'can the instruction accept a null register as
destination?' (which is independent of the context it occurs in, and
can_eliminate() doesn't have any influence on, it's just that we're okay
with the pass below putting invalid instructions in the program as long
as we have the guarantee that they're going to be removed later on), and
do 'can_omit_write(inst) || can_eliminate(inst, flag_live)' below.
> +
> + /* We can eliminate the destination write for ordinary instructions,
> + * but not most SENDs.
> + */
> + if (inst->opcode < 128 && inst->mlen == 0)
> + return true;
> +
This is going to miss ALU-like virtual instructions, but I guess they
could be enumerated in the 'return true' case above together with
surface atomics. [Not saying that you necessarily need to do that
now ;)]
> + /* It might not be safe for other virtual opcodes. */
> + return false;
> + }
> +}
> +
> bool
> fs_visitor::dead_code_eliminate()
> {
> @@ -52,31 +89,20 @@ fs_visitor::dead_code_eliminate()
> sizeof(BITSET_WORD));
>
> foreach_inst_in_block_reverse_safe(fs_inst, inst, block) {
> - if (inst->dst.file == VGRF && !inst->has_side_effects()) {
> + if (inst->dst.file == VGRF) {
> const unsigned var = live_intervals->var_from_reg(inst->dst);
> bool result_live = false;
>
> for (unsigned i = 0; i < regs_written(inst); i++)
> result_live |= BITSET_TEST(live, var + i);
>
> - if (!result_live) {
> + if (!result_live && can_omit_write(inst, flag_live)) {
> + inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type));
> progress = true;
> -
> - if (inst->writes_accumulator || inst->flags_written()) {
> - inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type));
> - } else {
> - inst->opcode = BRW_OPCODE_NOP;
> - }
> }
> }
>
> - if ((inst->opcode != BRW_OPCODE_IF &&
> - inst->opcode != BRW_OPCODE_WHILE) &&
> - inst->dst.is_null() &&
> - !inst->has_side_effects() &&
> - !(flag_live[0] & inst->flags_written()) &&
> - !inst->flags_written() &&
> - !inst->writes_accumulator) {
> + if (inst->dst.is_null() && can_eliminate(inst, flag_live)) {
> inst->opcode = BRW_OPCODE_NOP;
> progress = true;
> }
> --
> 2.11.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170117/f01057bb/attachment.sig>
More information about the mesa-dev
mailing list