[Mesa-dev] [RFC v2 10/11] glsl: teach copy_propagation_elements to deal with whole variables
Eric Anholt
eric at anholt.net
Tue Jul 10 22:16:27 UTC 2018
Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com> writes:
> Keep information in acp_entry whether the entry is full or not, and
> use the ACP in more nodes when visiting the instructions:
>
> - add_copy: write whole variables to the ACP state (regardless the
> type).
>
> - visit(ir_dereference_variable *): perform the propagation here if we have a
> full candidate. Element-wise here doesn't apply because the mask
> isn't available at this point.
>
> - visit_leave(ir_assignment *): process beyond scalar and vector, as
> the full variables might have other types.
>
> Also import an improvement from opt_copy_propagation.cpp: if ir_call
> is an intrinsic, we know the variables affected, so keep going.
> ---
> .../glsl/opt_copy_propagation_elements.cpp | 142 ++++++++++++++----
> 1 file changed, 113 insertions(+), 29 deletions(-)
>
> diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp b/src/compiler/glsl/opt_copy_propagation_elements.cpp
> index d2c24697203..701cce2dd30 100644
> --- a/src/compiler/glsl/opt_copy_propagation_elements.cpp
> +++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp
> @@ -27,18 +27,9 @@
> * Replaces usage of recently-copied components of variables with the
> * previous copy of the variable.
> *
> - * This pass can be compared with opt_copy_propagation, which operands
> - * on arbitrary whole-variable copies. However, in order to handle
> - * the copy propagation of swizzled variables or writemasked writes,
> - * we want to track things on a channel-wise basis. I found that
> - * trying to mix the swizzled/writemasked support here with the
> - * whole-variable stuff in opt_copy_propagation.cpp just made a mess,
> - * so this is separate despite the ACP handling being somewhat
> - * similar.
> - *
> * This should reduce the number of MOV instructions in the generated
> - * programs unless copy propagation is also done on the LIR, and may
> - * help anyway by triggering other optimizations that live in the HIR.
> + * programs and help triggering other optimizations that live in GLSL
> + * level.
> */
>
> #include "ir.h"
> @@ -58,6 +49,7 @@ class acp_entry
> public:
> DECLARE_LINEAR_ZALLOC_CXX_OPERATORS(acp_entry)
>
A comment would have helped me:
/* If set, rhs_full indicates that this ACP entry represents a
* whole-variable copy. The rhs_element[] array will still be filled,
* to allow the swizzling from its components in case the variable
* was a vector (and to simplify some of the erase() and write_vector()
* logic).
*/
> + ir_variable *rhs_full;
> ir_variable *rhs_element[4];
> unsigned rhs_channel[4];
>
> - void write(ir_variable *lhs, ir_variable *rhs, unsigned write_mask, int swizzle[4])
> + void write_elements(ir_variable *lhs, ir_variable *rhs, unsigned write_mask, int swizzle[4])
> {
> acp_entry *lhs_entry = pull_acp(lhs);
> + lhs_entry->rhs_full = NULL;
> + bool already_reference_rhs = false;
>
> for (int i = 0; i < 4; i++) {
> + if (lhs_entry->rhs_element[i] == rhs)
> + already_reference_rhs = true;
> +
> if ((write_mask & (1 << i)) == 0)
> continue;
> ir_variable *to_remove = lhs_entry->rhs_element[i];
> @@ -160,8 +160,38 @@ public:
> }
> }
>
> + if (!already_reference_rhs) {
> + acp_entry *rhs_entry = pull_acp(rhs);
> + _mesa_set_add(rhs_entry->dsts, lhs);
> + }
> + }
Though I think the logic would be simpler if we skip the checking that
we haven't added to the ACP and just _mesa_set_add() no matter what.
Optional, though. Either way:
Reviewed-by: Eric Anholt <eric at anholt.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180710/7bd0f7d5/attachment.sig>
More information about the mesa-dev
mailing list