[Mesa-dev] [PATCH 2/2] prog_optimize: Add reads without writes optimization pass
Jerome Glisse
j.glisse at gmail.com
Tue Mar 29 11:11:01 PDT 2011
On Tue, Mar 29, 2011 at 1:59 AM, Tom Stellard <tstellar at gmail.com> wrote:
> This pass scans programs for instructions that read registers that have
> not been written and replaces these reads with a read from a constant
> register with the value of zero. This pass prevents phantom
> dependencies from being introduced by the register allocator and
> improves the efficiency of subsequent optimization passes.
> ---
> src/mesa/program/prog_optimize.c | 81 ++++++++++++++++++++++++++++++++++++++
> 1 files changed, 81 insertions(+), 0 deletions(-)
>
> diff --git a/src/mesa/program/prog_optimize.c b/src/mesa/program/prog_optimize.c
> index b5f0fb3..bc698fb 100644
> --- a/src/mesa/program/prog_optimize.c
> +++ b/src/mesa/program/prog_optimize.c
> @@ -30,6 +30,7 @@
> #include "program.h"
> #include "prog_instruction.h"
> #include "prog_optimize.h"
> +#include "prog_parameter.h"
> #include "prog_print.h"
>
>
> @@ -1225,6 +1226,85 @@ print_it(struct gl_context *ctx, struct gl_program *program, const char *txt) {
> }
> #endif
>
> +/** This pass searches for registers that are read before they are written
> + * and replaces reads from these registers with a read from a constant
> + * register with the value of zero. This pass will not change the program
> + * if it has already been run, so it only needs to be run once per program.
> + *
> + * When CMP instructions are translated from GLSL IR to Mesa IR, usually
> + * source register 1 or source register 2 is set to value of the destination
> + * register. When the registers are reallocated by
> + * _mesa_reallocate_registers() there is the possibility of creating phantom
> + * dependencies where a source register is remapped so that it reads from a
> + * register that has been written by an instruction that is no longer live.
> + * Here is an example:
> + *
> + * 0: MUL TEMP[0], CONST[0] IN[0]
> + * 1: RCP TEMP[1], TEMP[0]
> + * 2: CMP TEMP[2], TEMP[1] CONST[0] TEMP[2]
> + * ...
> + *
> + * _mesa_reallocate_registers will remap registers 0->0, 1->1, 2->0 and
> + * the program will look like this:
> + *
> + * 0: MUL TEMP[0], CONST[0], IN[0]
> + * 1: RCP TEMP[1], TEMP[0]
> + * 2: CMP TEMP[0], TEMP[1] CONST[0] TEMP[0]
> + * ...
> + *
> + * This creates a phantom dependency, because instruction 2 now depends
> + * on the result of instruction 0 which was not the case in the original
> + * program.
> + */
> +static void
> +_mesa_reads_without_writes(struct gl_program * program)
> +{
> + GLfloat zeroArray[4] = {0.0f, 0.0f, 0.0f, 0.0f};
> + GLuint zeroSwizzle;
> + struct prog_src_register zeroReg;
> + GLuint regWrites[REG_ALLOCATE_MAX_PROGRAM_TEMPS];
> + GLuint i;
> +
> + if (dbg) {
> + printf("Optimize: Begin reads without writes\n");
> + _mesa_print_program(program);
> + }
> +
> + for (i = 0; i < REG_ALLOCATE_MAX_PROGRAM_TEMPS; i++) {
> + regWrites[i] = 0;
> + }
> +
> + memset(&zeroReg, 0, sizeof(zeroReg));
> + zeroReg.File = PROGRAM_CONSTANT;
> + zeroReg.Index = _mesa_add_unnamed_constant(program->Parameters, zeroArray,
> + 1, &zeroSwizzle);
> + zeroReg.Swizzle = zeroSwizzle;
> +
> + for (i = 0; i < program->NumInstructions; i++) {
> + struct prog_instruction *inst = program->Instructions + i;
> + GLuint numSrc = _mesa_num_inst_src_regs(inst->Opcode);
> + GLuint j;
> + for (j = 0; j < numSrc; j++) {
> + if (inst->SrcReg[j].File == PROGRAM_TEMPORARY) {
> + const GLuint index = inst->SrcReg[j].Index;
> + if (!inst->SrcReg[j].RelAddr
> + && !(regWrites[index] & get_src_arg_mask(inst, j, NO_MASK))) {
> + inst->SrcReg[j] = zeroReg;
> + }
> + }
> + }
> + if (inst->DstReg.File == PROGRAM_TEMPORARY) {
> + if (inst->DstReg.RelAddr) {
> + return;
> + }
> + regWrites[inst->DstReg.Index] |= inst->DstReg.WriteMask;
> + }
> + }
> + if (dbg) {
> + printf("Optimize: End reads without writes\n");
> + _mesa_print_program(program);
> + }
> +}
>
> /**
> * Apply optimizations to the given program to eliminate unnecessary
> @@ -1235,6 +1315,7 @@ _mesa_optimize_program(struct gl_context *ctx, struct gl_program *program)
> {
> GLboolean any_change;
>
> + _mesa_reads_without_writes(program);
> /* Stop when no modifications were output */
> do {
> any_change = GL_FALSE;
> --
> 1.7.3.4
>
Long time since i have look into mesa shader but your code seems to
completely ignore program flow and thus might face wrong positive. For
instance :
Consider a flow graph, in this flow graph consider B0 & B1 two
different block where B0 dominate B1 (ie in all execution flow
instruction that endup in B1 also goes through B0 before). Now
consider some instruction using temp 10 (with temp 10 never being used
outside B0 & B1) in B1 and some instruction defining temp 10 in B0. If
B1 instruction are first in the gl_program instructions array then
your algorithm will wrongly believe that temp 10 is never written
before instruction using it.
IIRC correctly on how mesa ir this case can happen.
Cheers,
Jerome
More information about the mesa-dev
mailing list