[Mesa-dev] [PATCH 06/20] i965: Preserve the CFG in instruction scheduling.

Wed Sep 17 00:24:33 PDT 2014

On Mon, Sep 08, 2014 at 10:46:33AM -0700, Matt Turner wrote:
> On Mon, Sep 8, 2014 at 5:43 AM, Pohjolainen, Topi
> <topi.pohjolainen at intel.com> wrote:
> > On Tue, Sep 02, 2014 at 09:34:17PM -0700, Matt Turner wrote:
> >> ---
> >>  .../drivers/dri/i965/brw_schedule_instructions.cpp | 74 ++++++++++++----------
> >>  1 file changed, 42 insertions(+), 32 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> >> index 04ac242..bac0d55 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> >> @@ -27,6 +27,8 @@
> >>
> >>  #include "brw_fs.h"
> >>  #include "brw_vec4.h"
> >> +#include "brw_cfg.h"
> >> +#include "brw_shader.h"
> >>  #include "glsl/glsl_types.h"
> >>  #include "glsl/ir_optimization.h"
> >>
> >> @@ -411,6 +413,7 @@ public:
> >>           this->remaining_grf_uses = NULL;
> >>           this->grf_active = NULL;
> >>        }
> >> +      v->calculate_cfg();
> >>     }
> >>
> >>     ~instruction_scheduler()
> >> @@ -421,8 +424,8 @@ public:
> >>     void add_dep(schedule_node *before, schedule_node *after, int latency);
> >>     void add_dep(schedule_node *before, schedule_node *after);
> >>
> >> -   void run(exec_list *instructions);
> >> -   void add_inst(backend_instruction *inst);
> >> +   void run(cfg_t *cfg);
> >> +   void add_insts_from_block(bblock_t *block);
> >>     void compute_delay(schedule_node *node);
> >>     virtual void calculate_deps() = 0;
> >>     virtual schedule_node *choose_instruction_to_schedule() = 0;
> >> @@ -440,7 +443,7 @@ public:
> >>     virtual void update_register_pressure(backend_instruction *inst) = 0;
> >>     virtual int get_register_pressure_benefit(backend_instruction *inst) = 0;
> >>
> >> -   void schedule_instructions(backend_instruction *next_block_header);
> >> +   void schedule_instructions(bblock_t *block);
> >>
> >>     void *mem_ctx;
> >>
> >> @@ -624,17 +627,28 @@ schedule_node::schedule_node(backend_instruction *inst,
> >>  }
> >>
> >>  void
> >> -instruction_scheduler::add_inst(backend_instruction *inst)
> >> +instruction_scheduler::add_insts_from_block(bblock_t *block)
> >>  {
> >> -   schedule_node *n = new(mem_ctx) schedule_node(inst, this);
> >> +   /* Removing the last instruction from a basic block removes the block as
> >> +    * well, so put a NOP at the end to keep it alive.
> >> +    */
> >
> > Initially using the nop-instruction to keep the block "non-empty" seemed a
> > little artifical. Then I started thinking alternatives.
> >
> > I started wondering why we couldn't simply copy the incoming block::start and
> > block::end into the instance of instruction_scheduler here instead of manually
> > creating a new instruction list. And then just set the original block::start
> > and block::end to NULL. If we named the new members of instruction_scheduler
> > the same as in bblock_t (start, end), we could even use the existing macros
> > for iterating (the macros are not concerned about the data type as long as it
> > has members 'start' and 'end').
> 
> That sounds like a good idea, except that we have to leave control
> flow instructions in place in add_insts_from_block(), since we cannot
> move them. I guess we could just remove their schedule_node in
> schedule_instructions(bblock_t *)?
> 
> So we'd be pulling instructions out of the list and reinserting them
> immediately before the last control flow instruction?
> 
> I haven't thought through all of the logic required to do that, but if
> we were to do it I'd want it to be as a separate patch.

This is fine with me - this patch is a lot safer to start with.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>

> 
> > I understood that the idea is to store the instructions represented by the
> > block somewhere, set the block empty and re-insert the instructions in possibly
> > different order back into the block, right?
> 
> Right.
> 
> >> +   if (!block->end->is_control_flow()) {
> >> +      backend_instruction *nop = new(mem_ctx) backend_instruction();
> >> +      nop->opcode = BRW_OPCODE_NOP;
> >> +      block->end->insert_after(block, nop);
> >> +   }
> >>
> >> -   assert(!inst->is_head_sentinel());
> >> -   assert(!inst->is_tail_sentinel());
> >> +   foreach_inst_in_block_safe(backend_instruction, inst, block) {
> >> +      if (inst->opcode == BRW_OPCODE_NOP || inst->is_control_flow())
> >> +         continue;
> >
> > Shouldn't this simply return? Original logic below simply quits iterating
> > when it finds a control flow instruction - there can be in the maximum one
> > control flow instruction per block, and it is always the last, right?
> 
> No, endif and do instructions begin a basic block, so we need to leave
> them in place, since we don't want to schedule control flow
> instructions. So two control flow instructions can be in a basic
> block. E.g., one starts with an endif and ends in an if.

I keep on forgetting that, sorry for asking over and over again.