[Mesa-dev] [PATCH 1/2] i965: Try not to reverse-schedule things when doing LIFO scheduling.

Tue Oct 22 22:46:24 CEST 2013

The LIFO plan was simple: Take the most recently made available
instructions, and pick those first.

But because of the order we were pushing things onto our list of
available-to-schedule instructions, it meant that when a set of
instructions was made available at the same time (for example, everything
at the start of the program that wasn't dependant on other instructions)
we'd schedule them in reverse order.

If you had 10 texture calls in a row in your program, each with
independent argument setup, we'd set up the last texture call's args and
execute it first, even though we wouldn't be able to consume its results
until we'd finished the other 9 texture calls (assuming consumption of
texture results happens near each texture call, and combines it with
another texture result, which is normal for a convolution shader).

To fix this, walk the list for doing LIFO in the order that instructions
were originally generated in the program, but choose to push
newly-made-available instructions to the other end of the list instead.

total instructions in shared programs: 1587242 -> 1586290 (-0.06%)
instructions in affected programs:     7801 -> 6849 (-12.20%)
GAINED:                                76
LOST:                                  67
---

Note: This pair of patches replaces the previous patch I'd proposed,
since it does better (and the previous patch, on top of these two,
makes things worse)

 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 84b74ff..99538bd 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -964,9 +964,7 @@ fs_instruction_scheduler::choose_instruction_to_schedule()
        * but also the MRF setup for the next sampler message, which in turn
        * unblocks the next sampler message).
        */
-      for (schedule_node *node = (schedule_node *)instructions.get_tail();
-           node != instructions.get_head()->prev;
-           node = (schedule_node *)node->prev) {
+      foreach_list(node, &instructions) {
          schedule_node *n = (schedule_node *)node;
          fs_inst *inst = (fs_inst *)n->inst;
 
@@ -1059,7 +1057,7 @@ instruction_scheduler::schedule_instructions(backend_instruction *next_block_hea
        * be scheduled.  Update the children's unblocked time for this
        * DAG edge as we do so.
        */
-      for (int i = 0; i < chosen->child_count; i++) {
+      for (int i = chosen->child_count - 1; i >= 0; i--) {
 	 schedule_node *child = chosen->children[i];
 
 	 child->unblocked_time = MAX2(child->unblocked_time,
@@ -1075,7 +1073,7 @@ instruction_scheduler::schedule_instructions(backend_instruction *next_block_hea
             if (debug) {
                printf("\t\tnow available\n");
             }
-	    instructions.push_tail(child);
+	    instructions.push_head(child);
 	 }
       }
 
-- 
1.8.4.rc3