<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Aug 16, 2016 at 1:54 PM, Francisco Jerez <<a href="mailto:currojerez@riseup.net" target="_blank">currojerez@riseup.net</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This uses the unblocked time of the exit assigned to each available node to attempt to unblock exit nodes as early as possible, potentially reducing the runtime of the shader when an exit branch is taken. There is a natural trade-off between terminating the program as early as possible and reducing the worst-case latency of the program as a whole (since this will typically move exit-unblocking nodes closer to its dependencies potentially causing additional stalls of the execution pipeline), but in practice the bandwidth and ALU cycle savings from terminating the program earlier tend to outweigh the slight increase in worst-case program execution latency, so it makes sense to prefer nodes likely to unblock an earlier exit regardless of the latency benefits of other available nodes. I haven't observed any benchmark regressions from this change after testing on VLV, HSW, BDW, BSW and SKL. The FPS of the GfxBench Manhattan benchmark increases by 10%-20% and the FPS of Unigine Valley improves by roughly 5% depending on the platform and settings. </blockquote><div> </div><div>Thanks for working on this! We've known about it for a while and it's nice to finally get some progress. </div><div>Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> The change to the register pressure-sensitive heuristic is rather conservative and gives precedence to the existing heuristic in order to avoid increasing register pressure and causing spill count and SIMD width regressions in shader-db. It may make sense to revisit this with additional performance data. --- .../drivers/dri/i965/brw_schedule_instructions.cpp | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp index 96562cf..dfcaa80 100644 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp @@ -1407,11 +1407,15 @@ fs_instruction_scheduler::choose_instruction_to_schedule() if (mode == SCHEDULE_PRE || mode == SCHEDULE_POST) { int chosen_time = 0; - /* Of the instructions ready to execute or the closest to - * being ready, choose the oldest one. + /* Of the instructions ready to execute or the closest to being ready, + * choose the one most likely to unblock an early program exit, or + * otherwise the oldest one. */ foreach_in_list(schedule_node, n, &instructions) { - if (!chosen || n->unblocked_time < chosen_time) { + if (!chosen || + exit_unblocked_time(n) < exit_unblocked_time(chosen) || + (exit_unblocked_time(n) == exit_unblocked_time(chosen) && + n->unblocked_time < chosen_time)) { chosen = n; chosen_time = n->unblocked_time; } @@ -1500,6 +1504,15 @@ fs_instruction_scheduler::choose_instruction_to_schedule() continue; } + /* Prefer the node most likely to unblock an early program exit. + */ + if (exit_unblocked_time(n) < exit_unblocked_time(chosen)) { + chosen = n; + continue; + } else if (exit_unblocked_time(n) > exit_unblocked_time(chosen)) { + continue; + } + /* If all other metrics are equal, we prefer the first instruction in * the list (program execution). */ -- 2.9.0 _______________________________________________ mesa-dev mailing list <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a> </blockquote></div> </div></div>