[Mesa-dev] [PATCH 3/7] i965/fs: Before reg alloc, schedule instructions to reduce live ranges.
Kenneth Graunke
kenneth at whitecape.org
Sun Dec 9 14:23:45 PST 2012
On 12/07/2012 02:58 PM, Eric Anholt wrote:
> This came from an idea by Ben Segovia. 16-wide pixel shaders are very
> important for latency hiding on i965, so we want to try really hard to
> get them. If scheduling an instruction makes some set of instructions
> available, those are probably the ones that make the instruction's
> result dead. By choosing those first, we'll have a tendency to reduce
> the amount of live data as opposed to creating more.
>
> Previously, we were sometimes getting this behavior out of the
> scheduler, which was what produced the scheduler's original performance
> wins on lightsmark. Unfortunately, that was mostly an accident of the
> lame instruction latency information that I had, which made it
> impossible to fix the actual scheduling for performance. Now that we've
> fixed the scheduling for setup for register allocation, we can safely
> update the latency parameters for the final schedule.
>
> In shader-db, we lose 37 16-wide shaders, but gain 90 new ones. 4
> shaders that were spilling change how many registers spill, for a
> reduction of 70/3899 instructions.
> ---
> .../dri/i965/brw_fs_schedule_instructions.cpp | 49 +++++++++++++++++---
> 1 file changed, 43 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_fs_schedule_instructions.cpp
> index d48ad1e..3941eac 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_schedule_instructions.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_schedule_instructions.cpp
> @@ -496,13 +496,50 @@ instruction_scheduler::schedule_instructions(fs_inst *next_block_header)
> schedule_node *chosen = NULL;
> int chosen_time = 0;
>
> - foreach_list(node, &instructions) {
> - schedule_node *n = (schedule_node *)node;
> + if (post_reg_alloc) {
> + /* Of the instructions closest ready to execute or the closest to
> + * being ready, choose the oldest one.
> + */
> + foreach_list(node, &instructions) {
> + schedule_node *n = (schedule_node *)node;
> +
> + if (!chosen || n->unblocked_time < chosen_time) {
> + chosen = n;
> + chosen_time = n->unblocked_time;
> + }
> + }
> + } else {
> + /* Before register allocation, we don't care about the latencies of
> + * instructions. All we care about is reducing live intervals of
> + * variables so that we can avoid register spilling, or get 16-wide
> + * shaders which naturally do a better job of hiding instruction
> + * latency.
> + *
> + * To do so, schedule our instructions in a roughly LIFO/depth-first
> + * order: when new instructions become available as a result of
> + * scheduling something, choose those first so that our result
> + * hopefully is consumed quickly.
> + *
> + * The exception is messages that generate more than one result
> + * register (AKA texturing). In those cases, the LIFO search would
> + * normally tend to choose them quickly (because scheduling the
> + * previous message not only unblocked the children using its result,
> + * but also the MRF setup for the next sampler message, which in turn
> + * unblocks the next sampler message).
> + */
> + for (schedule_node *node = (schedule_node *)instructions.get_tail();
> + node != instructions.get_head()->prev;
> + node = (schedule_node *)node->prev) {
> + schedule_node *n = (schedule_node *)node;
> +
> + if (!chosen || chosen->inst->regs_written() > 1) {
> + chosen = n;
> + if (chosen->inst->regs_written() <= 1)
> + break;
> + }
I don't think the if condition is necessary here. Just doing
for (...) {
chosen = (schedule_node *) node;
if (chosen->inst->regs_written() <= 1)
break;
}
should accomplish the same thing.
> + }
>
> - if (!chosen || n->unblocked_time < chosen_time) {
> - chosen = n;
> - chosen_time = n->unblocked_time;
> - }
> + chosen_time = chosen->unblocked_time;
It seems plausible that there could be no nodes to schedule...which
means chosen would be NULL here. Perhaps just move chosen_time =
chosen->unblocked_time into the if...break above.
> }
>
> /* Schedule this instruction. */
More information about the mesa-dev
mailing list