[Mesa-dev] [PATCH 4/5] i965/vec4: Don't emit scratch reads for a spilled register we have just written
Iago Toral
itoral at igalia.com
Sun Jul 26 23:14:19 PDT 2015
On Fri, 2015-07-24 at 16:18 +0300, Francisco Jerez wrote:
> Iago Toral Quiroga <itoral at igalia.com> writes:
>
> > When we have code such as this:
> >
> > mov vgrf1.0.x:F, vgrf2.xxxx:F
> > mov vgrf3.0.x:F, vgrf1.xxxx:F
> > ...
> > mov vgrf3.0.x:F, vgrf1.xxxx:F
> >
> > And vgrf1 is chosen for spilling, we can emit this:
> >
> > mov vgrf1.0.x:F, vgrf2.xxxx:F
> > gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D
> > mov vgrf3.0.x:F, vgrf1.xxxx:F
> > ...
> > gen4_scratch_read vgrf4.0.x:F, 22D
> > mov vgrf3.0.x:F, vgrf4.xxxx:F
> >
> > Instead of this:
> >
> > mov vgrf1.0.x:F, vgrf2.xxxx:F
> > gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D
> > gen4_scratch_read vgrf4.0.x:F, 22D
> > mov vgrf3.0.x:F, vgrf4.xxxx:F
> > ...
> > gen4_scratch_read vgrf5.0.x:F, 22D
> > mov vgrf3.0.x:F, vgrf5.xxxx:F
> >
> > And save one scratch read while still preserving the benefits of
> > spilling the register.
>
> This sounds reasonable to me in principle. I guess that there is in
> general a trade-off between the number of spills/fills you omit and the
> number of interference edges you eliminate. It may also be worth
> checking whether you can extend the same principle to cache the value of
> the variable in a GRF until the next instruction regardless of whether
> it was written or read (e.g. so you don't unspill the same register in
> two adjacent instructions).
That makes sense, I'll send a v2 with that chage.
> In either case it seems like the overall cost of spilling a register
> would be decreased in cases where this heuristic can be applied, would
> it make sense to update the cost metric accordingly?
Yeah, I guess so. I'll do that too.
> One more comment inline.
>
> > ---
> > .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 39 +++++++++++++++++++++-
> > 1 file changed, 38 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > index 80ab813..5fed2f9 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > @@ -334,6 +334,18 @@ vec4_visitor::choose_spill_reg(struct ra_graph *g)
> > return ra_get_best_spill_node(g);
> > }
> >
> > +static bool
> > +writemask_matches_swizzle(unsigned writemask, unsigned swizzle)
> > +{
> > + for (int i = 0; i < 4; i++) {
> > + unsigned channel = 1 << BRW_GET_SWZ(swizzle, i);
> > + if (!(writemask & channel))
> > + return false;
> > + }
> > +
> > + return true;
> > +}
> > +
> > void
> > vec4_visitor::spill_reg(int spill_reg_nr)
> > {
> > @@ -341,11 +353,33 @@ vec4_visitor::spill_reg(int spill_reg_nr)
> > unsigned int spill_offset = last_scratch++;
> >
> > /* Generate spill/unspill instructions for the objects being spilled. */
> > + vec4_instruction *spill_write_inst = NULL;
> > foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
> > + /* We don't spill registers used for scratch */
> > + if (inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_READ ||
> > + inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE)
> > + continue;
> > +
> > int scratch_reg = -1;
> > for (unsigned int i = 0; i < 3; i++) {
> > if (inst->src[i].file == GRF && inst->src[i].reg == spill_reg_nr) {
> > - if (scratch_reg == -1) {
> > + /* If we are reading the spilled register right after writing
> > + * to it we can skip the scratch read and use directly the
> > + * register we used as source for the scratch write. For this
> > + * to work we must check that:
> > + *
> > + * 1) The write is inconditional, that is, it is not predicated or
> > + it is a SEL.
> > + * 2) All the channels that we read have been written in that
> > + * last write instruction.
> > + */
> > + if (spill_write_inst &&
> > + (!spill_write_inst->predicate ||
> > + spill_write_inst->opcode == BRW_OPCODE_SEL) &&
> > + writemask_matches_swizzle(spill_write_inst->dst.writemask,
> > + inst->src[i].swizzle)) {
>
> brw_mask_for_swizzle() returns the mask of components accessed by a
> swizzle, you could just AND it with ~spill_write_inst->dst.writemask to
> find out whether it's contained in the destination of the previous
> instruction.
Ah nice, thanks for the tip!
Iago
> > + scratch_reg = spill_write_inst->dst.reg;
> > + } else if (scratch_reg == -1) {
> > scratch_reg = alloc.allocate(1);
> > src_reg temp = inst->src[i];
> > temp.reg = scratch_reg;
> > @@ -358,6 +392,9 @@ vec4_visitor::spill_reg(int spill_reg_nr)
> >
> > if (inst->dst.file == GRF && inst->dst.reg == spill_reg_nr) {
> > emit_scratch_write(block, inst, spill_offset);
> > + spill_write_inst = inst;
> > + } else {
> > + spill_write_inst = NULL;
> > }
> > }
> >
> > --
> > 1.9.1
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list