[Mesa-dev] [PATCH v2 0/6] Improvements to the vec4 spilling code

Wed Jul 29 03:23:10 PDT 2015

Iago Toral <itoral at igalia.com> writes:

> On Tue, 2015-07-28 at 18:17 +0300, Francisco Jerez wrote:
>> Iago Toral Quiroga <itoral at igalia.com> writes:
>> 
>> > Link to v1:
>> > http://lists.freedesktop.org/archives/mesa-dev/2015-July/089766.html
>> >
>> > Changes after review (Curro)
>> >   - Drop the patch that asserted that the reg size should always be 1
>> >   - Expand this so that we do not unspill a register if we have just
>> >     unspilled it as well
>> >   - Use brw_mask_for_swizzle
>> >   - Update spilling costs accordingly
>> >
>> > New changes:
>> >
>> >   - Expand the optimizations that are based on caching the spilled/unspilled
>> >     so we keep using the cached register for as long as consecutive instructions
>> >     keep reading the register (the previous version would only do this for one
>> >     instruction). This is because we only see benefits for register allocation
>> >     when there are gaps in the life span of a register where it is not used
>> >     (because these are the only instances in which we can use that reg for a
>> >     different purpose), so as long as consecutive instructions keep reading a
>> >     register we have just spilled or unspilled, we don't have to unspill it
>> >     again.
>> >
>> I think this may be a good idea (assuming you've managed to measure an
>> improvement in practice), but I don't think that the explanation is
>> strictly speaking correct.  It *may* be beneficial to, say, unspill a
>> variable for instruction i and then do it again for instruction i+1,
>> because the set of variables live at instruction i may not be exactly
>> the same as in instruction i+1, and by caching the value between both
>> instructions you cause the temporary to interfere with the union of both
>> sets simultaneously, what may increase the total number of registers
>> required to register-allocate the program.
>
> This is true, although you also need to allocate a register for the new
> vgrf used to unspill, so I think the chances of this being beneficial in
> practice are very low. I'll make sure to update the comment to be more
> precise though.
>
The difference is that if you split it into two temporaries each one may
interfere with a subset of the nodes they'd have interfered with if they
were a single variable, while they won't interfere with each other
(because the live range of the first ends after the end of instruction i
which is were the live range of the second starts), what means that the
register allocator is still free to allocate the same physical register
to both if need be -- or not, so it can only possibly decrease register
usage never increase it.

>> That said I think that this may still be a good idea because the
>> register-pressure benefit from separating the live ranges of temporaries
>> used in consecutive instructions is likely to be tiny typically, the
>> program is likely to have other spilling candidates which may simplify
>> the interference graph drastically for the same amount of fill/spill
>> bandwidth invested, so I think you're right that in most cases it's
>> going to be silly to re-spill/fill the same variable in consecutive
>> instructions.
>
> Right. The way I would expect this to work in practice is that we start
> by spilling registers with the best benefit / cost ratio. That should be
> registers that have a long life-span and usage gaps where the main
> benefit for allocation comes from being able to allocate the register
> for a different purpose during these gaps, so there should lose very
> little for register allocation by doing this (if anything at all).
>
Yeah, agreed.

>> In the future it may also be worth checking whether the heuristic can be
>> refined to use some sort of register pressure-sensitive distance between
>> uses of the same spilled variable as metric to decide whether the
>> variable is worth re-spilling or if it makes sense for it to be cached
>> between a pair of potentially non-consecutive uses.
>> 
>> Anyway I'll have a closer look at the rest of your series soon-ish.
>
> Thanks Curro!
>
>> > Other  
>> > Iago Toral Quiroga (6):
>> >   i965/vec4: Only emit one scratch read per instruction for spilled
>> >     registers
>> >   i965/vec4: Remove checks for reladdr when checking for spillable
>> >     registers
>> >   i965/vec4: Don't emit scratch reads for a spilled register we have
>> >     just written
>> >   i965/vec4: Don't emit scratch reads for a register we have just
>> >     unspilled
>> >   i965/vec4: Adjust spilling cost for consecutive instructions
>> >   i965: Add a debug option for spilling everything in vec4 code
>> >
>> >  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |   2 +-
>> >  src/mesa/drivers/dri/i965/brw_vec4.cpp             |   2 +-
>> >  .../drivers/dri/i965/brw_vec4_reg_allocate.cpp     | 145 +++++++++++++++++++--
>> >  src/mesa/drivers/dri/i965/intel_debug.c            |   3 +-
>> >  src/mesa/drivers/dri/i965/intel_debug.h            |   5 +-
>> >  5 files changed, 139 insertions(+), 18 deletions(-)
>> >
>> > -- 
>> > 1.9.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150729/0fea3fad/attachment.sig>