[Mesa-dev] [PATCH v2 0/6] Improvements to the vec4 spilling code
Francisco Jerez
currojerez at riseup.net
Wed Jul 29 03:23:10 PDT 2015
Iago Toral <itoral at igalia.com> writes:
> On Tue, 2015-07-28 at 18:17 +0300, Francisco Jerez wrote:
>> Iago Toral Quiroga <itoral at igalia.com> writes:
>>
>> > Link to v1:
>> > http://lists.freedesktop.org/archives/mesa-dev/2015-July/089766.html
>> >
>> > Changes after review (Curro)
>> > - Drop the patch that asserted that the reg size should always be 1
>> > - Expand this so that we do not unspill a register if we have just
>> > unspilled it as well
>> > - Use brw_mask_for_swizzle
>> > - Update spilling costs accordingly
>> >
>> > New changes:
>> >
>> > - Expand the optimizations that are based on caching the spilled/unspilled
>> > so we keep using the cached register for as long as consecutive instructions
>> > keep reading the register (the previous version would only do this for one
>> > instruction). This is because we only see benefits for register allocation
>> > when there are gaps in the life span of a register where it is not used
>> > (because these are the only instances in which we can use that reg for a
>> > different purpose), so as long as consecutive instructions keep reading a
>> > register we have just spilled or unspilled, we don't have to unspill it
>> > again.
>> >
>> I think this may be a good idea (assuming you've managed to measure an
>> improvement in practice), but I don't think that the explanation is
>> strictly speaking correct. It *may* be beneficial to, say, unspill a
>> variable for instruction i and then do it again for instruction i+1,
>> because the set of variables live at instruction i may not be exactly
>> the same as in instruction i+1, and by caching the value between both
>> instructions you cause the temporary to interfere with the union of both
>> sets simultaneously, what may increase the total number of registers
>> required to register-allocate the program.
>
> This is true, although you also need to allocate a register for the new
> vgrf used to unspill, so I think the chances of this being beneficial in
> practice are very low. I'll make sure to update the comment to be more
> precise though.
>
The difference is that if you split it into two temporaries each one may
interfere with a subset of the nodes they'd have interfered with if they
were a single variable, while they won't interfere with each other
(because the live range of the first ends after the end of instruction i
which is were the live range of the second starts), what means that the
register allocator is still free to allocate the same physical register
to both if need be -- or not, so it can only possibly decrease register
usage never increase it.
>> That said I think that this may still be a good idea because the
>> register-pressure benefit from separating the live ranges of temporaries
>> used in consecutive instructions is likely to be tiny typically, the
>> program is likely to have other spilling candidates which may simplify
>> the interference graph drastically for the same amount of fill/spill
>> bandwidth invested, so I think you're right that in most cases it's
>> going to be silly to re-spill/fill the same variable in consecutive
>> instructions.
>
> Right. The way I would expect this to work in practice is that we start
> by spilling registers with the best benefit / cost ratio. That should be
> registers that have a long life-span and usage gaps where the main
> benefit for allocation comes from being able to allocate the register
> for a different purpose during these gaps, so there should lose very
> little for register allocation by doing this (if anything at all).
>
Yeah, agreed.
>> In the future it may also be worth checking whether the heuristic can be
>> refined to use some sort of register pressure-sensitive distance between
>> uses of the same spilled variable as metric to decide whether the
>> variable is worth re-spilling or if it makes sense for it to be cached
>> between a pair of potentially non-consecutive uses.
>>
>> Anyway I'll have a closer look at the rest of your series soon-ish.
>
> Thanks Curro!
>
>> > Other
>> > Iago Toral Quiroga (6):
>> > i965/vec4: Only emit one scratch read per instruction for spilled
>> > registers
>> > i965/vec4: Remove checks for reladdr when checking for spillable
>> > registers
>> > i965/vec4: Don't emit scratch reads for a spilled register we have
>> > just written
>> > i965/vec4: Don't emit scratch reads for a register we have just
>> > unspilled
>> > i965/vec4: Adjust spilling cost for consecutive instructions
>> > i965: Add a debug option for spilling everything in vec4 code
>> >
>> > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 2 +-
>> > src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +-
>> > .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 145 +++++++++++++++++++--
>> > src/mesa/drivers/dri/i965/intel_debug.c | 3 +-
>> > src/mesa/drivers/dri/i965/intel_debug.h | 5 +-
>> > 5 files changed, 139 insertions(+), 18 deletions(-)
>> >
>> > --
>> > 1.9.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150729/0fea3fad/attachment.sig>
More information about the mesa-dev
mailing list