[Mesa-dev] [PATCH v2 3/6] i965/vec4: Don't emit scratch reads for a spilled register we have just written
Francisco Jerez
currojerez at riseup.net
Fri Jul 31 03:13:40 PDT 2015
Iago Toral <itoral at igalia.com> writes:
> On Thu, 2015-07-30 at 17:14 +0300, Francisco Jerez wrote:
>> Francisco Jerez <currojerez at riseup.net> writes:
>>
>> > Iago Toral Quiroga <itoral at igalia.com> writes:
>> >
>> >> When we have code such as this:
>> >>
>> >> mov vgrf1.0.x:F, vgrf2.xxxx:F
>> >> mov vgrf3.0.x:F, vgrf1.xxxx:F
>> >> ...
>> >> mov vgrf3.0.x:F, vgrf1.xxxx:F
>> >>
>> >> And vgrf1 is chosen for spilling, we can emit this:
>> >>
>> >> mov vgrf1.0.x:F, vgrf2.xxxx:F
>> >> gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D
>> >> mov vgrf3.0.x:F, vgrf1.xxxx:F
>> >> ...
>> >> gen4_scratch_read vgrf4.0.x:F, 22D
>> >> mov vgrf3.0.x:F, vgrf4.xxxx:F
>> >>
>> >> Instead of this:
>> >>
>> >> mov vgrf1.0.x:F, vgrf2.xxxx:F
>> >> gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D
>> >> gen4_scratch_read vgrf4.0.x:F, 22D
>> >> mov vgrf3.0.x:F, vgrf4.xxxx:F
>> >> ...
>> >> gen4_scratch_read vgrf5.0.x:F, 22D
>> >> mov vgrf3.0.x:F, vgrf5.xxxx:F
>> >>
>> >> And save one scratch read while still preserving the benefits of
>> >> spilling the register.
>> >>
>> >> In general, we avoid emitting scratch reads for as long as the next instruction
>> >> keeps reading the spilled register. This should not harm the benefit of
>> >> spilling the register because gains for register allocation only come when we
>> >> have chunks of program code where the register is alive but not really used
>> >> (because these are the points where we could effectively use that register for
>> >> another purpose if we spilled it), so as long as consecutive instructions use
>> >> that register we can avoid the scratch reads without losing anything.
>> >> ---
>> >> .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 37 +++++++++++++++++++++-
>> >> 1 file changed, 36 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
>> >> index cff5406..fd56dae 100644
>> >> --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
>> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
>> >> @@ -340,11 +340,43 @@ vec4_visitor::spill_reg(int spill_reg_nr)
>> >> unsigned int spill_offset = last_scratch++;
>> >>
>> >> /* Generate spill/unspill instructions for the objects being spilled. */
>> >> + vec4_instruction *spill_write_inst = NULL;
>> >> foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
>> >> + /* We don't spill registers used for scratch */
>> >> + if (inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_READ ||
>> >> + inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE)
>> >> + continue;
>> >> +
>> >> int scratch_reg = -1;
>> >> + bool spill_reg_was_read = false;
>> >> for (unsigned int i = 0; i < 3; i++) {
>> >> if (inst->src[i].file == GRF && inst->src[i].reg == spill_reg_nr) {
>> >> - if (scratch_reg == -1) {
>> >> + if (!spill_reg_was_read) {
>> >> + spill_reg_was_read = (!inst->predicate ||
>> >> + inst->opcode == BRW_OPCODE_SEL);
>> >> + }
>> >> +
>> >> + /* If we are reading the spilled register right after writing
>> >> + * to it we can skip the scratch read and use directly the
>> >> + * register we used as source for the scratch write. For this
>> >> + * to work we must check that:
>> >> + *
>> >> + * 1) The write is inconditional, that is, it is not predicated or
>> >> + * it is a SEL.
>> >> + * 2) All the channels that we read have been written in that
>> >> + * last write instruction.
>> >> + *
>> >> + * We keep doing this for as long as the next instruction
>> >> + * keeps reading the spilled register and break as soon as we
>> >> + * find an instruction that doesn't.
>> >> + */
>> >> + if (spill_write_inst &&
>> >> + (!spill_write_inst->predicate ||
>> >> + spill_write_inst->opcode == BRW_OPCODE_SEL) &&
>> >> + ((brw_mask_for_swizzle(inst->src[i].swizzle) &
>> >> + ~spill_write_inst->dst.writemask) == 0)) {
>> >> + scratch_reg = spill_write_inst->dst.reg;
>> >> + } else if (scratch_reg == -1) {
>> >
>> > One suggestion: You could factor out the rather complex caching logic
>> > into a separate function (e.g. 'bool can_reuse_scratch_for_source(const
>> > vec4_instruction *, unsigned i, unsigned scratch_reg)'). The function
>> > would simply compare scratch_reg with the sources of the current
>> > instruction (up to src) and the sources and destination of the previous
>>
>> Huh, of course "up to i" is what I intended to write.
>
> Up to i? Shouldn't we just check source i only? The way I understand
> this is that we would have a loop for each source in the current
> instruction and call this function for each one to see if it is cached.
>
The purpose is to check whether the spill candidate has been used by the
same instruction already, what would give roughly the same effect as
your PATCH 1 using the same approach.
>> > non-scratch_read/write instruction. If there's a match it would check
>> > that the regioning is compatible with the i-th source and return true in
>> > that case. This would have several benefits:
>> >
>> > - It would keep ::spill_reg() simple and the caching heuristic factored
>> > out. The only thing spill_reg() would still need to take care of is
>> > keep track of the last spilling temporary (e.g. as you're doing it
>> > now with the scratch_reg variable) and pass it to
>> > can_reuse_scratch_for_source(), but it would no longer be necessary
>> > to reset it to -1 when reuse is not possible.
>> >
>> > - It would allow you to use a single implementation of the caching
>> > policy to implement both spill_reg() and evaluate_spill_costs(), what
>> > would make sure that things don't go out of sync in the likely case
>> > that we want to change the caching heuristic in the future.
>> >
>> In case it's not obvious from my previous explanation
>> evaluate_spill_costs() could invoke the function I proposed like
>> "can_reuse_scratch_for_source(inst, i, inst->src[i].reg)" to find out if
>> the register would be cached by spill_reg().
>
> Sounds like a good idea, I'll give it a try. Thanks for having a look at
> the patches by the way.
>
> Iago
>
>> >> scratch_reg = alloc.allocate(1);
>> >> src_reg temp = inst->src[i];
>> >> temp.reg = scratch_reg;
>> >> @@ -357,6 +389,9 @@ vec4_visitor::spill_reg(int spill_reg_nr)
>> >>
>> >> if (inst->dst.file == GRF && inst->dst.reg == spill_reg_nr) {
>> >> emit_scratch_write(block, inst, spill_offset);
>> >> + spill_write_inst = inst;
>> >> + } else if (spill_write_inst && !spill_reg_was_read) {
>> >> + spill_write_inst = NULL;
>> >> }
>> >> }
>> >>
>> >> --
>> >> 1.9.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150731/ba912b9b/attachment.sig>
More information about the mesa-dev
mailing list