[Mesa-dev] [PATCH 06/23] i965/fs: fix copy/constant propagation regioning checks
Iago Toral
itoral at igalia.com
Wed May 11 08:45:42 UTC 2016
On Tue, 2016-05-10 at 16:53 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
>
> > From: Iago Toral Quiroga <itoral at igalia.com>
> >
> > We were not accounting for reg_suboffset in the check for the start
> > of the region. This meant that would allow copy-propagation even if
> > the dst wrote to sub_regoffset 4 and our source read from
> > sub_regoffset 0, which is not correct. This was observed in fp64 code,
> > since there we use reg_suboffset to select the high 32-bit of a double.
> >
> I don't think this paragraph is accurate, copy instructions with
> non-zero destination subreg offset are currently considered partial
> writes and should never have been added to the ACP hash table in the
> first place.
Right, I think I wrote this patch before the one where I fixed
is_partial_write() to consider any write to subreg_offset > 0 partial.
> > Also, fs_reg::regs_read() already takes the stride into account, so we
> > should not multiply its result by the stride again. This was making
> > copy-propagation fail to copy-propagate cases that would otherwise be
> > safe to copy-propagate. Again, this was observed in fp64 code, since
> > there we use stride > 1 often.
> >
> > Incidentally, these fixes open up more possibilities for copy propagation
> > which uncovered new bugs in copy-propagation. The folowing patches address
> > each of these new issues.
>
> Oh man, that sucks...
>
> > ---
> > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 21 +++++++++++++--------
> > 1 file changed, 13 insertions(+), 8 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> > index 5fae10f..23df877 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned stride,
> > return true;
> > }
> >
> > +static inline bool
> > +region_match(fs_reg src, unsigned regs_read,
> > + fs_reg dst, unsigned regs_written)
>
> How about 'region_contained_in(dst, regs_write, src, regs_read)'? (I
> personally wouldn't mind 'region_match' but
> 'write_region_contains_read_region' sounds a bit too long for my taste).
>
> > +{
> > + return src.reg_offset >= dst.reg_offset &&
> > + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) &&
> > + src.subreg_offset >= dst.subreg_offset;
>
> This works under the assumption that src.subreg_offset is strictly less
> than the reg_offset unit -- Which *should* be the case unless we've
> messed up that restriction in some place (we have in the past :P). To
> be on the safe side you could do something like following, if you like:
>
> | return (src.reg_offset * REG_SIZE + src.subreg_offset >=
> | dst.reg_offset * REG_SIZE + dst.subreg_offset) &&
> | src.reg_offset + regs_read <= dst.reg_offset + regs_written;
I understand that even if we discard writes with dst.subreg_offset > 0,
you still want the subreg_offset check here to be safe exactly in that
scenario (since we would not need this for the case I originally wrote
it for).
> With the above taken into account:
>
> Reviewed-by: Francisco Jerez <currojerez at riseup.net>
Thanks!
> > +}
> > +
> > bool
> > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
> > {
> > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
> > /* Bail if inst is reading a range that isn't contained in the range
> > * that entry is writing.
> > */
> > - if (inst->src[arg].reg_offset < entry->dst.reg_offset ||
> > - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset +
> > - inst->regs_read(arg) * inst->src[arg].stride * 32) >
> > - (entry->dst.reg_offset + entry->regs_written) * 32)
> > + if (!region_match(inst->src[arg], inst->regs_read(arg),
> > + entry->dst, entry->regs_written))
> > return false;
> >
> > /* we can't generally copy-propagate UD negations because we
> > @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry)
> > /* Bail if inst is reading a range that isn't contained in the range
> > * that entry is writing.
> > */
> > - if (inst->src[i].reg_offset < entry->dst.reg_offset ||
> > - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset +
> > - inst->regs_read(i) * inst->src[i].stride * 32) >
> > - (entry->dst.reg_offset + entry->regs_written) * 32)
> > + if (!region_match(inst->src[i], inst->regs_read(i),
> > + entry->dst, entry->regs_written))
> > continue;
> >
> > fs_reg val = entry->src;
> > --
> > 2.5.0
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list