[Mesa-dev] [PATCH 05/23] i965/fs: fix copy-propagation with suboffset from constants
Francisco Jerez
currojerez at riseup.net
Wed May 11 05:18:59 UTC 2016
Iago Toral <itoral at igalia.com> writes:
> On Tue, 2016-05-03 at 16:21 -0700, Jordan Justen wrote:
>> On 2016-05-03 05:21:54, Samuel Iglesias Gonsálvez wrote:
>> > From: Iago Toral Quiroga <itoral at igalia.com>
>> >
>> > The current code ignores the suboffet in the instruction's source
>> > and just uses the one from the constant. This is not correct
>> > when the instruction's source is accessing the constant with a
>> > different type and using the suboffset to select a specific
>> > chunk of the constant. We generate this kind of code in fp64
>> > when we want to select only the high 32-bit of a particular
>> > double constant.
>> >
>> > Instead, we should add any existing suboffset in the
>> > instruction's source (modulo the size of the entry's type)
>> > to the suboffset in the constant so we can preserve the orinal
>> > semantics.
>> >
>> > Prevents that we turn this:
>> >
>> > mov(8) vgrf5:DF, u2<0>:DF
>> > mov(8) vgrf7:UD, vgrf5+0.4<2>:UD
>> >
>> > Into:
>> >
>> > mov(8) vgrf7:UD, u2<0>:UD
>> >
>> > And instead, with this patch, we produce:
>> >
>> > mov(8) vgrf7:UD, u2+0.4<0>:UD
>> > ---
>> > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 23 ++++++++++++++++++++--
>> > 1 file changed, 21 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
>> > index aa4c9c9..5fae10f 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
>> > @@ -445,8 +445,27 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
>> > case BAD_FILE:
>> > case ARF:
>> > case FIXED_GRF:
>> > - inst->src[arg].reg_offset = entry->src.reg_offset;
>> > - inst->src[arg].subreg_offset = entry->src.subreg_offset;
>> > + {
>> > + inst->src[arg].reg_offset = entry->src.reg_offset;
>> > + inst->src[arg].subreg_offset = entry->src.subreg_offset;
>> > +
>> > + /* If we copy propagate from a larger type we have to be aware that
>> > + * the instruction might be using subreg_offset to select a particular
>> > + * chunk of the data in the entry. For example:
>> > + *
>> > + * mov(8) vgrf5:DF, u2<0>:DF
>> > + * mov(8) vgrf7:UD, vgrf5+0.4<2>:UD
>> > + *
>> > + * vgrf5+0.4<2>:UD is actually reading the high 32-bit of u2.0, so if
>> > + * we want to copy propagate here we have to do it from u2+0.4.
>> > + */
>> > + int type_sz_src = type_sz(inst->src[arg].type);
>> > + int type_sz_entry = type_sz(entry->src.type);
>> > + if (type_sz_entry > type_sz_src) {
>> > + inst->src[arg].subreg_offset +=
>> > + inst->src[arg].subreg_offset % type_sz_entry;
>>
>> Seeing 'inst->src[arg].subreg_offset' on both sides of this += seems
>> strange. Is this correct?
>
> Yes, this looks wrong. I'll fix it, thanks!
>
I just had a look at the version of this patch you have in your tree
(not sure I took the right one), which does:
| @@ -445,8 +445,24 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
| case BAD_FILE:
| case ARF:
| case FIXED_GRF:
| - inst->src[arg].reg_offset = entry->src.reg_offset;
| - inst->src[arg].subreg_offset = entry->src.subreg_offset;
| + {
| + inst->src[arg].reg_offset = entry->src.reg_offset;
| +
| + /* If we copy propagate from a larger type we have to be aware that
| + * the instruction might be using subreg_offset to select a particular
| + * chunk of the data in the entry. For example:
| + *
| + * mov(8) vgrf5:DF, u2<0>:DF
| + * mov(8) vgrf7:UD, vgrf5+0.4<2>:UD
| + *
| + * vgrf5+0.4<2>:UD is actually reading the high 32-bit of u2.0, so if
| + * we want to copy propagate here we have to do it from u2+0.4.
| + */
| + int type_sz_entry = type_sz(entry->src.type);
| + inst->src[arg].subreg_offset =
| + (inst->src[arg].subreg_offset +
| + entry->src.subreg_offset) % type_sz_entry;
| + }
| break;
| case ATTR:
| case VGRF:
I'm afraid the calculation is still wrong, overflow of subreg_offset is
not handled properly by taking the original value modulo 4 and then
updating reg_offset accordingly (oh man there should *really* just be a
single reg_offset field expressed in bytes), and you're applying the
entry->src.subreg_offset offset *before* taking the remainder as if it
applied to the source region of the instruction instead of the source
region of the copy, which doesn't look correct to me.
Really, fixing this properly amounts to duplicating half of the logic
From the VGRF case below (but replacing REG_SIZE with 4 for uniforms
because of the inconsistent units of reg_offset). I wonder why the
switch-case statement is even here -- How about the fix below? (applies
on top of the patch I proposed as replacement for PATCH 2)
>> -Jordan
>>
>> > + }
>> > + }
>> > break;
>> > case ATTR:
>> > case VGRF:
>> > --
>> > 2.5.0
>> >
>> > _______________________________________________
>> > mesa-dev mailing list
>> > mesa-dev at lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-i965-fs-Simplify-and-fix-register-offset-calculation.patch
Type: text/x-diff
Size: 3829 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160510/c31f77cc/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-i965-fs-Reindent-register-offset-calculation-of-try_.patch
Type: text/x-diff
Size: 3649 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160510/c31f77cc/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160510/c31f77cc/attachment.sig>
More information about the mesa-dev
mailing list