[Mesa-dev] [PATCH 14/32] i965/fs: Fix register coalesce not to lose track of the second half of 16-wide moves.
Francisco Jerez
currojerez at riseup.net
Mon Jul 27 07:51:15 PDT 2015
Francisco Jerez <currojerez at riseup.net> writes:
> Matt Turner <mattst88 at gmail.com> writes:
>
>> On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>> Fixes rewrite by the register coalesce pass of references to
>>> individual halves of 16-wide coalesced registers.
>>> ---
>>> src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 8 ++++++--
>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>>> index 09f0fad..2a26a46 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>>> @@ -211,9 +211,13 @@ fs_visitor::register_coalesce()
>>> continue;
>>> }
>>> reg_to_offset[offset] = inst->dst.reg_offset;
>>> - if (inst->src[0].width == 16)
>>> - reg_to_offset[offset + 1] = inst->dst.reg_offset + 1;
>>> mov[offset] = inst;
>>> +
>>> + if (inst->exec_size * type_sz(inst->src[0].type) > REG_SIZE) {
>>> + reg_to_offset[offset + 1] = inst->dst.reg_offset + 1;
>>> + mov[offset + 1] = inst;
>>> + }
>>> +
>>> channels_remaining -= inst->regs_written;
>>> }
>>>
>>> --
>>> 2.1.3
>>
>> I can believe it. It would help me to understand if we had an example
>> of a sequence of instructions that this code didn't handle properly.
>
> The problem is in the "rewrite" phase of the register coalesce pass
> (roughly lines 264-283). It won't fix up instructions that reference
> some specific offset of the coalesced register if mov[i] is NULL for
> that offset, as is the case for the second half of a 16-wide move. For
> example:
>
> | ADD (16) vgrf0:f, vgrf0:f, 1.0:f
> | MOV (16) vgrf1:f, vgrf0:f
> | MOV (8) vgrf2:f, vgrf0+1:f { sechalf }
>
> will get incorrectly register-coalesced into:
>
> | ADD (16) vgrf1:f, vgrf1:f, 1.0:f
> | MOV (8) vgrf2:f, vgrf0+1:f { sechalf }
Ping. The SIMD lowering pass emits this kind of code so this will lead
to actual piglit regressions
(e.g. tests/spec/arb_shader_texture_lod/execution/glsl-fs-shadow2DGradARB-07.shader_test).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150727/76a103a3/attachment-0001.sig>
More information about the mesa-dev
mailing list