[Mesa-dev] [PATCH 05/10] i965/vec4: Mark TCS_OPCODE_SRC0_010_IS_ZERO as writing the flag.
Francisco Jerez
currojerez at riseup.net
Tue Mar 15 20:16:53 UTC 2016
Matt Turner <mattst88 at gmail.com> writes:
> On Mon, Mar 14, 2016 at 9:32 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
>> On Monday, March 14, 2016 5:25:58 PM PDT Matt Turner wrote:
>>> On Mon, Mar 14, 2016 at 5:22 PM, Matt Turner <mattst88 at gmail.com> wrote:
>>> > On Mon, Mar 14, 2016 at 5:10 PM, Francisco Jerez <currojerez at riseup.net>
>> wrote:
>>> >> Matt Turner <mattst88 at gmail.com> writes:
>>> >>
>>> >>> Missing this causes an assertion failure in the scheduler with the next
>>> >>> patch.
>>> >>> ---
>>> >>> src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++-
>>> >>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>> >>>
>>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/
>> dri/i965/brw_ir_vec4.h
>>> >>> index 660beca..7cedf8e 100644
>>> >>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>>> >>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>>> >>> @@ -201,7 +201,8 @@ public:
>>> >>> {
>>> >>> return (conditional_mod && (opcode != BRW_OPCODE_SEL &&
>>> >>> opcode != BRW_OPCODE_IF &&
>>> >>> - opcode != BRW_OPCODE_WHILE));
>>> >>> + opcode != BRW_OPCODE_WHILE)) ||
>>> >>> + opcode == TCS_OPCODE_SRC0_010_IS_ZERO;
>>> >>
>>> >> Meh... Any reason this weird instruction doesn't have the
>>> >> conditional_mod set on creation? Having the generator set it implicitly
>>> >> makes the instruction *less* useful and is the only reason we need to
>>> >> introduce these hacks.
>>> >
>>> > None that I know of. I'd be happy with that fix as a replacement for this
>> patch.
>>
>> Setting it in the visitor seems totally reasonable. I like that better.
>>
>>> >> AFAICT the TCS_OPCODE_SRC0_010_IS_ZERO opcode is actually redundant and
>>> >> equivalent to:
>>> >>
>>> >> broadcast.nz.f0 null.xyzw, src.xxxx, 0UD
>>> >
>>> > I think there's a difference -- that broadcast instruction treats the
>>> > two vec4 halves independently, reading a component from each of them.
>>> > The SRC0_010_IS_ZERO reads a single component of the whole vec8 (I
>>> > don't see calls that would set it, but that region implies it's an
>>> > align1 instruction). Does that seem right?
>>
>> Yeah, that's the point - we want to make both SIMD4x2 halves read the
>> same 32-bit scalar value.
>>
>> I didn't realize that broadcast existed in the vec4 backend. I don't
>> think it does that, though...?
>>
>>>
>>> Uh, this seems weird. That apparently generates:
>>>
>>> mov.z.f0(8) null<1>F g5<0,4,1>.xUD { align16 1Q
>> };
>>>
>>> in arb_tessellation_shader/execution/nop.shader_test.
>>>
>>> I wonder if Ken intended that instruction to be align1?
>>
>> I don't recall. Does it matter? Aren't
>>
>> mov.z.f0(8) null<1>F g5<0,4,1>.xUD { align16 1Q };
>>
>> and
>>
>> mov.z.f0(8) null<1>F g5<0,1,0>UD { align1 1Q };
>>
>> both equivalent, anyway? I think I named the opcode "010" because
>> that's effectively what it does, even if I did it in align16 mode.
>
> Yes, they're equivalent. I was just confused by setting stride(src[0],
> 0, 1, 0) in an instruction that isn't capable of a <0,1,0> stride.
> Actually, I still don't understand how that produces what it does.
I guess it works by luck, brw_eu_emit.c (hopefully always) ignores the
width and horizontal stride in Align16 mode so it ends up emitting the
correct region <0,4,1>. We should probably be setting the source region
to the correct value from the start.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160315/687f268b/attachment.sig>
More information about the mesa-dev
mailing list