[Mesa-dev] [PATCH 05/10] i965/vec4: Mark TCS_OPCODE_SRC0_010_IS_ZERO as writing the flag.

Matt Turner mattst88 at gmail.com
Tue Mar 15 19:18:34 UTC 2016


On Mon, Mar 14, 2016 at 9:32 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Monday, March 14, 2016 5:25:58 PM PDT Matt Turner wrote:
>> On Mon, Mar 14, 2016 at 5:22 PM, Matt Turner <mattst88 at gmail.com> wrote:
>> > On Mon, Mar 14, 2016 at 5:10 PM, Francisco Jerez <currojerez at riseup.net>
> wrote:
>> >> Matt Turner <mattst88 at gmail.com> writes:
>> >>
>> >>> Missing this causes an assertion failure in the scheduler with the next
>> >>> patch.
>> >>> ---
>> >>>  src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++-
>> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> >>>
>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/
> dri/i965/brw_ir_vec4.h
>> >>> index 660beca..7cedf8e 100644
>> >>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> >>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> >>> @@ -201,7 +201,8 @@ public:
>> >>>     {
>> >>>        return (conditional_mod && (opcode != BRW_OPCODE_SEL &&
>> >>>                                    opcode != BRW_OPCODE_IF &&
>> >>> -                                  opcode != BRW_OPCODE_WHILE));
>> >>> +                                  opcode != BRW_OPCODE_WHILE)) ||
>> >>> +             opcode == TCS_OPCODE_SRC0_010_IS_ZERO;
>> >>
>> >> Meh...  Any reason this weird instruction doesn't have the
>> >> conditional_mod set on creation?  Having the generator set it implicitly
>> >> makes the instruction *less* useful and is the only reason we need to
>> >> introduce these hacks.
>> >
>> > None that I know of. I'd be happy with that fix as a replacement for this
> patch.
>
> Setting it in the visitor seems totally reasonable.  I like that better.
>
>> >> AFAICT the TCS_OPCODE_SRC0_010_IS_ZERO opcode is actually redundant and
>> >> equivalent to:
>> >>
>> >>  broadcast.nz.f0 null.xyzw, src.xxxx, 0UD
>> >
>> > I think there's a difference -- that broadcast instruction treats the
>> > two vec4 halves independently, reading a component from each of them.
>> > The SRC0_010_IS_ZERO reads a single component of the whole vec8 (I
>> > don't see calls that would set it, but that region implies it's an
>> > align1 instruction). Does that seem right?
>
> Yeah, that's the point - we want to make both SIMD4x2 halves read the
> same 32-bit scalar value.
>
> I didn't realize that broadcast existed in the vec4 backend.  I don't
> think it does that, though...?
>
>>
>> Uh, this seems weird. That apparently generates:
>>
>> mov.z.f0(8)     null<1>F        g5<0,4,1>.xUD                   { align16 1Q
> };
>>
>> in arb_tessellation_shader/execution/nop.shader_test.
>>
>> I wonder if Ken intended that instruction to be align1?
>
> I don't recall.  Does it matter?  Aren't
>
>     mov.z.f0(8) null<1>F g5<0,4,1>.xUD { align16 1Q };
>
> and
>
>     mov.z.f0(8) null<1>F g5<0,1,0>UD { align1 1Q };
>
> both equivalent, anyway?  I think I named the opcode "010" because
> that's effectively what it does, even if I did it in align16 mode.

Yes, they're equivalent. I was just confused by setting stride(src[0],
0, 1, 0) in an instruction that isn't capable of a <0,1,0> stride.
Actually, I still don't understand how that produces what it does.


More information about the mesa-dev mailing list