[Mesa-dev] [PATCH 05/10] i965/vec4: Mark TCS_OPCODE_SRC0_010_IS_ZERO as writing the flag.

Kenneth Graunke kenneth at whitecape.org
Tue Mar 15 04:32:32 UTC 2016


On Monday, March 14, 2016 5:25:58 PM PDT Matt Turner wrote:
> On Mon, Mar 14, 2016 at 5:22 PM, Matt Turner <mattst88 at gmail.com> wrote:
> > On Mon, Mar 14, 2016 at 5:10 PM, Francisco Jerez <currojerez at riseup.net> 
wrote:
> >> Matt Turner <mattst88 at gmail.com> writes:
> >>
> >>> Missing this causes an assertion failure in the scheduler with the next
> >>> patch.
> >>> ---
> >>>  src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/
dri/i965/brw_ir_vec4.h
> >>> index 660beca..7cedf8e 100644
> >>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> >>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> >>> @@ -201,7 +201,8 @@ public:
> >>>     {
> >>>        return (conditional_mod && (opcode != BRW_OPCODE_SEL &&
> >>>                                    opcode != BRW_OPCODE_IF &&
> >>> -                                  opcode != BRW_OPCODE_WHILE));
> >>> +                                  opcode != BRW_OPCODE_WHILE)) ||
> >>> +             opcode == TCS_OPCODE_SRC0_010_IS_ZERO;
> >>
> >> Meh...  Any reason this weird instruction doesn't have the
> >> conditional_mod set on creation?  Having the generator set it implicitly
> >> makes the instruction *less* useful and is the only reason we need to
> >> introduce these hacks.
> >
> > None that I know of. I'd be happy with that fix as a replacement for this 
patch.

Setting it in the visitor seems totally reasonable.  I like that better.

> >> AFAICT the TCS_OPCODE_SRC0_010_IS_ZERO opcode is actually redundant and
> >> equivalent to:
> >>
> >>  broadcast.nz.f0 null.xyzw, src.xxxx, 0UD
> >
> > I think there's a difference -- that broadcast instruction treats the
> > two vec4 halves independently, reading a component from each of them.
> > The SRC0_010_IS_ZERO reads a single component of the whole vec8 (I
> > don't see calls that would set it, but that region implies it's an
> > align1 instruction). Does that seem right?

Yeah, that's the point - we want to make both SIMD4x2 halves read the
same 32-bit scalar value.

I didn't realize that broadcast existed in the vec4 backend.  I don't
think it does that, though...?

> 
> Uh, this seems weird. That apparently generates:
> 
> mov.z.f0(8)     null<1>F        g5<0,4,1>.xUD                   { align16 1Q 
};
> 
> in arb_tessellation_shader/execution/nop.shader_test.
> 
> I wonder if Ken intended that instruction to be align1?

I don't recall.  Does it matter?  Aren't

    mov.z.f0(8) null<1>F g5<0,4,1>.xUD { align16 1Q };

and

    mov.z.f0(8) null<1>F g5<0,1,0>UD { align1 1Q };

both equivalent, anyway?  I think I named the opcode "010" because
that's effectively what it does, even if I did it in align16 mode.

I never wanted to make this opcode - it's weird and stupid - but we
don't have great regioning controls in the vec4 backend today, and
it does the job.  I don't see much point in trying to pretty it
(other than fixing the conditional_mod thing!).

--Ken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160314/3b989219/attachment.sig>


More information about the mesa-dev mailing list