[Mesa-dev] [PATCH 05/10] i965/vec4: Mark TCS_OPCODE_SRC0_010_IS_ZERO as writing the flag.

Matt Turner mattst88 at gmail.com
Tue Mar 15 00:38:50 UTC 2016


On Mon, Mar 14, 2016 at 5:35 PM, Francisco Jerez <currojerez at riseup.net> wrote:
> Matt Turner <mattst88 at gmail.com> writes:
>
>> On Mon, Mar 14, 2016 at 5:10 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>>> Matt Turner <mattst88 at gmail.com> writes:
>>>
>>>> Missing this causes an assertion failure in the scheduler with the next
>>>> patch.
>>>> ---
>>>>  src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>>>> index 660beca..7cedf8e 100644
>>>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>>>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>>>> @@ -201,7 +201,8 @@ public:
>>>>     {
>>>>        return (conditional_mod && (opcode != BRW_OPCODE_SEL &&
>>>>                                    opcode != BRW_OPCODE_IF &&
>>>> -                                  opcode != BRW_OPCODE_WHILE));
>>>> +                                  opcode != BRW_OPCODE_WHILE)) ||
>>>> +             opcode == TCS_OPCODE_SRC0_010_IS_ZERO;
>>>
>>> Meh...  Any reason this weird instruction doesn't have the
>>> conditional_mod set on creation?  Having the generator set it implicitly
>>> makes the instruction *less* useful and is the only reason we need to
>>> introduce these hacks.
>>
>> None that I know of. I'd be happy with that fix as a replacement for this patch.
>>
>>> AFAICT the TCS_OPCODE_SRC0_010_IS_ZERO opcode is actually redundant and
>>> equivalent to:
>>>
>>>  broadcast.nz.f0 null.xyzw, src.xxxx, 0UD
>>
>> I think there's a difference -- that broadcast instruction treats the
>> two vec4 halves independently, reading a component from each of them.
>> The SRC0_010_IS_ZERO reads a single component of the whole vec8 (I
>> don't see calls that would set it, but that region implies it's an
>> align1 instruction). Does that seem right?
>>
> broadcast is supposed to be about taking the value from a given channel
> (which will be a vec4 in Align16 and a scalar in Align1) and
> "broadcasting" it to all channels of the destination.  In Align16 mode
> and if the second source is an immediate it basically boils down to:

Yep, you're right. I had to look at what broadcast did again, and it
looks like it generates equivalent code (though in two instructions).

>>  brw_MOV(p, dst, stride(suboffset(src, 4 * i), 0, 4, 1));
>
> (from brw_eu_emit.c:3365), which seems pretty close to what TCS want.
> Anyway I'm fine with this patch landing as-is, this can always be
> cleaned up later.

Okay. I do like the suggestion to just set condmod at a higher level.
I'll send a patch that does that to replace this one. Thanks!


More information about the mesa-dev mailing list