[Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

Christoph Bumiller e0425955 at student.tuwien.ac.at
Thu Apr 4 08:41:18 PDT 2013


On 04.04.2013 17:23, Jose Fonseca wrote:
>
> ----- Original Message -----
>> On 04.04.2013 17:01, Jose Fonseca wrote:
>>> ----- Original Message -----
>>>>> On 04.04.2013 03:45, Zack Rusin wrote:
>>>>>> It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
>>>>>> painful to emulate without branching. Most hardware
>>>>>> supports it natively and even llvm has a 'select' opcode
>>>>>> which can handle it without too much hassle.
>>>>>>
>>>>>> diff --git a/src/gallium/docs/source/tgsi.rst
>>>>>> b/src/gallium/docs/source/tgsi.rst
>>>>>> index 28308cb..6c5a02b 100644
>>>>>> --- a/src/gallium/docs/source/tgsi.rst
>>>>>> +++ b/src/gallium/docs/source/tgsi.rst
>>>>>> @@ -72,6 +72,17 @@ used.
>>>>>>  
>>>>>>    dst.w = src.w
>>>>>>  
>>>>>> +.. opcode:: MOVC - Conditional move
>>>>>> +
>>>>>> +.. math::
>>>>>> +
>>>>>> +  dst.x = src0.x ? src1.x : src2.x
>>>>>> +
>>>>>> +  dst.y = src0.y ? src1.y : src2.y
>>>>>> +
>>>>>> +  dst.z = src0.z ? src1.z : src2.z
>>>>>> +
>>>>>> +  dst.w = src0.w ? src1.w : src2.w
>>>>>>  
>>>>> I think we already have that:
>>>>>
>>>>> .. opcode:: UCMP - Integer Conditional Move
>>>>>
>>>>> .. math::
>>>>>
>>>>>   dst.x = src0.x ? src1.x : src2.x
>>>>>
>>>>>   dst.y = src0.y ? src1.y : src2.y
>>>>>
>>>>>   dst.z = src0.z ? src1.z : src2.z
>>>>>
>>>>>   dst.w = src0.w ? src1.w : src2.w
>>>>>
>>>>>
>>>>> No difference apart from the source ordering (the "integer" just implies
>>>>> that any non-zero value counts as true, i.e. also inf, nan and -0).
>>>> That's really broken. UCMP needs to be a an unsigned version of the CMP
>>>> instruction which does
>>>> dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
>>>> not a whole new instruction. It's what everyone implements anyway. So if
>>>> st_glsl_to_tgsi needs
>>>> a conditional move we need to add the above patch and change it to use it.
>>> Yes, it doesn't seem that any of the TGSI_OPCODE_UCMP implementation does
>>> that the spec says it supposedly does -- it seems everybody implements it
>>> as an unsigned version of CMP. That is, it seems UCMP's description needs
>>> to be fixed.
>> Erm, unsigned < 0 doesn't make sense.
> Ah indeed!
>
>> Definitely what the description says:
>> static void
>> micro_ucmp(union tgsi_exec_channel *dst,
>>            const union tgsi_exec_channel *src0,
>>            const union tgsi_exec_channel *src1,
>>            const union tgsi_exec_channel *src2)
>> {
>>    dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
>>    dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
>>    dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
>>    dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
>> }
>>
>> or
>>
>>    case TGSI_OPCODE_UCMP:
>>    case TGSI_OPCODE_CMP:
>>       FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
>>          src0 = fetchSrc(0, c);
>>          src1 = fetchSrc(1, c);
>>          src2 = fetchSrc(2, c);
>>          if (src1 == src2)
>>             mkMov(dst0[c], src1);
>>          else
>>             mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
>> CC_NE(not equal 0),
>>                   srcTy, dst0[c], src1, src2, src0);
>>       }
>>
> But odd enough, the implementations I happend to look at seemed to do "foo >= 0":

Well, some people can't read documentation ... or they rely on the
condition value always being a glsl-to-tgsi boolean which is only either
0 or ~0/-1.

> src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c has:
>
> static void emit_ucmp(
>                 const struct lp_build_tgsi_action * action,
>                 struct lp_build_tgsi_context * bld_base,
>                 struct lp_build_emit_data * emit_data)
> {
>         LLVMBuilderRef builder = bld_base->base.gallivm->builder;
>
>         LLVMValueRef v = LLVMBuildFCmp(builder, LLVMRealUGE,
>                         emit_data->args[0], lp_build_const_float(bld_base->base.gallivm, 0.), "");
>
>         emit_data->output[emit_data->chan] = LLVMBuildSelect(builder, v, emit_data->args[2], emit_data->args[1], "");
> }
>
> (it doesn't even seem to do integers at all)
>
> src/gallium/drivers/r600/r600_shader.c:
>
> static int tgsi_ucmp(struct r600_shader_ctx *ctx)
> {
> 	struct tgsi_full_instruction *inst = &ctx->parse.FullToken.FullInstruction;
> 	struct r600_bytecode_alu alu;
> 	int i, r;
> 	int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);
>
> 	for (i = 0; i < lasti + 1; i++) {
> 		if (!(inst->Dst[0].Register.WriteMask & (1 << i)))
> 			continue;
>
> 		memset(&alu, 0, sizeof(struct r600_bytecode_alu));
> 		alu.op = ALU_OP3_CNDGE_INT;
> 		r600_bytecode_src(&alu.src[0], &ctx->src[0], i);
> 		r600_bytecode_src(&alu.src[1], &ctx->src[2], i);
> 		r600_bytecode_src(&alu.src[2], &ctx->src[1], i);
> 		tgsi_dst(ctx, &inst->Dst[0], i, &alu.dst);
> 		alu.dst.chan = i;
> 		alu.dst.write = 1;
> 		alu.is_op3 = 1;
> 		if (i == lasti)
> 			alu.last = 1;
> 		r = r600_bytecode_add_alu(ctx->bc, &alu);
> 		if (r)
> 			return r;
> 	}
> 	return 0;
> }
>



More information about the mesa-dev mailing list