[Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction
Christoph Bumiller
e0425955 at student.tuwien.ac.at
Thu Apr 4 08:41:18 PDT 2013
On 04.04.2013 17:23, Jose Fonseca wrote:
>
> ----- Original Message -----
>> On 04.04.2013 17:01, Jose Fonseca wrote:
>>> ----- Original Message -----
>>>>> On 04.04.2013 03:45, Zack Rusin wrote:
>>>>>> It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
>>>>>> painful to emulate without branching. Most hardware
>>>>>> supports it natively and even llvm has a 'select' opcode
>>>>>> which can handle it without too much hassle.
>>>>>>
>>>>>> diff --git a/src/gallium/docs/source/tgsi.rst
>>>>>> b/src/gallium/docs/source/tgsi.rst
>>>>>> index 28308cb..6c5a02b 100644
>>>>>> --- a/src/gallium/docs/source/tgsi.rst
>>>>>> +++ b/src/gallium/docs/source/tgsi.rst
>>>>>> @@ -72,6 +72,17 @@ used.
>>>>>>
>>>>>> dst.w = src.w
>>>>>>
>>>>>> +.. opcode:: MOVC - Conditional move
>>>>>> +
>>>>>> +.. math::
>>>>>> +
>>>>>> + dst.x = src0.x ? src1.x : src2.x
>>>>>> +
>>>>>> + dst.y = src0.y ? src1.y : src2.y
>>>>>> +
>>>>>> + dst.z = src0.z ? src1.z : src2.z
>>>>>> +
>>>>>> + dst.w = src0.w ? src1.w : src2.w
>>>>>>
>>>>> I think we already have that:
>>>>>
>>>>> .. opcode:: UCMP - Integer Conditional Move
>>>>>
>>>>> .. math::
>>>>>
>>>>> dst.x = src0.x ? src1.x : src2.x
>>>>>
>>>>> dst.y = src0.y ? src1.y : src2.y
>>>>>
>>>>> dst.z = src0.z ? src1.z : src2.z
>>>>>
>>>>> dst.w = src0.w ? src1.w : src2.w
>>>>>
>>>>>
>>>>> No difference apart from the source ordering (the "integer" just implies
>>>>> that any non-zero value counts as true, i.e. also inf, nan and -0).
>>>> That's really broken. UCMP needs to be a an unsigned version of the CMP
>>>> instruction which does
>>>> dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
>>>> not a whole new instruction. It's what everyone implements anyway. So if
>>>> st_glsl_to_tgsi needs
>>>> a conditional move we need to add the above patch and change it to use it.
>>> Yes, it doesn't seem that any of the TGSI_OPCODE_UCMP implementation does
>>> that the spec says it supposedly does -- it seems everybody implements it
>>> as an unsigned version of CMP. That is, it seems UCMP's description needs
>>> to be fixed.
>> Erm, unsigned < 0 doesn't make sense.
> Ah indeed!
>
>> Definitely what the description says:
>> static void
>> micro_ucmp(union tgsi_exec_channel *dst,
>> const union tgsi_exec_channel *src0,
>> const union tgsi_exec_channel *src1,
>> const union tgsi_exec_channel *src2)
>> {
>> dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
>> dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
>> dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
>> dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
>> }
>>
>> or
>>
>> case TGSI_OPCODE_UCMP:
>> case TGSI_OPCODE_CMP:
>> FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
>> src0 = fetchSrc(0, c);
>> src1 = fetchSrc(1, c);
>> src2 = fetchSrc(2, c);
>> if (src1 == src2)
>> mkMov(dst0[c], src1);
>> else
>> mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
>> CC_NE(not equal 0),
>> srcTy, dst0[c], src1, src2, src0);
>> }
>>
> But odd enough, the implementations I happend to look at seemed to do "foo >= 0":
Well, some people can't read documentation ... or they rely on the
condition value always being a glsl-to-tgsi boolean which is only either
0 or ~0/-1.
> src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c has:
>
> static void emit_ucmp(
> const struct lp_build_tgsi_action * action,
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
> {
> LLVMBuilderRef builder = bld_base->base.gallivm->builder;
>
> LLVMValueRef v = LLVMBuildFCmp(builder, LLVMRealUGE,
> emit_data->args[0], lp_build_const_float(bld_base->base.gallivm, 0.), "");
>
> emit_data->output[emit_data->chan] = LLVMBuildSelect(builder, v, emit_data->args[2], emit_data->args[1], "");
> }
>
> (it doesn't even seem to do integers at all)
>
> src/gallium/drivers/r600/r600_shader.c:
>
> static int tgsi_ucmp(struct r600_shader_ctx *ctx)
> {
> struct tgsi_full_instruction *inst = &ctx->parse.FullToken.FullInstruction;
> struct r600_bytecode_alu alu;
> int i, r;
> int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);
>
> for (i = 0; i < lasti + 1; i++) {
> if (!(inst->Dst[0].Register.WriteMask & (1 << i)))
> continue;
>
> memset(&alu, 0, sizeof(struct r600_bytecode_alu));
> alu.op = ALU_OP3_CNDGE_INT;
> r600_bytecode_src(&alu.src[0], &ctx->src[0], i);
> r600_bytecode_src(&alu.src[1], &ctx->src[2], i);
> r600_bytecode_src(&alu.src[2], &ctx->src[1], i);
> tgsi_dst(ctx, &inst->Dst[0], i, &alu.dst);
> alu.dst.chan = i;
> alu.dst.write = 1;
> alu.is_op3 = 1;
> if (i == lasti)
> alu.last = 1;
> r = r600_bytecode_add_alu(ctx->bc, &alu);
> if (r)
> return r;
> }
> return 0;
> }
>
More information about the mesa-dev
mailing list