[Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction
Jose Fonseca
jfonseca at vmware.com
Thu Apr 4 08:23:22 PDT 2013
----- Original Message -----
> On 04.04.2013 17:01, Jose Fonseca wrote:
> >
> > ----- Original Message -----
> >>> On 04.04.2013 03:45, Zack Rusin wrote:
> >>>> It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
> >>>> painful to emulate without branching. Most hardware
> >>>> supports it natively and even llvm has a 'select' opcode
> >>>> which can handle it without too much hassle.
> >>>>
> >>>> diff --git a/src/gallium/docs/source/tgsi.rst
> >>>> b/src/gallium/docs/source/tgsi.rst
> >>>> index 28308cb..6c5a02b 100644
> >>>> --- a/src/gallium/docs/source/tgsi.rst
> >>>> +++ b/src/gallium/docs/source/tgsi.rst
> >>>> @@ -72,6 +72,17 @@ used.
> >>>>
> >>>> dst.w = src.w
> >>>>
> >>>> +.. opcode:: MOVC - Conditional move
> >>>> +
> >>>> +.. math::
> >>>> +
> >>>> + dst.x = src0.x ? src1.x : src2.x
> >>>> +
> >>>> + dst.y = src0.y ? src1.y : src2.y
> >>>> +
> >>>> + dst.z = src0.z ? src1.z : src2.z
> >>>> +
> >>>> + dst.w = src0.w ? src1.w : src2.w
> >>>>
> >>> I think we already have that:
> >>>
> >>> .. opcode:: UCMP - Integer Conditional Move
> >>>
> >>> .. math::
> >>>
> >>> dst.x = src0.x ? src1.x : src2.x
> >>>
> >>> dst.y = src0.y ? src1.y : src2.y
> >>>
> >>> dst.z = src0.z ? src1.z : src2.z
> >>>
> >>> dst.w = src0.w ? src1.w : src2.w
> >>>
> >>>
> >>> No difference apart from the source ordering (the "integer" just implies
> >>> that any non-zero value counts as true, i.e. also inf, nan and -0).
> >> That's really broken. UCMP needs to be a an unsigned version of the CMP
> >> instruction which does
> >> dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
> >> not a whole new instruction. It's what everyone implements anyway. So if
> >> st_glsl_to_tgsi needs
> >> a conditional move we need to add the above patch and change it to use it.
> > Yes, it doesn't seem that any of the TGSI_OPCODE_UCMP implementation does
> > that the spec says it supposedly does -- it seems everybody implements it
> > as an unsigned version of CMP. That is, it seems UCMP's description needs
> > to be fixed.
>
> Erm, unsigned < 0 doesn't make sense.
Ah indeed!
> Definitely what the description says:
> static void
> micro_ucmp(union tgsi_exec_channel *dst,
> const union tgsi_exec_channel *src0,
> const union tgsi_exec_channel *src1,
> const union tgsi_exec_channel *src2)
> {
> dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
> dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
> dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
> dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> }
>
> or
>
> case TGSI_OPCODE_UCMP:
> case TGSI_OPCODE_CMP:
> FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> src0 = fetchSrc(0, c);
> src1 = fetchSrc(1, c);
> src2 = fetchSrc(2, c);
> if (src1 == src2)
> mkMov(dst0[c], src1);
> else
> mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
> CC_NE(not equal 0),
> srcTy, dst0[c], src1, src2, src0);
> }
>
But odd enough, the implementations I happend to look at seemed to do "foo >= 0":
src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c has:
static void emit_ucmp(
const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
LLVMValueRef v = LLVMBuildFCmp(builder, LLVMRealUGE,
emit_data->args[0], lp_build_const_float(bld_base->base.gallivm, 0.), "");
emit_data->output[emit_data->chan] = LLVMBuildSelect(builder, v, emit_data->args[2], emit_data->args[1], "");
}
(it doesn't even seem to do integers at all)
src/gallium/drivers/r600/r600_shader.c:
static int tgsi_ucmp(struct r600_shader_ctx *ctx)
{
struct tgsi_full_instruction *inst = &ctx->parse.FullToken.FullInstruction;
struct r600_bytecode_alu alu;
int i, r;
int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);
for (i = 0; i < lasti + 1; i++) {
if (!(inst->Dst[0].Register.WriteMask & (1 << i)))
continue;
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.op = ALU_OP3_CNDGE_INT;
r600_bytecode_src(&alu.src[0], &ctx->src[0], i);
r600_bytecode_src(&alu.src[1], &ctx->src[2], i);
r600_bytecode_src(&alu.src[2], &ctx->src[1], i);
tgsi_dst(ctx, &inst->Dst[0], i, &alu.dst);
alu.dst.chan = i;
alu.dst.write = 1;
alu.is_op3 = 1;
if (i == lasti)
alu.last = 1;
r = r600_bytecode_add_alu(ctx->bc, &alu);
if (r)
return r;
}
return 0;
}
More information about the mesa-dev
mailing list