[Mesa-dev] [PATCH] i965/fs: Reimplement nir_op_uadd_carry and _usub_borrow without accumulator.
Francisco Jerez
currojerez at riseup.net
Thu Jul 9 13:11:43 PDT 2015
Ilia Mirkin <imirkin at alum.mit.edu> writes:
> FYI there's already a lowering pass that does this in the GLSL IR
> (CARRY_TO_ARITH in lower_instructions). Perhaps the right place to do
> this is NIR though, just wanted to let you know.
>
Ah, I wasn't aware of that flag, that seems even better. I just tried
it and it seems to generate one instruction more per op than my assembly
code (apparently because our implementation of b2i is suboptimal, could
probably be fixed), but it would also work to get rid of the no16()
calls, which is all I care about right now.
I'll resend using your approach tomorrow.
> On Thu, Jul 9, 2015 at 3:51 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>> This gets rid of two no16() fall-backs and should allow better
>> scheduling of the generated IR. There are no uses of usubBorrow() or
>> uaddCarry() in shader-db so no changes are expected. However the
>> "arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and
>> "arb_gpu_shader5/execution/built-in-functions/fs-uaddCarry" piglit
>> tests go from 40 to 28 instructions. The reason is that the plain ADD
>> instruction can easily be CSE'ed with the original addition, and the
>> negation can easily be propagated into the source modifier of another
>> instruction, so effectively both operations can be performed with just
>> one instruction.
>>
>> No piglit regressions.
>> ---
>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 33 +++++++++++++-------------------
>> 1 file changed, 13 insertions(+), 20 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 6d9e9d3..3b6aa0a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -829,29 +829,22 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr)
>> bld.emit(SHADER_OPCODE_INT_QUOTIENT, result, op[0], op[1]);
>> break;
>>
>> - case nir_op_uadd_carry: {
>> - if (devinfo->gen >= 7)
>> - no16("SIMD16 explicit accumulator operands unsupported\n");
>> -
>> - struct brw_reg acc = retype(brw_acc_reg(dispatch_width),
>> - BRW_REGISTER_TYPE_UD);
>> -
>> - bld.ADDC(bld.null_reg_ud(), op[0], op[1]);
>> - bld.MOV(result, fs_reg(acc));
>> + case nir_op_uadd_carry:
>> + /* Use signed operands for the ADD to be easily CSE'ed with the original
>> + * addition (e.g. in case we're implementing the uaddCarry() GLSL
>> + * built-in).
>> + */
>> + bld.ADD(result, retype(op[0], BRW_REGISTER_TYPE_D),
>> + retype(op[1], BRW_REGISTER_TYPE_D));
>> + bld.CMP(result, retype(result, BRW_REGISTER_TYPE_UD), op[0],
>> + BRW_CONDITIONAL_L);
>> + bld.MOV(result, negate(result));
>> break;
>> - }
>>
>> - case nir_op_usub_borrow: {
>> - if (devinfo->gen >= 7)
>> - no16("SIMD16 explicit accumulator operands unsupported\n");
>> -
>> - struct brw_reg acc = retype(brw_acc_reg(dispatch_width),
>> - BRW_REGISTER_TYPE_UD);
>> -
>> - bld.SUBB(bld.null_reg_ud(), op[0], op[1]);
>> - bld.MOV(result, fs_reg(acc));
>> + case nir_op_usub_borrow:
>> + bld.CMP(result, op[0], op[1], BRW_CONDITIONAL_L);
>> + bld.MOV(result, negate(result));
>> break;
>> - }
>>
>> case nir_op_umod:
>> bld.emit(SHADER_OPCODE_INT_REMAINDER, result, op[0], op[1]);
>> --
>> 2.4.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150709/949ba4e3/attachment.sig>
More information about the mesa-dev
mailing list