[Mesa-dev] [PATCH 16/26] i965: Implement the new imod and irem opcodes
Jason Ekstrand
jason at jlekstrand.net
Mon Apr 4 19:00:36 UTC 2016
On Fri, Apr 1, 2016 at 9:56 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>
>
> On Tue, Mar 29, 2016 at 8:49 PM, Matt Turner <mattst88 at gmail.com> wrote:
>
>> On Fri, Mar 25, 2016 at 4:12 PM, Jason Ekstrand <jason at jlekstrand.net>
>> wrote:
>> > ---
>> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32
>> ++++++++++++++++++++++++++++++
>> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 32
>> ++++++++++++++++++++++++++++++
>> > 2 files changed, 64 insertions(+)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> > index 14480fb..131f50e 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> > @@ -844,8 +844,40 @@ fs_visitor::nir_emit_alu(const fs_builder &bld,
>> nir_alu_instr *instr)
>> > unreachable("Should have been lowered by borrow_to_arith().");
>> >
>> > case nir_op_umod:
>> > + case nir_op_irem:
>> > + /* According to the sign table for INT DIV in the Ivy Bridge
>> PRM, it
>> > + * appears that our hardware just does the right thing for signed
>> > + * remainder.
>> > + */
>> > + bld.emit(SHADER_OPCODE_INT_REMAINDER, result, op[0], op[1]);
>> > + break;
>> > +
>> > + case nir_op_imod: {
>> > + /* Get a regular C-style remainder. If a % b == 0, set the
>> predicate. */
>> > bld.emit(SHADER_OPCODE_INT_REMAINDER, result, op[0], op[1]);
>> > +
>> > + /* Math instructions don't support conditional mod */
>> > + inst = bld.MOV(bld.null_reg_d(), result);
>> > + inst->conditional_mod = BRW_CONDITIONAL_NZ;
>> > +
>> > + /* Now, we need to determine if signs of the sources are
>> different.
>> > + * When we XOR the sources, the top bit is 0 if they are the
>> same and 1
>> > + * if they are different. We can then use a conditional
>> modifier to
>> > + * turn that into a predicate. This leads us to an XOR.l
>> instruction.
>> > + */
>> > + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_D);
>> > + inst = bld.XOR(tmp, op[0], op[1]);
>> > + inst->predicate = BRW_PREDICATE_NORMAL;
>> > + inst->conditional_mod = BRW_CONDITIONAL_L;
>>
>> This goes against the PRM:
>>
>> "This operation does not produce sign or overflow conditions. Only the
>> .e/.z or .ne/.nz conditional modifiers should be used."
>>
>
> So, interesting news: I wrote a Vulkan CTS test for both imod and irem
> (they weren't tested by the CTS before) to see what the hardware does.
> And, contrary to what the PRM might lead you to believe, it seems to work
> just fine. I've only tested on SKL so far but I have tested in both FS and
> vec4. I'll try it on BDW and HSW before declaring victory, but it looks
> like XOR.l might be a well-defined thing after all.
>
More data: It appears to also work on Ivy Bridge.
> At the very least, we should add a comment with the PRM citation and the
> empirical results.
> --Jason
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160404/4414d970/attachment.html>
More information about the mesa-dev
mailing list