<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 29, 2016 at 8:49 PM, Matt Turner <span dir="ltr"><<a href="mailto:mattst88@gmail.com" target="_blank">mattst88@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Fri, Mar 25, 2016 at 4:12 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br> > ---<br> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32 ++++++++++++++++++++++++++++++<br> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 32 ++++++++++++++++++++++++++++++<br> > 2 files changed, 64 insertions(+)<br> ><br> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp<br> > index 14480fb..131f50e 100644<br> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp<br> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp<br> > @@ -844,8 +844,40 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr)<br> > unreachable("Should have been lowered by borrow_to_arith().");<br> ><br> > case nir_op_umod:<br> > + case nir_op_irem:<br> > + /* According to the sign table for INT DIV in the Ivy Bridge PRM, it<br> > + * appears that our hardware just does the right thing for signed<br> > + * remainder.<br> > + */<br> > + bld.emit(SHADER_OPCODE_INT_REMAINDER, result, op[0], op[1]);<br> > + break;<br> > +<br> > + case nir_op_imod: {<br> > + /* Get a regular C-style remainder. If a % b == 0, set the predicate. */<br> > bld.emit(SHADER_OPCODE_INT_REMAINDER, result, op[0], op[1]);<br> > +<br> > + /* Math instructions don't support conditional mod */<br> > + inst = bld.MOV(bld.null_reg_d(), result);<br> > + inst->conditional_mod = BRW_CONDITIONAL_NZ;<br> > +<br> > + /* Now, we need to determine if signs of the sources are different.<br> > + * When we XOR the sources, the top bit is 0 if they are the same and 1<br> > + * if they are different. We can then use a conditional modifier to<br> > + * turn that into a predicate. This leads us to an XOR.l instruction.<br> > + */<br> > + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_D);<br> > + inst = bld.XOR(tmp, op[0], op[1]);<br> > + inst->predicate = BRW_PREDICATE_NORMAL;<br> > + inst->conditional_mod = BRW_CONDITIONAL_L;<br> <br> </div></div>This goes against the PRM:<br> <br> "This operation does not produce sign or overflow conditions. Only the<br> .e/.z or .ne/.nz conditional modifiers should be used."<br> </blockquote></div><br></div><div class="gmail_extra">So, interesting news: I wrote a Vulkan CTS test for both imod and irem (they weren't tested by the CTS before) to see what the hardware does. And, contrary to what the PRM might lead you to believe, it seems to work just fine. I've only tested on SKL so far but I have tested in both FS and vec4. I'll try it on BDW and HSW before declaring victory, but it looks like XOR.l might be a well-defined thing after all.<br><br></div><div class="gmail_extra">At the very least, we should add a comment with the PRM citation and the empirical results.<br></div><div class="gmail_extra">--Jason<br></div></div>