[Mesa-dev] [PATCH] R600/SI: Fix fneg for 0.0

Mon Feb 3 23:16:09 CET 2014

On Mon, Feb 03, 2014 at 04:43:10PM +0900, Michel Dänzer wrote:
> On Don, 2014-01-30 at 10:43 -0500, Tom Stellard wrote:
> > On Wed, Jan 29, 2014 at 06:23:00PM +0900, Michel Dänzer wrote:
> > > From: Michel Dänzer <michel.daenzer at amd.com>
> > > 
> > > V_ADD_F32 with source modifier does not produce -0.0 for this. Just
> > > manipulate the sign bit directly instead.
> > 
> > That's strange, so does this mean we can never use these modifiers?
> 
> I think we could use them for folding fabs/fneg into other instructions
> using their results, as we're already doing for pre-SI.
> 
> The problem here is that adding -0.0 to 0.0 results in 0.0, not -0.0.
> 
> 
> > > Also add a pattern for (fneg (fabs ...)).
> > > 
> > > Fixes a bunch of bit encoding piglit tests with radeonsi.
> > > 
> > > Signed-off-by: Michel Dänzer <michel.daenzer at amd.com>
> > > ---
> > >  lib/Target/R600/SIInstructions.td | 11 +++++++----
> > >  1 file changed, 7 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td
> > > index 912b59a..43fe63c 100644
> > > --- a/lib/Target/R600/SIInstructions.td
> > > +++ b/lib/Target/R600/SIInstructions.td
> > > @@ -1684,15 +1684,18 @@ def : Pat <
> > >  >;
> > >  
> > >  def : Pat <
> > > +  (fneg (fabs f32:$src)),
> > > +  (V_OR_B32_e32 $src, (V_MOV_B32_e32 0x80000000)) /* Set sign bit */
> > > +>;
> > > +
> > > +def : Pat <
> > >    (fabs f32:$src),
> > > -  (V_ADD_F32_e64 $src, (i32 0 /* SRC1 */),
> > > -   1 /* ABS */, 0 /* CLAMP */, 0 /* OMOD */, 0 /* NEG */)
> > > +  (V_AND_B32_e32 $src, (V_MOV_B32_e32 0x7fffffff)) /* Clear sign bit */
> > >  >;
> > >  
> > >  def : Pat <
> > >    (fneg f32:$src),
> > > -  (V_ADD_F32_e64 $src, (i32 0 /* SRC1 */),
> > > -   0 /* ABS */, 0 /* CLAMP */, 0 /* OMOD */, 1 /* NEG */)
> > > +  (V_XOR_B32_e32 $src, (V_MOV_B32_e32 0x80000000)) /* Toggle sign bit */
> > >  >;
> > 
> > I think you may be able to achieve the same results by marking
> > ISD::FNEG and ISD::FABS as Expand in SIISelLowering.
> 
> That seems to work as expected for the *-floatBitsToInt-neg(_abs) piglit
> tests, but the lit tests end up using V_SUB_F32 vX, -0.0, vY for fneg,
> and while fabs results in V_AND_B32 as expected for a single f32, it
> ends up using more complex comparisons and selects for v2f32 and v4f32.
> So I'm not sure what to do about the lit tests in that case.
> 

This sounds like a bug in the vector lowering code.

> 
> > Also, we have implemented isFAbsFree() and isFNegFree() in
> > AMDGPUISelLowering.cpp
> > We will need to move these implementations into R600ISelLowering.cpp
> > now that FAbs and FNeg are no longer free on SI.
> 
> FWIW, they're not really more expensive with this change than before. :)
> I think implementing these for SI is already wrong at this point.
> 

Yes, good point.

> 
> May I ask you to fix this in your preferred way?
> 

It's clear that there a few things that are wrong which are unrelated to this
patch, so I think it is fine as is.  Could you add a comment above the pattern
explaining why we need to manually toggle the sign bit and also a todo to fix FabsFree
and FNegFree.

If you add a lit test, then this patch has my r-b.

-Tom

> 
> -- 
> Earthling Michel Dänzer            |                  http://www.amd.com
> Libre software enthusiast          |                Mesa and X developer
>