[Mesa-dev] [PATCH] glsl: Fix type error when lowering integer divisions

Mon Aug 15 08:50:58 PDT 2011

----- Original Message -----
> On 13 August 2011 09:58, Kenneth Graunke <kenneth at whitecape.org>
> wrote:
> > On 08/12/2011 10:38 AM, Paul Berry wrote:
> >>
> >> This patch fixes a bug when lowering an integer division:
> >>
> >>   x/y
> >>
> >> to a multiplication by a reciprocal:
> >>
> >>   int(float(x)*reciprocal(float(y)))
> >>
> >> If x was a a plain int and y was an ivecN, the lowering pass
> >> incorrectly assigned the type of the product to be float, when in
> >> fact
> >> it should be vecN.  This caused mesa to abort with an IR
> >> validation
> >> error.
> >>
> >> Fixes piglit tests {fs,vs}-op-div-int-ivec{2,3,4}.
> >
> > Good catch, Paul!  Thanks again for writing all these test cases.
> >
> > Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
> >
> > Come to think of it, we may want to avoid this altogether on i965.
> >  The
> > mathbox has an INT DIV message that computes integer quotient and
> > remainder...so we can support it natively.
> >
> > I guess the question is "which is faster?".  My intuition says that
> > using
> > INT DIV will be faster on Gen6+, possibly on Gen5, and slower on
> > Gen4/G45.
> >  AFAICT on Gen5+ you can compute quotient & remainder separately (3
> >  or 4
> > rounds) while on Gen4 you always have to compute both (3 + 4 = 7
> > rounds?).
> >  Meanwhile RCP is 2 rounds.  Not only is that more rounds, it means
> >  hogging
> > the shared mathbox for longer.
> 
> Accuracy is also a question.  Our current technique of multiplying by
> the reciprocal doesn't work for some denominators because of rounding
> errors in computing the reciprocal.  For example, try to write a
> piglit tests that computes 77/77.  On Gen5 hardware, at least, this
> produces zero.  The reason is because rounding errors in computing
> the
> floating point reciprocal mean that  77*reciprocal(77) is actually
> slightly less than 1.0, so it gets rounded down to zero when it's
> converted back to an int.  Note: I believe the smallest integer for
> which rounding errors cause n*reciprocal(n) to be less than 1.0 is
> n=25, which probably explains why we haven't noticed this bug before.

In places you don't have native int division support, you could use one Newton-Raphson iteration step for almost accurate results, assuggested accuracy of SSE2's RCPPS instructions. See for reference the following llvmpipe comment:

 /**
 * Do one Newton-Raphson step to improve reciprocate precision:
 *
 *   x_{i+1} = x_i * (2 - a * x_i)
 *
 * XXX: Unfortunately this won't give IEEE-754 conformant results for 0 or
 * +/-Inf, giving NaN instead.  Certain applications rely on this behavior,
 * such as Google Earth, which does RCP(RSQRT(0.0) when drawing the Earth's
 * halo. It would be necessary to clamp the argument to prevent this.
 *
 * See also:
 * - http://en.wikipedia.org/wiki/Division_(digital)#Newton.E2.80.93Raphson_division
 * - http://softwarecommunity.intel.com/articles/eng/1818.htm
 */

The softwarecommunity.intel.com link is down, but the "Intel® 64 and IA-32 Architectures Optimization Reference Manual" also documents this.

As mentioned, the N-R iteration gives wrong results for the reciprocate of +/-inf, but that's guaranteed to never happen when the arguments are integers encoded as floats.

Jose