[Mesa-dev] [PATCH] glsl: Fix type error when lowering integer divisions

Mon Aug 15 09:48:17 PDT 2011

On 15 August 2011 08:50, Jose Fonseca <jfonseca at vmware.com> wrote:
>
> In places you don't have native int division support, you could use one Newton-Raphson iteration step for almost accurate results, assuggested accuracy of SSE2's RCPPS instructions. See for reference the following llvmpipe comment:
>
>  /**
>  * Do one Newton-Raphson step to improve reciprocate precision:
>  *
>  *   x_{i+1} = x_i * (2 - a * x_i)
>  *
>  * XXX: Unfortunately this won't give IEEE-754 conformant results for 0 or
>  * +/-Inf, giving NaN instead.  Certain applications rely on this behavior,
>  * such as Google Earth, which does RCP(RSQRT(0.0) when drawing the Earth's
>  * halo. It would be necessary to clamp the argument to prevent this.
>  *
>  * See also:
>  * - http://en.wikipedia.org/wiki/Division_(digital)#Newton.E2.80.93Raphson_division
>  * - http://softwarecommunity.intel.com/articles/eng/1818.htm
>  */
>
> The softwarecommunity.intel.com link is down, but the "Intel® 64 and IA-32 Architectures Optimization Reference Manual" also documents this.
>
>
> As mentioned, the N-R iteration gives wrong results for the reciprocate of +/-inf, but that's guaranteed to never happen when the arguments are integers encoded as floats.
>
> Jose
>

Thanks for the reference, Jose.  My understanding is that applying
Newton-Raphson to the reciprocal operation won't help directly in this
case (since the problem is due to the fundamental precision limit of
32-bit floats, and happens even if the reciprocal is computed
perfectly), but we may be able to cook up a variation on N-R that
works in this case.

Another idea I've been toying around with would be to implement the
equivalent of the following GLSL:

int divide(int x, int y)
{
  int quotient = int(float(x)*reciprocal(float(y)));
  if ((quotient+1)*y == x) {
    // Fix the case where y divides x exactly, and rounding errors cause
    // us to compute the wrong quotient.
    quotient = quotient + 1;
  }
  return quotient;
}

(the actual routine would have to be slightly more complex than this
to make sure to do the right thing for negative values).

It's kind of an ugly hack, but it wouldn't cost too many GPU
instructions, and it would give us sufficient accuracy for pre-GLSL
1.30.