[Mesa-dev] [PATCH] glsl: Fix type error when lowering integer divisions

Mon Aug 15 10:11:44 PDT 2011

On 08/15/2011 09:48 AM, Paul Berry wrote:
> On 15 August 2011 08:50, Jose Fonseca<jfonseca at vmware.com>  wrote:
>> In places you don't have native int division support, you could use one Newton-Raphson iteration step for almost accurate results, assuggested accuracy of SSE2's RCPPS instructions. See for reference the following llvmpipe comment:
>>
>>   /**
>>   * Do one Newton-Raphson step to improve reciprocate precision:
>>   *
>>   *   x_{i+1} = x_i * (2 - a * x_i)
>>   *
>>   * XXX: Unfortunately this won't give IEEE-754 conformant results for 0 or
>>   * +/-Inf, giving NaN instead.  Certain applications rely on this behavior,
>>   * such as Google Earth, which does RCP(RSQRT(0.0) when drawing the Earth's
>>   * halo. It would be necessary to clamp the argument to prevent this.
>>   *
>>   * See also:
>>   * - http://en.wikipedia.org/wiki/Division_(digital)#Newton.E2.80.93Raphson_division
>>   * - http://softwarecommunity.intel.com/articles/eng/1818.htm
>>   */
>>
>> The softwarecommunity.intel.com link is down, but the "Intel® 64 and IA-32 Architectures Optimization Reference Manual" also documents this.
>>
>>
>> As mentioned, the N-R iteration gives wrong results for the reciprocate of +/-inf, but that's guaranteed to never happen when the arguments are integers encoded as floats.
>>
>> Jose
>>
> Thanks for the reference, Jose.  My understanding is that applying
> Newton-Raphson to the reciprocal operation won't help directly in this
> case (since the problem is due to the fundamental precision limit of
> 32-bit floats, and happens even if the reciprocal is computed
> perfectly), but we may be able to cook up a variation on N-R that
> works in this case.
>
> Another idea I've been toying around with would be to implement the
> equivalent of the following GLSL:
>
> int divide(int x, int y)
> {
>    int quotient = int(float(x)*reciprocal(float(y)));
>    if ((quotient+1)*y == x) {
>      // Fix the case where y divides x exactly, and rounding errors cause
>      // us to compute the wrong quotient.
>      quotient = quotient + 1;
>    }
>    return quotient;
> }
>
> (the actual routine would have to be slightly more complex than this
> to make sure to do the right thing for negative values).
>
> It's kind of an ugly hack, but it wouldn't cost too many GPU
> instructions, and it would give us sufficient accuracy for pre-GLSL
> 1.30.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

You might also want to consider implementing
     quotient = int((float(x) + 0.5 * float(y)) * reciprocal(float(y)));

This rounds the result to the nearest integer rather then flooring the 
result and is arguably faster (assuming that common subexpressions for 
float(y) are hoisted).

cheers, danm