[Mesa-dev] Improving precision of mod(x,y)
Glenn Kennard
glenn.kennard at gmail.com
Thu Jan 15 12:26:27 PST 2015
On Thu, 15 Jan 2015 15:32:59 +0100, Roland Scheidegger
<sroland at vmware.com> wrote:
> Am 15.01.2015 um 10:05 schrieb Iago Toral:
>> Hi,
>>
>> We have 16 deqp tests that fail, at least on i965, because of
>> insufficient precision of the mod GLSL function.
>>
>> Mesa lowers mod(x,y) to y * fract(x,y) so there can be some precision
>> lost due to fract operation. Since the result is multiplied by y the
>> total precision lost usually grows together with the value of y.
> Did you mean fract(x/y) here?
>
>>
>> Below are some examples to give an idea of the magnitude of this error.
>> The values on the right represent the precision error for each case:
>>
>> mod(-1.951171875, 1.9980468750) => 0.0000000447
>> mod(121.57, 13.29) => 0.0000023842
>> mod(3769.12, 321.99) => 0.0000762939
>> mod(3769.12, 1321.99) => 0.0001220703
>> mod(-987654.125, 123456.984375) => 0.0160663128
>> mod( 987654.125, 123456.984375) => 0.0312500000
>>
>> As you see, for large enough values, the precision error becomes
>> significant.
>>
>> This can be fixed by lowering mod(x,y) to x - y * floor(x/y) instead,
>> which is the suggested implementation in the GLSL docs. I have a local
>> patch in my tree that does this and it does indeed fix the problem. the
>> down side is that this implementation adds and extra ADD instruction to
>> the generated code (besides replacing fract with floor, which I guess
>> have similar cost).
>>
>> Since this is a case where there is some trade-off to the fix, I wonder
>> if we are interested in doing this or not. Is the precision fix worth
>> the additional ADD?
>>
>
> Well I can tell you that llvmpipe implements frc(x) as x - floor(x), so
> this change looks good to me :-).
> On a more serious note though, it looks to me like the cost of this
> expression would be mostly dominated by the division, hence some add
> more shouldn't be that bad. And if the test is legit, I don't think
> there's much choice (unless you could make this optional for some old
> glsl versions if they didn't require that much precision but even then
> it's probably not worth bothering imho).
>
FWIW, I just typed out the following little piglit test and tried it on
R600:
[require]
GLSL >= 3.30
[vertex shader passthrough]
[fragment shader]
uniform float a;
uniform float b;
out vec4 colour;
void
main(void)
{
// colour = vec4(b * fract(a / b)); // current lowering of mod(x,y)
colour = vec4(a - b * floor(a/b)); // proposed lowering
}
[test]
clear color 0.5 0.5 0.5 0.5
clear
uniform float a 4.2
uniform float b 3.5
draw rect -1 -1 2 2
probe rgba 1 1 0.7 0.7 0.7 0.7
Resulting R600 assembly:
// y * fract(x,y)
// KC0[0].x is x and KC0[1] is y
1 t: RECIP_IEEE T0.x, KC0[1].x
2 x: MUL T0.x, KC0[0].x, T0.x
3 x: FRACT T0.x, T0.x
4 x: MUL R0.x, KC0[1].x, T0.x
EXPORT_DONE PIXEL 0 R0.xxxx EOP
// x - y * floor(x/y)
1 t: RECIP_IEEE T0.x, KC0[1].x
2 x: MUL T0.x, KC0[0].x, T0.x
3 x: FLOOR T0.x, T0.x
4 x: MULADD R0.x, KC0[1].x, -T0.x, KC0[0].x
EXPORT_DONE PIXEL 0 R0.xxxx EOP
Same number of cycles/length of dependency chain/ALU pipe usage for both
methods.
I'd expect most architectures that can do source negate with multiply-add
in a single operation should get similar results with no extra cost for
the subtraction.
/Glenn
More information about the mesa-dev
mailing list