[Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

Fri Jan 25 09:55:47 PST 2013

On 25 January 2013 07:49, Paul Berry <stereotype441 at gmail.com> wrote:

> On 24 January 2013 19:44, Matt Turner <mattst88 at gmail.com> wrote:
>
>> Following this email are eight patches that add the 4x8 pack/unpack
>> operations that are the difference between what GLSL ES 3.0 and
>> ARB_shading_language_packing require.
>>
>> They require Chad's gles3-glsl-packing series and are available at
>>
>> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing
>>
>> I've also added testing support on top of Chad's piglit patch. The
>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
>> spot why.
>>
>
> I had minor comments on patches 4/8 and 5/8.  The remainder is:
>
> Reviewed-by: Paul Berry <stereotype441 at gmail.com>
>
> I didn't spot anything that would explain the failure in unpackUnorm4x8
> tests.  I'll go have a look at your piglit tests now, and if I don't find
> anything there either, I'll fire up the simulator and see if I can see
> what's going wrong.
>

I found the problem.  On i965, floating point divisions are implemented as
multiplication by a reciprocal, whereas on the CPU there's a floating point
division instruction.  Therefore, unpackUnorm4x8's computation of "f /
255.0" doesn't yield consistent results when run on the CPU vs the
GPU--there is a tiny difference due to the accumulation of floating point
rounding errors.

That's why the "fs" and "vs" variants of the tests failed, and the "const"
variant passed--because Mesa does constant folding using the CPU's floating
point division instruction, which matches the Python test generator
perfectly, whereas the "fs" and "vs" variants use the actual GPU.

It's only by dumb luck that this rounding error issue didn't bite us until
now, because in principle it could equally well have occurred in the
unpack2x16 functions.

I believe we should relax the test to allow for these tiny rounding errors
(this is what the other test generators, such as
gen_builtin_uniform_tests.py, do).  As an experiment I modified
gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and
fs_unpack_2x16_template, "actual == expect${j}" is replaced with
"distance(actual, expect${j}) < 0.00001".  With this change, the test
passes.

However, that change isn't good enough to commit to piglit, for two reasons:

(1) It should only be applied when testing the functions whose definition
includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and
unpackSnorm2x16).  A properly functioning implementation ought to be able
to get exact answers with all the other packing functions, and we should
test that it does.

(2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced
without error, since a shader author might conceivably write code that
relies on these values being exact.  That is, we should check that the
following conversions are exact, with no rounding error:

unpackUnorm4x8(0) == vec4(0.0)
unpackUnorm4x8(0xffffffff) == vec4(1.0)
unpackSnorm4x8(0) == vec4(0.0)
unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0)
unpackSnorm4x8(0x80808080) == vec4(-1.0)
unpackSnorm4x8(0x81818181) == vec4(-1.0)
unpackUnorm2x16(0) == vec2(0.0)
unpackUnorm2x16(0xffffffff) == vec4(1.0)
unpackSnorm2x16(0) == vec4(0.0)
unpackSnorm2x16(0x7fff7fff) == vec4(1.0)
unpackSnorm2x16(0x80008000) == vec4(-1.0)
unpackSnorm2x16(0x80018001) == vec4(-1.0)

My recommendation: address problem (1) by modifying the templates to accept
a new parameter that determines whether the test needs to be precise or
approximate (e.g. "func.precise").  Address problem (2) by hand-coding a
few shader_runner tests to check the cases above.  IMHO it would be ok to
leave the current patch as is (modulo my previous comments) and do a pair
of follow-on patches to address problems (1) and (2).

Chad, do you have any thoughts on this subject, since you're the original
author of this test generator?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130125/33bf894e/attachment.html>