[Piglit] [PATCH 1/2] ARB_shading_language_packing: Allow some imprecision in unpackUnorm4x8

Tue Feb 5 15:56:31 PST 2013

On Tue, Feb 5, 2013 at 12:56 PM, Paul Berry <stereotype441 at gmail.com> wrote:
> On 31 January 2013 22:51, Matt Turner <mattst88 at gmail.com> wrote:
>>
>> The unpack builtins are implemented using floating-point division, which
>> is implemented by reciprocal+multiply.
>
>
> This should probably say something like "Which is implemented by
> reciprocal+multiply on Mesa/i965".  Other drivers don't necessarily use
> reciprocal+multiply to do division.
>
>>
>> As it so happens, 1.0/255.0 falls
>> almost exactly between two representable 32-bit floats and leads to bad
>> rounding. None of the other divisors (127, 32767, 65535) have this
>> property.
>>
>>
>> So allow the result of unpackUnorm4x8 to be within some small epsilon
>> (0.00001) of the actual value. Some values like 0.0 and 1.0 should be
>> calculated exactly, so handle them specially.
>> ---
>>  generated_tests/gen_builtin_packing_tests.py | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/generated_tests/gen_builtin_packing_tests.py
>> b/generated_tests/gen_builtin_packing_tests.py
>> index 7c1074e..0f7510f 100644
>> --- a/generated_tests/gen_builtin_packing_tests.py
>> +++ b/generated_tests/gen_builtin_packing_tests.py
>> @@ -231,6 +231,8 @@ vs_unpack_template = Template(dedent("""\
>>
>>      uniform highp uint func_input;
>>
>> +    uniform bool exact;
>> +
>>      % for j in range(func.num_valid_outputs):
>>      uniform ${func.result_precision} ${func.vector_type} expect${j};
>>      % endfor
>> @@ -246,6 +248,9 @@ vs_unpack_template = Template(dedent("""\
>>
>>          if (false
>>              % for j in range(func.num_valid_outputs):
>> +            % if not func.exact:
>> +            || (!exact && distance(actual, expect${j}) < 0.00001)
>> +            %endif
>>              || actual == expect${j}
>>              % endfor
>
>
> I had a little trouble following the boolean logic here.  Would something
> like this be clearer?
>
> % for j in range(func.num_valid_outputs):
> || (exact ? actual == expect${j}
>
>           : distance(actual, expect${j}) < 0.00001)
> % endfor
>
> (I don't consider it a blocking issue, though.  The code you've written is
> correct, it just took me a while to puzzle out.)
>
> Note: since my suggested change would cause the shader to always reference
> the "exact" uniform, you'd have to change the code below to emit "uniform"
> commands even when func.exact is true.
>
>>
>>             ) {
>> @@ -274,6 +279,9 @@ vs_unpack_template = Template(dedent("""\
>>      [test]
>>      % for io in func.inout_seq:
>>      uniform uint func_input ${io.input}
>> +    % if not func.exact:
>> +    uniform int exact ${int(int(io.input[:-1]) in (0x0, 0xffffffff))}
>> +    % endif
>>      % for j in range(func.num_valid_outputs):
>>      uniform ${func.vector_type} expect${j} ${"
>> ".join(io.valid_outputs[j])}
>>      % endfor
>> @@ -370,6 +378,8 @@ fs_unpack_template = Template(dedent("""\
>>
>>      uniform highp uint func_input;
>>
>> +    uniform bool exact;
>> +
>>      % for i in range(func.num_valid_outputs):
>>      uniform ${func.result_precision} ${func.vector_type} expect${i};
>>      % endfor
>> @@ -382,6 +392,9 @@ fs_unpack_template = Template(dedent("""\
>>
>>          if (false
>>              % for i in range(func.num_valid_outputs):
>> +            % if not func.exact:
>> +            || (!exact && distance(actual, expect${i}) < 0.00001)
>> +            %endif
>>              || actual == expect${i}
>>              % endfor
>>             ) {
>> @@ -401,6 +414,9 @@ fs_unpack_template = Template(dedent("""\
>>      [test]
>>      % for io in func.inout_seq:
>>      uniform uint func_input ${io.input}
>> +    % if not func.exact:
>> +    uniform int exact ${int(int(io.input[:-1]) in (0x0, 0xffffffff))}:
>> +    % endif
>>      % for i in range(func.num_valid_outputs):
>>      uniform ${func.vector_type} expect${i} ${"
>> ".join(io.valid_outputs[i])}
>>      % endfor
>> @@ -1271,6 +1287,9 @@ class FuncInfo:
>>
>>      - requirements: A set of API/extension requirments to be listed in
>> the
>>        .shader_test's [requires] section.
>> +
>> +    - exact: Whether the generated results must be exact (e.g., 0.0 and
>> 1.0
>> +      should always be converted exactly).
>>      """
>>
>>      def __init__(self, name, requirements):
>> @@ -1279,6 +1298,7 @@ class FuncInfo:
>>          self.inout_seq = inout_table[name]
>>          self.num_valid_outputs = len(self.inout_seq[0].valid_outputs)
>>          self.requirements = requirements
>> +        self.exact = not name.endswith("unpackUnorm4x8")
>
>
> I'm not comfortable with only applying this change to unpackUnorm4x8.
> Although we've only seen the failure in unpackUnorm4x8, the reason for that
> is highly specific to Mesa/i965.  Other implementations might conceivably
> have different sources of rounding errors, and IMHO we shouldn't make an
> exception that's so specific to i965.
>
> I believe the only unpack function we should require to be exact is
> unpackHalf2x16.

I committed a patch that allows inexact results like Paul suggested.

For anyone who cares, here's what I wrote earlier about why I don't
think allowing inexact results in other functions is necessary.

"""
The problem that you discovered was that 1.0/255.0 just so happens to
appear between representable float32s. Combining that with recip/mul
instead of floating-point division leads to the problem. I think it's
okay to make unpackUnorm4x8 a special case precisely because 1.0/255.0
is a special case. To be clear: this isn't a case specific to i965. It
actually seems that every Gallium driver also implements
floating-point division as recip/mul. llvmpipe is the only exception,
because SSE's rcp instructions are so imprecise that they don't even
give exact results for rcp(1.0).

Conceivably other implementations could find new ways of being
imprecise, but I don't think that's something we can or should account
for.
"""