[Piglit] [PATCH] glsl-es-3.00: Generate tests for builtin packing functions (v2)
Chad Versace
chad.versace at linux.intel.com
Tue Jan 22 14:23:08 PST 2013
On 01/22/2013 10:24 AM, Paul Berry wrote:
> On 21 January 2013 15:24, Chad Versace <chad.versace at linux.intel.com> wrote:
>
>> Generate the following test files:
>> {const,vs,fs}-{pack,unpack}{Snorm,Unorm,Half}2x16.shader_test
>>
>> The tests are generated by a new Python script,
>> gen_builtin_packing_tests.py, and placed into directory
>> spec/glsl-es-3.00/execution/built-in-functions.
>>
>> v2: Add reduced_input_table. This allows us to generate a smaller set of
>> inputs for packHalf2x16 in order to avoid Linux's oom-killer.
>>
>
> The change looks good, thanks.
>
> I've included a couple of minor comments below, including some other ideas
> for speeding up performance if it's still slow, but I don't consider any of
> them blocking issues, so consider this patch:
>
> Reviewed-by: Paul Berry <stereotype441 at gmail.com>
>
>
>>
>> CC: Paul Berry <stereotype441 at gmail.com>
>> Signed-off-by: Chad Versace <chad.versace at linux.intel.com>
>> ---
>> generated_tests/CMakeLists.txt | 6 +-
>> generated_tests/gen_builtin_packing_tests.py | 1230
>> ++++++++++++++++++++++++++
>> 2 files changed, 1235 insertions(+), 1 deletion(-)
>> create mode 100644 generated_tests/gen_builtin_packing_tests.py
>> +# Test evaluation of constant pack2x16 expressions.
>> +const_pack_2x16_template = Template(dedent("""\
>> + [require]
>> + GL ES >= 3.0
>> + GLSL ES >= 3.00
>> +
>> + [vertex shader]
>> + const vec4 red = vec4(1, 0, 0, 1);
>> + const vec4 green = vec4(0, 1, 0, 1);
>> +
>> + in vec4 vertex;
>> + out vec4 vert_color;
>> +
>> + void main()
>> + {
>> + ${func.result_precision} uint actual;
>> +
>> + gl_Position = vertex;
>> +
>> + % for io in func.inout_seq:
>> + actual = ${func.name}(vec2(${io.input[0]}, ${io.input[1]}));
>> +
>> + if (true
>> + % for u in io.valid_outputs:
>>
>
> Not a huge deal, but you might get better performance out of the shader
> compiler if you change this to "% for u in sorted(set(io.valid_outputs))".
> (This drops duplicate entries from io.valid_outputs, which happen a lot
> because frequently the different rounding/flushing options don't affect the
> result.)
Execution time is troublesome on Sandybridge. So I'll use your suggestion.
>> + && actual != ${u}
>> + % endfor
>> + ) {
>> + vert_color = red;
>> + return;
>> + }
>> +
>> + % endfor
>> +
>> + vert_color = green;
>> + }
>>
>
> Another possible performance improvement would be to change the logic to:
>
> vert_color = green;
> % for io in func.inout_seq:
> actual = ...
> if (...) {
> vert_color = red;
> }
> % endfor
>
> In orther words, instead of using a "return" statement to shortcut the
> shader once a failure is found, set vert_color to green initially, and then
> only change it to red if a failure is found. The reason this might improve
> performance is because on architectures with limited JMP support, the
> version with the embedded return statements has to get rewritten to a very
> deeply-nested if tree.
Good idea. I'll do that.
>> +class FuncOpts:
>> + """Options that modify the evaluation of the GLSL pack/unpack
>> functions.
>> +
>> + Given an input and a pack/unpack function, there exist multiple valid
>> + outputs because the GLSL specs permit variation in the implementation
>> of
>> + the function. The actual output is dependent on the GLSL compiler's
>> and
>> + hardware's choice of rounding mode (for example, to even or to
>> nearest)
>> + and handling of subnormal (also called denormalized) floating point
>> + numbers.
>> +
>> + This class attempts to capture such permitted variations in
>> implementation
>> + behavior. To select a particular behavior, pass the appropriate enums
>> to
>> + the constructor.
>> +
>> + What follows is an explanation of the variations and how to select
>> them.
>> +
>> + Flushing subnormal floats to zero
>> + ---------------------------------
>> + The GLSL ES 3.00 and GLSL 4.10 specs allows implementations to
>> truncate
>> + subnormal floats to zero. From section 4.5.1 "Range and Precision" of
>> the
>> + two specs:
>> + Any subnormal (denormalized) value input into a shader or
>> + potentially generated by any operation in a shader can be
>> + flushed to 0.
>> +
>> + The constructor parameter 'flush_mode' selects the flushing behavior.
>> + Valid values are:
>> + - FLUSH_NONE: Flush no float to zero.
>> + - FLUSH_FLOAT32: Flush subnormal float32 values to zero.
>>
>
> Thinking about this some more, I have to admit some surprise that it's
> necessary to account for flushing of subnormal float32 values in this test
> generator. It seems to me that since all subnormal float32's are
> extraordinarily tiny (-2^-126 < x < 2^-126), they will always behave
> equivalently to zero considering the coarseness of the packing functions.
> Are you aware of a counterexample?
Why didn't I see that? I'll remove the float32 flushing before committing.
>> + else:
>> + # f32 lies in the range [max_normal16 + max_step16, inf), which is
>> + # outside the range of finite float16 values. The resultant
>> float16 is
>> + # infinite.
>>
>
> You might want to define max_step16 above.
A little mistake. I'll fix it.
>> +def make_inputs_for_pack_snorm_2x16():
>> + # The domain of packSnorm2x16 is [-inf, +inf]^2. The function clamps
>> + # its input into the range [-1, +1]^2.
>> + #
>> + # We test -0.0 in order to stress the implementation's handling of
>> zero.
>> + # The implementation should return a uint16 that encodes -0.0; that
>> is,
>> + # a uint16 # with the sign bit set.
>>
>
> Is there a reasoning error in this comment? uint16's are unsigned
> integers, so they don't have a sign bit and can't encode -0.0.
>
> I'm all for testing -0.0 as an input, but I think the only place we need to
> be careful to make sure that a -0.0 input yields a negative result is in
> packHalf2x16 and unpackHalf2x16.
>
> Having said that, I think the code is correct.
Right. Sometimes I'm surprised that I can write correct code with such
stupid comments.
>> +def make_inputs_for_pack_half_2x16():
>> + # The domain of packHalf2x16 is ([-inf, +inf] + {NaN})^2. The function
>> + # does not clamp its input.
>> + #
>> + # We test both -0.0 and +0.0 in order to stress the implementation's
>> + # handling of zero.
>> +
>> + subnormal_min = 2.0**(-14) * (1.0 / 2.0**10)
>> + subnormal_max = 2.0**(-24) * (1023.0 / 2.0**10)
>>
>
> Whoa, shouldn't that -24 be a -14? Fortunately it doesn't look like
> subnormal_max is used below.
Oops.
>> +def make_inputs_for_unpack_half_2x16():
>> + # For each of the two classes of float16 values, subnormal and
>> normalized,
>> + # below are listed the exponent and mantissa of the class's boundary
>> + # values some values slightly inside the bounds.
>>
>
> Missing the word "and" here?
Will fix.
More information about the Piglit
mailing list