[Piglit] [PATCH v3 2/3] arb_shader_precision: add framework for calculating tolerances for complex functions

Fri Mar 6 13:12:23 PST 2015

+    def __truediv__(self, other):
+        tol = numpy.float32(2.5)
+        a = self.high / (other.high + _ulpsize(other.high) * tol)
+        b = self.high / (other.low - _ulpsize(other.low) * tol)
+        c = self.low / (other.high + _ulpsize(other.high) * tol)
+        d = self.low / (other.low - _ulpsize(other.low) * tol)

That's not how I understand the tolerance... I understand it as a
"this operation can be off by this much" sort of thing. i.e. it should
be applied *after* the division. Also due to signs you may be moving
things in the wrong direction. You do this sort of thing in your other
ops as well. Also the non-i* operators shouldn't modify self but
rather return a new value.

Lastly I wonder how this can deal with the uncertainty presented by
a*b+c being implemented as 2 separate ops vs one that doesn't do an
intermediate step. Perhaps it should keep around the low/high at
33-bit (or 64-bit for simplicity) precision and use that in
adds/multiplies (and a non-add/multiply operation would "flush" it to
the 32-bit value)

On Fri, Mar 6, 2015 at 3:49 PM, Micah Fedke <micah.fedke at collabora.co.uk> wrote:
> Ilia,
>
> Here is a POC for your interval arithmetic-like method:
>
> http://cgit.collabora.com/git/user/fedke.m/piglit.git/log/?h=complex_tolerances_ia
>
> Only normalize() and length() are implemented right now.
>
> Is that along the lines of what you were thinking?
>
>
> -mf
>
> On 02/28/2015 12:18 PM, Ilia Mirkin wrote:
>>
>> BTW, and this is a bigger concern... your approach appears to
>> basically be along the lines of
>>
>> 1. Perform calculation with ~infinite precision
>> 2. Perform calculation with ... "finite" precision
>> 3. Compare. If diff is too big, don't bother, otherwise keep summing
>> ULP's based on that diff (rather than the allowable error).
>>
>> However this doesn't handle compounding that well. If I have an op
>> that can be off by 1000 ULPs, the computation with low precision won't
>> capture that. Let's say you define an op like 'fma' as a complex op,
>> and the mul part of it can be off by 10 ULPs but the 'add' part of it
>> can't be off by any ULPs. Will the error computed by your "complex"
>> path provide a tolerance of 10 ULPs in that case?
>>
>> I think that the better approach (albeit somewhat complex) would be to
>> define a class like "ImpreciseValue" which would keep both the "low"
>> and "high" values allowed (as np.float32/float64's). Then you can
>> define all operations to compute all 4 possible values, i.e. if it's
>> multiplication, do
>>
>> a.low * b.low
>> a.high * b.low
>> a.low * b.high
>> a.high * b.high
>>
>> And take the min/max of those, and if the multiplication operation has
>> a tolerance of, say, 10 ULP's, spread the min/max apart by that much.
>> Each operation would return one of these ImpreciseValue things, and
>> you should be able to determine an accurate tolerance. (I think the
>> low/high approach works for all the functions we have here... haven't
>> proven it out, but seems like it could be fine, esp over small ranges,
>> since in practice the highest error is with pow and that's only 16
>> ULPs and it's never compounded.) You could even add syntactic sugar
>> and override things like __add__ and so on so that you can keep using
>> +, -, etc.
>>
>> Thoughts?
>>
>>    -ilia
>>
>>
>> On Fri, Feb 27, 2015 at 6:02 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>
>>> On Fri, Feb 27, 2015 at 5:36 PM, Micah Fedke
>>> <micah.fedke at collabora.co.uk> wrote:
>>>>
>>>> +def _gen_tolerance(name, rettype, args):
>>>> +    """Return the tolerance that should be allowed for a function for
>>>> the
>>>> +    test vector passed in.  Return -1 for any vectors that would push
>>>> the
>>>> +    tolerance outside of acceptable bounds
>>>
>>>
>>> It seems to alternatively return scalars or lists. Why?
>>>
>>>> +    """
>>>> +    if name in simple_fns:
>>>> +        return simple_fns[name]
>>>> +    elif name in complex_fns:
>>>> +        if name in componentwise:
>>>> +            # for componentwise functions, call the reference function
>>>> +            # for each component (scalar components multiplex here)
>>>> +            # eg. for fn(float a, vec2 b, vec2 c) call:
>>>> +            # _fn_ref(a, b[0], c[0])
>>>> +            # _fn_ref(a, b[1], c[1])
>>>> +            # and then capture the results in a list
>>>> +            errors = []
>>>> +            badlands = []
>>>> +            component_tolerances = []
>>>> +            for component in range(rettype.num_cols *
>>>> rettype.num_rows):
>>>> +                current_args = []
>>>> +                for arg in args:
>>>> +                    # sanitize args so that all analyze operations can
>>>> be performed on a list
>>>> +                    sanitized_arg = _listify(arg)
>>>> +                    current_args.append(sanitized_arg[component %
>>>> len(sanitized_arg)])
>>>
>>>
>>> This is confusing. Does this ever happen on matrix "outputs"? When
>>> would component be > len(sanitized_arg)? And when it is, why does
>>> modding it by the length again make sense?
>>>
>>>> +                cur_errors, cur_badlands, cur_component_tolerances =
>>>> _analyze_ref_fn(complex_fns[name], current_args)
>>>> +                errors.extend(cur_errors)
>>>> +                badlands.extend(cur_badlands)
>>>> +                component_tolerances.extend(cur_component_tolerances)
>>>> +        else:
>>>> +            # for non-componentwise functions, all args must be passed
>>>> in in a single run
>>>> +            # sanitize args so that all analyze operations can be
>>>> performed on a list
>>>
>>>
>>> Why did you choose to split these up? Why not just make the various
>>> functions just always take a list and operate on the whole list? It
>>> seems like that would save a lot of heartache and confusion...
>>>
>>>> +            current_args = map(_listify, args)
>>>> +            errors, badlands, component_tolerances =
>>>> _analyze_ref_fn(complex_fns[name], current_args)
>>>> +        # results are in a list, and may have one or more members
>>>> +        return -1.0 if True in badlands else map(float,
>>>> component_tolerances)
>>>
>>>
>>> If badlands is just a true/false (as it seems to be), makes sense to
>>> not keep it as a list and just have it as a plain bool right?
>>>
>>> Also the
>>>
>>> a if b else c
>>>
>>> syntax is generally meant for "a is the super-common case, but in
>>> these awkward situations, it may be c". Or "I _really_ _really_ want a
>>> single expression here". It doesn't seem like it'd reduce readability
>>> to just do the more common
>>>
>>> if True in badlands:
>>>    return -1
>>> else:
>>>    return map(float, component_tolerances)
>>>
>>> Which also makes it more obvious that something funky is going on
>>> since one path returns a scalar while the other returns a list.
>>>
>>>> +    elif name in mult_fns:
>>>> +        x_type = glsl_type_of(args[0])
>>>> +        y_type = glsl_type_of(args[1])
>>>> +        # if matrix types are involved, the multiplication will be
>>>> matrix multiplication
>>>> +        # and we will have to pass the args through a reference
>>>> function
>>>> +        if x_type.is_vector and y_type.is_matrix:
>>>> +            mult_func = _vec_times_mat_ref
>>>> +        elif x_type.is_matrix and y_type.is_vector:
>>>> +            mult_func = _mat_times_vec_ref
>>>> +        elif x_type.is_matrix and y_type.is_matrix:
>>>> +            mult_func = _mat_times_mat_ref
>>>> +        else:
>>>> +            # float*float, float*vec, vec*float or vec*vec are all done
>>>> as
>>>> +            # normal multiplication and share a single tolerance
>>>> +            return mult_fns[name]
>>>> +        # sanitize args so that all analyze operations can be performed
>>>> on a list
>>>> +        errors, badlands, component_tolerances =
>>>> _analyze_ref_fn(mult_func, _listify(args))
>>>> +        # results are in a list, and may have one or more members
>>>> +        return -1.0 if True in badlands else map(float,
>>>> component_tolerances)
>>>> +    else:
>>>> +        # default for any operations without specified tolerances
>>>> +        return 0.0
>>>
>>>
>>> We've had a few rounds of these reviews, and I'm still generally
>>> confused by the specifics of how this is done. I normally don't have
>>> this much trouble reading code... not sure what's going on. Feel free
>>> to find someone else to review if you're getting frustrated by my
>>> apparently irreparable state of confusion.
>>>
>>> Cheers,
>>>
>>>    -ilia
>
>
> --
>
> Micah Fedke
> Collabora Ltd.
> +44 1223 362967
> https://www.collabora.com/
> https://twitter.com/collaboraltd