[Piglit] [PATCH v3 2/3] arb_shader_precision: add framework for calculating tolerances for complex functions

Fri Mar 6 13:50:45 PST 2015

So use the "max/min of all permutations" method for all ops?  e.g.:

     def __mul__(self, other):
         a = self.high * other.high
         b = self.high * other.low
         c = self.low * other.high
         d = self.low * other.low
         high = numpy.float32(numpy.amax([a, b, c, d]))
         low = numpy.float32(numpy.amin([a, b, c, d]))
         return ValueInterval(high, low)

And tack on the tolerance at the end like this, for ops that have a 
tolerance?  Things should move in the right direction after high and low 
have been determined, if I'm not mistaken.

     def __truediv__(self, other):
         tol = numpy.float32(2.5)
         a = self.high / other.high
         b = self.high / other.low
         c = self.low / other.high
         d = self.low / other.low
         self.high = numpy.float32(numpy.amax([a, b, c, d]))
         self.low = numpy.float32(numpy.amin([a, b, c, d]))
         self.high += _ulpsize(self.high) * tol
         self.low -= _ulpsize(self.low) * tol
         return self

As for manual fma's, that should work.  I wonder, though - a 
double-round manual fma has the potential to produce more error than a 
single-round, and the spec allows either method, so don't we want to 
evaluate the more error-ful option?

-mf

On 03/06/2015 03:12 PM, Ilia Mirkin wrote:
> +    def __truediv__(self, other):
> +        tol = numpy.float32(2.5)
> +        a = self.high / (other.high + _ulpsize(other.high) * tol)
> +        b = self.high / (other.low - _ulpsize(other.low) * tol)
> +        c = self.low / (other.high + _ulpsize(other.high) * tol)
> +        d = self.low / (other.low - _ulpsize(other.low) * tol)
>
> That's not how I understand the tolerance... I understand it as a
> "this operation can be off by this much" sort of thing. i.e. it should
> be applied *after* the division. Also due to signs you may be moving
> things in the wrong direction. You do this sort of thing in your other
> ops as well. Also the non-i* operators shouldn't modify self but
> rather return a new value.
>
> Lastly I wonder how this can deal with the uncertainty presented by
> a*b+c being implemented as 2 separate ops vs one that doesn't do an
> intermediate step. Perhaps it should keep around the low/high at
> 33-bit (or 64-bit for simplicity) precision and use that in
> adds/multiplies (and a non-add/multiply operation would "flush" it to
> the 32-bit value)
>
>
> On Fri, Mar 6, 2015 at 3:49 PM, Micah Fedke <micah.fedke at collabora.co.uk> wrote:
>> Ilia,
>>
>> Here is a POC for your interval arithmetic-like method:
>>
>> http://cgit.collabora.com/git/user/fedke.m/piglit.git/log/?h=complex_tolerances_ia
>>
>> Only normalize() and length() are implemented right now.
>>
>> Is that along the lines of what you were thinking?
>>
>>
>> -mf
>>
>> On 02/28/2015 12:18 PM, Ilia Mirkin wrote:
>>>
>>> BTW, and this is a bigger concern... your approach appears to
>>> basically be along the lines of
>>>
>>> 1. Perform calculation with ~infinite precision
>>> 2. Perform calculation with ... "finite" precision
>>> 3. Compare. If diff is too big, don't bother, otherwise keep summing
>>> ULP's based on that diff (rather than the allowable error).
>>>
>>> However this doesn't handle compounding that well. If I have an op
>>> that can be off by 1000 ULPs, the computation with low precision won't
>>> capture that. Let's say you define an op like 'fma' as a complex op,
>>> and the mul part of it can be off by 10 ULPs but the 'add' part of it
>>> can't be off by any ULPs. Will the error computed by your "complex"
>>> path provide a tolerance of 10 ULPs in that case?
>>>
>>> I think that the better approach (albeit somewhat complex) would be to
>>> define a class like "ImpreciseValue" which would keep both the "low"
>>> and "high" values allowed (as np.float32/float64's). Then you can
>>> define all operations to compute all 4 possible values, i.e. if it's
>>> multiplication, do
>>>
>>> a.low * b.low
>>> a.high * b.low
>>> a.low * b.high
>>> a.high * b.high
>>>
>>> And take the min/max of those, and if the multiplication operation has
>>> a tolerance of, say, 10 ULP's, spread the min/max apart by that much.
>>> Each operation would return one of these ImpreciseValue things, and
>>> you should be able to determine an accurate tolerance. (I think the
>>> low/high approach works for all the functions we have here... haven't
>>> proven it out, but seems like it could be fine, esp over small ranges,
>>> since in practice the highest error is with pow and that's only 16
>>> ULPs and it's never compounded.) You could even add syntactic sugar
>>> and override things like __add__ and so on so that you can keep using
>>> +, -, etc.
>>>
>>> Thoughts?
>>>
>>>     -ilia
>>>
>>>
>>> On Fri, Feb 27, 2015 at 6:02 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>>
>>>> On Fri, Feb 27, 2015 at 5:36 PM, Micah Fedke
>>>> <micah.fedke at collabora.co.uk> wrote:
>>>>>
>>>>> +def _gen_tolerance(name, rettype, args):
>>>>> +    """Return the tolerance that should be allowed for a function for
>>>>> the
>>>>> +    test vector passed in.  Return -1 for any vectors that would push
>>>>> the
>>>>> +    tolerance outside of acceptable bounds
>>>>
>>>>
>>>> It seems to alternatively return scalars or lists. Why?
>>>>
>>>>> +    """
>>>>> +    if name in simple_fns:
>>>>> +        return simple_fns[name]
>>>>> +    elif name in complex_fns:
>>>>> +        if name in componentwise:
>>>>> +            # for componentwise functions, call the reference function
>>>>> +            # for each component (scalar components multiplex here)
>>>>> +            # eg. for fn(float a, vec2 b, vec2 c) call:
>>>>> +            # _fn_ref(a, b[0], c[0])
>>>>> +            # _fn_ref(a, b[1], c[1])
>>>>> +            # and then capture the results in a list
>>>>> +            errors = []
>>>>> +            badlands = []
>>>>> +            component_tolerances = []
>>>>> +            for component in range(rettype.num_cols *
>>>>> rettype.num_rows):
>>>>> +                current_args = []
>>>>> +                for arg in args:
>>>>> +                    # sanitize args so that all analyze operations can
>>>>> be performed on a list
>>>>> +                    sanitized_arg = _listify(arg)
>>>>> +                    current_args.append(sanitized_arg[component %
>>>>> len(sanitized_arg)])
>>>>
>>>>
>>>> This is confusing. Does this ever happen on matrix "outputs"? When
>>>> would component be > len(sanitized_arg)? And when it is, why does
>>>> modding it by the length again make sense?
>>>>
>>>>> +                cur_errors, cur_badlands, cur_component_tolerances =
>>>>> _analyze_ref_fn(complex_fns[name], current_args)
>>>>> +                errors.extend(cur_errors)
>>>>> +                badlands.extend(cur_badlands)
>>>>> +                component_tolerances.extend(cur_component_tolerances)
>>>>> +        else:
>>>>> +            # for non-componentwise functions, all args must be passed
>>>>> in in a single run
>>>>> +            # sanitize args so that all analyze operations can be
>>>>> performed on a list
>>>>
>>>>
>>>> Why did you choose to split these up? Why not just make the various
>>>> functions just always take a list and operate on the whole list? It
>>>> seems like that would save a lot of heartache and confusion...
>>>>
>>>>> +            current_args = map(_listify, args)
>>>>> +            errors, badlands, component_tolerances =
>>>>> _analyze_ref_fn(complex_fns[name], current_args)
>>>>> +        # results are in a list, and may have one or more members
>>>>> +        return -1.0 if True in badlands else map(float,
>>>>> component_tolerances)
>>>>
>>>>
>>>> If badlands is just a true/false (as it seems to be), makes sense to
>>>> not keep it as a list and just have it as a plain bool right?
>>>>
>>>> Also the
>>>>
>>>> a if b else c
>>>>
>>>> syntax is generally meant for "a is the super-common case, but in
>>>> these awkward situations, it may be c". Or "I _really_ _really_ want a
>>>> single expression here". It doesn't seem like it'd reduce readability
>>>> to just do the more common
>>>>
>>>> if True in badlands:
>>>>     return -1
>>>> else:
>>>>     return map(float, component_tolerances)
>>>>
>>>> Which also makes it more obvious that something funky is going on
>>>> since one path returns a scalar while the other returns a list.
>>>>
>>>>> +    elif name in mult_fns:
>>>>> +        x_type = glsl_type_of(args[0])
>>>>> +        y_type = glsl_type_of(args[1])
>>>>> +        # if matrix types are involved, the multiplication will be
>>>>> matrix multiplication
>>>>> +        # and we will have to pass the args through a reference
>>>>> function
>>>>> +        if x_type.is_vector and y_type.is_matrix:
>>>>> +            mult_func = _vec_times_mat_ref
>>>>> +        elif x_type.is_matrix and y_type.is_vector:
>>>>> +            mult_func = _mat_times_vec_ref
>>>>> +        elif x_type.is_matrix and y_type.is_matrix:
>>>>> +            mult_func = _mat_times_mat_ref
>>>>> +        else:
>>>>> +            # float*float, float*vec, vec*float or vec*vec are all done
>>>>> as
>>>>> +            # normal multiplication and share a single tolerance
>>>>> +            return mult_fns[name]
>>>>> +        # sanitize args so that all analyze operations can be performed
>>>>> on a list
>>>>> +        errors, badlands, component_tolerances =
>>>>> _analyze_ref_fn(mult_func, _listify(args))
>>>>> +        # results are in a list, and may have one or more members
>>>>> +        return -1.0 if True in badlands else map(float,
>>>>> component_tolerances)
>>>>> +    else:
>>>>> +        # default for any operations without specified tolerances
>>>>> +        return 0.0
>>>>
>>>>
>>>> We've had a few rounds of these reviews, and I'm still generally
>>>> confused by the specifics of how this is done. I normally don't have
>>>> this much trouble reading code... not sure what's going on. Feel free
>>>> to find someone else to review if you're getting frustrated by my
>>>> apparently irreparable state of confusion.
>>>>
>>>> Cheers,
>>>>
>>>>     -ilia
>>
>>
>> --
>>
>> Micah Fedke
>> Collabora Ltd.
>> +44 1223 362967
>> https://www.collabora.com/
>> https://twitter.com/collaboraltd

-- 

Micah Fedke
Collabora Ltd.
+44 1223 362967
https://www.collabora.com/
https://twitter.com/collaboraltd