[Piglit] [PATCH] CL: Fix check of ULP when probing float/double results

Micah Fedke micah.fedke at collabora.co.uk
Wed Jun 17 09:22:11 PDT 2015


Just throwing a few ideas/warnings your way :)  I ran into these same 
issues when implementing arb_shader_precision for GLSL.

It's not my intention to introduce more questions than answers here. 
This is all simply coming from a fellow dev who's been beaten down by 
float's guerrilla attacks, not a float expert by any means.

First, float is bees, let's get that out of the way.

nextafter does sound like the clearest solution.  I used a 
hand-generated method to convert to the binary representation for my 
purposes, which made the most sense to me:

def _floatToBits(f):
     s = struct.pack('>f', f)
     # capital 'L' is important - sign bit is already built into 754
     return struct.unpack('>L', s)[0]

def _bitsToFloat(b):
     s = struct.pack('>l', b)
     return struct.unpack('>f', s)[0]

However nextafter accomplishes the same thing.

I did find it easiest/clearest to move to the binary representation to 
do all ulp-boundary math and comparisons, and only convert back to float 
at the end.

There are many gotcha's here in addition to the ones you've already 
mentioned.  The first would be "who's ulp?"  I haven't checked the spec 
governing these CL tests in question, but arb_shader_precision never 
specified which ulp it was referring to.  Is it the value of an ulp at 
the expected value?  Or the value of an ulp at one of the inputs?  Or 
the largest of all those?  Also, it's very helpful when the spec 
specifies a range of values for which the operation can be held to the 
specified number of ulps.

Secondly the rounding issue you raised is valid and difficult to get 
right.  Operations submitted to the CPU will use the rounding mode it is 
set up for.  I believe the default on intel chips is round-to-nearest, 
ties-to-even, which seems to be the case for most GPUs as well, but I'm 
not an expert on this subject by any means.  Rounding doesn't just 
happen at the end, either - intermediate values in complex equations 
will be rounded as well - these rounding errors could multiply, or 
cancel, depending on the inputs.  The other thing to consider is if the 
equation that produced the expected value used python's round(), and it 
was run with Python2, then round() would use round-away-from-zero; 
python3's round uses round-to-nearest, ties-to-even.

Regarding the generation of expected values, it's important to be very 
cautious about equations that stack multiple floating point operations, 
as this is where errors have the potential to get out of hand quickly. 
The most difficult issue here, in my mind, is fma.  Unless you're 
testing 64-bit floating point on the GPU, it will be doing its 
calculations in 32-bit space, however, if the GPU compiler employs fma, 
this can cause intermediate operations to be done at higher precisions. 
  This means that the GPU may be jumping between low-precision and 
high-precision operations over the course of a single equation, where 
python may well be doing all operations at a uniform level of precision. 
  This can result in differences in expected results that are very hard 
to track down.  Sadly, it's nearly impossible to predict how a compiler 
might employ fma for any give equation.  You can use python's decimal, 
bigfloat or numpy (ugh) to clamp down the precision at which the 
operation is evaluated, but again, it will be difficult-to-impossible to 
predict if and where the GPU compiler employed fma.

Ultimately, you're attempting to get python to run an equation *in 
exactly the same way* that the shader code does, in order to produce an 
identical result.  This is a difficult thing.  Your ulps tolerance will 
likely, in some way, be accounting for these differences between the CPU 
and GPU generated results, in addition to its true purpose which is to 
identify real issues in the GPU compiler or hardware that introduce 
unintended floating point errors.

You will, ultimately, have to run all the test vectors through your 
chosen solution and carefully inspect any differences in the results. 
You may discover that certain test vectors are simply not appropriate.

In the end, good floating point tests use very carefully selected 
inputs.  Wisdom or deception?  Perhaps a bit of both.



On 06/15/2015 09:29 AM, Jan Vesely wrote:
>
>
> On Sun, Jun 14, 2015 at 7:48 PM, Aaron Watry <awatry at gmail.com
> <mailto:awatry at gmail.com>> wrote:
>
>
>
>     On Sun, Jun 14, 2015 at 4:19 PM, Jan Vesely <jan.vesely at rutgers.edu
>     <mailto:jan.vesely at rutgers.edu>> wrote:
>
>         On Sat, 2015-06-13 at 21:22 -0500, Aaron Watry wrote:
>         > Meh, this still feels broken.  Give me a bit longer.
>
>         and it is :). I don't think this can work based on abs(expected
>         - real),
>         since ULP depends on the magnitude of the numbers. This
>         information is
>         lost after subtraction.
>
>
>     Yeah, that's essentially the conclusion that I came to last night as
>     well.  Currently, we're saying that 3 ULP is 3x the smallest
>     representable floating point number, not 3x the interval between
>     adjacent float values for the given magnitude.
>
>     I was thinking that we might need to take the expected value, cast
>     to unsigned, add the ulp value, then convert back to float.  From
>     there, find the difference between expected and the expected+ulp,
>     and use that as a tolerance.  We could alternatively call
>     nextafterf/nextafter the required number of times from the expected
>     value in either direction and get a min/max allowed value and then
>     check that the result value is in that range.
>
>
> I considered casting to ints too. However, it runs into problems in some
> corner cases (like neg zero, crossing exponent boundaries, ...),
> nextafter offered an easy way out. although it's still not perfect. When
> the number sits on exponent boundary, going up or down gives different
> steps. one example is 2^24:
> 16777214: 16777214.000000
> 16777215: 16777215.000000
> 16777216: 16777216.000000
> 16777217: 16777216.000000
> 16777218: 16777218.000000
> 16777219: 16777220.000000
>
> we could take something like unit = nextafterf(expected) - expected;
> and then compare against ulp * unit, that way we can even subtract 0.5
> unit for python/piglit rounding.
> Although nextafterf(expected) - expected hits representability issues
> for very large numbers.
>
> however the problem whether to take nextaftef(x, 0.0f) or nextafterf(x,
> +/- inf) persits.
> I'm not sure which one we want. if python always rounded to/away from zero,
> we could always take the opposite, but it would cause problems for
> non-integer ulp (which we don't support at the moment anyway).
>
>
> Jan
>
>
>     Yes, we still run into cases where the tests themselves are expected
>     an incorrectly rounded value, but at least the ULP checking code
>     might be functioning correctly then.
>
>     Thoughts?
>
>     --Aaron
>
>
>         I had an idea some time back to implement this using nextafterf
>         in both
>         directions and checking whether the result falls in that interval.
>         However, there is still a problem. Some of the expected values are
>         already rounded (I'm not sure what rounding mode is used by
>         python by
>         default), but unless it always rounds in one direction, we'll
>         still get
>         slight differences based on whether the actual value was rounded
>         up or
>         down.
>
>         I have attached a small test program that shows the deficiencies of
>         fabsf based approaches.
>
>         regards,
>         Jan
>
>          >
>          > --Aaron
>          >
>          > On Sat, Jun 13, 2015 at 2:28 PM, Aaron Watry
>         <awatry at gmail.com <mailto:awatry at gmail.com>> wrote:
>          >
>          > > We need to actually check against the float value from the
>         union,
>          > > instead of just doing (diff > ulp), which seems to cast the
>         diff to
>          > > an int before checking against ulp.
>          > >
>          > > Signed-off-by: Aaron Watry <awatry at gmail.com
>         <mailto:awatry at gmail.com>>
>          > > CC: Tom Stellard <thomas.stellard at amd.com
>         <mailto:thomas.stellard at amd.com>>
>          > > CC: Jan Vesely <jan.vesely at rutgers.edu
>         <mailto:jan.vesely at rutgers.edu>>
>          > > ---
>          > >  tests/util/piglit-util-cl.c | 28 ++++++++++++++--------------
>          > >  1 file changed, 14 insertions(+), 14 deletions(-)
>          > >
>          > > diff --git a/tests/util/piglit-util-cl.c
>         b/tests/util/piglit-util-cl.c
>          > > index 47e0c7a..6cdd718 100644
>          > > --- a/tests/util/piglit-util-cl.c
>          > > +++ b/tests/util/piglit-util-cl.c
>          > > @@ -80,7 +80,7 @@ piglit_cl_probe_floating(float value,
>         float expect,
>          > > uint32_t ulp)
>          > >
>          > >         diff = fabsf(value - expect);
>          > >
>          > > -       if(diff > ulp || isnan(value)) {
>          > > +       if (diff > t.f || isnan(value)) {
>          > >                 printf("Expecting %f (0x%x) with tolerance
>         %f (%u ulps),
>          > > but got %f (0x%x)\n",
>          > >                        e.f, e.u, t.f, t.u, v.f, v.u);
>          > >                 return false;
>          > > @@ -108,7 +108,7 @@ piglit_cl_probe_double(double value,
>         double expect,
>          > > uint64_t ulp)
>          > >
>          > >         diff = fabsl(value - expect);
>          > >
>          > > -       if(diff > ulp || isnan(value)) {
>          > > +       if (diff > t.f || isnan(value)) {
>          > >                 printf("Expecting %f (0x%lx) with tolerance
>         %f (%lu ulps),
>          > > but got %f (0x%lx)\n",
>          > >                        e.f, e.u, t.f, t.u, v.f, v.u);
>          > >                 return false;
>          > > @@ -162,7 +162,7 @@
>         piglit_cl_get_platform_version(cl_platform_id platform)
>          > >         int scanf_count;
>          > >         int major;
>          > >         int minor;
>          > > -
>          > > +
>          > >         /*
>          > >          * Returned format:
>          > >          *
>          > >
>         OpenCL<space><major_version.minor_version><space><platform-specific
>          > > information>
>          > > @@ -353,7 +353,7 @@ piglit_cl_get_info(void* fn_ptr, void*
>         obj, cl_uint
>          > > param)
>          > >
>          > >         if(errNo == CL_SUCCESS) {
>          > >                 param_ptr = calloc(param_size, sizeof(char));
>          > > -
>          > > +
>          > >                 /* retrieve param */
>          > >                 if(fn_ptr == clGetPlatformInfo) {
>          > >                         errNo =
>         clGetPlatformInfo(*(cl_platform_id*)obj,
>          > > param,
>          > > @@ -463,7 +463,7 @@
>         piglit_cl_get_program_build_info(cl_program program,
>          > > cl_device_id device,
>          > >                 .program = program,
>          > >                 .device = device
>          > >         };
>          > > -
>          > > +
>          > >         return piglit_cl_get_info(clGetProgramBuildInfo,
>         &args, param);
>          > >  }
>          > >
>          > > @@ -479,7 +479,7 @@
>         piglit_cl_get_kernel_work_group_info(cl_kernel kernel,
>          > > cl_device_id device,
>          > >                 .kernel = kernel,
>          > >                 .device = device
>          > >         };
>          > > -
>          > > +
>          > >         return piglit_cl_get_info(clGetKernelWorkGroupInfo,
>         &args, param);
>          > >  }
>          > >
>          > > @@ -620,7 +620,7 @@ piglit_cl_get_device_ids(cl_platform_id
>         platform_id,
>          > > cl_device_type device_type,
>          > >
>           piglit_cl_get_error_name(errNo));
>          > >                                 return 0;
>          > >                         }
>          > > -
>          > > +
>          > >                         /* get device list */
>          > >                         if(device_ids != NULL &&
>         num_device_ids > 0) {
>          > >                                 *device_ids =
>         malloc(num_device_ids *
>          > > sizeof(cl_device_id));
>          > > @@ -761,7 +761,7 @@
>          > >
>         piglit_cl_build_program_with_source_extended(piglit_cl_context
>         context,
>          > >                         piglit_cl_get_error_name(errNo));
>          > >                 return NULL;
>          > >         }
>          > > -
>          > > +
>          > >         errNo = clBuildProgram(program,
>          > >                                context->num_devices,
>          > >                                context->device_ids,
>          > > @@ -788,7 +788,7 @@
>          > >
>         piglit_cl_build_program_with_source_extended(piglit_cl_context
>         context,
>          > >                         char* log =
>          > > piglit_cl_get_program_build_info(program,
>          > >
>          > >  context->device_ids[i],
>          > >
>          > >  CL_PROGRAM_BUILD_LOG);
>          > > -
>          > > +
>          > >                         printf("Build log for device %s:\n
>         -------- \n%s\n
>          > > -------- \n",
>          > >                                device_name,
>          > >                                log);
>          > > @@ -848,11 +848,11 @@
>          > >
>         piglit_cl_build_program_with_binary_extended(piglit_cl_context
>         context,
>          > >                 for(i = 0; i < context->num_devices; i++) {
>          > >                         char* device_name =
>          > > piglit_cl_get_device_info(context->device_ids[i],
>          > >
>          > > CL_DEVICE_NAME);
>          > > -
>          > > +
>          > >                         printf("Error for %s: %s\n",
>          > >                                device_name,
>          > >
>         piglit_cl_get_error_name(binary_status[i]));
>          > > -
>          > > +
>          > >                         free(device_name);
>          > >                 }
>          > >
>          > > @@ -860,7 +860,7 @@
>          > >
>         piglit_cl_build_program_with_binary_extended(piglit_cl_context
>         context,
>          > >                 return NULL;
>          > >         }
>          > >         free(binary_status);
>          > > -
>          > > +
>          > >         errNo = clBuildProgram(program,
>          > >                                context->num_devices,
>          > >                                context->device_ids,
>          > > @@ -884,11 +884,11 @@
>          > >
>         piglit_cl_build_program_with_binary_extended(piglit_cl_context
>         context,
>          > >                         char* log =
>          > > piglit_cl_get_program_build_info(program,
>          > >
>          > >  context->device_ids[i],
>          > >
>          > >  CL_PROGRAM_BUILD_LOG);
>          > > -
>          > > +
>          > >                         printf("Build log for device %s:\n
>         -------- \n%s\n
>          > > -------- \n",
>          > >                                device_name,
>          > >                                log);
>          > > -
>          > > +
>          > >                         free(device_name);
>          > >                         free(log);
>          > >                 }
>          > > --
>          > > 2.1.4
>          > >
>          > >
>
>
>         --
>         Jan Vesely <jan.vesely at rutgers.edu <mailto:jan.vesely at rutgers.edu>>
>
>
>
>
>
> _______________________________________________
> Piglit mailing list
> Piglit at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/piglit
>

-- 

Micah Fedke
Collabora Ltd.
+44 1223 362967
https://www.collabora.com/
https://twitter.com/collaboraltd


More information about the Piglit mailing list