<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Dec 29, 2015 at 2:28 PM, Ilia Mirkin <span dir="ltr"><<a href="mailto:imirkin@alum.mit.edu" target="_blank">imirkin@alum.mit.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="">On Tue, Dec 29, 2015 at 3:18 PM, Aaron Watry <<a href="mailto:awatry@gmail.com">awatry@gmail.com</a>> wrote:<br> > Regardless of whether 0.0 is the absolutely correct answer for<br> > cos(1.57079632679), we can hopefully all agree that -4.0 is NOT a valid<br> > answer for cosine of anything<br> <br> </span>Right so that's clearly wrong :) I was largely warning you about some<br> of the issues we ran into trying to compute ULPs, and especially<br> combining them with each other. The GL_ARB_shader_precision text is<br> quite precise and similarly difficult to test -- e.g. fma() is allowed<br> to be fused or non-fused, at the implementation's option. But that<br> causes all sorts of results to be way different.<br> <br> Perhaps you don't have these issues in the OpenCL specs.<br> <span class=""><font color="#888888"><br></font></span></blockquote><div><br></div><div>Answers, hopefully?<br></div><div><br><div>With regards to FMA(a,b,c):<br></div><div>CL 1.2, Section 6.12.2 says:<br></div><div>"Returns the correctly rounded floating-point<br>representation of the sum of c with the infinitely<br>precise product of a and b. Rounding of<br>intermediate products shall not occur. Edge case<br>behavior is per the IEEE 754-2008 standard."<br></div><br></div><div>Rounding mode is also called out in section 6.12.2:<br>"The built-in math functions are not affected by the prevailing rounding mode in the calling<br>environment, and always return the same value as they would if called with the round to nearest<br>even rounding mode."<br></div><div><br></div><div>CL 1.2, section 7.4:<br>"In this section we discuss the maximum relative error defined as ulp (units in the last place).<br>Addition, subtraction, multiplication, fused multiply-add and conversion between integer and a<br>single precision floating-point format are IEEE 754 compliant and are therefore correctly<br>rounded. Conversion between floating-point formats and explicit conversions specified in<br>section 6.2.3 must be correctly rounded."<br></div><div><SNIP ><br>"The reference value used to compute the ULP value of an<br>arithmetic operation is the infinitely precise result."<br><br></div><div>So, we will still have to deal with the discrepancy between the infinitely precise result and the rounded "expected" result when calculating our ULP-based tolerances, but as already covered in the previous email, the absolute tolerance was definitely wrong.<br></div><div><br></div><div></div><div>CL has a set of native_* functions that allows the implementation to choose how to calculate the result.<br>Example: native_cos(x), which has implementation-defined precision, but allows the vendor to optimize as they see fit while possibly sacrificing accuracy. <br><br>Most of the time, I've seen the native_* functions used to call the hardware implementation of those instructions, while the non-native ones are implemented in CL C to achieve the required precision.<br><br></div><div>--Aaron<br></div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class=""><font color="#888888"> -ilia<br> </font></span></blockquote></div><br></div></div>