<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jan 13, 2016 at 2:14 PM, Matt Turner <span dir="ltr"><<a href="mailto:mattst88@gmail.com" target="_blank">mattst88@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Wed, Jan 13, 2016 at 1:46 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br> > On Wed, Jan 13, 2016 at 2:01 AM, Ian Romanick <<a href="mailto:idr@freedesktop.org">idr@freedesktop.org</a>> wrote:<br> >> On 01/12/2016 05:41 PM, Matt Turner wrote:<br> </span><span class="">>> > Section 8.3.2 of the OpenCL C 2.0 spec is also relevant, but doesn't<br> >> > touch directly on the issue at hand.<br> >> ><br> >> > I'm worried that what is specified is not implementable via a round<br> >> > trip through half-precision, because it's not the behavior other<br> >> > languages implement.<br> >> ><br> >> > If I had to guess, given the table in the IVB PRM and section 8.3.2,<br> >> > out-of-range single-precision floats are converted to the<br> >> > half-precision value with the largest magnitude.<br> >><br> >> You are correct, we should test it to be sure what the hardware really<br> >> does. This is not intended to be a performance operation. If we need to<br> >> use a different, more expensive expansion to meet the requirements, we<br> >> shouldn't lose any sleep over it.<br> ><br> ><br> > I haven't looked at it in bit-for-bit detail, but I I did run it through a<br> > set of tests which explicitly hits denorms and the out-of-bounds cases in<br> > both directions. The tests seem to indicate that the hardware does what the<br> > opcode claims.<br> <br> </span>I checked out the tests you mention, and none of the cases touch on<br> what I'm saying (and this has nothing to do with denormal values). Let<br> me explain again.<br></blockquote><div><br></div><div>Right. Thanks for looking at it. I guess it only checks the explicit infinity case.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> The largest representable value in half-precision is<br> <br> 65504 == 2.0**15 * (1.0 + 1023.0 / 2.0**10)<br> <br> and the distance between representable integers at this range is 32.<br> Converting 65505.0f through 65519.0f (i.e., one less than half the<br> interval more than the largest representable value) to half-precision<br> should round to 65504.0. 65520.0f and larger should round to infinity.<br> <br> This is what piglit tests<br> (generated_tests/gen_builtin_packing_tests.py) and since we pass those<br> tests I believe this is what the hardware does.<br> <br> This is, unfortunately, *not* what the documentation you've cited<br> says. I expect that that's an oversight more than intentional<br> behavior. Maybe tomorrow we can figure out how to submit changes to<br> the spec and test suite?<br></blockquote><div><br></div><div>Yeah, we can look at that tomorrow. The objective of the opcode is to get the behavior that Ian mentioned where if you sprinkle enough of them in, you can emulate half-float precision. What happens if you do FLOAT_MAX + FLOAT_MAX? Maybe infinity is what's wanted. If that's the case, then we'll have to do some sort of absolute value range-check. It doesn't have to be efficient.<br></div><div>--Jason <br></div></div></div></div>