[Mesa-dev] [PATCH 1/2] nir: Add a fquantize2f16 opcode

Tue Jan 12 18:11:32 PST 2016

Am 13.01.2016 um 02:41 schrieb Matt Turner:
> On Tue, Jan 12, 2016 at 4:10 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> On Tue, Jan 12, 2016 at 3:52 PM, Matt Turner <mattst88 at gmail.com> wrote:
>>>
>>> On Tue, Jan 12, 2016 at 3:35 PM, Jason Ekstrand <jason at jlekstrand.net>
>>> wrote:
>>>> This opcode simply takes a 32-bit floating-point value and reduces its
>>>> effective precision to 16 bits.
>>>> ---
>>>
>>> What's it supposed to do for values not representable in half-precision?
>>
>>
>> If they're in-range, round.  If they're out-of-range, the appropriate
>> infinity.
> 
> Are you sure that's the behavior hardware has? And by "are you sure" I
> mean "have you tested it"
> 
> The conversion table in the f32to16 documentation in the IVB PRM says:
> 
> single precision -> half precision
> ------------------------------------
> -finite -> -finite/-denorm/-0
> +finite -> +finite/+denorm/+0
> 
>> https://www.khronos.org/registry/spir-v/specs/1.0/SPIRV.html#OpQuantizeToF16
> 
>> Quantize a floating-point value to a what is expressible by a 16-bit floating-point value.
> 
> Erf, anyway,
> 
> ... and the "convert too-large values to inf" isn't the behavior of
> other languages like C [1] (and I don't think GLSL either, but I can't
> find anything on the matter i the spec) or OpenCL C [2].

At least for opengl, round-to-nearest (implying round-to-infinity for
too large values) seems to be preferred, but not required. glsl
generally operates according to some rounding mode, which is undefined.
(I don't, however, know if rounding needs to be consistent, that is the
same for all operations). Thus, if you would operate with a rounding
mode of truncate, mapping too large values to finite max is the right
thing to do, otherwise round-to-infinty. d3d10 OTOH generally requires
round-to-nearest for all operations and conversions, with the very
notable exception of all float-like conversions which for some reason
operate with truncate mode (hence map too large values to finite max).
This is likely why the IVB PRM says this. However the mesa f16
conversion code (not the one in gallium) says this matches intel hw, and
does round to infinity.
There's of cause always the question if you'd want to do the same for
f32->f16 shader instructions, or conversion to f16 render targets (from
rendering or clearing) and what not... Intuitively I'd say it would make
sense to handle this the same everywhere but maybe not... That opcode as
quoted is also different in general to what is done for at least render
target conversion (dunno about shader ops) or sampling such formats,
since "usually" (it is required by d3d10) the hw will be able to
represent f16 denorms.

Roland

> 
> Section 8.3.2 of the OpenCL C 2.0 spec is also relevant, but doesn't
> touch directly on the issue at hand.
> 
> I'm worried that what is specified is not implementable via a round
> trip through half-precision, because it's not the behavior other
> languages implement.
> 
> If I had to guess, given the table in the IVB PRM and section 8.3.2,
> out-of-range single-precision floats are converted to the
> half-precision value with the largest magnitude.
> 
> [1] C99 spec, 6.3.1.5 says "If the value being converted is outside
> the range of values that can be represented, the behavior is
> undefined."
> [2] OpenCL C 2.0 spec 6.2.3.3 says to refer to C99 spec section 6.3.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>