[Mesa-dev] [PATCH] radeonsi: enable denorms for 64-bit and 16-bit floats

Mon Feb 8 23:53:32 UTC 2016

On 02/08/2016 03:37 PM, Roland Scheidegger wrote:
> Am 09.02.2016 um 00:02 schrieb Ian Romanick:
>> On 02/08/2016 12:38 PM, Marek Olšák wrote:
>>> On Mon, Feb 8, 2016 at 5:08 PM, Tom Stellard <tom at stellard.net> wrote:
>>>> On Sat, Feb 06, 2016 at 01:15:42PM +0100, Marek Olšák wrote:
>>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>>
>>>>> This fixes FP16 conversion instructions for VI, which has 16-bit floats,
>>>>> but not SI & CI, which can't disable denorms for those instructions.
>>>>
>>>> Do you know why this fixes FP16 conversions?  What does the OpenGL
>>>> spec say about denormal handing?
>>>
>>> Yes, I know why. The patch explain everything as far as I can see
>>> though. What isn't clear?
>>>
>>> SI & CI: Don't support FP16. FP16 conversions are hardcoded to emit
>>> and accept FP16 denormals.
>>> VI: Supports FP16. FP16 denormal support is now configurable and
>>> affects FP16 conversions as well.(shared setting with FP64).
>>>
>>> OpenGL doesn't require denormals. Piglit does. I think this is
>>> incorrect piglit behavior.
>>
>> I submitted a public spec bug for this issue:
>>
>> https://www.khronos.org/bugzilla/show_bug.cgi?id=1460
>>
>> I'm investigating whether a similar bug is needed for the SPIR-V
>> specification.
>>
>> I think an argument can be made for either the flush-to-zero or
>> non-flush-to-zero behavior in the case of unpackHalf2x16 and (possibly)
>> packHalf2x16.  The only place in the GLSL 4.50.5 specification that
>> mentions subnormal values is section 4.7.1 (Range and Precision).
>>
>>     "The precision of stored single- and double-precision floating-point
>>     variables is defined by the IEEE 754 standard for 32-bit and 64-bit
>>     floating-point numbers....Any denormalized value input into a
>>     shader or potentially generated by any operation in a shader can be
>>     flushed to 0."
>>
>> Since there is no half-precision type in desktop GLSL, there is no
>> mention of 16-bit subnormal values.  As Roland mentioned before, all
>> 16-bit subnormal values values are 32-bit normal values.
>>
>> As I mentioned before, from the point of view of an application
>> developer, the flush-to-zero behavior for unpackHalf2x16 is both
>> surprising and awful. :)
>>
>> While I think an argument can be made for either behavior, I also think
>> the argument for the non-flush-to-zero behavior is slightly stronger.
>> The case for flush-to-zero based on the above spec quotation fails for
>> two reasons.  First, the "input into [the] shader" is not a subnormal
>> number.  It is an integer.  Second, the "[value] potentially generated
>> by [the] operation" is not subnormal in single-precision.
> 
> I don't disagree with that, however OTOH you could make an argument that
> such a strong guarantee for packed half floats is inconsistent with
> what's required for them elsewhere in GL. In particular half float
> texture formats - these are still based on ARB_half_float_pixel. Which
> says denormals are optional, infs are optional, NaNs are optional -
> albeit that's not any different to ordinary floats...

Thanks for mentioning this. :)  The same issue had occurred to me, and I
was trying to find some relevant text in the GL spec.  I hadn't thought
to look in the extension spec.

> (And I still have the problem that d3d10 wants trunc behavior instead of
> round... fwiw the precedent there in GL is also for r11g11b10 format,
> which says round-to-nearest recommended but trunc allowed, and all too
> large finite numbers converted to max finite (which is inconsistent with
> nearest rounding). The spec is completely silent both within GLSL or GL
> how rounding should be done for fp32 to fp16, albeit I don't disagree
> round-to-nearest seems the most reasonable.)

The GLSL spec isn't silent.  Section 4.7.1 explicitly says, "The
rounding mode cannot be set and is undefined."

> Roland
> 
>> We've already determined that NVIDIA closed-source drivers do not flush
>> to zero.  I'm curious to know what AMD's closed-source drivers do for
>> 16-bit subnormal values supplied to unpackHalf2x16.  If they do not
>> flush to zero, then you had better believe that applications depend on
>> that behavior... and that also means that it doesn't matter very much
>> what piglit does or the spec does (or does not) say.  This is the sort
>> of situation where the spec changes to match application expectations
>> and shipping implementations... and Mesa drivers change to follow.  This
>> isn't even close to the first time through that loop.