[Mesa-dev] [PATCH] radeonsi: enable 32-bit denormals on VI+

Roland Scheidegger sroland at vmware.com
Thu Jan 12 16:23:50 UTC 2017

Am 12.01.2017 um 17:13 schrieb Roland Scheidegger:
> Am 12.01.2017 um 02:12 schrieb Marek Olšák:
>> On Thu, Jan 12, 2017 at 12:33 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>> On Wed, Jan 11, 2017 at 4:00 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>>>> Am 11.01.2017 um 21:08 schrieb Samuel Pitoiset:
>>>>> On 01/11/2017 07:00 PM, Roland Scheidegger wrote:
>>>>>> I don't think there's any glsl, es or otherwise, specification which
>>>>>> would require denorms (since obviously lots of hw can't do it, d3d10
>>>>>> forbids them), with any precision qualifier. Hence these look like bugs
>>>>>> of the test suite to me?
>>>>>> (Irrespective if it's a good idea or not to enable denormals, which I
>>>>>> don't realy know.)
>>>>> That test works on NVIDIA hw (both with blob and nouveau) and IIRC it
>>>>> also works on Intel hw. I don't think it's buggy there.
>>>> The question then is why it needs denorms on radeons...
>>> I spent some time with Samuel looking at this. So, this is pretty
>>> funny... (or at least feels that way after staring at floating point
>>> for a while)
>>> dEQP is, in fact, feeding denorms to the min/max functions. But it's
>>> smart enough to know that flushing denorms to 0 is OK, and so it
>>> treats a 0 as a pass. (And obviously it treats the "right answer" as a
>>> pass.) So that's why enabling denorm processing fixes it - that causes
>>> the hw to return the proper correct answer and all is well.
>>> However the issue is that without denorm processing, the hw is
>>> returning the *wrong* answer. At first I thought that max was being
>>> lowered into something like
>>> if (a > b) { x = a; } else { x = b; }
>>> which would end up with potentially wrong results if a and b are being
>>> flushed as inputs into the comparison but not into the assignments.
>>> But that's not (explicitly) what's happening - the v_max_f32_e32
>>> instruction is being used. Perhaps that's what it does internally? If
>>> so, that means that results of affected float functions in LLVM need
>>> explicit flushing before being stored into results.
>>> FWIW the specific values triggering the issue are:
>>> in0=-0x0.000002p-126, in1=-0x0.fffffep-126, out0=-0x0.fffffep-126 -> FAIL
>>> With denorm processing, it correctly reports out0=-0x0.000002p-126,
>>> while nouveau with denorm flushing enabled reports out0=0.0 which also
>>> passes.
>> The denorm configuration has 2 bits:
>> - flush (0) or allow (1) input denorms
>> - flush (0) or allow (1) output denorms
>> In the case of v_max, it looks like output denorms are not flushed and
>> it behaves almost like you said:
>> if (a >= b) { x = a; } else { x = b; }
>> Marek
> I think this makes perfect sense. Usually output denorms are always
> flushed but this often doesn't affect "mov-like" operations, only if
> it's some arithmetic operation. And in this case you have a comparison
> plus a cmov.
> FWIW I believe you'd get the same result on x86 sse, if denorm inputs as
> zero (DAZ) is set and denorm output flush is enabled (FTZ), such as with
> llvmpipe. The comparison will think both values are the same, but since
> there's no arithmetic involved for the selection there will be no denorm
> flush, albeit it's not obvious from the manual, since it doesn't
> actually mention if DAZ is honored, but I'd think it is just like with
> all float operations: "IF ((SRC1 == 0.0) and (SRC2 == 0.0)) THEN DEST =
> SRC2;" That behavior may be alright with d3d10 even, though it's far
> from obvious neither, but in any case I didn't see problems due to that
> with conformance tests ("Min or max operations flush denorms for
> comparison, but the result may or may not be denorm flushed." -
> https://msdn.microsoft.com/en-us/library/windows/desktop/cc308050(v=vs.85).aspx)
> )
> Doesn't the hw do denorm flushing on shader export though or does the
> test get back the result via some other means?
> In any case, probably no big deal - since the hw won't generate denorms
> on its own, they can only be passed in, and apps are unlikely to do that
> (and even if they do, a wrong answer probably shouldn't really hurt,
> since it's just a wrong denorm which generally won't matter).

Oh and btw is the wrong denorm result really wrong according to glsl
rules? I think that's at least debatable.


More information about the mesa-dev mailing list