[Mesa-dev] [PATCH] radeonsi: enable 32-bit denormals on VI+
Samuel Pitoiset
samuel.pitoiset at gmail.com
Thu Jan 12 13:12:02 UTC 2017
On 01/12/2017 12:55 PM, Marek Olšák wrote:
> I think v_mad always flushes denorms.
>
> I would just ignore this failure. It's not required to fix every silly
> test on the planet. If you opencode v_max, you'll have the same problem,
> and then you'd have to fix v_cmp. It's just silly.
Your call.
That test compares 60k values and only one is actually wrong (the one
mentioned by Ilia).
>
> Marek
>
> On Jan 12, 2017 11:59 AM, "Nicolai Hähnle" <nhaehnle at gmail.com
> <mailto:nhaehnle at gmail.com>> wrote:
>
> On 12.01.2017 09:24, Samuel Pitoiset wrote:
>
>
>
> On 01/12/2017 02:12 AM, Marek Olšák wrote:
>
> On Thu, Jan 12, 2017 at 12:33 AM, Ilia Mirkin
> <imirkin at alum.mit.edu <mailto:imirkin at alum.mit.edu>>
> wrote:
>
> On Wed, Jan 11, 2017 at 4:00 PM, Roland Scheidegger
> <sroland at vmware.com <mailto:sroland at vmware.com>> wrote:
>
> Am 11.01.2017 um 21:08 schrieb Samuel Pitoiset:
>
>
>
> On 01/11/2017 07:00 PM, Roland Scheidegger wrote:
>
> I don't think there's any glsl, es or
> otherwise, specification which
> would require denorms (since obviously lots
> of hw can't do it, d3d10
> forbids them), with any precision qualifier.
> Hence these look like
> bugs
> of the test suite to me?
> (Irrespective if it's a good idea or not to
> enable denormals, which I
> don't realy know.)
>
>
> That test works on NVIDIA hw (both with blob and
> nouveau) and IIRC it
> also works on Intel hw. I don't think it's buggy
> there.
>
> The question then is why it needs denorms on radeons...
>
>
> I spent some time with Samuel looking at this. So, this
> is pretty
> funny... (or at least feels that way after staring at
> floating point
> for a while)
>
> dEQP is, in fact, feeding denorms to the min/max
> functions. But it's
> smart enough to know that flushing denorms to 0 is OK,
> and so it
> treats a 0 as a pass. (And obviously it treats the
> "right answer" as a
> pass.) So that's why enabling denorm processing fixes it
> - that causes
> the hw to return the proper correct answer and all is well.
>
> However the issue is that without denorm processing, the
> hw is
> returning the *wrong* answer. At first I thought that
> max was being
> lowered into something like
>
> if (a > b) { x = a; } else { x = b; }
>
> which would end up with potentially wrong results if a
> and b are being
> flushed as inputs into the comparison but not into the
> assignments.
> But that's not (explicitly) what's happening - the
> v_max_f32_e32
> instruction is being used. Perhaps that's what it does
> internally? If
> so, that means that results of affected float functions
> in LLVM need
> explicit flushing before being stored into results.
>
> FWIW the specific values triggering the issue are:
>
> in0=-0x0.000002p-126, in1=-0x0.fffffep-126,
> out0=-0x0.fffffep-126 ->
> FAIL
>
> With denorm processing, it correctly reports
> out0=-0x0.000002p-126,
> while nouveau with denorm flushing enabled reports
> out0=0.0 which also
> passes.
>
>
> The denorm configuration has 2 bits:
> - flush (0) or allow (1) input denorms
> - flush (0) or allow (1) output denorms
>
> In the case of v_max, it looks like output denorms are not
> flushed and
> it behaves almost like you said:
>
> if (a >= b) { x = a; } else { x = b; }
>
>
> Should we adjust the denorm mode with s_setreg for
> v_max_f32/v_min_f32?
>
>
> That might eliminate some optimization opportunities, so let's first
> see if another fix is possible?
>
> I haven't run the test, but from the description the most plausible
> explanation is that v_max_f32/v_min_f32 flushes input denorms, but
> doesn't flush output denorms for some stupid reason. Perhaps we
> could change the fp_denorm setting to
>
> - allow input denorms
> - flush output denorms
>
> Then min/max will preserve the denorms, but other operations will
> flush denorms to zero.
>
> Do you know how that affects v_mad_f32? If we just change the
> register without telling LLVM about it, LLVM will still happily emit
> v_mad_f32, and perhaps that produces incorrect results when denorms
> are passed in from uniforms?
>
> If this register setting doesn't work, then yes, looks like s_setreg
> may be needed. Unless there's a cheap way to flush denorms from
> loads as well? But I don't think there is.
>
> Nicolai
>
>
>
>
>
> Marek
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> <mailto:mesa-dev at lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> <mailto:mesa-dev at lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>
More information about the mesa-dev
mailing list