[Mesa-dev] [PATCH 5/8] PowerPC: clear Altivec NJ bit
Roland Scheidegger
sroland at vmware.com
Wed Nov 28 07:54:34 PST 2012
Am 28.11.2012 13:47, schrieb Adhemerval Zanella:
> On 11/22/2012 07:33 PM, Roland Scheidegger wrote:
>> Am 22.11.2012 21:34, schrieb Adhemerval Zanella:
>>> Mostly PowerPC system sets the Altivec NJ bit to 1 so denormal number
>>> are handled as 0. Initially it was a performance configuration, since
>>> denormal handling tended to be costly. However it is not the case on
>>> more recent PowerPC chips (POWER6 and onwards).
>>>
>>> This patch enforces the clear of NJ bit in VSCR Altivec register so
>>> denormal numbers are handles as expected by IEEE standards
>>> (more information on PowerISA 2.06 - Section 6.3). This make the
>>> half-float to float transformation and some rounding work correctly
>>> on an Altivec enabled machine.
>>>
>>> Any tips, advices, comments?
>> Looks good to me, though I think ultimately we should be able to avoid
>> denorms - even for x86 we probably want to switch on the DAZ and/or FTZ
>> flags (I guess there's also the question if that interferes with the
>> callers of the jited functions). Denorms ARE slow, and they typically
>> shouldn't be required for shaders (dx10 for instance even though it
>> mostly conforms to ieee754 seems to mandate they are flushed to zero).
>>
>> Roland
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> Thanks for the review and I comment why I have added this patch in Jose
> Fonseca email. Related on Denorms are slow, it might be true in some platforms
> but I didn't observe it on newer PPC (POWER7): in an synthetic benchmark denormals
> multiplication/adds are as fast as any FP input. This is true also for newer
> VSX instruction (ISA 2.06+).
>
Ok maybe that's the case for power7. I was going mostly by the intel
architecture optimization manual, which still has disabling denormals as
one of the top optimizations listed. I guess it could also make a
difference if it's actually doubles or floats at least on some hw (since
denormalized float is perfectly representable as normalized double).
In any case we're going to use dx10 rules at some point.
It is also possible some tests require denormals right now.
Roland
More information about the mesa-dev
mailing list