[Mesa-dev] [PATCH 5/8] PowerPC: clear Altivec NJ bit
Adhemerval Zanella
azanella at linux.vnet.ibm.com
Wed Nov 28 04:47:21 PST 2012
On 11/22/2012 07:33 PM, Roland Scheidegger wrote:
> Am 22.11.2012 21:34, schrieb Adhemerval Zanella:
>> Mostly PowerPC system sets the Altivec NJ bit to 1 so denormal number
>> are handled as 0. Initially it was a performance configuration, since
>> denormal handling tended to be costly. However it is not the case on
>> more recent PowerPC chips (POWER6 and onwards).
>>
>> This patch enforces the clear of NJ bit in VSCR Altivec register so
>> denormal numbers are handles as expected by IEEE standards
>> (more information on PowerISA 2.06 - Section 6.3). This make the
>> half-float to float transformation and some rounding work correctly
>> on an Altivec enabled machine.
>>
>> Any tips, advices, comments?
> Looks good to me, though I think ultimately we should be able to avoid
> denorms - even for x86 we probably want to switch on the DAZ and/or FTZ
> flags (I guess there's also the question if that interferes with the
> callers of the jited functions). Denorms ARE slow, and they typically
> shouldn't be required for shaders (dx10 for instance even though it
> mostly conforms to ieee754 seems to mandate they are flushed to zero).
>
> Roland
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
Thanks for the review and I comment why I have added this patch in Jose
Fonseca email. Related on Denorms are slow, it might be true in some platforms
but I didn't observe it on newer PPC (POWER7): in an synthetic benchmark denormals
multiplication/adds are as fast as any FP input. This is true also for newer
VSX instruction (ISA 2.06+).
More information about the mesa-dev
mailing list