[Mesa-dev] [PATCH 11/28] util: added float to float16 conversions with RTZ and RTNE
Roland Scheidegger
sroland at vmware.com
Fri Dec 7 16:11:48 UTC 2018
Am 07.12.18 um 05:22 schrieb Matt Turner:
> On Thu, Dec 6, 2018 at 7:22 PM Roland Scheidegger <sroland at vmware.com> wrote:
>>
>> Am 07.12.18 um 03:20 schrieb Matt Turner:
>>> Since this is for an extension that will be BDW+ can we use the
>>> _cvtss_sh() intrinsic instead? It corresponds to an IVB+ instruction
>>> and even takes the rounding mode directly as an immediate argument.
>>
>> Not saying trying to use it isn't a good idea, but you'd need the right
>> compile flags, and you can't assume it's present, since even the latest
>> pentiums don't support avx (and by extension, f16c). (The same is true
>> for atoms too, of course).
>
> I'm not sure that AVX and F16C are related, but from a quick glance it
> seems that you're right that Atoms ("little core") doesn't support
> F16C. I had no idea :(
>
> As far as I can tell all "big cores" have F16C. That's what
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Fgcc%2Fx86-Options.html&data=02%7C01%7Csroland%40vmware.com%7Ca977fe6f49144fb22be608d65bfbb280%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636797533925838415&sdata=oyAmOqL3xyDJ4pWo7jpduH4XawLuSKJf432K7X31094%3D&reserved=0 indicates.
That also indicates SNB and up all have AVX. Despite that,
Pentiums/Celerons from those families definitely do not. (I suppose that
means cputype=ivbybridge etc. can't be used if you target the
pentiums/celerons, at least not for gcc. I know this was a recurring
problem for llvm with autodetect of cpu type, when it would recognize
newer core and then trying to use avx / avx2 on pentiums, dying in a fire.)
That f16c is tied implicitly to avx seems obvious without a doubt, since
the instructions (VCVTPH2PS, VCVTPS2PH) only exist with VEX encoding.
You cannot issue VEX-encoded instructions without AVX (VEX-encoding _is_
AVX, regardless if you use the 128bit or 256bit variants).
If you don't like that pentiums don't support those, complain to intel
(as it's just disabled, of course). IMHO it's a bit silly nowadays...
>
> If we've got to have the code, we might as well use it and not
> complicate it by using _cvtss_sh() then. Dang.
>
> (Unfortunately there seems to be bad information out there confusing
> things though... see https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommunities.intel.com%2Fthread%2F121635&data=02%7C01%7Csroland%40vmware.com%7Ca977fe6f49144fb22be608d65bfbb280%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636797533925838415&sdata=KOCiTY%2BLWFc1eu7iMPWPm2PALY7Bl%2FNaEoVk%2FP%2BAvaw%3D&reserved=0)
Quite sure this is blatantly false. Seems even intel is confused about
it :-).
Roland
More information about the mesa-dev
mailing list