[Mesa-dev] [PATCH 2/3] radeonsi: implement PK2H and UP2H opcodes

Wed Feb 3 17:29:26 UTC 2016

Am 03.02.2016 um 18:01 schrieb Marek Olšák:
> On Wed, Feb 3, 2016 at 5:37 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 03.02.2016 um 10:38 schrieb Michel Dänzer:
>>> On 03.02.2016 18:29, Marek Olšák wrote:
>>>> On Wed, Feb 3, 2016 at 10:19 AM, Michel Dänzer <michel at daenzer.net> wrote:
>>>>> On 03.02.2016 05:15, Marek Olšák wrote:
>>>>>> On Sat, Jan 30, 2016 at 12:46 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>>>>
>>>>>>> Based on a gallivm patch by Ilia Mirkin.
>>>>>>>
>>>>>>> +8 piglit regressions due to precision issues
>>>>>
>>>>> You're saying this patch causes 8 piglit tests to fail? What are the
>>>>> benefits we get in exchange for that?
>>>>
>>>> The tests are too strict and llvmpipe allegedly fails them too.
>>>
>>> Allegedly? You can easily test that. :)
>> That's not so easy. I'm not even entirely sure they are really too strict.
>> The glsl wording leaves something to be desired, with things such as
>> "rounding mode is undefined" but yet it requires at least some
>> operations to be "correctly rounded".
>> FWIW the arb_shader_packing tests require either round-to-nearest-even
>> or round-to-nearest-trunc (both with rounding not representable finite
>> values to infinity), whereas llvmpipe does just trunc (which comes with
>> round-to-max-finite). (There's also the question about fp16 denorms -
>> llvmpipe will flush them to zero for pack, but handle them on unpack,
>> again glsl doesn't really say anything about that...). However, I wasn't
>> brave enough to actually enable it for llvmpipe at least for now...
> 
> A simple test that checks if the results are within reasonable bounds
> should be enough in my opinion. We can't change the behavior of the
> hardware instructions anyway.
> 
> The current radeonsi setting for fp16, fp32, and fp64 is:
> - round to nearest even
> - flush input and output denorms
> 
> (FLOAT_MODE in PGM registers, described in the GCN3 ISA document:
> section 6.4, table 6.4, fp16 uses the fp64 setting IIRC)
> 
> Marek
> 

Are you sure though the cvt f16 works according to that? Pre-GCN3 you
didn't have any fp16 instructions (except the conversion one).
Albeit gcn actually had a separate  V_CVT_PKRTZ_F16_F32 intrinsic -
which says in the name it is round to zero so I suppose the other one
indeed honors current rounding mode. But if it is using round to nearest
even, it should work for the piglit test (I think it was supposed to
tolerate flush to zero).

Roland