[Mesa-dev] [Piglit] [PATCH] Add (un)packHalf tests which don't fail on GCN

Fri Feb 5 14:56:57 UTC 2016

Am 05.02.2016 um 15:44 schrieb Marek Olšák:
> On Fri, Feb 5, 2016 at 10:57 AM, Marek Olšák <maraeo at gmail.com> wrote:
>> On Fri, Feb 5, 2016 at 1:55 AM, Matt Turner <mattst88 at gmail.com> wrote:
>>> On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>
>>>> This is a subset of the generated tests which are known to fail
>>>> on everything except CPU emulation (AFAIK).
>>>> ---
>>>
>>> This is really awful. Committing a generated test, but with unknown
>>> bits chopped out is gross.
>>>
>>> If it were me, I'd want to understand why my hardware behaved
>>> differently -- not just hack up *different* tests and claim victory.
>>>
>>> FWIW, the generated tests pass on all Intel hardware exposing
>>> ARB_shading_language_packing. Gen7+ has native half-float support, and
>>> Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
>>> built-ins into a pile of instructions.
>>>
>>> If you can identify how AMD hardware behaves differently and can prove
>>> that the generator needs to be relaxed or something, that's cool. But
>>> as is, I hate this patch.
>>>
>>> I can't find anything in the AMD docs (I looked at GCN3) about
>>> half-precision support, so I can't check my theory that AMD hardware
>>> rounds towards zero instead of to-nearest/even like Intel.
>>
>> Since the tests only fail with very small numbers, I think the problem
>> is that denorms are disabled by radeonsi. I can try to confirm that.
>>
>> The hardware rounds to nearest even. The hw precision is:
>> - unpack functions - 0 ULP
>> - pack functions = 0.5 ULP
>> - input and output denorms are flushed to 0
> 
> Hey Matt,
> 
> I have just confirmed that I was right. After I enable denormals in
> hw, the original tests pass. This means that this patch tests the
> packing functions but skips denormals.
> 
> Not so awful now, is it? :)
> 
> Sadly, I can't enable denormals on all chips, because they are slow.
> 
> So if I add "-no-denormals" suffix into the test names, I can push this, right?
> 

Can't you hack up the generator instead? By the looks of it
(gen_builtin_packing_tests.py) it has a list of values which result in
denorm f16 values (make_inputs_for_pack_half_2x16). Presumably you could
add a test there which uses a different list, not including them.

(That said, I'm a bit surprised for conversion to/from fp16 your hw
doesn't do fp16 denorms - they'd be required by d3d(11) as well.)

Roland