[Mesa-dev] [PATCH 1/6] gallium: document PK2H/UP2H

Sun Jan 3 18:49:15 PST 2016

Am 04.01.2016 um 02:05 schrieb Ilia Mirkin:
> On Sun, Jan 3, 2016 at 7:51 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 03.01.2016 um 21:32 schrieb Ilia Mirkin:
>>> On Sun, Jan 3, 2016 at 2:15 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>> On Sun, Jan 3, 2016 at 2:08 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>>>>>>> For the series (with the first point addressed either way,though a tgsi
>>>>>>> exec implementation which should be trivial wouldn't hurt neither)
>>>>>>> Reviewed-by: Roland Scheidegger <sroland at vmware.com>
>>>>>>
>>>>>> Thanks! I'll do a patch for that shortly (tgsi_exec). Unfortunately I
>>>>>> won't be able to enable the cap since it will still use gallivm by
>>>>>> default for vertices. I have a gallivm implementation as well, but it
>>>>>> hits asserts on LLVM 3.5. I'm pretty sure I tested it at one point or
>>>>>> another, but it must have been on another box with a more recent LLVM.
>>>>>
>>>>> Ah right. f16 conversion is pretty annoying indeed, though I'd hope the
>>>>> helpers for that should work. In any case, I only really suggested that
>>>>> because I'd thought it would be trivial, so if it's not I don't consider
>>>>> that important...
>>>>
>>>> I'll send it out as a separate series, including my (semi?) broken
>>>> gallivm impl and leave it to you to fix it if you care, or ignore if
>>>> you don't. (I already have it, so might as well...) I understand
>>>> neither how LLVM works, nor how gallivm uses LLVM, which isn't a great
>>>> combination :)
>>>
>>> And of course the piglits expect out-of-bounds numbers to be
>>> represented as infinities, instead of the clamped value
>>
>> This is, imho, a bug, they should allow both. Because round-towards-zero
>> when converting is allowed by GL when converting floats to half, albeit
>> round-to-nearest-even is preferred. And the former gets you the clamped
>> values.
>>
>>> which is what util_float_to_half does :(
>> Yep. The reason both the util and gallivm code do round-towards zero is
>> that for such conversions GL allows both, but d3d10 is deeply unhappy if
>> you do round-toward-nearest-even (for float to float conversions), at
>> least for the clamp vs. infinite issue. As per the data conversion
>> rules:
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__msdn.microsoft.com_en-2Dus_library_windows_desktop_dd607323-2528v-3Dvs.85-2529.aspx&d=BQIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=Gd7OrjAeguJzGQHAmmnWwz-_ok3_P7HVdfP1UqlD06w&s=c9EvJslgDjJWgBgsKb_VdSLRtbWWq30XqYi0689ilkQ&e= 
>> Albeit there's no specific half float conversion instructions in d3d10
>> (but in d3d11), render target conversions etc. must honor these rules too.
>> I suspect most hw can do both without too much fuzz (x86 f16c certainly
>> can).
> 
> Take it up with people who aren't me :)
>
http://cgit.freedesktop.org/mesa/mesa/tree/src/glsl/lower_packing_builtins.cpp#n990

Yes, it is actually imho somewhat surprising intel gpus can't even do
the round-toward-zero behavior natively (meaning they'd most likely have
to emulate that one way or the other for the d3d10 driver).

> FWIW the f32 -> f16 opcode this maps to on nvc0 has the same
> behaviour. Now it also has rounding mode flags which I don't set and
> perhaps one of them would yield the behaviour that you're talking
> about, but I don't know offhand how to get it. Curiously from the PTX
> ISA docs: "Conversions to floating-point that are beyond the range of
> floating-point numbers are represented with the maximum floating-point
> value (IEEE 754 Inf for f32 and f64, and ~131,000 for f16)."
Yes that's somewhat odd. imho if you set round-towards-zero you should
get the maxf value. With round-to-nearest(-even) you should get the
infinities. This is per standard ieee754 rules. Getting something like
round-to-nearest but the overflowing values still clamped to maxf (or
vice versa) doesn't really make all that much sense, if that is what's
somehow implied by this paragraph.

> 
> If you get the piglit tests changed, I guess I'll poke around.

Hmm quite some python code, so I probably don't have time to dig into
that. Albeit what I can tell is the rounding mode functions inside
gen_builtin_packing_tests.py are (to me) somewhat confusingly named,
"round to nearest" and "round to even" - both are round to nearest, one
is just "round to nearest_even" the other is really "round to
nearest_away_from_zero".
But really, glsl just says "The rounding mode cannot be set and is
undefined." And that is true for ALL operations.
The section also says though for "implicit and explicit conversions
between types - Correctly rounded". It is possible to interpret that as
meaning that while the rounding mode is undefined, it must be consistent
for all operations (in which case it would indeed not be legal to do
ordinary arithmetic with round-to-nearest but packHalf2x16 with
round-toward-zero). If that's true, we need some way to distinguish
between the two possible float->half conversions in gallium, which
sounds like quite a big mess (util functions can get triggered from
several places, and it's totally unclear how that information should be
propagated through, same for shaders where we'd need either two
functions or some other way to annotate the shader to account for such
differences).

Roland