[Mesa-dev] [PATCH 2/2] gallium: Desambiguate TGSI_OPCODE_IF.

Sun Apr 14 15:30:16 PDT 2013

On Sun, Apr 14, 2013 at 6:04 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 14.04.2013 23:44, schrieb Alex Deucher:
>> On Sun, Apr 14, 2013 at 2:36 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>> The R600 ISA documentation only says that the DX10 variants of MIN and MAX
>>> use DX10 handling of NaNs. It does not say anything about the non-DX10
>>> variants.
>>
>> The difference is the NaN behavior.  The dx10 versions of MIN/MAX are
>> NaN safe.
> Yes but what does it mean for the non-dx10 versions what do they return
> in case one argument is a NaN? Obviously it can't just be random
> otherwise you could always use the dx10 version...

I can't seem to find any finer details at the moment, but apparently
DX9 and DX10 have different rules for NaN propagation for min and max
and clamping (there's also DX9 and DX10 clamping behavior) and the
opcodes implement those differences.  It looks like DX10 (like IEEE)
requires NaN always be propagated while DX9 does not.  I suppose the
non-DX10 version does whatever is expected for NaN on DX9.

Alex

>
> Roland
>
>
>
>   There are also DX10 and non-DX10 versions of the SET*
>> opcodes.  The difference there is in the result:
>>
>> SETE             A == B ? 1.0 : 0.0
>> SETE_DX10   A == B ?   -1 : 0
>> etc.
>>
>> Alex
>>
>>>
>>> Marek
>>>
>>>
>>> On Sun, Apr 14, 2013 at 8:16 PM, Roland Scheidegger <sroland at vmware.com>
>>> wrote:
>>>>
>>>> Am 14.04.2013 18:39, schrieb Marek Olšák:
>>>>> On Sun, Apr 14, 2013 at 5:24 PM, Roland Scheidegger <sroland at vmware.com
>>>>> <mailto:sroland at vmware.com>> wrote:
>>>>>
>>>>>     Am 14.04.2013 10:12, schrieb jfonseca at vmware.com
>>>>>     <mailto:jfonseca at vmware.com>:> -  TBD
>>>>>     > +  Start an IF ... ELSE .. ENDIF block.  Condition evaluates to
>>>>>     true if
>>>>>     > +
>>>>>     > +    src0.x != 0.0
>>>>>     > +
>>>>>     > +  where src0.x is interpreted as a floating point register.
>>>>>     Maybe should say something wrt evaluation of NaNs? I know we haven't
>>>>>     really established rules for comparisons etc. wrt NaNs but those
>>>>>     bools-as-float make me cry. I guess it is no different though than
>>>>> other
>>>>>     float opcodes, if we now really have a definition saying IF takes
>>>>> _any_
>>>>>     float not just a bool-as-float which was loosely implied before.
>>>>>
>>>>>
>>>>> I don't know where the term "bool-as-float" came from, but I'd rather
>>>>> not use it unless it's properly defined somewhere, and TGSI doesn't have
>>>>> bools anyway, so why bother? The GLSL compiler or glsl-to-tgsi is
>>>>> responsible for converting bools to either floats or ints and TGSI
>>>>> shouldn't need to care. Both r300g and r600g use (src0.x != 0.0) for IF
>>>>> and (src0.x != 0) for UIF (r600-only), so there is always the
>>>>> "not-equal-to" operator, which is also well defined for NaNs.
>>>> That depends on your definition of "well defined". llvm for instance has
>>>> both "ordered not equal" and "unordered not equal" operators for
>>>> precisely this reason. But yes I guess ieee-754 has some defined
>>>> behavior there.
>>>> That "bool-as-float" essentially comes from state trackers, because the
>>>> language they are translating from require bools as "if" inputs - hence
>>>> the input value always should have been the result of some comparison
>>>> (or similar) operation (which in turn return these fake bools).
>>>> But I agree this was never really documented, so just clearly stating
>>>> you can pass in any float is just fine (it means that state trackers now
>>>> are explicitly allowed to omit the comparison for simple cases like this
>>>> one, "if(a != 0)...", well if they can detect it, it was not really
>>>> obvious without documentation before if that would be ok). So in that
>>>> sense nothing more needs to be said about NaNs, since they just adhere
>>>> to the same rules as in other places (meaning pretty much undefined for
>>>> most things, currently).
>>>>
>>>>>
>>>>> Also if you care about NaNs, we should start by defining how
>>>>> instructions should handle them, e.g. how relational operators handle
>>>>> NaNs, whether the multiplication operator follows the rule 0*anything =
>>>>> 0 (MUL, MAD, DP4, ...), etc.
>>>>>
>>>>> R600 have separate opcodes depending on what behavior you want, for
>>>>> example:
>>>>> - The MUL opcode follows the rule 0*anything = 0. (DX9)
>>>>> - The MUL_IEEE opcode follows the IEEE behavior.
>>>>>
>>>>> The other opcodes with both the DX9 and IEEE behavior are: MAD, DP4,
>>>>> EX2, LG2, RCP, RSQ. There are also separate MIN and MAX opcodes for DX9
>>>>> and DX10. We should choose our opcodes carefully depending on whether we
>>>>> are implementing a DX9, DX10, OpenGL, or OpenCL state tracker.
>>>>
>>>> Yes indeed. d3d10 has quite strict rules which are mostly ieee754 (or
>>>> ieee754r) but with some deviations. Other specs tend to be more lenient,
>>>> and requiring strict rules could add quite some overhead, so we might
>>>> want to introduce additional opcodes. How does MIN/MAX work for dx9 btw?
>>>> DX10 will require you to give back the non-NaN value if only one
>>>> argument is NaN (which seems to be ieee754r behavior), which for
>>>> instance unfortunately doesn't translate well to sse2 code (as sse2 will
>>>> just give you the second source if there's a NaN in either src which
>>>> means you had to use cmp/select instead and be careful about what
>>>> comparison you use there since the cpu doesn't support the full set of
>>>> "ordered" and "unordered" comparisons unless you've got avx though
>>>> presumably llvm would take care of that if you use the right comparison
>>>> ops there).
>>>>
>>>> Roland
>>>
>>>
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>