[Mesa-dev] [PATCH 2/2] gallium: Desambiguate TGSI_OPCODE_IF.

Sun Apr 14 15:04:35 PDT 2013

Am 14.04.2013 23:44, schrieb Alex Deucher:
> On Sun, Apr 14, 2013 at 2:36 PM, Marek Olšák <maraeo at gmail.com> wrote:
>> The R600 ISA documentation only says that the DX10 variants of MIN and MAX
>> use DX10 handling of NaNs. It does not say anything about the non-DX10
>> variants.
> 
> The difference is the NaN behavior.  The dx10 versions of MIN/MAX are
> NaN safe.
Yes but what does it mean for the non-dx10 versions what do they return
in case one argument is a NaN? Obviously it can't just be random
otherwise you could always use the dx10 version...

Roland


  There are also DX10 and non-DX10 versions of the SET*
> opcodes.  The difference there is in the result:
> 
> SETE             A == B ? 1.0 : 0.0
> SETE_DX10   A == B ?   -1 : 0
> etc.
> 
> Alex
> 
>>
>> Marek
>>
>>
>> On Sun, Apr 14, 2013 at 8:16 PM, Roland Scheidegger <sroland at vmware.com>
>> wrote:
>>>
>>> Am 14.04.2013 18:39, schrieb Marek Olšák:
>>>> On Sun, Apr 14, 2013 at 5:24 PM, Roland Scheidegger <sroland at vmware.com
>>>> <mailto:sroland at vmware.com>> wrote:
>>>>
>>>>     Am 14.04.2013 10:12, schrieb jfonseca at vmware.com
>>>>     <mailto:jfonseca at vmware.com>:> -  TBD
>>>>     > +  Start an IF ... ELSE .. ENDIF block.  Condition evaluates to
>>>>     true if
>>>>     > +
>>>>     > +    src0.x != 0.0
>>>>     > +
>>>>     > +  where src0.x is interpreted as a floating point register.
>>>>     Maybe should say something wrt evaluation of NaNs? I know we haven't
>>>>     really established rules for comparisons etc. wrt NaNs but those
>>>>     bools-as-float make me cry. I guess it is no different though than
>>>> other
>>>>     float opcodes, if we now really have a definition saying IF takes
>>>> _any_
>>>>     float not just a bool-as-float which was loosely implied before.
>>>>
>>>>
>>>> I don't know where the term "bool-as-float" came from, but I'd rather
>>>> not use it unless it's properly defined somewhere, and TGSI doesn't have
>>>> bools anyway, so why bother? The GLSL compiler or glsl-to-tgsi is
>>>> responsible for converting bools to either floats or ints and TGSI
>>>> shouldn't need to care. Both r300g and r600g use (src0.x != 0.0) for IF
>>>> and (src0.x != 0) for UIF (r600-only), so there is always the
>>>> "not-equal-to" operator, which is also well defined for NaNs.
>>> That depends on your definition of "well defined". llvm for instance has
>>> both "ordered not equal" and "unordered not equal" operators for
>>> precisely this reason. But yes I guess ieee-754 has some defined
>>> behavior there.
>>> That "bool-as-float" essentially comes from state trackers, because the
>>> language they are translating from require bools as "if" inputs - hence
>>> the input value always should have been the result of some comparison
>>> (or similar) operation (which in turn return these fake bools).
>>> But I agree this was never really documented, so just clearly stating
>>> you can pass in any float is just fine (it means that state trackers now
>>> are explicitly allowed to omit the comparison for simple cases like this
>>> one, "if(a != 0)...", well if they can detect it, it was not really
>>> obvious without documentation before if that would be ok). So in that
>>> sense nothing more needs to be said about NaNs, since they just adhere
>>> to the same rules as in other places (meaning pretty much undefined for
>>> most things, currently).
>>>
>>>>
>>>> Also if you care about NaNs, we should start by defining how
>>>> instructions should handle them, e.g. how relational operators handle
>>>> NaNs, whether the multiplication operator follows the rule 0*anything =
>>>> 0 (MUL, MAD, DP4, ...), etc.
>>>>
>>>> R600 have separate opcodes depending on what behavior you want, for
>>>> example:
>>>> - The MUL opcode follows the rule 0*anything = 0. (DX9)
>>>> - The MUL_IEEE opcode follows the IEEE behavior.
>>>>
>>>> The other opcodes with both the DX9 and IEEE behavior are: MAD, DP4,
>>>> EX2, LG2, RCP, RSQ. There are also separate MIN and MAX opcodes for DX9
>>>> and DX10. We should choose our opcodes carefully depending on whether we
>>>> are implementing a DX9, DX10, OpenGL, or OpenCL state tracker.
>>>
>>> Yes indeed. d3d10 has quite strict rules which are mostly ieee754 (or
>>> ieee754r) but with some deviations. Other specs tend to be more lenient,
>>> and requiring strict rules could add quite some overhead, so we might
>>> want to introduce additional opcodes. How does MIN/MAX work for dx9 btw?
>>> DX10 will require you to give back the non-NaN value if only one
>>> argument is NaN (which seems to be ieee754r behavior), which for
>>> instance unfortunately doesn't translate well to sse2 code (as sse2 will
>>> just give you the second source if there's a NaN in either src which
>>> means you had to use cmp/select instead and be careful about what
>>> comparison you use there since the cpu doesn't support the full set of
>>> "ordered" and "unordered" comparisons unless you've got avx though
>>> presumably llvm would take care of that if you use the right comparison
>>> ops there).
>>>
>>> Roland
>>
>>
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>