<div dir="ltr"><div>The R600 ISA documentation only says that the DX10 variants of MIN and MAX use DX10 handling of NaNs. It does not say anything about the non-DX10 variants. </div>Marek </div><div class="gmail_extra"> <div class="gmail_quote">On Sun, Apr 14, 2013 at 8:16 PM, Roland Scheidegger <<a href="mailto:sroland@vmware.com" target="_blank">sroland@vmware.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Am 14.04.2013 18:39, schrieb Marek Olšák: <div class="im">> On Sun, Apr 14, 2013 at 5:24 PM, Roland Scheidegger <<a href="mailto:sroland@vmware.com">sroland@vmware.com</a> </div><div class="im">> <mailto:<a href="mailto:sroland@vmware.com">sroland@vmware.com</a>>> wrote: > > Am 14.04.2013 10:12, schrieb <a href="mailto:jfonseca@vmware.com">jfonseca@vmware.com</a> </div>> <mailto:<a href="mailto:jfonseca@vmware.com">jfonseca@vmware.com</a>>:> - TBD <div class="im">> > + Start an IF ... ELSE .. ENDIF block. Condition evaluates to > true if > > + > > + src0.x != 0.0 > > + > > + where src0.x is interpreted as a floating point register. > Maybe should say something wrt evaluation of NaNs? I know we haven't > really established rules for comparisons etc. wrt NaNs but those > bools-as-float make me cry. I guess it is no different though than other > float opcodes, if we now really have a definition saying IF takes _any_ > float not just a bool-as-float which was loosely implied before. > > > I don't know where the term "bool-as-float" came from, but I'd rather > not use it unless it's properly defined somewhere, and TGSI doesn't have > bools anyway, so why bother? The GLSL compiler or glsl-to-tgsi is > responsible for converting bools to either floats or ints and TGSI > shouldn't need to care. Both r300g and r600g use (src0.x != 0.0) for IF > and (src0.x != 0) for UIF (r600-only), so there is always the > "not-equal-to" operator, which is also well defined for NaNs. </div>That depends on your definition of "well defined". llvm for instance has both "ordered not equal" and "unordered not equal" operators for precisely this reason. But yes I guess ieee-754 has some defined behavior there. That "bool-as-float" essentially comes from state trackers, because the language they are translating from require bools as "if" inputs - hence the input value always should have been the result of some comparison (or similar) operation (which in turn return these fake bools). But I agree this was never really documented, so just clearly stating you can pass in any float is just fine (it means that state trackers now are explicitly allowed to omit the comparison for simple cases like this one, "if(a != 0)...", well if they can detect it, it was not really obvious without documentation before if that would be ok). So in that sense nothing more needs to be said about NaNs, since they just adhere to the same rules as in other places (meaning pretty much undefined for most things, currently). <div class="im"> > > Also if you care about NaNs, we should start by defining how > instructions should handle them, e.g. how relational operators handle > NaNs, whether the multiplication operator follows the rule 0*anything = > 0 (MUL, MAD, DP4, ...), etc. > > R600 have separate opcodes depending on what behavior you want, for example: > - The MUL opcode follows the rule 0*anything = 0. (DX9) > - The MUL_IEEE opcode follows the IEEE behavior. > > The other opcodes with both the DX9 and IEEE behavior are: MAD, DP4, > EX2, LG2, RCP, RSQ. There are also separate MIN and MAX opcodes for DX9 > and DX10. We should choose our opcodes carefully depending on whether we > are implementing a DX9, DX10, OpenGL, or OpenCL state tracker. </div>Yes indeed. d3d10 has quite strict rules which are mostly ieee754 (or ieee754r) but with some deviations. Other specs tend to be more lenient, and requiring strict rules could add quite some overhead, so we might want to introduce additional opcodes. How does MIN/MAX work for dx9 btw? DX10 will require you to give back the non-NaN value if only one argument is NaN (which seems to be ieee754r behavior), which for instance unfortunately doesn't translate well to sse2 code (as sse2 will just give you the second source if there's a NaN in either src which means you had to use cmp/select instead and be careful about what comparison you use there since the cpu doesn't support the full set of "ordered" and "unordered" comparisons unless you've got avx though presumably llvm would take care of that if you use the right comparison ops there). Roland </blockquote></div> </div>