<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jan 11, 2017 at 1:12 PM, Erik Faye-Lund <span dir="ltr"><<a href="mailto:kusmabite@gmail.com" target="_blank">kusmabite@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Wed, Jan 11, 2017 at 9:49 PM, Erik Faye-Lund <<a href="mailto:kusmabite@gmail.com">kusmabite@gmail.com</a>> wrote:<br>
> On Wed, Jan 11, 2017 at 9:42 PM, Erik Faye-Lund <<a href="mailto:kusmabite@gmail.com">kusmabite@gmail.com</a>> wrote:<br>
>> On Wed, Jan 11, 2017 at 9:22 PM, Samuel Pitoiset<br>
>> <<a href="mailto:samuel.pitoiset@gmail.com">samuel.pitoiset@gmail.com</a>> wrote:<br>
>>><br>
>>><br>
>>> On 01/11/2017 07:34 PM, Erik Faye-Lund wrote:<br>
>>>><br>
>>>> On Wed, Jan 11, 2017 at 7:33 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br>
>>>> wrote:<br>
>>>>><br>
>>>>> On Wed, Jan 11, 2017 at 10:31 AM, Erik Faye-Lund <<a href="mailto:kusmabite@gmail.com">kusmabite@gmail.com</a>><br>
>>>>> wrote:<br>
>>>>>><br>
>>>>>><br>
>>>>>> On Wed, Jan 11, 2017 at 7:22 PM, Marek Olšák <<a href="mailto:maraeo@gmail.com">maraeo@gmail.com</a>> wrote:<br>
>>>>>>><br>
>>>>>>> On Wed, Jan 11, 2017 at 7:09 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br>
>>>>>>> wrote:<br>
>>>>>>>><br>
>>>>>>>> On Wed, Jan 11, 2017 at 9:32 AM, Samuel Pitoiset<br>
>>>>>>>> <<a href="mailto:samuel.pitoiset@gmail.com">samuel.pitoiset@gmail.com</a>><br>
>>>>>>>> wrote:<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> On 01/11/2017 05:32 PM, Marek Olšák wrote:<br>
>>>>>>>>>><br>
>>>>>>>>>><br>
>>>>>>>>>> On Wed, Jan 11, 2017 at 4:33 PM, Erik Faye-Lund<br>
>>>>>>>>>> <<a href="mailto:kusmabite@gmail.com">kusmabite@gmail.com</a>><br>
>>>>>>>>>> wrote:<br>
>>>>>>>>>>><br>
>>>>>>>>>>><br>
>>>>>>>>>>> On Wed, Jan 11, 2017 at 4:14 PM, Nicolai Hähnle<br>
>>>>>>>>>>> <<a href="mailto:nhaehnle@gmail.com">nhaehnle@gmail.com</a>><br>
>>>>>>>>>>> wrote:<br>
>>>>>>>>>>>><br>
>>>>>>>>>>>><br>
>>>>>>>>>>>> On 11.01.2017 13:17, Marek Olšák wrote:<br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>> On Tue, Jan 10, 2017 at 6:48 PM, Jason Ekstrand<br>
>>>>>>>>>>>>> <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br>
>>>>>>>>>>>>> wrote:<br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>> I'll be honest, I'm not a fan... Given that D3D10 has one<br>
>>>>>>>>>>>>>> defined<br>
>>>>>>>>>>>>>> behavior,<br>
>>>>>>>>>>>>>> D3D9 has another, and GL doesn't specify, I don't really think<br>
>>>>>>>>>>>>>> we<br>
>>>>>>>>>>>>>> should<br>
>>>>>>>>>>>>>> be<br>
>>>>>>>>>>>>>> making a global change to all drivers to do the D3D9 behavior<br>
>>>>>>>>>>>>>> just to<br>
>>>>>>>>>>>>>> fix<br>
>>>>>>>>>>>>>> one app. Sure, other apps probably have the same bug, but are<br>
>>>>>>>>>>>>>> we<br>
>>>>>>>>>>>>>> going<br>
>>>>>>>>>>>>>> to<br>
>>>>>>>>>>>>>> have apps that expect the D3D10 behavior that we've now<br>
>>>>>>>>>>>>>> explicitly<br>
>>>>>>>>>>>>>> made<br>
>>>>>>>>>>>>>> not<br>
>>>>>>>>>>>>>> work?<br>
>>>>>>>>>>>>>><br>
>>>>>>>>>>>>>> If we're going to hack around an app bug, I would really rather<br>
>>>>>>>>>>>>>> see<br>
>>>>>>>>>>>>>> it<br>
>>>>>>>>>>>>>> behind a driconf option rather than a global change to driver<br>
>>>>>>>>>>>>>> behavior.<br>
>>>>>>>>>>>>>> Even better, it'd be cool if we could see the app get fixed.<br>
>>>>>>>>>>>>>> (Yes, I<br>
>>>>>>>>>>>>>> know<br>
>>>>>>>>>>>>>> that's not likely).<br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>> I think we are not in a position to refuse this workaround, or<br>
>>>>>>>>>>>>> put<br>
>>>>>>>>>>>>> more precisely, to have a different behavior from everybody else.<br>
>>>>>>>>>>>>> By<br>
>>>>>>>>>>>>> "we", I mean i965, radeonsi, svga. All closed drivers use abs.<br>
>>>>>>>>>>>>> Many<br>
>>>>>>>>>>>>> Mesa drivers also use abs internally (r300, r600, nv30,<br>
>>>>>>>>>>>>> nv50/nvc0).<br>
>>>>>>>>>>>>> This is not really a workaround for a specific application, even<br>
>>>>>>>>>>>>> though it's strongly motivated by that. It's a fix to align the<br>
>>>>>>>>>>>>> few<br>
>>>>>>>>>>>>> remaining drivers with all others.<br>
>>>>>>>>>>>>><br>
>>>>>>>>>>>>> We talked with the publisher about this a very long time ago.<br>
>>>>>>>>>>>>> While I<br>
>>>>>>>>>>>>> don't remember the details (Nicolai?), I think they refused to<br>
>>>>>>>>>>>>> fix<br>
>>>>>>>>>>>>> it<br>
>>>>>>>>>>>>> because radeonsi appeared to be the only driver not doing abs.<br>
>>>>>>>>>>>><br>
>>>>>>>>>>>><br>
>>>>>>>>>>>><br>
>>>>>>>>>>>><br>
>>>>>>>>>>>> If I remember correctly, it wasn't so much a refusal as a lack of<br>
>>>>>>>>>>>> follow-through. They even had an option in their framework to add<br>
>>>>>>>>>>>> the<br>
>>>>>>>>>>>> abs(...) when translating shaders, but somehow didn't turn it on<br>
>>>>>>>>>>>> unconditionally for some reason...<br>
>>>>>>>>>>><br>
>>>>>>>>>>><br>
>>>>>>>>>>><br>
>>>>>>>>>>> VP even says so here:<br>
>>>>>>>>>>> <a href="https://github.com/virtual-programming/specops-linux/issues/20" rel="noreferrer" target="_blank">https://github.com/virtual-<wbr>programming/specops-linux/<wbr>issues/20</a><br>
>>>>>>>>>>><br>
>>>>>>>>>>> They recommend against patching mesa to do abs, though.<br>
>>>>>>>>>><br>
>>>>>>>>>><br>
>>>>>>>>>><br>
>>>>>>>>>> We should still patch Mesa to align the behavior with closed drivers<br>
>>>>>>>>>> and gallium drivers like r600g and nouveau. In other words, it's too<br>
>>>>>>>>>> late to tell us not to patch Mesa, because r600g and nouveau have<br>
>>>>>>>>>> been<br>
>>>>>>>>>> "patched" since the beginning.<br>
>>>>>>>>>><br>
>>>>>>>>>> We only need to decide whether we should do it in the GLSL compiler<br>
>>>>>>>>>> or<br>
>>>>>>>>>> radeonsi, i.e. whether we should exclude i965 and svga.<br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>><br>
>>>>>>>>> I do agree with that.<br>
>>>>>>>><br>
>>>>>>>><br>
>>>>>>>><br>
>>>>>>>> I tend to disagree but I've come to the conclusion that I won't stand<br>
>>>>>>>> in the<br>
>>>>>>>> way either. If both of the other desktop vendors do it and we've<br>
>>>>>>>> already<br>
>>>>>>>> decided that no implementation we care about will have its performance<br>
>>>>>>>> impacted, it seems like a valid spec-compliant thing to do. I would<br>
>>>>>>>> prefer<br>
>>>>>>>> it to be behind a driconf option, but if it's unconditional, oh well.<br>
>>>>>>>> My<br>
>>>>>>>> disagreement is mostly philosophical.<br>
>>>>>>>><br>
>>>>>>>> Over the last two years of working on Vulkan, I've been fighting<br>
>>>>>>>> broken<br>
>>>>>>>> tests and apps left and right. Vulkan has a huge amount of area<br>
>>>>>>>> where,<br>
>>>>>>>> if<br>
>>>>>>>> an app does something wrong, they get undefined behavior which is up<br>
>>>>>>>> to<br>
>>>>>>>> and<br>
>>>>>>>> including program termination. And basically all apps are broken in<br>
>>>>>>>> some<br>
>>>>>>>> way. Fortunately, the validation layers are finally starting to catch<br>
>>>>>>>> up to<br>
>>>>>>>> the point where I'm noticing very few bugs that the validation layers<br>
>>>>>>>> don't<br>
>>>>>>>> catch and things are getting into a better state. However, I've had<br>
>>>>>>>> more<br>
>>>>>>>> discussions than I can count with people where I have to explain to<br>
>>>>>>>> them<br>
>>>>>>>> that "No, the app is broken. It needs to be fixed. It's not my job<br>
>>>>>>>> to<br>
>>>>>>>> make<br>
>>>>>>>> it work." Once you start allowing brokenness, you can never stop<br>
>>>>>>>> allowing<br>
>>>>>>>> it and you paint yourself into a corner. Suddenly, you go to make a<br>
>>>>>>>> change,<br>
>>>>>>>> and your design decisions are not guided by the spec, they're guided<br>
>>>>>>>> by<br>
>>>>>>>> the<br>
>>>>>>>> spec *and* all of the broken apps that you have to keep working on<br>
>>>>>>>> your<br>
>>>>>>>> driver because you let something through.<br>
>>>>>>>><br>
>>>>>>>> In the world of GLES and OpenGL conformance, we fight the same fight.<br>
>>>>>>>> When<br>
>>>>>>>> people ask me how conformance is coming, I frequently answer with,<br>
>>>>>>>> "We've<br>
>>>>>>>> got a bunch of people fixing <insert test suite name here> so that our<br>
>>>>>>>> driver passes". It's not that mesa is particularly touchy, it's that<br>
>>>>>>>> a<br>
>>>>>>>> good<br>
>>>>>>>> chunk of the rest of the industry just hacks around everything inside<br>
>>>>>>>> their<br>
>>>>>>>> driver and doesn't bother to fix the tests. Sometimes the driver that<br>
>>>>>>>> passes the conformance suite isn't even the one they ship. If we're<br>
>>>>>>>> going<br>
>>>>>>>> to have a spec and hardware vendors (or the FOSS community) are going<br>
>>>>>>>> to<br>
>>>>>>>> implement it and apps are going to write to it, then we all need to<br>
>>>>>>>> agree on<br>
>>>>>>>> what it means and play by the rules. If an app doesn't play by the<br>
>>>>>>>> rules<br>
>>>>>>>> and does something with undefined behavior, then it's a broken app.<br>
>>>>>>>> If<br>
>>>>>>>> we<br>
>>>>>>>> say, "No, it's ok, you don't have to fix it. We'll just hack around<br>
>>>>>>>> it"<br>
>>>>>>>> we're enablers for their broken behavior and the broken behavior<br>
>>>>>>>> continues.<br>
>>>>>>>> In this particular case, we're dealing with a broken app. The only<br>
>>>>>>>> real<br>
>>>>>>>> issue is that all of the drivers that point out the issues were not<br>
>>>>>>>> drivers<br>
>>>>>>>> they tested on.<br>
>>>>>>>><br>
>>>>>>>> Another reason why I'm not a huge fan is that there is some momentum<br>
>>>>>>>> in<br>
>>>>>>>> the<br>
>>>>>>>> industry to make GLSL better defined with respect to NaN. I don't<br>
>>>>>>>> know<br>
>>>>>>>> that<br>
>>>>>>>> anything will ever come of it (because it may break apps) but if<br>
>>>>>>>> something<br>
>>>>>>>> does, we may find ourselves having to make SQRT and RSQ NaN-correct in<br>
>>>>>>>> the<br>
>>>>>>>> future and, hey look, it'll break apps.<br>
>>>>>>>><br>
>>>>>>>> Ok, rant over. Push it if you want. You can even put my nakked-by on<br>
>>>>>>>> it if<br>
>>>>>>>> you'd like. :-)<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> I agree with you completely, and I find it unfortunate too that we<br>
>>>>>>> have to add the workaround to GLSL or radeonsi to align its behavior<br>
>>>>>>> with closed drivers.<br>
>>>>>><br>
>>>>>><br>
>>>>>> Just for reference, I just tested what NVIDIA does on Windows, and<br>
>>>>>> they *don't* seem to do inversesqrt(abs(x)) on my HW/driver.<br>
>>>>><br>
>>>>><br>
>>>>><br>
>>>>> What about sqrt()? Do they do abs for one and not the other? Because<br>
>>>>> that<br>
>>>>> would be crazy but also possible.<br>
>>>><br>
>>>><br>
>>>> Not for sqrt either, it seems.<br>
>>><br>
>>><br>
>>> Huh? I'm sure I have seen NVIDIA doing rsq(abs()) one day. Maybe it was<br>
>>> specific to that application but I don't remember the name...<br>
>><br>
>> Could be. But negative inputs to inversesqrt() returns NaN for me when<br>
>> writing a GLSL-shader in Render Monkey.<br>
>><br>
>> This fragment shader produces red output:<br>
>><br>
>> ---8<---<br>
>> void main(void)<br>
>> {<br>
>> gl_FragColor = vec4(0.0);<br>
>> float tmp = inversesqrt(-1.0);<br>
>> if (tmp != tmp)<br>
>> gl_FragColor.x = 1.0;<br>
>> }<br>
>> ---8<---<br>
>> And just to be sure, I've verified that passing the -1.0 through a<br>
>> varying doesn't change the result, neither does negative non-constant<br>
>> values.<br>
>><br>
>> Now I've even tested on two different NVIDIA-GPUs, with two different<br>
>> driver versions, both give the same result. I even tried on Intel's<br>
>> Windows drivers, same result. I'm starting to think that this issue<br>
>> hasn't been diagnosed correctly.<br>
><br>
> Actually, the test on the Intel Windows-driver was a mistake (optimus<br>
> mixup); Intel seems to do abs on Windows.<br>
<br>
</div></div>Actually, the Intel Windows driver is even more confusing;<br>
inversesqrt(-1.0) from a constant does not produce NaN, but from a<br>
varying it does! So yeah, that driver seems to have its issues.<br>
</blockquote></div><br></div><div class="gmail_extra">I think this is a good argument for doing the workaround at the GLSL IR level if we choose to do it. GLSL IR, NIR, LLVM, they all constant-fold. If we want constant folding to match what happens when it hits hardwawre, we need to put the abs() in at the highest level.<br></div></div>