<p dir="ltr"><br>
On May 5, 2016 6:35 PM, "Connor Abbott" <<a href="mailto:cwabbott0@gmail.com">cwabbott0@gmail.com</a>> wrote:<br>
><br>
> On Thu, May 5, 2016 at 8:51 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
> > According to the GLSL spec, if the user uses fma directly and you have it<br>
> > in your hardware, you shouldn't split it. For a while now, we've been<br>
> > splitting all fma's up-front and then planned to fuse them later. The only<br>
> > reason why this possibly helped before was for ARB programs which is<br>
> > handled by the previous commit. This fixes rendering corruptions in Tomb<br>
> > Raider.<br>
><br>
> Where in the GLSL spec does it say this? In the definition of fma(), it says:<br>
><br>
> "Otherwise, in the absence of precise consumption, there are no<br>
> special constraints on the number of operations or difference in<br>
> precision between fma() and the expression<br>
> 'a * b + c'."<br>
><br>
> So in other words, splitting them and then opportunistically combining<br>
> them is fine. I think the real problem is that the lower_ffma<br>
> transform isn't marked as inexact, when doing that transform does<br>
> violate the GLSL spec. That is, we shouldn't be splitting ffma<br>
> instructions marked as exact.</p>
<p dir="ltr">Sure, but that isn't lowering. Drivers that need lowering want it split regardless. If we want to just split imprecise things then we need a new split_imprecise_ffma option.</p>
<p dir="ltr">I should at least update the comment.</p>
<p dir="ltr">> ><br>
> > Shader-db results on Haswell:<br>
> ><br>
> > total instructions in shared programs: 7560300 -> 7561510 (0.02%)<br>
> > instructions in affected programs: 56265 -> 57475 (2.15%)<br>
> > helped: 86<br>
> > HURT: 291<br>
> ><br>
> > The only shaders in the database that are affected are from Shadow of<br>
> > Mordor which is the first app in our database to use fma().<br>
> ><br>
> > Reported-by: Kenneth Graunke <<a href="mailto:kenneth@whitecape.org">kenneth@whitecape.org</a>><br>
> > ---<br>
> > src/mesa/drivers/dri/i965/brw_compiler.c | 1 -<br>
> > 1 file changed, 1 deletion(-)<br>
> ><br>
> > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c b/src/mesa/drivers/dri/i965/brw_compiler.c<br>
> > index 0ea5e8b..7969543 100644<br>
> > --- a/src/mesa/drivers/dri/i965/brw_compiler.c<br>
> > +++ b/src/mesa/drivers/dri/i965/brw_compiler.c<br>
> > @@ -72,7 +72,6 @@ shader_perf_log_mesa(void *data, const char *fmt, ...)<br>
> > * split all ffma instructions during opt_algebraic and we then re-combine \<br>
> > * them as a later step. \<br>
> > */ \<br>
> > - .lower_ffma = true, \<br>
> > .lower_sub = true, \<br>
> > .lower_fdiv = true, \<br>
> > .lower_scmp = true, \<br>
> > --<br>
> > 2.5.0.400.gff86faf<br>
> ><br>
> > _______________________________________________<br>
> > mesa-dev mailing list<br>
> > <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</p>