<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 17, 2016 at 6:10 PM, Matt Turner <span dir="ltr"><<a href="mailto:mattst88@gmail.com" target="_blank">mattst88@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Thu, Mar 17, 2016 at 5:51 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
> ---<br>
> src/compiler/nir/nir.h | 11 +++++++++++<br>
> src/compiler/nir/nir_clone.c | 1 +<br>
> src/compiler/nir/nir_print.c | 2 ++<br>
> 3 files changed, 14 insertions(+)<br>
><br>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h<br>
> index 34f31eb..94b981b 100644<br>
> --- a/src/compiler/nir/nir.h<br>
> +++ b/src/compiler/nir/nir.h<br>
> @@ -671,6 +671,17 @@ extern const nir_op_info nir_op_infos[nir_num_opcodes];<br>
> typedef struct nir_alu_instr {<br>
> nir_instr instr;<br>
> nir_op op;<br>
> +<br>
> + /** Indicates that this ALU instruction generates an exact value<br>
> + *<br>
> + * This is kind-of a mixture of GLSL "precise" and "invariant" and not<br>
<br>
</span>"kind of" isn't hyphenated.<span class=""><br></span></blockquote><div><br></div><div>Thanks<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
> + * really equivalent to either. This indicates that the value generated by<br>
> + * this operation is high-precision and any code transformations that touch<br>
> + * it must ensure that the resulting value is bit-for-bit identical to the<br>
> + * original.<br>
<br>
</span>I think this is a lot of the problem -- we don't seem to have a good<br>
idea of what these keywords mean, concretely.<br>
<br>
Precise is more clear to me: don't optimize things in such a way as to<br>
change the result.<br>
<br>
Invariant is much less clear to me though. I've read the GLSL spec of<br>
course, but can someone give me an example?<br>
</blockquote></div><br></div><div class="gmail_extra">The best docs I've found are in the ES 3.2 spec. Basically it means that you're allowed to optimize it in an imprecise way as long as you always optimize the computation exactly the same way. One of the places this gets us in trouble is in the fma peephole where we decide whether to fuse a mul+add or not based on how many users the add has and if they're all mul. This means that if we have a mul+add in an invariant expression and another, unrelated, user of the mul, it won't get fused. If you dead-code the other unrelated user of the mul, things change and we fuse them. This is the kind of thing that's not allowed. Does that make more sense?<br><br></div><div class="gmail_extra">Unfortunately, invariant is horrifically difficult to think about. It's much easier to just implement it as precise which is also a valid way to do it.<br></div><div class="gmail_extra">--Jason<br></div></div>