<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Thu, Jan 17, 2019 at 6:42 PM Matt Turner <<a href="mailto:mattst88@gmail.com">mattst88@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, Jan 15, 2019 at 5:54 AM Iago Toral Quiroga <<a href="mailto:itoral@igalia.com" target="_blank">itoral@igalia.com</a>> wrote:<br>
><br>
> NIR already has these so they are redundant. A run of shader-db confirms<br>
> that the only cases where these backend optimizations are activated<br>
> are some Tomb Raider shaders where the affected variables are qualified<br>
> as "precise", which is why NIR won't apply them and why the backend<br>
> shouldn't either (so it is actually a bug).<br>
<br>
Which of the six optimizations that you're removing were responsible<br>
for the change? I ask because...<br></blockquote><div><br></div><div>If it's one of the precise ones, we should port it to NIR...<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
><br>
> Suggested-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net" target="_blank">jason@jlekstrand.net</a>><br>
> ---<br>
>  src/intel/compiler/brw_fs.cpp | 37 -----------------------------------<br>
>  1 file changed, 37 deletions(-)<br>
><br>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp<br>
> index 77c955ac435..e7f5a8822a3 100644<br>
> --- a/src/intel/compiler/brw_fs.cpp<br>
> +++ b/src/intel/compiler/brw_fs.cpp<br>
> @@ -2568,16 +2568,6 @@ fs_visitor::opt_algebraic()<br>
>              break;<br>
>           }<br>
>           break;<br>
> -      case BRW_OPCODE_LRP:<br>
> -         if (inst->src[1].equals(inst->src[2])) {<br>
> -            inst->opcode = BRW_OPCODE_MOV;<br>
> -            inst->src[0] = inst->src[1];<br>
> -            inst->src[1] = reg_undef;<br>
> -            inst->src[2] = reg_undef;<br>
> -            progress = true;<br>
> -            break;<br>
<br>
I'm not sure whether this is imprecise, and...<br></blockquote><div><br></div><div>Doesn't work for NaN or either inf, at least not unles inf - inf == 0 which I don't think it is.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> -         }<br>
> -         break;<br>
>        case BRW_OPCODE_CMP:<br>
>           if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||<br>
>                inst->conditional_mod == BRW_CONDITIONAL_NZ) &&<br>
> @@ -2654,33 +2644,6 @@ fs_visitor::opt_algebraic()<br>
>              }<br>
>           }<br>
>           break;<br>
> -      case BRW_OPCODE_MAD:<br>
> -         if (inst->src[1].is_zero() || inst->src[2].is_zero()) {<br>
> -            inst->opcode = BRW_OPCODE_MOV;<br>
> -            inst->src[1] = reg_undef;<br>
> -            inst->src[2] = reg_undef;<br>
> -            progress = true;<br>
> -         } else if (inst->src[0].is_zero()) {<br>
> -            inst->opcode = BRW_OPCODE_MUL;<br>
> -            inst->src[0] = inst->src[2];<br>
> -            inst->src[2] = reg_undef;<br>
> -            progress = true;<br>
> -         } else if (inst->src[1].is_one()) {<br>
> -            inst->opcode = BRW_OPCODE_ADD;<br>
> -            inst->src[1] = inst->src[2];<br>
> -            inst->src[2] = reg_undef;<br>
> -            progress = true;<br>
> -         } else if (inst->src[2].is_one()) {<br>
> -            inst->opcode = BRW_OPCODE_ADD;<br>
> -            inst->src[2] = reg_undef;<br>
> -            progress = true;<br>
> -         } else if (inst->src[1].file == IMM && inst->src[2].file == IMM) {<br>
> -            inst->opcode = BRW_OPCODE_ADD;<br>
> -            inst->src[1].f *= inst->src[2].f;<br>
> -            inst->src[2] = reg_undef;<br>
> -            progress = true;<br>
<br>
or this one.<br></blockquote><div><br></div><div>Yes, it is.  Part of the point of FMA is that it's more precise than mul+add because the mul is done with extra precision and added to src[0] in high-precision before the final rounding.  This optimization explicitly breaks the MAD into mul+add.</div><div><br></div><div>--Jason<br></div></div></div>