<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 17, 2016 at 11:17 AM, Matt Turner <span dir="ltr"><<a href="mailto:mattst88@gmail.com" target="_blank">mattst88@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Thu, Mar 17, 2016 at 10:21 AM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
> ---<br>
> src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 +<br>
> 1 file changed, 1 insertion(+)<br>
><br>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp<br>
> index baf72a2..155a550 100644<br>
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp<br>
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp<br>
> @@ -375,6 +375,7 @@ vec4_visitor::opt_vector_float()<br>
> if (inst->opcode != BRW_OPCODE_MOV ||<br>
> inst->dst.writemask == WRITEMASK_XYZW ||<br>
> inst->src[0].file != IMM ||<br>
> + inst->src[0].type != inst->dst.type ||<br>
<br>
</span>Why?<br></blockquote><div><br></div><div>That may be the wrong condition. The reason is that the code below doesn't look at the source type at all and just assumes that it's a float. If, for instance, we had the first case below but instead of 0D we had some non-zero VF-capable value, it would interpret it as a float, cram it into a VF, and then copy it to m2 doing a re-interpret instead of doing a conversion.<br><br></div><div>Thinking about it harder, I'm convinced that this patch is bogus, but the bug is real.<br></div><div>--Jasoan<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
This breaks legitimate optimizations, like<br>
<br>
-mov m2.xy:F, 0.500000F<br>
-mov m2.zw:F, 0D<br>
+mov m2:F, [0.5F, 0.5F, 0F, 0F]<br>
<br>
and<br>
<br>
-mov vgrf6.0.x:D, -1082130432D<br>
-mov vgrf6.0.y:D, 1056964608D<br>
-mov vgrf6.0.z:D, 1065353216D<br>
+mov vgrf6.0.xyz:F, [-1F, 0.5F, 1F, 0F]<br>
<br>
and<br>
<br>
-mov vgrf7.0.x:D, 1073741824D<br>
-mov vgrf7.0.yz:D, 0D<br>
+mov vgrf7.0.xyz:F, [2F, 0F, 0F, 0F]<br>
<br>
<br>
The first one we should just handle in opt_algebraic by fixing the src<br>
type. The other two look like NIR fail. If we fixed those things, I'd<br>
be fine with this.<br>
</blockquote></div><br></div></div>