<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 28/09/15 17:05, Jason Ekstrand
wrote:<br>
</div>
<blockquote
cite="mid:CAOFGe97cpECZA1kknfEij9y2MCNV36_HUi4W5zHMbPH7ACBdGw@mail.gmail.com"
type="cite">
<p dir="ltr"><br>
On Sep 28, 2015 2:09 AM, "Alejandro Piñeiro" <<a
moz-do-not-send="true" href="mailto:apinheiro@igalia.com"><a class="moz-txt-link-abbreviated" href="mailto:apinheiro@igalia.com">apinheiro@igalia.com</a></a>>
wrote:<br>
><br>
> Hi,<br>
><br>
> TL;DR:<br>
><br>
> as there are several people working on improving the shader
quality at<br>
> vec4 using NIR, to avoid overlapping, Im explicitly
announcing that this<br>
> week I will work on implement an equivalent to
fs_cmod_propagation but<br>
> for the vec4 case.<br>
><br>
> More details:<br>
><br>
> While checking shader-db HURT regressions, shaders like
these:<br>
> unity/15.shader_test<br>
> warzone2100/1.shader_test<br>
> humus-celshading/4.shader_test<br>
><br>
> are emitting extra movs when conditions are involved.
Writing the<br>
> equivalent fs shader, I found those are optimized by<br>
> opt_cmod_propagation. I vaguely remembered that Jason
mentioned it some<br>
> months ago, and I found this email [1], where he suggest to
implement<br>
> that pass. So just in case someone else was already doing
that, Im<br>
> sending this email.</p>
<p dir="ltr">Hey Alejandro,</p>
<p dir="ltr">First off, thanks for working on this. Now that
we've fixed the type issues in copy propagation and register
coalesce, I think this is probably the last major back end issue
required for getting decent results out of NIR. Not that more
work can't be done (it always can) but this solves the last
known NIR->backend translation issue.</p>
<p dir="ltr">I do have one comment for you to think about. In the
fs backend we never move a flag result *to* a GRF. We only ever
use a CMP with a GRF destination. In vec4, for our
implementation of nir_op_banyN and nir_op_ballN, we do something
like this:</p>
<p dir="ltr">CMP null a b<br>
MOV reg 0ud<br>
MOV(+f0.0) reg 0xffffffffud</p>
<p dir="ltr">Where wise the ANY4H or ALL4H predicate on the second
MOV. We should pick up on this as a CMP that generates a
special predicate so that a MOV.nz that moves it to the flag
actually tur s into a use of the ANY or ALL predicate.<br>
</p>
</blockquote>
<br>
Ok, thanks for the hints. But probably I will try to get the basic
functionality working, based on the current brw_fs_cmod_propagation,
and then try to be smarter based on your comments.<br>
<br>
BTW, I realized that there is a unit test test_fs_cmod_propagation.
I assume that a vec4 equivalent is expected, and will work on that
more or less at the same time I work on the optimization pass. Just
saying in case I'm wrong.<br>
<br>
Best regards<br>
<br>
<pre class="moz-signature" cols="72">--
Alejandro Piñeiro (<a class="moz-txt-link-abbreviated" href="mailto:apinheiro@igalia.com">apinheiro@igalia.com</a>)</pre>
</body>
</html>