[Mesa-dev] i965: FS copy propagation across control flow
Kenneth Graunke
kenneth at whitecape.org
Thu Nov 1 20:20:41 PDT 2012
On 10/30/2012 08:28 PM, Eric Anholt wrote:
> Here's a patch series to clean up the most glaring failures I think we have
> left in FS code generation other than variable-indexed array access.
> Unfortunately, I haven't found a particular testcase to show that it's a
> performance improvement, but I still think it's a good idea since it does
> remove instructions.
>
> I was hoping this series would let me remove the badly-named
> register_coalesce() pass, but it turns out that doing so increases shader-db
> instruction count by a significant fraction of a percent. A bit of a
> surprise.
Patches 1-3 are:
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
(I haven't looked at patch 4 yet, but I will soon.)
This also has another benefit: it cuts compilation time of L4D2's
largest fragment shader from 10.2 to 4.3 seconds (a 57% reduction!).
We used to do 26 iterations through the brw_fs optimization loop; the
first two did a bunch of optimizing, but on iterations 3-25 only
register_coalesce() flagged any progress. Which also meant
recalculating live intervals every time. Absurdly expensive.
With your patch, we do exactly 3 iterations.
1: copy propagation coalesce coalesce2 compute->mrf
2: CSE copy propagation
3: (nothing)
This is vastly more reasonable.
--Ken
More information about the mesa-dev
mailing list