[Mesa-dev] [PATCH 00/25] i965: Scalar back-end support for SIMD32, part 4.
Jason Ekstrand
jason at jlekstrand.net
Sat May 28 21:47:25 UTC 2016
On Fri, May 27, 2016 at 7:05 PM, Francisco Jerez <currojerez at riseup.net>
wrote:
> This fixes the few code quality regressions from the previous series
> enabling SIMD32 CS codegen in the back-end -- AFAICT by the end of the
> series we can finally enable GL 4.3 on all Gen8+ hardware.
>
> Patches 1-8 delay the SIMD lowering pass after the bulk of
> optimization passes have been run, which should decrease the
> compilation time of mainly SIMD32 shaders and improve the code quality
> of SIMD32 shaders on all generations and shaders of any dispatch width
> on older generations (up to and including IVB) that use SIMD lowering
> more intensively to implement various workarounds.
>
> Patches 9-14 rework the SIMD lowering pass to avoid emitting the copy
> instructions used to zip and unzip register regions where possible,
> since the register coalesce and copy propagation passes seem to
> perform rather poorly at getting rid of them in some cases. In the
> long term we'll likely want to improve the register coalesce pass
> irrespective of these changes.
>
> Patches 15-20 improve the compute-to-mrf pass used on Gen4-6 to handle
> cases where the source of a VGRF-to-MRF copy is initialized by the
> shader using multiple single-GRF writes, which becomes far more common
> with the additional SIMD lowering going on after this series.
>
> Patches 21-24 are some other assorted changes improving code quality
> on older gens.
>
> I wanted to provide more detailed (e.g. per commit) shader-db stats
> with this series, but kind of ran out of time. Let me know if you
> would like to see more evidence that any of the changes below is
> improving code quality in case it's not clear from the commit alone.
>
> [PATCH 01/25] i965/fs: Let CSE handle logical sampler sends as expressions.
> [PATCH 02/25] i965/fs: Allow constant propagation into logical send
> sources.
> [PATCH 03/25] i965/fs: Add FS_OPCODE_FB_WRITE_LOGICAL to
> has_side_effects().
> [PATCH 04/25] i965/fs: Run SIMD and logical send lowering after the
> optimization loop.
> [PATCH 05/25] i965/fs: Take opt_redundant_discard_jumps out of the
> optimization loop.
> [PATCH 06/25] i965/fs: Fix UB list sentinel dereference in
> opt_sampler_eot().
> [PATCH 07/25] i965/fs: Implement opt_sampler_eot() in terms of logical
> sends.
[PATCH 08/25] SQUASH: i965/fs: Add basic dataflow check to
> opt_sampler_eot().
>
> [PATCH 09/25] i965/fs: Refactor offset() into a separate function taking
> the width as argument.
> [PATCH 10/25] i965/fs: Generalize regions_overlap() from copy propagation
> to handle non-VGRF files.
> [PATCH 11/25] i965/fs: Factor out region zipping and unzipping from the
> SIMD lowering pass.
> [PATCH 12/25] i965/fs: Skip SIMD lowering source unzipping for regular
> scalar regions.
> [PATCH 13/25] i965/fs: Skip SIMD lowering destination zipping if possible.
> [PATCH 14/25] i965/fs: Reindent emit_zip().
>
9-14 Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
> [PATCH 15/25] i965/fs: Teach regions_overlap() about COMPR4 MRF regions.
> [PATCH 16/25] i965/fs: Simplify and improve accuracy of compute_to_mrf()
> by using regions_overlap().
> [PATCH 17/25] i965/fs: Fix compute-to-mrf VGRF region coverage condition.
> [PATCH 18/25] i965/fs: Refactor compute_to_mrf() to split search and
> rewrite into separate loops.
> [PATCH 19/25] i965/fs: Teach compute_to_mrf about the COMPR4 address
> transformation.
> [PATCH 20/25] i965/fs: Extend compute_to_mrf() to coalesce VGRFs
> initialized by multiple single-GRF writes.
> [PATCH 21/25] i965/fs: Extend remove_duplicate_mrf_writes() to handle
> non-VGRF to MRF copies.
>
15-21 scare me. A lot. They even make me think that forking the compiler
between SNB and IVB may be a good idea. :-/ MRFs are annoying, but COMPR4
is such a gross hack that I really want to teach as little of the compiler
about it as possible.
So here's the million dollar question: Do we need them? and, more
importantly, do we need them now? I didn't see anything wrong in my brief
skimming but don't call that a review.
> [PATCH 22/25] i965/fs: Fix constant combining for instructions that cannot
> accept source mods.
> [PATCH 23/25] i965/fs: Allow scalar source regions on SNB math
> instructions.
> [PATCH 24/25] i965/fs: Skip gen4 pre/post-send dependency workaronds for
> the first/last block.
> [PATCH 25/25] i965: Expose GL 4.3 on Gen8+.
>
22-25 are Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160528/48731fb2/attachment-0001.html>
More information about the mesa-dev
mailing list