[Mesa-dev] [PATCH 00/25] i965: Scalar back-end support for SIMD32, part 4.

Jason Ekstrand jason at jlekstrand.net
Sat May 28 14:50:37 UTC 2016

On Fri, May 27, 2016 at 7:05 PM, Francisco Jerez <currojerez at riseup.net>

> This fixes the few code quality regressions from the previous series
> enabling SIMD32 CS codegen in the back-end -- AFAICT by the end of the
> series we can finally enable GL 4.3 on all Gen8+ hardware.
> Patches 1-8 delay the SIMD lowering pass after the bulk of
> optimization passes have been run, which should decrease the
> compilation time of mainly SIMD32 shaders and improve the code quality
> of SIMD32 shaders on all generations and shaders of any dispatch width
> on older generations (up to and including IVB) that use SIMD lowering
> more intensively to implement various workarounds.
> Patches 9-14 rework the SIMD lowering pass to avoid emitting the copy
> instructions used to zip and unzip register regions where possible,
> since the register coalesce and copy propagation passes seem to
> perform rather poorly at getting rid of them in some cases.  In the
> long term we'll likely want to improve the register coalesce pass
> irrespective of these changes.
> Patches 15-20 improve the compute-to-mrf pass used on Gen4-6 to handle
> cases where the source of a VGRF-to-MRF copy is initialized by the
> shader using multiple single-GRF writes, which becomes far more common
> with the additional SIMD lowering going on after this series.
> Patches 21-24 are some other assorted changes improving code quality
> on older gens.
> I wanted to provide more detailed (e.g. per commit) shader-db stats
> with this series, but kind of ran out of time.  Let me know if you
> would like to see more evidence that any of the changes below is
> improving code quality in case it's not clear from the commit alone.
> [PATCH 01/25] i965/fs: Let CSE handle logical sampler sends as expressions.
> [PATCH 02/25] i965/fs: Allow constant propagation into logical send
> sources.
> [PATCH 03/25] i965/fs: Add FS_OPCODE_FB_WRITE_LOGICAL to
> has_side_effects().
> [PATCH 04/25] i965/fs: Run SIMD and logical send lowering after the
> optimization loop.
> [PATCH 05/25] i965/fs: Take opt_redundant_discard_jumps out of the
> optimization loop.
> [PATCH 06/25] i965/fs: Fix UB list sentinel dereference in
> opt_sampler_eot().
> [PATCH 07/25] i965/fs: Implement opt_sampler_eot() in terms of logical
> sends.
> [PATCH 08/25] SQUASH: i965/fs: Add basic dataflow check to
> opt_sampler_eot().

I had a few comments on 7, namely that it should be 3 patches and maybe a
different check.  Once that gets resolved, 1-8 are

Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

> [PATCH 09/25] i965/fs: Refactor offset() into a separate function taking
> the width as argument.
> [PATCH 10/25] i965/fs: Generalize regions_overlap() from copy propagation
> to handle non-VGRF files.
> [PATCH 11/25] i965/fs: Factor out region zipping and unzipping from the
> SIMD lowering pass.
> [PATCH 12/25] i965/fs: Skip SIMD lowering source unzipping for regular
> scalar regions.
> [PATCH 13/25] i965/fs: Skip SIMD lowering destination zipping if possible.
> [PATCH 14/25] i965/fs: Reindent emit_zip().
> [PATCH 15/25] i965/fs: Teach regions_overlap() about COMPR4 MRF regions.
> [PATCH 16/25] i965/fs: Simplify and improve accuracy of compute_to_mrf()
> by using regions_overlap().
> [PATCH 17/25] i965/fs: Fix compute-to-mrf VGRF region coverage condition.
> [PATCH 18/25] i965/fs: Refactor compute_to_mrf() to split search and
> rewrite into separate loops.
> [PATCH 19/25] i965/fs: Teach compute_to_mrf about the COMPR4 address
> transformation.
> [PATCH 20/25] i965/fs: Extend compute_to_mrf() to coalesce VGRFs
> initialized by multiple single-GRF writes.
> [PATCH 21/25] i965/fs: Extend remove_duplicate_mrf_writes() to handle
> non-VGRF to MRF copies.
> [PATCH 22/25] i965/fs: Fix constant combining for instructions that cannot
> accept source mods.
> [PATCH 23/25] i965/fs: Allow scalar source regions on SNB math
> instructions.
> [PATCH 24/25] i965/fs: Skip gen4 pre/post-send dependency workaronds for
> the first/last block.
> [PATCH 25/25] i965: Expose GL 4.3 on Gen8+.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160528/8a12ddb8/attachment.html>

More information about the mesa-dev mailing list