[Mesa-dev] [PATCH 0/2] Nir: Allow CSE of SSBO loads
jason at jlekstrand.net
Thu Oct 22 09:09:59 PDT 2015
On Thu, Oct 22, 2015 at 4:21 AM, Iago Toral Quiroga <itoral at igalia.com> wrote:
> I implemented this first as a separate optimization pass in GLSL IR , but
> Curro pointed out that this being pretty much a restricted form of a CSE pass
> it would probably make more sense to do it inside CSE (and we no longer have
> a CSE pass in GLSL IR).
> Unlike other things we CSE in NIR, in the case of SSBO loads we need to make
> sure that we invalidate previous entries in the set in the presence of
> conflicting instructions (i.e. SSBO writes to the same block and offset) or
> in the presence of memory barriers.
> If this is accepted I intend to extend this to also cover image reads, which
> follow similar behavior.
> No regressions observed in piglit or dEQP's SSBO functional tests.
>  http://lists.freedesktop.org/archives/mesa-dev/2015-October/097718.html
I think you've gotten enough NAK's that I don't need to chime in
there. Unfortunately, solving this in general is something of a
research project that both Connor and I have been thinking about for
quite some time. I've been thinking off-and-on about how to add a
proper memory model to lower_vars_to_ssa for almost a year now and
still haven't come up with a good way to do it. I don't know whether
SSBO's would be simpler or not. We need a proper memory model for
both lower_vars_to_ssa and SSBO load/stores (and shared local
variables) but it's a substantial research project.
This isn't to say that you couldn't do it. Just know what you're taking on. ;-)
That said, here's a suggestion for something that we *could* write
today, wouldn't be very hard, and wold solve a decent number of cases.
For each block:
1) Create a new instruction set (don't use anything from any previous blocks)
2) call add_or_rewrite on all ssbo load operations
3) If you ever see a barrier or ssbo store, destroy the entire
instruction set and start again.
This is something you could put together fairly quickly and would
handle a fair number of cases. With a little special casing, you may
also be able to handle store and then an immediate load of the same
value or duplicate stores. Anything much more complex than that is
going to take a lot more thought.
More information about the mesa-dev