[Mesa-dev] [PATCH 0/2] Nir: Allow CSE of SSBO loads
jason at jlekstrand.net
Fri Oct 23 09:26:16 PDT 2015
On Thu, Oct 22, 2015 at 11:13 PM, Iago Toral <itoral at igalia.com> wrote:
> On Thu, 2015-10-22 at 09:09 -0700, Jason Ekstrand wrote:
>> On Thu, Oct 22, 2015 at 4:21 AM, Iago Toral Quiroga <itoral at igalia.com> wrote:
>> > I implemented this first as a separate optimization pass in GLSL IR , but
>> > Curro pointed out that this being pretty much a restricted form of a CSE pass
>> > it would probably make more sense to do it inside CSE (and we no longer have
>> > a CSE pass in GLSL IR).
>> > Unlike other things we CSE in NIR, in the case of SSBO loads we need to make
>> > sure that we invalidate previous entries in the set in the presence of
>> > conflicting instructions (i.e. SSBO writes to the same block and offset) or
>> > in the presence of memory barriers.
>> > If this is accepted I intend to extend this to also cover image reads, which
>> > follow similar behavior.
>> > No regressions observed in piglit or dEQP's SSBO functional tests.
>> >  http://lists.freedesktop.org/archives/mesa-dev/2015-October/097718.html
>> I think you've gotten enough NAK's that I don't need to chime in
>> there. Unfortunately, solving this in general is something of a
>> research project that both Connor and I have been thinking about for
>> quite some time. I've been thinking off-and-on about how to add a
>> proper memory model to lower_vars_to_ssa for almost a year now and
>> still haven't come up with a good way to do it. I don't know whether
>> SSBO's would be simpler or not. We need a proper memory model for
>> both lower_vars_to_ssa and SSBO load/stores (and shared local
>> variables) but it's a substantial research project.
>> This isn't to say that you couldn't do it. Just know what you're taking on. ;-)
> Yeah, it does not make sense that I try to do this, you guys have
> clearly given this much more thought than me and know much better how a
> solution for this would fit in NIR than me.
>> That said, here's a suggestion for something that we *could* write
>> today, wouldn't be very hard, and wold solve a decent number of cases.
>> For each block:
>> 1) Create a new instruction set (don't use anything from any previous blocks)
>> 2) call add_or_rewrite on all ssbo load operations
>> 3) If you ever see a barrier or ssbo store, destroy the entire
>> instruction set and start again.
> Yep, this is what I was thinking for the load-combine pass that Connor
> suggested. However, I think that in this case we do not need to destroy
> the entire set when we find a store, only for memory barriers, right? I
> mean, there should be nothing preventing us from checking the
> offset/block of the store and compare it with the offset/block of the
> loads in the set to decide which ones we need to remove (like I was
> doing in my last patch)
That's where you get into the "special casing" I mentioned below. If
you have an direct store, you would have to throw away any indirect
loads and then insert a fake direct load for the given offset. If you
have an indirect store, you would have to throw away everything and
then insert a fake indirect load for the given offset. So, yes, you
can do it, but it'll take a little more work. You'll also probably
want to use a different hashing function than we use for CSE because
you need to be able to handle both real and fake loads and have them
compare/hash the same.
Does that make sense?
>> This is something you could put together fairly quickly and would
>> handle a fair number of cases. With a little special casing, you may
>> also be able to handle store and then an immediate load of the same
>> value or duplicate stores. Anything much more complex than that is
>> going to take a lot more thought.
> Yes, I'll give this a try next. Thanks for all the comments and
More information about the mesa-dev