[Mesa-dev] [PATCH] nir: Add an IO scalarizing pass using the intrinsic's first_component.

Timothy Arceri timothy.arceri at collabora.com
Tue Aug 9 00:12:21 UTC 2016


On Mon, 2016-08-08 at 09:18 -0700, Eric Anholt wrote:
> Timothy Arceri <timothy.arceri at collabora.com> writes:
> 
> > 
> > On Sat, 2016-08-06 at 10:15 +1000, Timothy Arceri wrote:
> > > 
> > > On Fri, 2016-08-05 at 16:27 -0700, Eric Anholt wrote:
> > > > 
> > > > vc4 wants to have per-scalar IO load/stores so that dead code
> > > > elimination
> > > > can happen on a more granular basis,
> > 
> > Out of interest what is it exactly that you are doing in the
> > backend? 
> 
> Given that all of my IO is done as indidivual moves of scalars (with
> the
> exception of the color output, which is weird but is lowered by vc4
> code
> anyway), I'd like to see all the scalar ops for setting up
> undefined/unused scalar slots get dropped.  I don't see much change
> from
> this, because dead code is easy to eliminate, but I think there were
> small diffs.  It also makes the output more readable by cutting the
> pointless vector ops.
> 
> However, doing my IO as scalar has been a bit of a pain for other
> passes: The UCP and twoside lowering want a single load/store for the
> vector.  Because of this, I've also wondered if using the write_mask
> and
> extending nir_opt_dce() for per-channel liveness would be a better
> way
> to go.
> 
> > 
> > I was looking at brw_do_vector_splitting() and it seems to me that
> > moving that out of the Intel backend and making it also work on
> > varyings could be benefical to all drivers as we could extend it to
> > work across stages which would hopefully also improve varying
> > packing.
> > 
> > Currently it says: 
> > 
> > "This skips vectors in uniforms and varyings, which need to be
> > accessible as vectors for their access by the GL."
> > 
> > But that really only applies to vs inputs and the outward facing
> > stages
> > of SSOs.
> 
> Yeah, I think that comment is stale at this point.  But we have a lot
> of
> that pass in NIR already (ALU, const, io), and instead of extending
> the
> GLSL IR pass I'd rather see a NIR cross-linking pass.

I've been looking at this recently and I think to do this properly we
want to be able to start disabling the GLSL IR passes and doing things
in NIR before adding cross-linking support. The big blocker I see for
this is that we need to do the optimisations and packing passes before
doing doing validation and assigning varying/uniform locations.

It seems we would need to a least have a nir version of the GLSL IR
packing pass, even more extrem but probably less hacky (e.g not having
to check NIR and remove varying/uniforms that have been eliminated)
would be to do the varying and uniform location/store assignments from
a NIR pass also.

I made some minor steps towards this with this series [1], but there is
still a while to go.


[1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124501.ht
ml  

> 
> On the topic of cross-linking: Back in the day, i965 would look at
> the
> VS outputs and propagate constants into the FS inputs.  This was
> really
> painful, slow code to maintain, and we eventually just dropped it.
> However, if we know that a VS/FS pair are non-SSO, it seems like with
> NIR we could do that propagation really easily at link time.


> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list