[Mesa-dev] [PATCH] nir: add nir_instr_type_tex support to nir_lower_phis_to_scalar()
Jason Ekstrand
jason at jlekstrand.net
Fri Feb 22 22:19:26 UTC 2019
On Fri, Feb 22, 2019 at 4:05 PM Timothy Arceri <tarceri at itsqueeze.com>
wrote:
>
>
> On 23/2/19 8:54 am, Jason Ekstrand wrote:
> > On Fri, Feb 22, 2019 at 2:47 PM Timothy Arceri <tarceri at itsqueeze.com
> > <mailto:tarceri at itsqueeze.com>> wrote:
> >
> >
> >
> > On 23/2/19 6:31 am, Rob Clark wrote:
> > > On Fri, Feb 22, 2019 at 12:39 PM Jason Ekstrand
> > <jason at jlekstrand.net <mailto:jason at jlekstrand.net>> wrote:
> > >>
> > >> On Fri, Feb 22, 2019 at 9:51 AM Eric Anholt <eric at anholt.net
> > <mailto:eric at anholt.net>> wrote:
> > >>>
> > >>> Timothy Arceri <tarceri at itsqueeze.com
> > <mailto:tarceri at itsqueeze.com>> writes:
> > >>>
> > >>>> shader-db results i965 (SKL):
> > >>>>
> > >>>> total instructions in shared programs: 13219105 -> 13024761
> > (-1.47%)
> > >>>> instructions in affected programs: 1169457 -> 975113 (-16.62%)
> > >>>> helped: 599
> > >>>> HURT: 154
> > >>>>
> > >>>> total cycles in shared programs: 333968972 -> 324822073
> (-2.74%)
> > >>>> cycles in affected programs: 130032440 -> 120885541 (-7.03%)
> > >>>> helped: 590
> > >>>> HURT: 216
> > >>>>
> > >>>> total spills in shared programs: 57947 -> 29130 (-49.73%)
> > >>>> spills in affected programs: 53364 -> 24547 (-54.00%)
> > >>>> helped: 351
> > >>>> HURT: 0
> > >>>>
> > >>>> total fills in shared programs: 51310 -> 25468 (-50.36%)
> > >>>> fills in affected programs: 44882 -> 19040 (-57.58%)
> > >>>> helped: 351
> > >>>> HURT: 0
> > >>>> ---
> > >>>> src/compiler/nir/nir_lower_phis_to_scalar.c | 1 +
> > >>>> 1 file changed, 1 insertion(+)
> > >>>>
> > >>>> diff --git a/src/compiler/nir/nir_lower_phis_to_scalar.c
> > b/src/compiler/nir/nir_lower_phis_to_scalar.c
> > >>>> index 16001f73685..f6f702bca15 100644
> > >>>> --- a/src/compiler/nir/nir_lower_phis_to_scalar.c
> > >>>> +++ b/src/compiler/nir/nir_lower_phis_to_scalar.c
> > >>>> @@ -74,6 +74,7 @@ is_phi_src_scalarizable(nir_phi_src *src,
> > >>>> /* A phi is scalarizable if we're going to lower it */
> > >>>> return should_lower_phi(nir_instr_as_phi(src_instr),
> > state);
> > >>>>
> > >>>> + case nir_instr_type_tex:
> > >>>> case nir_instr_type_load_const:
> > >>>> case nir_instr_type_ssa_undef:
> > >>>> /* These are trivially scalarizable */
> > >>>
> > >>> Sounds promising, but I would definitely not describe
> > instr_type_tex as
> > >>> "trivially scalarizable" -- could you explain what's going on
> > with this
> > >>> patch?
> >
> > Basically it just turns:
> >
> > if ssa0 {
> > ...
> > vec4 ssa1 = txf .....
> > } else {
> > ...
> > vec4 ssa2 = ...
> > }
> > vec4 ss3 = phi ssa1, ssa2
> >
> > Into
> >
> > if ssa0 {
> > ...
> > vec4 ssa1 = txf .....
> > vec1 ssa2 = imov ssa1.x
> > vec1 ssa3 = imov ssa1.y
> > vec1 ssa4 = imov ssa1.z
> > vec1 ssa5 = imov ssa1.w
> > } else {
> > ...
> > vec4 ssa6 = ...
> > vec1 ssa7 = imov ssa6.x
> > vec1 ssa8 = imov ssa6.y
> > vec1 ssa9 = imov ssa6.z
> > vec1 ssa10 = imov ssa6.w
> > }
> > vec1 ss11 = phi ssa2, ssa7
> > vec1 ss12 = phi ssa3, ssa8
> > vec1 ss13 = phi ssa4, ssa9
> > vec1 ss14 = phi ssa5, ssa10
> >
> >
> > This allows a whole bunch more optimisation to take place as often
> not
> > all of the phi channels are actually used. If some cases large
> > chunks of
> > logic can be remove from the if branch that doesn't contain the
> texture
> > access.
> >
> > >>
> > >>
> > >> I think I can for Intel though I'm not sure how this affects
> > other drivers.
> > >>
> > >> On Intel hardware, we almost always have to combine all the
> > texture sources into one big message. Since having more than one
> > source is very common, this means that we have to make a temporary
> > copy of the sources anyway. Because we're copying them, having them
> > contiguous (a vector in NIR terms) doesn't actually gain us
> > anything. We may as well let NIR scalarize them and give more
> > freedom to the register allocator and other NIR passes which may
> > need to clean things up. We don't want to make the same choice for
> > destinations as they are required to be contiguous.
> > >>
> > >
> > > hmm, but this is abut the phi src (ie. the tex dest), not the tex
> > src, isn't it?
> >
> >
> > Well, that's a pickle... When I wrote this pass I did so to try and
> > explicitly keep from breaking up things that are known to be vectors in
> > the back-end such as textures. The idea was to try and not break up
> > things like this:
> >
> > if (...) {
> > ssa_1 = tex()
> > } else {
> > ssa_2 = tex()
> > }
> > ssa_3 = phi (ssa_1, ssa_2)
> >
> > in the hopes that it would turn into
> >
> > if (...) {
> > r1 = tex();
> > } else {
> > r1 = tex();
> > }
> >
> > Clearly, that notion was mis-placed. At this point, I really wonder
> > what the complexity is saving us. Maybe it's not worth it at all?
> > Maybe we need to be more agressive and require all sources to not be
> > vectorizable or something?
>
> Maybe checking if all components of the phi are used before converting
> to scalar would be useful? Not sure but I've sent a more conservative v2
> of this patch where a bunch of hurt is gone.
>
> The remaining hurt is very small and comes from shaders like this:
>
> if (...) {
> r1 = tex();
> } else {
> if (...) {
> r1 = tex();
> } else {
> r1 = undefined;
> }
> }
>
We generally shouldn't let undef affect things. This is currently true but
not anymore with your new heuristic. I'm thinking about how to actually
write the heuristic I have in my head and it's really annoying. :'(
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190222/ab1dc63e/attachment-0001.html>
More information about the mesa-dev
mailing list