[Mesa-dev] [PATCH 07/11] i965: Move postprocess_nir to codegen time

Iago Toral itoral at igalia.com
Mon Nov 16 23:55:14 PST 2015


Hi Jason,

On Mon, 2015-11-16 at 07:50 -0800, Jason Ekstrand wrote:
> 
> On Nov 16, 2015 2:01 AM, "Iago Toral" <itoral at igalia.com> wrote:
> >
> > On Fri, 2015-11-13 at 07:34 -0800, Jason Ekstrand wrote:
> > >
> > > On Nov 13, 2015 5:53 AM, "Iago Toral" <itoral at igalia.com> wrote:
> > > >
> > > > On Wed, 2015-11-11 at 17:26 -0800, Jason Ekstrand wrote:
> > > > > ---
> > > > >  src/mesa/drivers/dri/i965/brw_fs.cpp              | 11
> > > +++++++++--
> > > > >  src/mesa/drivers/dri/i965/brw_nir.c               |  1 -
> > > > >  src/mesa/drivers/dri/i965/brw_vec4.cpp            |  5 ++++-
> > > > >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  6 +++++-
> > > > >  4 files changed, 18 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > > > index ad94fa4..b8713ab 100644
> > > > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > > > @@ -43,6 +43,7 @@
> > > > >  #include "brw_wm.h"
> > > > >  #include "brw_fs.h"
> > > > >  #include "brw_cs.h"
> > > > > +#include "brw_nir.h"
> > > > >  #include "brw_vec4_gs_visitor.h"
> > > > >  #include "brw_cfg.h"
> > > > >  #include "brw_dead_control_flow.h"
> > > > > @@ -5459,13 +5460,16 @@ brw_compile_fs(const struct
> brw_compiler
> > > *compiler, void *log_data,
> > > > >                 void *mem_ctx,
> > > > >                 const struct brw_wm_prog_key *key,
> > > > >                 struct brw_wm_prog_data *prog_data,
> > > > > -               const nir_shader *shader,
> > > > > +               const nir_shader *src_shader,
> > > > >                 struct gl_program *prog,
> > > > >                 int shader_time_index8, int
> shader_time_index16,
> > > > >                 bool use_rep_send,
> > > > >                 unsigned *final_assembly_size,
> > > > >                 char **error_str)
> > > > >  {
> > > > > +   nir_shader *shader = nir_shader_clone(mem_ctx,
> src_shader);
> > > > > +   brw_postprocess_nir(shader, compiler->devinfo, true);
> > > > > +
> > > >
> > > > Maybe it is a silly question, but why do we need to clone the
> shader
> > > to
> > > > do this?
> > >
> > > Because brw_compile_foo may be called multiple times on the same
> > > shader source. Since brw_postprocess_nir alters the shader source,
> we
> > > need to make a copy.
> >
> > Ok, trying to see if I get the big picture of the series:
> >
> > So the situation before this change is that we were running
> > brw_postprocess_nir in brw_create_nir (so at link-time) before we
> ran
> > brw_compile_foo (at codegen/drawing time), and thus, we never had
> this
> > problem. We still had to fix codegen for texture rectangle when
> drawing
> > though, which we were doing with rescale_texcoord().
> >
> > With this change, we handle texture rectangle in
> brw_postprocess_nir()
> > so we don't need rescale_texcoord() any more, however, this needs to
> be
> > done at codegen time anyway, so as a consequence, now we have to
> move
> > all of brw_postprocess_nir there and we have to clone the NIR
> shader.
> >
> > Did I get it right?
> 
> It actually happens in brw_apply_foo_key, but yes.
> 
> > If I did then I wonder about the performance impact of this change,
> > since codegen happens when we draw and there is plenty of things
> > happening in brw_postprocess_nir (plus the cloning). Is it worth it?
> 
> That's a very hard question to answer definitively.  However, here are
> a few data-points:

Thanks for the very detailed reply:

>   a) At one point we were doing all our NIR stuff on-demand.  No one
> complained about the performance impact.
> 
>   b) We pre-compile for the common case at link time so we will only
> hit this at draw time if we actually need a recompile.

Yeah, this is true. Although it looks like with this we are stepping a
bit further into the performance unpredictability issues that have
usually been a concern with OpenGL drivers.

>   c) While brw_postprocess_nir looks like it does a lot of stuff, it
> calls a fixed number of mostly linear-time passes.  The most expensive
> is almost certainly the the out-of-SSA pass and that one is on the
> order of register allocation in the back-end:
> http://people.freedesktop.org/~cwabbott0/perf-shader-db-nir.svg

Yeah, that could be true.

> Will it have an affect? Yes.  Will it be that bad? I don't think so.
> It also has some advantages.  In the case of texture-rectangle, it
> lets us delete some fairly nasty code in the fs back-end compiler and
> gives us support (without porting that nasty code) in the vec4
> back-end.  In the case of texture swizzle, it lets us share code
> between the backends, cleans up the backends, and gives NIR a chance
> to optimize the swizzle.  I think this last point is important.  There
> are some cases such as "if (...) { a = tex } else { a = tex }" where
> we end up with a pipeline-stalling move right after both tex
> operations that the FS backend has a lot of trouble getting rid of.
> NIR, on the other hand, might be able to see through the phi node that
> it's just a swizzling mov, reswizzle the phi, and get rid of the mov.
> I haven't really looked into this problem in detail yet, but I think
> it will be easier with texture swizzle out of the way.
> 
> So, yes, I think it's worth it.  Do I have hard numbers to prove it,
> no.

I agree that there are clear benefits. I was just curious about the
performance side in this case because it looked like we were doing
something that is usually discouraged, so I wanted to raise that
discussion. If you feel like it is worth it I certainly have no
objections :)

This patch is,

Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

> --Jason
> 
> 
> > > > >     /* key->alpha_test_func means simulating alpha testing via
> > > discards,
> > > > >      * so the shader definitely kills pixels.
> > > > >      */
> > > > > @@ -5618,11 +5622,14 @@ brw_compile_cs(const struct
> brw_compiler
> > > *compiler, void *log_data,
> > > > >                 void *mem_ctx,
> > > > >                 const struct brw_cs_prog_key *key,
> > > > >                 struct brw_cs_prog_data *prog_data,
> > > > > -               const nir_shader *shader,
> > > > > +               const nir_shader *src_shader,
> > > > >                 int shader_time_index,
> > > > >                 unsigned *final_assembly_size,
> > > > >                 char **error_str)
> > > > >  {
> > > > > +   nir_shader *shader = nir_shader_clone(mem_ctx,
> src_shader);
> > > > > +   brw_postprocess_nir(shader, compiler->devinfo, true);
> > > > > +
> > > > >     prog_data->local_size[0] = shader->info.cs.local_size[0];
> > > > >     prog_data->local_size[1] = shader->info.cs.local_size[1];
> > > > >     prog_data->local_size[2] = shader->info.cs.local_size[2];
> > > > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> > > b/src/mesa/drivers/dri/i965/brw_nir.c
> > > > > index 21c2648..693b9cd 100644
> > > > > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > > > > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > > > > @@ -391,7 +391,6 @@ brw_create_nir(struct brw_context *brw,
> > > > >
> > > > >     brw_preprocess_nir(nir, is_scalar);
> > > > >     brw_lower_nir(nir, devinfo, shader_prog, is_scalar);
> > > > > -   brw_postprocess_nir(nir, devinfo, is_scalar);
> > > > >
> > > > >     return nir;
> > > > >  }
> > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > > b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > > > > index 8350a02..9f75bb6 100644
> > > > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > > > > @@ -2028,13 +2028,16 @@ brw_compile_vs(const struct
> brw_compiler
> > > *compiler, void *log_data,
> > > > >                 void *mem_ctx,
> > > > >                 const struct brw_vs_prog_key *key,
> > > > >                 struct brw_vs_prog_data *prog_data,
> > > > > -               const nir_shader *shader,
> > > > > +               const nir_shader *src_shader,
> > > > >                 gl_clip_plane *clip_planes,
> > > > >                 bool use_legacy_snorm_formula,
> > > > >                 int shader_time_index,
> > > > >                 unsigned *final_assembly_size,
> > > > >                 char **error_str)
> > > > >  {
> > > > > +   nir_shader *shader = nir_shader_clone(mem_ctx,
> src_shader);
> > > > > +   brw_postprocess_nir(shader, compiler->devinfo,
> > > compiler->scalar_vs);
> > > > > +
> > > > >     const unsigned *assembly = NULL;
> > > > >
> > > > >     unsigned nr_attributes =
> > > _mesa_bitcount_64(prog_data->inputs_read);
> > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> > > b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> > > > > index 49c1083..92b15d9 100644
> > > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> > > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> > > > > @@ -30,6 +30,7 @@
> > > > >  #include "brw_vec4_gs_visitor.h"
> > > > >  #include "gen6_gs_visitor.h"
> > > > >  #include "brw_fs.h"
> > > > > +#include "brw_nir.h"
> > > > >
> > > > >  namespace brw {
> > > > >
> > > > > @@ -604,7 +605,7 @@ brw_compile_gs(const struct brw_compiler
> > > *compiler, void *log_data,
> > > > >                 void *mem_ctx,
> > > > >                 const struct brw_gs_prog_key *key,
> > > > >                 struct brw_gs_prog_data *prog_data,
> > > > > -               const nir_shader *shader,
> > > > > +               const nir_shader *src_shader,
> > > > >                 struct gl_shader_program *shader_prog,
> > > > >                 int shader_time_index,
> > > > >                 unsigned *final_assembly_size,
> > > > > @@ -614,6 +615,9 @@ brw_compile_gs(const struct brw_compiler
> > > *compiler, void *log_data,
> > > > >     memset(&c, 0, sizeof(c));
> > > > >     c.key = *key;
> > > > >
> > > > > +   nir_shader *shader = nir_shader_clone(mem_ctx,
> src_shader);
> > > > > +   brw_postprocess_nir(shader, compiler->devinfo,
> > > compiler->scalar_gs);
> > > > > +
> > > > >     prog_data->include_primitive_id =
> > > > >        (shader->info.inputs_read & VARYING_BIT_PRIMITIVE_ID) !
> = 0;
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
> 
> 
> 




More information about the mesa-dev mailing list