[Mesa-dev] [PATCH v2 12/17] i965/vs: Rework vs_emit to take a nir_shader and a brw_compiler

Wed Oct 14 22:48:03 PDT 2015

On Wed, Oct 14, 2015 at 11:53:37AM -0700, Jason Ekstrand wrote:
> On Wed, Oct 14, 2015 at 1:41 AM, Pohjolainen, Topi
> <topi.pohjolainen at intel.com> wrote:
> > On Wed, Oct 14, 2015 at 11:25:40AM +0300, Pohjolainen, Topi wrote:
> >> On Sat, Oct 10, 2015 at 08:09:01AM -0700, Jason Ekstrand wrote:
> >> > This commit removes all dependence on GL state by getting rid of the
> >> > brw_context parameter and the GL data structures.
> >> >
> >> > v2 (Jason Ekstrand):
> >> >    - Patch use_legacy_snorm_formula through as a function argument rather
> >> >      than trying to go through the shader key.
> >> > ---
> >> >  src/mesa/drivers/dri/i965/brw_vec4.cpp | 70 +++++++++++++---------------------
> >> >  src/mesa/drivers/dri/i965/brw_vs.c     | 16 +++++++-
> >> >  src/mesa/drivers/dri/i965/brw_vs.h     | 12 ++++--
> >> >  3 files changed, 49 insertions(+), 49 deletions(-)
> >> >
> >> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> >> > index 4b8390f..8e38729 100644
> >> > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> >> > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> >> > @@ -1937,51 +1937,42 @@ extern "C" {
> >> >   * Returns the final assembly and the program's size.
> >> >   */
> >> >  const unsigned *
> >> > -brw_vs_emit(struct brw_context *brw,
> >> > +brw_vs_emit(const struct brw_compiler *compiler, void *log_data,
> >> >              void *mem_ctx,
> >> >              const struct brw_vs_prog_key *key,
> >> >              struct brw_vs_prog_data *prog_data,
> >> > -            struct gl_vertex_program *vp,
> >> > -            struct gl_shader_program *prog,
> >> > +            const nir_shader *shader,
> >> > +            gl_clip_plane *clip_planes,
> >> > +            bool use_legacy_snorm_formula,
> >> >              int shader_time_index,
> >> > -            unsigned *final_assembly_size)
> >> > +            unsigned *final_assembly_size,
> >> > +            char **error_str)
> >> >  {
> >> >     const unsigned *assembly = NULL;
> >> >
> >> > -   if (brw->intelScreen->compiler->scalar_vs) {
> >> > +   if (compiler->scalar_vs) {
> >> >        prog_data->base.dispatch_mode = DISPATCH_MODE_SIMD8;
> >> >
> >> > -      fs_visitor v(brw->intelScreen->compiler, brw,
> >> > -                   mem_ctx, key, &prog_data->base.base,
> >> > +      fs_visitor v(compiler, log_data, mem_ctx, key, &prog_data->base.base,
> >> >                     NULL, /* prog; Only used for TEXTURE_RECTANGLE on gen < 8 */
> >> > -                   vp->Base.nir, 8, shader_time_index);
> >> > -      if (!v.run_vs(brw_select_clip_planes(&brw->ctx))) {
> >> > -         if (prog) {
> >> > -            prog->LinkStatus = false;
> >> > -            ralloc_strcat(&prog->InfoLog, v.fail_msg);
> >> > -         }
> >> > -
> >> > -         _mesa_problem(NULL, "Failed to compile vertex shader: %s\n",
> >> > -                       v.fail_msg);
> >> > +                   shader, 8, shader_time_index);
> >> > +      if (!v.run_vs(clip_planes)) {
> >> > +         if (error_str)
> >> > +            *error_str = ralloc_strdup(mem_ctx, v.fail_msg);
> >>
> >> I don't particularly like the complexity of the error reporting mechanism.
> >> First vec4_visitor::fail() uses ralloc_asprintf() to create one string, then
> >> we make a copy of it here and finally the caller of brw_vs_emit() makes yet
> >> another copy using ralloc_strcat().
> >> I wonder if we could pass the final destination all the way for the
> >> vec4_visitor::fail() to augment with ralloc_asprintf() and hence avoid all
> >
> > Or more appropiately using ralloc_asprintf_append()...
> >
> >> the indirection in the middle. What do you think?
> 
> I'd be moderately ok with just doing "*error_str = v.fail_msg" and
> avoiding the extra copy.  I'm not a big fan of the extra copy, but I
> decided to leave it in for a couple of reasons
> 
> 1) It only happens on the error path so it's not a big deal.

I wasn't concerned about the overhead either, as you said this is error path
only.

> 
> 2) Not copying it is kind of a layering violation.  You're grabbing a
> string from an object without copying it, destroying the object, and
> then handing it back to the thing that called you.  The only way this
> works is if you know that the class ralloc'd the string from the
> context you gave it.  We do, in this case, but it did seem like a bit
> of a layering violation.
> 
> 3) The first time I did this rework, I created a new memory context
> for *_emit and destroyed that memory context at the end.  Because
> fail_msg was allocated out of this temp context, I had to do something
> with it before returning it.  The objective there was to remove the
> mem_ctx input parameter and make it more self-contained.  Then I
> realized that we had to do something about the pointer that gets
> returned.  The only sane use of the emit function is to call it,
> immediately copy the result into some buffer, and destroy the context
> so it wasn't like having a temp context just for *_emit was really
> going help us keep stuff around for a shorter period of time.  In the
> end, I put the mem_ctx parameter back in.
> 
> There's my reasoning.  That said, if you'd still like to get rid of
> the copy, I'm ok with that.

Clearly you have given it more thought than I have. I don't really have
preference between making the copy or assigning the pointer directly. I can
only think of policy-based designs giving control to the caller (which in
turn would involve templates). So lets just leave it as is:

Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>