[Mesa-dev] [PATCH v3] i965/gen6/gs: Handle case where a GS doesn't allocate VUE

Iago Toral itoral at igalia.com
Mon Jun 25 09:11:57 UTC 2018


Thanks for testing Mark.

Andrii, I'll add my Reviewed-by and and push the patch to master later
today (I'll also queue it for the next stable release).

Thanks for fixing this!

Iago

On Fri, 2018-06-22 at 13:18 -0700, Mark Janes wrote:
> Tested-by: Mark Janes <mark.a.janes at intel.com>
> 
> Iago Toral <itoral at igalia.com> writes:
> 
> > Thanks Andrii, this version looks good to me.
> > 
> > Mark: this change fixes a GPU hang in sandy bridge with geometry
> > shaders (the change itself affects a path in the driver that is
> > only
> > executed in SNB with GS, so nothing else is affected). While I
> > think
> > the change in here is correct according to the PRMs and in fact it
> > seems to fix the GPU hang reported in Bugzilla, since this is
> > tinkering
> > with the way in which we allocate and free VUEs for SNB GS I
> > believe
> > that if this breaks anything it might produce a GPU hang and in
> > that
> > case I would rather not hang Jenkins for everyone else until you
> > have a
> > chance to restore it, so in order to minimize that risk, could you
> > run
> > this through Jenkins when you are available? If that is
> > inconvenient
> > for you just let me know and I will send it myself late in my day
> > on
> > Monday to minimize the risk.
> > 
> > Thanks,
> > Iago
> > 
> > On Fri, 2018-06-22 at 10:59 +0300, Andrii Simiklit wrote:
> > > We can not use the VUE Dereference flags combination for EOT
> > > message under ILK and SNB because the threads are not initialized
> > > there with initial VUE handle unlike Pre-IL.
> > > So to avoid GPU hangs on SNB and ILK we need
> > > to avoid usage of the VUE Dereference flags combination.
> > > (Was tested only on SNB but according to the specification
> > > SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6
> > > the ILK must behave itself in the similar way)
> > > 
> > > v2: Approach to fix this issue was changed.
> > > Instead of different EOT flags in the program end
> > > we will create VUE every time even if GS produces no output.
> > > 
> > > v3: Clean up the patch.
> > > Signed-off-by: Andrii Simiklit <andrii.simiklit at globallogic.com>
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399
> > > 
> > > ---
> > >  src/intel/compiler/gen6_gs_visitor.cpp | 42 +++++++++++++++++---
> > > ----
> > > ----------
> > >  1 file changed, 21 insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/src/intel/compiler/gen6_gs_visitor.cpp
> > > b/src/intel/compiler/gen6_gs_visitor.cpp
> > > index 66c69fb..c9571cc 100644
> > > --- a/src/intel/compiler/gen6_gs_visitor.cpp
> > > +++ b/src/intel/compiler/gen6_gs_visitor.cpp
> > > @@ -350,27 +350,27 @@ gen6_gs_visitor::emit_thread_end()
> > >     int max_usable_mrf = FIRST_SPILL_MRF(devinfo->gen);
> > >  
> > >     /* Issue the FF_SYNC message and obtain the initial VUE
> > > handle.
> > > */
> > > +   this->current_annotation = "gen6 thread end: ff_sync";
> > > +
> > > +   vec4_instruction *inst = NULL;
> > > +   if (prog->info.has_transform_feedback_varyings) {
> > > +      src_reg sol_temp(this, glsl_type::uvec4_type);
> > > +      emit(GS_OPCODE_FF_SYNC_SET_PRIMITIVES,
> > > +           dst_reg(this->svbi),
> > > +           this->vertex_count,
> > > +           this->prim_count,
> > > +           sol_temp);
> > > +      inst = emit(GS_OPCODE_FF_SYNC,
> > > +                  dst_reg(this->temp), this->prim_count, this-
> > > > svbi);
> > > 
> > > +   } else {
> > > +      inst = emit(GS_OPCODE_FF_SYNC,
> > > +                  dst_reg(this->temp), this->prim_count,
> > > brw_imm_ud(0u));
> > > +   }
> > > +   inst->base_mrf = base_mrf;
> > > +
> > >     emit(CMP(dst_null_ud(), this->vertex_count, brw_imm_ud(0u),
> > > BRW_CONDITIONAL_G));
> > >     emit(IF(BRW_PREDICATE_NORMAL));
> > >     {
> > > -      this->current_annotation = "gen6 thread end: ff_sync";
> > > -
> > > -      vec4_instruction *inst;
> > > -      if (prog->info.has_transform_feedback_varyings) {
> > > -         src_reg sol_temp(this, glsl_type::uvec4_type);
> > > -         emit(GS_OPCODE_FF_SYNC_SET_PRIMITIVES,
> > > -              dst_reg(this->svbi),
> > > -              this->vertex_count,
> > > -              this->prim_count,
> > > -              sol_temp);
> > > -         inst = emit(GS_OPCODE_FF_SYNC,
> > > -                     dst_reg(this->temp), this->prim_count,
> > > this-
> > > > svbi);
> > > 
> > > -      } else {
> > > -         inst = emit(GS_OPCODE_FF_SYNC,
> > > -                     dst_reg(this->temp), this->prim_count,
> > > brw_imm_ud(0u));
> > > -      }
> > > -      inst->base_mrf = base_mrf;
> > > -
> > >        /* Loop over all buffered vertices and emit URB write
> > > messages
> > > */
> > >        this->current_annotation = "gen6 thread end: urb writes
> > > init";
> > >        src_reg vertex(this, glsl_type::uint_type);
> > > @@ -414,7 +414,7 @@ gen6_gs_visitor::emit_thread_end()
> > >                 dst_reg reg = dst_reg(MRF, mrf);
> > >                 reg.type = output_reg[varying][0].type;
> > >                 data.type = reg.type;
> > > -               vec4_instruction *inst = emit(MOV(reg, data));
> > > +               inst = emit(MOV(reg, data));
> > >                 inst->force_writemask_all = true;
> > >  
> > >                 mrf++;
> > > @@ -460,7 +460,7 @@ gen6_gs_visitor::emit_thread_end()
> > >      *
> > >      * However, this would lead us to end the program with an
> > > ENDIF
> > > opcode,
> > >      * which we want to avoid, so what we do is that we always
> > > request a new
> > > -    * VUE handle every time we do a URB WRITE, even for the last
> > > vertex we emit.
> > > +    * VUE handle every time, even if GS produces no output.
> > >      * With this we make sure that whether we have emitted at
> > > least
> > > one vertex
> > >      * or none at all, we have to finish the thread without
> > > writing
> > > to the URB,
> > >      * which works for both cases by setting the COMPLETE and
> > > UNUSED
> > > flags in
> > > @@ -476,7 +476,7 @@ gen6_gs_visitor::emit_thread_end()
> > >        emit(GS_OPCODE_SET_DWORD_2, dst_reg(MRF, base_mrf), data);
> > >     }
> > >  
> > > -   vec4_instruction *inst = emit(GS_OPCODE_THREAD_END);
> > > +   inst = emit(GS_OPCODE_THREAD_END);
> > >     inst->urb_write_flags = BRW_URB_WRITE_COMPLETE |
> > > BRW_URB_WRITE_UNUSED;
> > >     inst->base_mrf = base_mrf;
> > >     inst->mlen = 1;
> 
> 


More information about the mesa-dev mailing list