[Mesa-dev] [PATCH v3] i965/gen6/gs: Handle case where a GS doesn't allocate VUE

asimiklit.work asimiklit.work at gmail.com
Mon Jul 9 06:22:47 UTC 2018


Hello,

Sorry for late response. I was on vacation.
I am glad that this fix is ok.
Thanks a lot for your review.

Regards,
Andrii.

On 2018-06-25 12:11 PM, Iago Toral wrote:
> Thanks for testing Mark.
>
> Andrii, I'll add my Reviewed-by and and push the patch to master later
> today (I'll also queue it for the next stable release).
>
> Thanks for fixing this!
>
> Iago
>
> On Fri, 2018-06-22 at 13:18 -0700, Mark Janes wrote:
>> Tested-by: Mark Janes <mark.a.janes at intel.com>
>>
>> Iago Toral <itoral at igalia.com> writes:
>>
>>> Thanks Andrii, this version looks good to me.
>>>
>>> Mark: this change fixes a GPU hang in sandy bridge with geometry
>>> shaders (the change itself affects a path in the driver that is
>>> only
>>> executed in SNB with GS, so nothing else is affected). While I
>>> think
>>> the change in here is correct according to the PRMs and in fact it
>>> seems to fix the GPU hang reported in Bugzilla, since this is
>>> tinkering
>>> with the way in which we allocate and free VUEs for SNB GS I
>>> believe
>>> that if this breaks anything it might produce a GPU hang and in
>>> that
>>> case I would rather not hang Jenkins for everyone else until you
>>> have a
>>> chance to restore it, so in order to minimize that risk, could you
>>> run
>>> this through Jenkins when you are available? If that is
>>> inconvenient
>>> for you just let me know and I will send it myself late in my day
>>> on
>>> Monday to minimize the risk.
>>>
>>> Thanks,
>>> Iago
>>>
>>> On Fri, 2018-06-22 at 10:59 +0300, Andrii Simiklit wrote:
>>>> We can not use the VUE Dereference flags combination for EOT
>>>> message under ILK and SNB because the threads are not initialized
>>>> there with initial VUE handle unlike Pre-IL.
>>>> So to avoid GPU hangs on SNB and ILK we need
>>>> to avoid usage of the VUE Dereference flags combination.
>>>> (Was tested only on SNB but according to the specification
>>>> SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6
>>>> the ILK must behave itself in the similar way)
>>>>
>>>> v2: Approach to fix this issue was changed.
>>>> Instead of different EOT flags in the program end
>>>> we will create VUE every time even if GS produces no output.
>>>>
>>>> v3: Clean up the patch.
>>>> Signed-off-by: Andrii Simiklit <andrii.simiklit at globallogic.com>
>>>>
>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399
>>>>
>>>> ---
>>>>   src/intel/compiler/gen6_gs_visitor.cpp | 42 +++++++++++++++++---
>>>> ----
>>>> ----------
>>>>   1 file changed, 21 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/src/intel/compiler/gen6_gs_visitor.cpp
>>>> b/src/intel/compiler/gen6_gs_visitor.cpp
>>>> index 66c69fb..c9571cc 100644
>>>> --- a/src/intel/compiler/gen6_gs_visitor.cpp
>>>> +++ b/src/intel/compiler/gen6_gs_visitor.cpp
>>>> @@ -350,27 +350,27 @@ gen6_gs_visitor::emit_thread_end()
>>>>      int max_usable_mrf = FIRST_SPILL_MRF(devinfo->gen);
>>>>   
>>>>      /* Issue the FF_SYNC message and obtain the initial VUE
>>>> handle.
>>>> */
>>>> +   this->current_annotation = "gen6 thread end: ff_sync";
>>>> +
>>>> +   vec4_instruction *inst = NULL;
>>>> +   if (prog->info.has_transform_feedback_varyings) {
>>>> +      src_reg sol_temp(this, glsl_type::uvec4_type);
>>>> +      emit(GS_OPCODE_FF_SYNC_SET_PRIMITIVES,
>>>> +           dst_reg(this->svbi),
>>>> +           this->vertex_count,
>>>> +           this->prim_count,
>>>> +           sol_temp);
>>>> +      inst = emit(GS_OPCODE_FF_SYNC,
>>>> +                  dst_reg(this->temp), this->prim_count, this-
>>>>> svbi);
>>>> +   } else {
>>>> +      inst = emit(GS_OPCODE_FF_SYNC,
>>>> +                  dst_reg(this->temp), this->prim_count,
>>>> brw_imm_ud(0u));
>>>> +   }
>>>> +   inst->base_mrf = base_mrf;
>>>> +
>>>>      emit(CMP(dst_null_ud(), this->vertex_count, brw_imm_ud(0u),
>>>> BRW_CONDITIONAL_G));
>>>>      emit(IF(BRW_PREDICATE_NORMAL));
>>>>      {
>>>> -      this->current_annotation = "gen6 thread end: ff_sync";
>>>> -
>>>> -      vec4_instruction *inst;
>>>> -      if (prog->info.has_transform_feedback_varyings) {
>>>> -         src_reg sol_temp(this, glsl_type::uvec4_type);
>>>> -         emit(GS_OPCODE_FF_SYNC_SET_PRIMITIVES,
>>>> -              dst_reg(this->svbi),
>>>> -              this->vertex_count,
>>>> -              this->prim_count,
>>>> -              sol_temp);
>>>> -         inst = emit(GS_OPCODE_FF_SYNC,
>>>> -                     dst_reg(this->temp), this->prim_count,
>>>> this-
>>>>> svbi);
>>>> -      } else {
>>>> -         inst = emit(GS_OPCODE_FF_SYNC,
>>>> -                     dst_reg(this->temp), this->prim_count,
>>>> brw_imm_ud(0u));
>>>> -      }
>>>> -      inst->base_mrf = base_mrf;
>>>> -
>>>>         /* Loop over all buffered vertices and emit URB write
>>>> messages
>>>> */
>>>>         this->current_annotation = "gen6 thread end: urb writes
>>>> init";
>>>>         src_reg vertex(this, glsl_type::uint_type);
>>>> @@ -414,7 +414,7 @@ gen6_gs_visitor::emit_thread_end()
>>>>                  dst_reg reg = dst_reg(MRF, mrf);
>>>>                  reg.type = output_reg[varying][0].type;
>>>>                  data.type = reg.type;
>>>> -               vec4_instruction *inst = emit(MOV(reg, data));
>>>> +               inst = emit(MOV(reg, data));
>>>>                  inst->force_writemask_all = true;
>>>>   
>>>>                  mrf++;
>>>> @@ -460,7 +460,7 @@ gen6_gs_visitor::emit_thread_end()
>>>>       *
>>>>       * However, this would lead us to end the program with an
>>>> ENDIF
>>>> opcode,
>>>>       * which we want to avoid, so what we do is that we always
>>>> request a new
>>>> -    * VUE handle every time we do a URB WRITE, even for the last
>>>> vertex we emit.
>>>> +    * VUE handle every time, even if GS produces no output.
>>>>       * With this we make sure that whether we have emitted at
>>>> least
>>>> one vertex
>>>>       * or none at all, we have to finish the thread without
>>>> writing
>>>> to the URB,
>>>>       * which works for both cases by setting the COMPLETE and
>>>> UNUSED
>>>> flags in
>>>> @@ -476,7 +476,7 @@ gen6_gs_visitor::emit_thread_end()
>>>>         emit(GS_OPCODE_SET_DWORD_2, dst_reg(MRF, base_mrf), data);
>>>>      }
>>>>   
>>>> -   vec4_instruction *inst = emit(GS_OPCODE_THREAD_END);
>>>> +   inst = emit(GS_OPCODE_THREAD_END);
>>>>      inst->urb_write_flags = BRW_URB_WRITE_COMPLETE |
>>>> BRW_URB_WRITE_UNUSED;
>>>>      inst->base_mrf = base_mrf;
>>>>      inst->mlen = 1;
>>



More information about the mesa-dev mailing list