[Mesa-dev] [PATCH] Fix 105399 bug GPU hang on SNB using geometry shader. The end of thread (EOT) message with flags Complete=1 and Used=0 will leads to GPU hang on SNB and ILK when GS does not allocate VUE at all.

andrey simiklit asimiklit.work at gmail.com
Tue Jun 19 14:17:25 UTC 2018


Hello,

Thanks for a fast feedback.
I have created new patch:
"i965/gen6/gs: Handle case where a GS doesn't allocate VUE"
which contains all fixes.

Regards,
Andrii.

On Tue, Jun 19, 2018 at 3:16 PM, Iago Toral <itoral at igalia.com> wrote:

> Hi Andrii,
>
> thanks for the fix!
>
> Kenneth, this patch makes it so that we end the GS program with and
> ENDIF. I remember that back in the day when I wrote this code you had
> concerns about that (that's why I added that comment), but that was a
> long time ago so maybe things have changed, do you know if this is
> still something that we should avoid?
>
> Andrii: limit the subject line (the one that shows up in the subject of
> the e-mail starting after "[PATCH]" small enough to fit in 80
> characters. I do a quick review of the patch inline below and will do a
> more thorough review tomorrow.
>
> On Tue, 2018-06-19 at 11:31 +0300, Andrii Simiklit wrote:
> > We can not use the VUE Dereference flags combination for EOT
> > message under ILK and SNB because the threads are not initialized
> > there with initial VUE handle unlike Pre-IL.
> > So to avoid GPU hangs on SNB and ILK we need
> > to avoid usage of the VUE Dereference flags combination.
> > (Was tested only on SNB but according to specification
> > https://01.org/sites/default/files/documentation/snb_ihd_os_vol2_part
> > 1_0.pdf
> > sections: 1.6.5.3, 1.6.5.6
> > the ILK must behave itself in the similar way)
> >
> > Signed-off-by: Andrii Simiklit <andrii.simiklit at globallogic.com>
>
> Add:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399
>
> > ---
> >  src/intel/compiler/gen6_gs_visitor.cpp | 53
> > ++++++++++++++++++++++++++++++----
> >  1 file changed, 47 insertions(+), 6 deletions(-)
> >
> > diff --git a/src/intel/compiler/gen6_gs_visitor.cpp
> > b/src/intel/compiler/gen6_gs_visitor.cpp
> > index 66c69fb..2143fd2 100644
> > --- a/src/intel/compiler/gen6_gs_visitor.cpp
> > +++ b/src/intel/compiler/gen6_gs_visitor.cpp
> > @@ -300,10 +300,11 @@ gen6_gs_visitor::emit_urb_write_opcode(bool
> > complete, int base_mrf,
> >        /* Otherwise we always request to allocate a new VUE handle.
> > If this is
> >         * the last write before the EOT message and the new handle
> > never gets
> >         * used it will be dereferenced when we send the EOT message.
> > This is
> > -       * necessary to avoid different setups for the EOT message
> > (one for the
> > +       * necessary to avoid different setups (under Pre-IL only) for
> > the EOT message (one for the
> >         * case when there is no output and another for the case when
> > there is)
> >         * which would require to end the program with an
> > IF/ELSE/ENDIF block,
> > -       * something we do not want.
> > +       * something we do not want.
> > +       * But for ILK and SNB we can not avoid the end the program
> > with an IF/ELSE/ENDIF block.
> >         */
>
> Limit lines to 80 characters long.
>
> >        inst = emit(GS_OPCODE_URB_WRITE_ALLOCATE);
> >        inst->urb_write_flags = BRW_URB_WRITE_COMPLETE;
> > @@ -449,8 +450,12 @@ gen6_gs_visitor::emit_thread_end()
> >        if (prog->info.has_transform_feedback_varyings)
> >           xfb_write();
> >     }
> > -   emit(BRW_OPCODE_ENDIF);
> > -
> > +   enum { GEN5_ILK = 5 };
> > +   const bool common_eot_approach_can_be_used = (devinfo->gen <
> > GEN5_ILK);
>
> devinfo->gen < 5 is fine, we do that everywhere in the driver.
>
> > +   if(common_eot_approach_can_be_used)
> > +   {
> > +      emit(BRW_OPCODE_ENDIF);
> > +   }
> >     /* Finally, emit EOT message.
> >      *
> >      * In gen6 we need to end the thread differently depending on
> > whether we have
> > @@ -463,8 +468,30 @@ gen6_gs_visitor::emit_thread_end()
> >      * VUE handle every time we do a URB WRITE, even for the last
> > vertex we emit.
> >      * With this we make sure that whether we have emitted at least
> > one vertex
> >      * or none at all, we have to finish the thread without writing
> > to the URB,
> > -    * which works for both cases by setting the COMPLETE and UNUSED
> > flags in
> > +    * which works for both cases (but only under Pre-IL) by setting
> > the COMPLETE and UNUSED flags in
> >      * the EOT message.
> > +    *
> > +    * But under ILK or SNB we must not use combination COMPLETE and
> > UNUSED
> > +    * because this combination could be used only for already
> > allocated VUE.
> > +    * But unlike Pre-IL in the ILK and SNB the initial VUE is not
> > passed to threads.
> > +    * This behaver mentioned in specification:
> > +    * SNB (gen6) Spec:
> > https://01.org/sites/default/files/documentation/snb_ihd_os_vol2_part
> > 1_0.pdf
>
> I think you can drop the URL, mentioning the specific section of the
> PRM with the relevant text is sufficient.
>
> > +    *    1.6.5.3 VUE Allocation (GS, CLIP) [DevIL]
> > +    *    1.6.5.4 VUE Allocation (GS) [DevSNB+]
>
> We usually write PRM citations with quotes.
>
> > +    *       The threads are not passed an initial handle.
> > +    *       Instead, they request a first handle (if any)
> > +    *       via the URB shared function’s FF_SYNC message (see
> > Shared Functions).
> > +    *       If additional handles are required,
> > +    *       the URB_WRITE allocate mechanism (mentioned above) is
> > used.
> > +    *
> > +    * So for ILK and for SNB we must use only UNUSED flag.
> > +    * This is accepteble combination according to:
> > +    *    SNB (gen6) Spec: https://01.org/sites/default/files/documen
> > tation/snb_ihd_os_vol4_part2_0.pdf
> > +    *       2.4.2 Message Descriptor
> > +    *          "Table lists the valid and invalid combinations of
> > the Complete, Used, Allocate and EOT bits"
> > +    *          "Thread terminate non-write of URB"
> > +    *    SNB (gen6) Spec: https://01.org/sites/default/files/documen
> > tation/snb_ihd_os_vol2_part1_0.pdf
> > +    *       1.6.5.6 Thread Termination
> >      */
>
> Limit lines to 80 columns.
>
> >     this->current_annotation = "gen6 thread end: EOT";
> >
> > @@ -480,8 +507,22 @@ gen6_gs_visitor::emit_thread_end()
> >     inst->urb_write_flags = BRW_URB_WRITE_COMPLETE |
> > BRW_URB_WRITE_UNUSED;
> >     inst->base_mrf = base_mrf;
> >     inst->mlen = 1;
> > -}
> > +
> > +   if(!common_eot_approach_can_be_used)
> > +   {
> > +      emit(BRW_OPCODE_ELSE);
> > +
> > +      this->current_annotation = "gen6 thread end: EOT";
> > +
> > +      vec4_instruction *unused_urb_inst =
> > emit(GS_OPCODE_THREAD_END);
> > +      unused_urb_inst->urb_write_flags = BRW_URB_WRITE_UNUSED;
> > +      unused_urb_inst->base_mrf = base_mrf;
> > +      unused_urb_inst->mlen = 1;
> >
> > +      emit(BRW_OPCODE_ENDIF);
> > +   }
> > +}
> > +
> >  void
> >  gen6_gs_visitor::setup_payload()
> >  {
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180619/ced4ccaa/attachment-0001.html>


More information about the mesa-dev mailing list