[Intel-xe] [PATCH] drm/xe: fix tlb_invalidation_seqno_past()

Christopher Snowhill kode54 at gmail.com
Mon May 8 03:19:52 UTC 2023


Wow, this patch made intel-compute-runtime suddenly start working properly
instead of causing a "GPU hang" that wasn't really a hang but instead a
seqno overflow.

On Sun, May 7, 2023 at 5:53 PM Matthew Brost <matthew.brost at intel.com>
wrote:

> On Fri, May 05, 2023 at 03:49:10PM +0100, Matthew Auld wrote:
> > Checking seqno_recv >= seqno looks like it will incorrectly report true
> > when the seqno has wrapped (not unlikely given
> > TLB_INVALIDATION_SEQNO_MAX). Calling xe_gt_tlb_invalidation_wait() might
> > then return before the flush has been completed by the GuC.
> >
> > Fix this by treating a large negative delta as an indication that the
> > seqno has wrapped around. Similar to how we treat a large positive delta
> > as an indication that the seqno_recv must have wrapped around, but in
> > that case the seqno has likely also signalled.
> >
> > It looks like we could also potentially make the seqno use the full
> > 32bits as supported by the GuC.
>
> Yea we def could use more of the space but in the end we have the seqno
> wrap issue. I think I set this to a low value to prove the wrapping
> protection worked (it didn't) by triigering wraps more often than the
> wrapping 32 bits.
>
> With, this patch LGTM.
>
> Reviewed-by: Matthew Brost <matthew.brost at intel.com>
>
> >
> > Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> > Cc: Matthew Brost <matthew.brost at intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > index 604f189dbd70..67822b3dd353 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
> > @@ -251,14 +251,15 @@ int xe_gt_tlb_invalidation_vma(struct xe_gt *gt,
> >
> >  static bool tlb_invalidation_seqno_past(struct xe_gt *gt, int seqno)
> >  {
> > -     if (gt->tlb_invalidation.seqno_recv >= seqno)
> > -             return true;
> > +     if (seqno - gt->tlb_invalidation.seqno_recv <
> > +         -(TLB_INVALIDATION_SEQNO_MAX / 2))
> > +             return false;
> >
> >       if (seqno - gt->tlb_invalidation.seqno_recv >
> >           (TLB_INVALIDATION_SEQNO_MAX / 2))
> >               return true;
> >
> > -     return false;
> > +     return gt->tlb_invalidation.seqno_recv >= seqno;
> >  }
> >
> >  /**
> > --
> > 2.40.0
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-xe/attachments/20230507/c3c80f46/attachment.htm>


More information about the Intel-xe mailing list