[PATCH 5/5] drm/xe: Split TLB invalidation code in frontend and backend

Wed Jul 23 22:43:41 UTC 2025

On Wed, 2025-07-23 at 22:03 +0000, Summers, Stuart wrote:
> On Wed, 2025-07-23 at 14:22 -0700, Matthew Brost wrote:
> > On Wed, Jul 23, 2025 at 02:55:24PM -0600, Summers, Stuart wrote:
> > > On Wed, 2025-07-23 at 13:47 -0700, Matthew Brost wrote:
> > > > 
> > > 
> > > <cut>
> > > (just to reduce the noise in the rest of the patch here for
> > > now...)
> > > 
> > > > > > > >  
> > > > > > > >  /**
> > > > > > > > - * xe_tlb_inval_reset - Initialize TLB invalidation
> > > > > > > > reset
> > > > > > > > + * xe_tlb_inval_reset() - TLB invalidation reset
> > > > > > > >   * @tlb_inval: TLB invalidation client
> > > > > > > >   *
> > > > > > > >   * Signal any pending invalidation fences, should be
> > > > > > > > called
> > > > > > > > during a GT reset
> > > > > > > >   */
> > > > > > > >  void xe_tlb_inval_reset(struct xe_tlb_inval
> > > > > > > > *tlb_inval)
> > > > > > > >  {
> > > > > > > > -       struct xe_gt *gt = tlb_inval->private;
> > > > > > > >         struct xe_tlb_inval_fence *fence, *next;
> > > > > > > >         int pending_seqno;
> > > > > > > >  
> > > > > > > >         /*
> > > > > > > > -        * we can get here before the CTs are even
> > > > > > > > initialized if
> > > > > > > > we're wedging
> > > > > > > > -        * very early, in which case there are not
> > > > > > > > going
> > > > > > > > to
> > > > > > > > be
> > > > > > > > any pending
> > > > > > > > -        * fences so we can bail immediately.
> > > > > > > > +        * we can get here before the backends are even
> > > > > > > > initialized if we're
> > > > > > > > +        * wedging very early, in which case there are
> > > > > > > > not
> > > > > > > > going
> > > > > > > > to be any
> > > > > > > > +        * pendind fences so we can bail immediately.
> > > > > > > >          */
> > > > > > > > -       if (!xe_guc_ct_initialized(&gt->uc.guc.ct))
> > > > > > > > +       if (!tlb_inval->ops->initialized(tlb_inval))
> > > > > > > >                 return;
> > > > > > > >  
> > > > > > > >         /*
> > > > > > > > -        * CT channel is already disabled at this
> > > > > > > > point.
> > > > > > > > No
> > > > > > > > new
> > > > > > > > TLB requests can
> > > > > > > > +        * Backend is already disabled at this point.
> > > > > > > > No
> > > > > > > > new
> > > > > > > > TLB
> > > > > > > > requests can
> > > > > > > >          * appear.
> > > > > > > >          */
> > > > > > > >  
> > > > > > > > -       mutex_lock(&gt->uc.guc.ct.lock);
> > > > > > > > -       spin_lock_irq(&gt->tlb_inval.pending_lock);
> > > > > > > > -       cancel_delayed_work(&gt->tlb_inval.fence_tdr);
> > > > > > > > +       tlb_inval->ops->lock(tlb_inval);
> > > > > > > 
> > > > > > > I think you want a dedicated lock embedded in struct
> > > > > > > xe_tlb_inval,
> > > > > > > rather than reaching into the backend to grab one.
> > > > > > > 
> > > > > > > This will deadlock as written: G2H TLB inval messages are
> > > > > > > sometimes
> > > > > > > processed while holding ct->lock (non-fast path,
> > > > > > > unlikely)
> > > > > > > and
> > > > > > > sometimes
> > > > > > > without it (fast path, likely).
> > > > > > 
> > > > > > Ugh, I'm off today. Ignore the deadlock part, I was
> > > > > > confusing
> > > > > > myself...
> > > > > > I was thinking this was the function
> > > > > > xe_tlb_inval_done_handler,
> > > > > > it is
> > > > > > not. I still think xe_tlb_inval should its own lock but
> > > > > > this
> > > > > > patch
> > > > > > written should work with
> > > > > > s/xe_guc_ct_send/xe_guc_ct_send_locked.
> > > > > 
> > > > > So one reason I didn't go that way is we did just the reverse
> > > > > recently
> > > > > - moved from a TLB dedicated lock to the more specific CT
> > > > > lock
> > > > > since
> > > > > these are all going into the CT handler anyway when we use
> > > > > GuC
> > > > > submission. Then this embedded version allows us to lock at
> > > > > the
> > > > > bottom
> > > > > data layer rather than having a separate lock in the upper
> > > > > layer.
> > > > > Another thing is we might want to have different types of
> > > > > invalidation
> > > > > running in parallel without locking the data in the upper
> > > > > layer
> > > > > since
> > > > > the real contention would be in the lower level pipelining
> > > > > anyway.
> > > > > 
> > > > 
> > > > I can see the reasoning behind this approach, and maybe it’s
> > > > fine.
> > > > 
> > > > But consider the case where the GuC backend has to look up a
> > > > VM,
> > > > iterate
> > > > over a list of exec queues, and send multiple H2Gs to the
> > > > hardware,
> > > > each
> > > > with a corresponding G2H (per-context invalidations). In the
> > > > worst
> > > > case,
> > > > the CT code may have to wait for and process some G2Hs because
> > > > our
> > > > G2H
> > > > credits are exhausted—all while holding the CT lock, which
> > > > currently
> > > > blocks any hardware submissions (i.e., hardware submissions
> > > > need
> > > > the
> > > > CT
> > > > lock). Now imagine multiple sources issuing invalidations: they
> > > > could
> > > > grab the CT lock before a submission waiting on it, further
> > > > delaying
> > > > that
> > > > submission. 
> > > > 
> > > > The longer a mutex is held, the more likely the CPU thread
> > > > holding it
> > > > could switched out while holding it.
> > > > 
> > > > This doesn’t seem scalable compared to using a finer-grained CT
> > > > lock
> > > > (e.g., only taking it in xe_guc_ct_send).
> > > > 
> > > > I’m not saying this won’t work as you have it—I think it
> > > > will—but
> > > > the
> > > > consequences of holding the CT lock for an extended period need
> > > > to be
> > > > considered.
> > > 
> > > Couple more thoughts.. so in the case you mentioned, ideally I'd
> > > like
> > > to have just a single invalidation per request, rather than
> > > across
> > > a
> > > whole VM. That's the reason we have the range based invalidation
> > > to
> > 
> > Yes, this is ranged based.
> > 
> > > begin with. If we get to the point where we want to make that
> > > even
> > > finer, that's great, but we should still just have a single
> > > invalidation per request (again, ideally).
> > > 
> > 
> > Maybe you have a different idea, but I was thinking of queue-based
> > invalidations: the frontend assigns a single seqno, the backend
> > issues N
> > invalidations to the hardware—one per GCID mapped in the VM/GT
> > tuple—and
> > then signals the frontend when all invalidations associated with
> > the
> > seqno are complete. With the GuC, a GCID corresponds to each exec
> > queue’s
> > gucid mapped in the VM/GT tuple. Different backends can handle this
> > differently.
> > 
> > > Also, you already have some patches up on the list that do some
> > > coalescing of invalidations so we reduce the number of
> > > invalidations
> > > for multiple ranges. I didn't want to include those patches here
> > > because IMO they are really a separate feature here and it'd be
> > > nice to
> > > review that on its own.
> > > 
> > 
> > I agree it is a seperate thing, that should help in some cases, and
> > should be reviewed on its own.
> > 
> > That doesn't help in the case of multiple VM's issuing
> > invalidations
> > though (think eviction is occuring or MMU notifiers are firing).
> > The
> > lock contenion is moved from a dedicated TLB invalidation lock, to
> > a
> > widely shared CT lock. If multiple TLB invalidations are
> > contending,
> > now
> > all other users of the CT lock contend at this higher level. i.e.,
> > by
> > only acquring CT lock at last part of an invalidation, other
> > waiters
> > (non-invalidation) get QoS.
> 
> I mean, this was the original reason I had understood for having the
> separate lock in the first place. But it feels a little like we're
> running in circles here moving between the two modes..
> 
> I do see what you're saying though, basically the problem is the CT
> send routine right now is doing a busy wait for a reply from guc each
> time it sends something, all within the lock.
> 
>                 if (!wait_event_timeout(ct->wq, !ct->g2h_outstanding
> ||
>                                         g2h_avail(ct), HZ))

Ok maybe ignore what I said here for now. Let me dig a bit and get
back. The code I linked here is clearly just on the busy path so not a
block like that.

Thanks,
Stuart

> 
> So if we're going to stick with this, yeah I agree we really need
> some
> kind of queuing if we're going to have a lot of these fine grained
> invalidations all in a row or we'll start blocking things like page
> fault replies.
> 
> I'm wondering if the better way to approach this though would be to
> refactor on the GuC side rather than do something really complicated
> on
> the TLB side. I.e. why can't we do the CT busy wait in a worker
> thread
> and let the send thread keep going adding more and more? It would
> mean
> we'd have to do a better job of tracking each unique request out to
> guc
> rather than just relying on the current g2h_outstanding count, but it
> would at least let us do some of this work in parallel.
> 
> The queueing mechanism is still going to take work on top of what we
> have in this series to build up these chains of h2g messages with the
> CT lock held only for that last one. And IMO it still will be a
> little
> messy calling into the lower layer (guc) and back out to the upper
> layer (tlb) and back again to build these queues. And I'm not sure
> how
> great that will work if we move to a different back end than guc - we
> might not get any benefit there after all this work on the guc side.
> 
> Let me know what you think about a CT refactor like what I said.
> 
> And I still do think we can do a better job reducing the scope of
> some
> of these invalidations, particularly in a case where we weanted to
> associate something like the guc id with the VM to build a range
> rather
> than just the addresses within the VM. At least in that case we can
> look a little longer term at something like the CT refactor and still
> keep the backend/frontend isolation intact.
> 
> Thanks,
> Stuart
> 
> > 
> > Matt
> >  
> > > So basically, the per request lock here also pushes us to
> > > implement
> > > in
> > > a more efficient and precise way rather than just hammering as
> > > many
> > > invalidations over a given range as possible.
> > > 
> > > And of course there are going to need to be bigger hammer
> > > invalidations
> > > sometimes (like the full VF invalidation we're doing in the
> > > invalidate_all() routines), but those still fall into the same
> > > category
> > > of precision, just with a larger scope (rather than multiple
> > > smaller
> > > invalidations).
> > > 
> > > Thanks,
> > > Stuart
> > > 
> > > > 
> > > > Matt
> > > > 
> > > > > Thanks,
> > > > > Stuart
> > > > > 
> > > > > > 
> > > > > > Matt 
> > > > > > 
> > > > > > > 
> > > > > > > I’d call this lock seqno_lock, since it protects exactly
> > > > > > > that—the
> > > > > > > order
> > > > > > > in which a seqno is assigned by the frontend and handed
> > > > > > > to
> > > > > > > the
> > > > > > > backend.
> > > > > > > 
> > > > > > > Prime this lock for reclaim as well—do what
> > > > > > > primelockdep()
> > > > > > > does
> > > > > > > in
> > > > > > > xe_guc_ct.c—to make it clear that memory allocations are
> > > > > > > not
> > > > > > > allowed
> > > > > > > while the lock is held as TLB invalidations can be called
> > > > > > > from
> > > > > > > two
> > > > > > > reclaim paths:
> > > > > > > 
> > > > > > > - MMU notifier callbacks
> > > > > > > - The dma-fence signaling path of VM binds that require a
> > > > > > > TLB
> > > > > > >   invalidation
> > > > > > > 
> > > > > > > > +       spin_lock_irq(&tlb_inval->pending_lock);
> > > > > > > > +       cancel_delayed_work(&tlb_inval->fence_tdr);
> > > > > > > >         /*
> > > > > > > >          * We might have various kworkers waiting for
> > > > > > > > TLB
> > > > > > > > flushes
> > > > > > > > to complete
> > > > > > > >          * which are not tracked with an explicit TLB
> > > > > > > > fence,
> > > > > > > > however at this
> > > > > > > > -        * stage that will never happen since the CT is
> > > > > > > > already
> > > > > > > > disabled, so
> > > > > > > > -        * make sure we signal them here under the
> > > > > > > > assumption
> > > > > > > > that we have
> > > > > > > > +        * stage that will never happen since the
> > > > > > > > backend
> > > > > > > > is
> > > > > > > > already disabled,
> > > > > > > > +        * so make sure we signal them here under the
> > > > > > > > assumption
> > > > > > > > that we have
> > > > > > > >          * completed a full GT reset.
> > > > > > > >          */
> > > > > > > > -       if (gt->tlb_inval.seqno == 1)
> > > > > > > > +       if (tlb_inval->seqno == 1)
> > > > > > > >                 pending_seqno =
> > > > > > > > TLB_INVALIDATION_SEQNO_MAX -
> > > > > > > > 1;
> > > > > > > >         else
> > > > > > > > -               pending_seqno = gt->tlb_inval.seqno -
> > > > > > > > 1;
> > > > > > > > -       WRITE_ONCE(gt->tlb_inval.seqno_recv,
> > > > > > > > pending_seqno);
> > > > > > > > +               pending_seqno = tlb_inval->seqno - 1;
> > > > > > > > +       WRITE_ONCE(tlb_inval->seqno_recv,
> > > > > > > > pending_seqno);
> > > > > > > >  
> > > > > > > >         list_for_each_entry_safe(fence, next,
> > > > > > > > -                                &gt-
> > > > > > > > > tlb_inval.pending_fences,
> > > > > > > > link)
> > > > > > > > -               inval_fence_signal(gt_to_xe(gt),
> > > > > > > > fence);
> > > > > > > > -       spin_unlock_irq(&gt->tlb_inval.pending_lock);
> > > > > > > > -       mutex_unlock(&gt->uc.guc.ct.lock);
> > > > > > > > +                                &tlb_inval-
> > > > > > > > > pending_fences,
> > > > > > > > link)
> > > > > > > > +               xe_tlb_inval_fence_signal(fence);
> > > > > > > > +       spin_unlock_irq(&tlb_inval->pending_lock);
> > > > > > > > +       tlb_inval->ops->unlock(tlb_inval);
> > > > > > > >  }
> > > > > > > >  
> > > > > > > > -static bool tlb_inval_seqno_past(struct xe_gt *gt, int
> > > > > > > > seqno)
> > > > > > > > +static bool xe_tlb_inval_seqno_past(struct
> > > > > > > > xe_tlb_inval
> > > > > > > > *tlb_inval, int seqno)
> > > > > > > >  {
> > > > > > > > -       int seqno_recv = READ_ONCE(gt-
> > > > > > > > > tlb_inval.seqno_recv);
> > > > > > > > +       int seqno_recv = READ_ONCE(tlb_inval-
> > > > > > > > > seqno_recv);
> > > > > > > > +
> > > > > > > > +       lockdep_assert_held(&tlb_inval->pending_lock);
> > > > > > > >  
> > > > > > > >         if (seqno - seqno_recv < -
> > > > > > > > (TLB_INVALIDATION_SEQNO_MAX
> > > > > > > > /
> > > > > > > > 2))
> > > > > > > >                 return false;
> > > > > > > > @@ -201,44 +192,20 @@ static bool
> > > > > > > > tlb_inval_seqno_past(struct
> > > > > > > > xe_gt *gt, int seqno)
> > > > > > > >         return seqno_recv >= seqno;
> > > > > > > >  }
> > > > > > > >  
> > > > > > > > -static int send_tlb_inval(struct xe_guc *guc, const
> > > > > > > > u32
> > > > > > > > *action,
> > > > > > > > int len)
> > > > > > > > -{
> > > > > > > > -       struct xe_gt *gt = guc_to_gt(guc);
> > > > > > > > -
> > > > > > > > -       xe_gt_assert(gt, action[1]);    /* Seqno */
> > > > > > > > -       lockdep_assert_held(&guc->ct.lock);
> > > > > > > > -
> > > > > > > > -       /*
> > > > > > > > -        * XXX: The seqno algorithm relies on TLB
> > > > > > > > invalidation
> > > > > > > > being processed
> > > > > > > > -        * in order which they currently are, if that
> > > > > > > > changes
> > > > > > > > the
> > > > > > > > algorithm will
> > > > > > > > -        * need to be updated.
> > > > > > > > -        */
> > > > > > > > -
> > > > > > > > -       xe_gt_stats_incr(gt, XE_GT_STATS_ID_TLB_INVAL,
> > > > > > > > 1);
> > > > > > > > -
> > > > > > > > -       return xe_guc_ct_send(&guc->ct, action, len,
> > > > > > > > -                            
> > > > > > > > G2H_LEN_DW_TLB_INVALIDATE,
> > > > > > > > 1);
> > > > > > > > -}
> > > > > > > > -
> > > > > > > >  static void xe_tlb_inval_fence_prep(struct
> > > > > > > > xe_tlb_inval_fence
> > > > > > > > *fence)
> > > > > > > >  {
> > > > > > > >         struct xe_tlb_inval *tlb_inval = fence-
> > > > > > > > > tlb_inval;
> > > > > > > > -       struct xe_gt *gt = tlb_inval->private;
> > > > > > > > -       struct xe_device *xe = gt_to_xe(gt);
> > > > > > > > -
> > > > > > > > -       lockdep_assert_held(&gt->uc.guc.ct.lock);
> > > > > > > >  
> > > > > > > >         fence->seqno = tlb_inval->seqno;
> > > > > > > > -       trace_xe_tlb_inval_fence_send(xe, fence);
> > > > > > > > +       trace_xe_tlb_inval_fence_send(tlb_inval->xe,
> > > > > > > > fence);
> > > > > > > >  
> > > > > > > >         spin_lock_irq(&tlb_inval->pending_lock);
> > > > > > > >         fence->inval_time = ktime_get();
> > > > > > > >         list_add_tail(&fence->link, &tlb_inval-
> > > > > > > > > pending_fences);
> > > > > > > >  
> > > > > > > >         if (list_is_singular(&tlb_inval-
> > > > > > > > >pending_fences))
> > > > > > > > -               queue_delayed_work(system_wq,
> > > > > > > > -                                  &tlb_inval-
> > > > > > > > >fence_tdr,
> > > > > > > > -                                 
> > > > > > > > tlb_timeout_jiffies(gt));
> > > > > > > > +               queue_delayed_work(system_wq,
> > > > > > > > &tlb_inval-
> > > > > > > > > fence_tdr,
> > > > > > > > +                                  tlb_inval->ops-
> > > > > > > > > timeout_delay(tlb_inval));
> > > > > > > >         spin_unlock_irq(&tlb_inval->pending_lock);
> > > > > > > >  
> > > > > > > >         tlb_inval->seqno = (tlb_inval->seqno + 1) %
> > > > > > > > @@ -247,202 +214,63 @@ static void
> > > > > > > > xe_tlb_inval_fence_prep(struct
> > > > > > > > xe_tlb_inval_fence *fence)
> > > > > > > >                 tlb_inval->seqno = 1;
> > > > > > > >  }
> > > > > > > >  
> > > > > > > > -#define MAKE_INVAL_OP(type)    ((type <<
> > > > > > > > XE_GUC_TLB_INVAL_TYPE_SHIFT) | \
> > > > > > > > -               XE_GUC_TLB_INVAL_MODE_HEAVY <<
> > > > > > > > XE_GUC_TLB_INVAL_MODE_SHIFT | \
> > > > > > > > -               XE_GUC_TLB_INVAL_FLUSH_CACHE)
> > > > > > > > -
> > > > > > > > -static int send_tlb_inval_ggtt(struct xe_gt *gt, int
> > > > > > > > seqno)
> > > > > > > > -{
> > > > > > > > -       u32 action[] = {
> > > > > > > > -               XE_GUC_ACTION_TLB_INVALIDATION,
> > > > > > > > -               seqno,
> > > > > > > > -               MAKE_INVAL_OP(XE_GUC_TLB_INVAL_GUC),
> > > > > > > > -       };
> > > > > > > > -
> > > > > > > > -       return send_tlb_inval(&gt->uc.guc, action,
> > > > > > > > ARRAY_SIZE(action));
> > > > > > > > -}
> > > > > > > > -
> > > > > > > > -static int send_tlb_inval_all(struct xe_tlb_inval
> > > > > > > > *tlb_inval,
> > > > > > > > -                             struct xe_tlb_inval_fence
> > > > > > > > *fence)
> > > > > > > > -{
> > > > > > > > -       u32 action[] = {
> > > > > > > > -               XE_GUC_ACTION_TLB_INVALIDATION_ALL,
> > > > > > > > -               0,  /* seqno, replaced in
> > > > > > > > send_tlb_inval
> > > > > > > > */
> > > > > > > > -               MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL),
> > > > > > > > -       };
> > > > > > > > -       struct xe_gt *gt = tlb_inval->private;
> > > > > > > > -
> > > > > > > > -       xe_gt_assert(gt, fence);
> > > > > > > > -
> > > > > > > > -       return send_tlb_inval(&gt->uc.guc, action,
> > > > > > > > ARRAY_SIZE(action));
> > > > > > > > -}
> > > > > > > > +#define xe_tlb_inval_issue(__tlb_inval, __fence, op,
> > > > > > > > args...)  \
> > > > > > > > +({                                                    
> > > > > > > >   
> > > > > > > >     
> > > > > > > >    \
> > > > > > > > +       int
> > > > > > > > __ret;                                              \
> > > > > > > > +                                                      
> > > > > > > >   
> > > > > > > >     
> > > > > > > >    \
> > > > > > > > +       xe_assert((__tlb_inval)->xe, (__tlb_inval)-
> > > > > > > > > ops);       \
> > > > > > > > +       xe_assert((__tlb_inval)->xe,
> > > > > > > > (__fence));                \
> > > > > > > > +                                                      
> > > > > > > >   
> > > > > > > >     
> > > > > > > >    \
> > > > > > > > +       (__tlb_inval)->ops-
> > > > > > > > > lock((__tlb_inval));                \
> > > > > > > > +       xe_tlb_inval_fence_prep((__fence));            
> > > > > > > >   
> > > > > > > >     
> > > > > > > >    \
> > > > > > > > +       __ret = op((__tlb_inval), (__fence)->seqno,
> > > > > > > > ##args);    \
> > > > > > > > +       if (__ret <
> > > > > > > > 0)                                          \
> > > > > > > > +               xe_tlb_inval_fence_signal_unlocked((__f
> > > > > > > > en
> > > > > > > > ce))
> > > > > > > > ;  \
> > > > > > > > +       (__tlb_inval)->ops-
> > > > > > > > > unlock((__tlb_inval));              \
> > > > > > > > +                                                      
> > > > > > > >   
> > > > > > > >     
> > > > > > > >    \
> > > > > > > > +       __ret == -ECANCELED ? 0 :
> > > > > > > > __ret;                        \
> > > > > > > > +})
> > > > > > > >  
> > > > > > > >  /**
> > > > > > > > - * xe_gt_tlb_invalidation_all - Invalidate all TLBs
> > > > > > > > across
> > > > > > > > PF
> > > > > > > > and all VFs.
> > > > > > > > - * @gt: the &xe_gt structure
> > > > > > > > - * @fence: the &xe_tlb_inval_fence to be signaled on
> > > > > > > > completion
> > > > > > > > + * xe_tlb_inval_all() - Issue a TLB invalidation for
> > > > > > > > all
> > > > > > > > TLBs
> > > > > > > > + * @tlb_inval: TLB invalidation client
> > > > > > > > + * @fence: invalidation fence which will be signal on
> > > > > > > > TLB
> > > > > > > > invalidation
> > > > > > > > + * completion
> > > > > > > >   *
> > > > > > > > - * Send a request to invalidate all TLBs across PF and
> > > > > > > > all
> > > > > > > > VFs.
> > > > > > > > + * Issue a TLB invalidation for all TLBs. Completion
> > > > > > > > of
> > > > > > > > TLB
> > > > > > > > is
> > > > > > > > asynchronous and
> > > > > > > > + * caller can use the invalidation fence to wait for
> > > > > > > > completion.
> > > > > > > >   *
> > > > > > > >   * Return: 0 on success, negative error code on error
> > > > > > > >   */
> > > > > > > >  int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval,
> > > > > > > >                      struct xe_tlb_inval_fence *fence)
> > > > > > > >  {
> > > > > > > > -       struct xe_gt *gt = tlb_inval->private;
> > > > > > > > -       int err;
> > > > > > > > -
> > > > > > > > -       err = send_tlb_inval_all(tlb_inval, fence);
> > > > > > > > -       if (err)
> > > > > > > > -               xe_gt_err(gt, "TLB invalidation request
> > > > > > > > failed
> > > > > > > > (%pe)", ERR_PTR(err));
> > > > > > > > -
> > > > > > > > -       return err;
> > > > > > > > -}
> > > > > > > > -
> > > > > > > > -/*
> > > > > > > > - * Ensure that roundup_pow_of_two(length) doesn't
> > > > > > > > overflow.
> > > > > > > > - * Note that roundup_pow_of_two() operates on unsigned
> > > > > > > > long,
> > > > > > > > - * not on u64.
> > > > > > > > - */
> > > > > > > > -#define MAX_RANGE_TLB_INVALIDATION_LENGTH
> > > > > > > > (rounddown_pow_of_two(ULONG_MAX))
> > > > > > > > -
> > > > > > > > -static int send_tlb_inval_ppgtt(struct xe_gt *gt, u64
> > > > > > > > start,
> > > > > > > > u64
> > > > > > > > end,
> > > > > > > > -                               u32 asid, int seqno)
> > > > > > > > -{
> > > > > > > > -#define MAX_TLB_INVALIDATION_LEN       7
> > > > > > > > -       u32 action[MAX_TLB_INVALIDATION_LEN];
> > > > > > > > -       u64 length = end - start;
> > > > > > > > -       int len = 0;
> > > > > > > > -
> > > > > > > > -       action[len++] = XE_GUC_ACTION_TLB_INVALIDATION;
> > > > > > > > -       action[len++] = seqno;
> > > > > > > > -       if (!gt_to_xe(gt)->info.has_range_tlb_inval ||
> > > > > > > > -           length > MAX_RANGE_TLB_INVALIDATION_LENGTH)
> > > > > > > > {
> > > > > > > > -               action[len++] =
> > > > > > > > MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL);
> > > > > > > > -       } else {
> > > > > > > > -               u64 orig_start = start;
> > > > > > > > -               u64 align;
> > > > > > > > -
> > > > > > > > -               if (length < SZ_4K)
> > > > > > > > -                       length = SZ_4K;
> > > > > > > > -
> > > > > > > > -               /*
> > > > > > > > -                * We need to invalidate a higher
> > > > > > > > granularity
> > > > > > > > if
> > > > > > > > start address
> > > > > > > > -                * is not aligned to length. When start
> > > > > > > > is
> > > > > > > > not
> > > > > > > > aligned with
> > > > > > > > -                * length we need to find the length
> > > > > > > > large
> > > > > > > > enough
> > > > > > > > to create an
> > > > > > > > -                * address mask covering the required
> > > > > > > > range.
> > > > > > > > -                */
> > > > > > > > -               align = roundup_pow_of_two(length);
> > > > > > > > -               start = ALIGN_DOWN(start, align);
> > > > > > > > -               end = ALIGN(end, align);
> > > > > > > > -               length = align;
> > > > > > > > -               while (start + length < end) {
> > > > > > > > -                       length <<= 1;
> > > > > > > > -                       start = ALIGN_DOWN(orig_start,
> > > > > > > > length);
> > > > > > > > -               }
> > > > > > > > -
> > > > > > > > -               /*
> > > > > > > > -                * Minimum invalidation size for a 2MB
> > > > > > > > page
> > > > > > > > that
> > > > > > > > the hardware
> > > > > > > > -                * expects is 16MB
> > > > > > > > -                */
> > > > > > > > -               if (length >= SZ_2M) {
> > > > > > > > -                       length = max_t(u64, SZ_16M,
> > > > > > > > length);
> > > > > > > > -                       start = ALIGN_DOWN(orig_start,
> > > > > > > > length);
> > > > > > > > -               }
> > > > > > > > -
> > > > > > > > -               xe_gt_assert(gt, length >= SZ_4K);
> > > > > > > > -               xe_gt_assert(gt,
> > > > > > > > is_power_of_2(length));
> > > > > > > > -               xe_gt_assert(gt, !(length &
> > > > > > > > GENMASK(ilog2(SZ_16M)
> > > > > > > > - 1,
> > > > > > > > -                                                  
> > > > > > > > ilog2(SZ_2M)
> > > > > > > > + 1)));
> > > > > > > > -               xe_gt_assert(gt, IS_ALIGNED(start,
> > > > > > > > length));
> > > > > > > > -
> > > > > > > > -               action[len++] =
> > > > > > > > MAKE_INVAL_OP(XE_GUC_TLB_INVAL_PAGE_SELECTIVE);
> > > > > > > > -               action[len++] = asid;
> > > > > > > > -               action[len++] = lower_32_bits(start);
> > > > > > > > -               action[len++] = upper_32_bits(start);
> > > > > > > > -               action[len++] = ilog2(length) -
> > > > > > > > ilog2(SZ_4K);
> > > > > > > > -       }
> > > > > > > > -
> > > > > > > > -       xe_gt_assert(gt, len <=
> > > > > > > > MAX_TLB_INVALIDATION_LEN);
> > > > > > > > -
> > > > > > > > -       return send_tlb_inval(&gt->uc.guc, action,
> > > > > > > > len);
> > > > > > > > -}
> > > > > > > > -
> > > > > > > > -static int __xe_tlb_inval_ggtt(struct xe_gt *gt,
> > > > > > > > -                              struct
> > > > > > > > xe_tlb_inval_fence
> > > > > > > > *fence)
> > > > > > > > -{
> > > > > > > > -       int ret;
> > > > > > > > -
> > > > > > > > -       mutex_lock(&gt->uc.guc.ct.lock);
> > > > > > > > -
> > > > > > > > -       xe_tlb_inval_fence_prep(fence);
> > > > > > > > -
> > > > > > > > -       ret = send_tlb_inval_ggtt(gt, fence->seqno);
> > > > > > > > -       if (ret < 0)
> > > > > > > > -
> > > > > > > >                inval_fence_signal_unlocked(gt_to_xe(gt),
> > > > > > > > fence);
> > > > > > > > -
> > > > > > > > -       mutex_unlock(&gt->uc.guc.ct.lock);
> > > > > > > > -
> > > > > > > > -       /*
> > > > > > > > -        * -ECANCELED indicates the CT is stopped for a
> > > > > > > > GT
> > > > > > > > reset.
> > > > > > > > TLB caches
> > > > > > > > -        *  should be nuked on a GT reset so this error
> > > > > > > > can
> > > > > > > > be
> > > > > > > > ignored.
> > > > > > > > -        */
> > > > > > > > -       if (ret == -ECANCELED)
> > > > > > > > -               return 0;
> > > > > > > > -
> > > > > > > > -       return ret;
> > > > > > > > +       return xe_tlb_inval_issue(tlb_inval, fence,
> > > > > > > > tlb_inval-
> > > > > > > > > ops->all);
> > > > > > > >  }
> > > > > > > >  
> > > > > > > >  /**
> > > > > > > > - * xe_tlb_inval_ggtt - Issue a TLB invalidation on
> > > > > > > > this
> > > > > > > > GT
> > > > > > > > for
> > > > > > > > the GGTT
> > > > > > > > + * xe_tlb_inval_ggtt() - Issue a TLB invalidation for
> > > > > > > > the
> > > > > > > > GGTT
> > > > > > > >   * @tlb_inval: TLB invalidation client
> > > > > > > >   *
> > > > > > > > - * Issue a TLB invalidation for the GGTT. Completion
> > > > > > > > of
> > > > > > > > TLB
> > > > > > > > invalidation is
> > > > > > > > - * synchronous.
> > > > > > > > + * Issue a TLB invalidation for the GGTT. Completion
> > > > > > > > of
> > > > > > > > TLB
> > > > > > > > is
> > > > > > > > asynchronous and
> > > > > > > > + * caller can use the invalidation fence to wait for
> > > > > > > > completion.
> > > > > > > >   *
> > > > > > > >   * Return: 0 on success, negative error code on error
> > > > > > > >   */
> > > > > > > >  int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval)
> > > > > > > >  {
> > > > > > > > -       struct xe_gt *gt = tlb_inval->private;
> > > > > > > > -       struct xe_device *xe = gt_to_xe(gt);
> > > > > > > > -       unsigned int fw_ref;
> > > > > > > > -
> > > > > > > > -       if (xe_guc_ct_enabled(&gt->uc.guc.ct) &&
> > > > > > > > -           gt->uc.guc.submission_state.enabled) {
> > > > > > > > -               struct xe_tlb_inval_fence fence;
> > > > > > > > -               int ret;
> > > > > > > > -
> > > > > > > > -               xe_tlb_inval_fence_init(tlb_inval,
> > > > > > > > &fence,
> > > > > > > > true);
> > > > > > > > -               ret = __xe_tlb_inval_ggtt(gt, &fence);
> > > > > > > > -               if (ret)
> > > > > > > > -                       return ret;
> > > > > > > > -
> > > > > > > > -               xe_tlb_inval_fence_wait(&fence);
> > > > > > > > -       } else if (xe_device_uc_enabled(xe) &&
> > > > > > > > !xe_device_wedged(xe)) {
> > > > > > > > -               struct xe_mmio *mmio = &gt->mmio;
> > > > > > > > -
> > > > > > > > -               if (IS_SRIOV_VF(xe))
> > > > > > > > -                       return 0;
> > > > > > > > -
> > > > > > > > -               fw_ref =
> > > > > > > > xe_force_wake_get(gt_to_fw(gt),
> > > > > > > > XE_FW_GT);
> > > > > > > > -               if (xe->info.platform == XE_PVC ||
> > > > > > > > GRAPHICS_VER(xe) >= 20) {
> > > > > > > > -                       xe_mmio_write32(mmio,
> > > > > > > > PVC_GUC_TLB_INV_DESC1,
> > > > > > > > -
> > > > > > > >                                        PVC_GUC_TLB_INV_
> > > > > > > > DE
> > > > > > > > SC1_
> > > > > > > > INVAL
> > > > > > > > IDATE);
> > > > > > > > -                       xe_mmio_write32(mmio,
> > > > > > > > PVC_GUC_TLB_INV_DESC0,
> > > > > > > > -
> > > > > > > >                                        PVC_GUC_TLB_INV_
> > > > > > > > DE
> > > > > > > > SC0_
> > > > > > > > VALID
> > > > > > > > );
> > > > > > > > -               } else {
> > > > > > > > -                       xe_mmio_write32(mmio,
> > > > > > > > GUC_TLB_INV_CR,
> > > > > > > > -
> > > > > > > >                                        GUC_TLB_INV_CR_I
> > > > > > > > NV
> > > > > > > > ALID
> > > > > > > > ATE);
> > > > > > > > -               }
> > > > > > > > -               xe_force_wake_put(gt_to_fw(gt),
> > > > > > > > fw_ref);
> > > > > > > > -       }
> > > > > > > > +       struct xe_tlb_inval_fence fence, *fence_ptr =
> > > > > > > > &fence;
> > > > > > > > +       int ret;
> > > > > > > >  
> > > > > > > > -       return 0;
> > > > > > > > +       xe_tlb_inval_fence_init(tlb_inval, fence_ptr,
> > > > > > > > true);
> > > > > > > > +       ret = xe_tlb_inval_issue(tlb_inval, fence_ptr,
> > > > > > > > tlb_inval-
> > > > > > > > > ops->ggtt);
> > > > > > > > +       xe_tlb_inval_fence_wait(fence_ptr);
> > > > > > > > +
> > > > > > > > +       return ret;
> > > > > > > >  }
> > > > > > > >  
> > > > > > > >  /**
> > > > > > > > - * xe_tlb_inval_range - Issue a TLB invalidation on
> > > > > > > > this
> > > > > > > > GT
> > > > > > > > for
> > > > > > > > an address range
> > > > > > > > + * xe_tlb_inval_range() - Issue a TLB invalidation for
> > > > > > > > an
> > > > > > > > address range
> > > > > > > >   * @tlb_inval: TLB invalidation client
> > > > > > > >   * @fence: invalidation fence which will be signal on
> > > > > > > > TLB
> > > > > > > > invalidation
> > > > > > > >   * completion
> > > > > > > > @@ -460,33 +288,12 @@ int xe_tlb_inval_range(struct
> > > > > > > > xe_tlb_inval
> > > > > > > > *tlb_inval,
> > > > > > > >                        struct xe_tlb_inval_fence
> > > > > > > > *fence,
> > > > > > > > u64
> > > > > > > > start, u64 end,
> > > > > > > >                        u32 asid)
> > > > > > > >  {
> > > > > > > > -       struct xe_gt *gt = tlb_inval->private;
> > > > > > > > -       struct xe_device *xe = gt_to_xe(gt);
> > > > > > > > -       int  ret;
> > > > > > > > -
> > > > > > > > -       xe_gt_assert(gt, fence);
> > > > > > > > -
> > > > > > > > -       /* Execlists not supported */
> > > > > > > > -       if (xe->info.force_execlist) {
> > > > > > > > -               __inval_fence_signal(xe, fence);
> > > > > > > > -               return 0;
> > > > > > > > -       }
> > > > > > > > -
> > > > > > > > -       mutex_lock(&gt->uc.guc.ct.lock);
> > > > > > > > -
> > > > > > > > -       xe_tlb_inval_fence_prep(fence);
> > > > > > > > -
> > > > > > > > -       ret = send_tlb_inval_ppgtt(gt, start, end,
> > > > > > > > asid,
> > > > > > > > fence-
> > > > > > > > > seqno);
> > > > > > > > -       if (ret < 0)
> > > > > > > > -               inval_fence_signal_unlocked(xe, fence);
> > > > > > > > -
> > > > > > > > -       mutex_unlock(&gt->uc.guc.ct.lock);
> > > > > > > > -
> > > > > > > > -       return ret;
> > > > > > > > +       return xe_tlb_inval_issue(tlb_inval, fence,
> > > > > > > > tlb_inval-
> > > > > > > > > ops->ppgtt,
> > > > > > > > +                                 start, end, asid);
> > > > > > > >  }
> > > > > > > >  
> > > > > > > >  /**
> > > > > > > > - * xe_tlb_inval_vm - Issue a TLB invalidation on this
> > > > > > > > GT
> > > > > > > > for
> > > > > > > > a
> > > > > > > > VM
> > > > > > > > + * xe_tlb_inval_vm() - Issue a TLB invalidation for a
> > > > > > > > VM
> > > > > > > >   * @tlb_inval: TLB invalidation client
> > > > > > > >   * @vm: VM to invalidate
> > > > > > > >   *
> > > > > > > > @@ -496,27 +303,22 @@ void xe_tlb_inval_vm(struct
> > > > > > > > xe_tlb_inval
> > > > > > > > *tlb_inval, struct xe_vm *vm)
> > > > > > > >  {
> > > > > > > >         struct xe_tlb_inval_fence fence;
> > > > > > > >         u64 range = 1ull << vm->xe->info.va_bits;
> > > > > > > > -       int ret;
> > > > > > > >  
> > > > > > > >         xe_tlb_inval_fence_init(tlb_inval, &fence,
> > > > > > > > true);
> > > > > > > > -
> > > > > > > > -       ret = xe_tlb_inval_range(tlb_inval, &fence, 0,
> > > > > > > > range,
> > > > > > > > vm-
> > > > > > > > > usm.asid);
> > > > > > > > -       if (ret < 0)
> > > > > > > > -               return;
> > > > > > > > -
> > > > > > > > +       xe_tlb_inval_range(tlb_inval, &fence, 0, range,
> > > > > > > > vm-
> > > > > > > > > usm.asid);
> > > > > > > >         xe_tlb_inval_fence_wait(&fence);
> > > > > > > >  }
> > > > > > > >  
> > > > > > > >  /**
> > > > > > > > - * xe_tlb_inval_done_handler - TLB invalidation done
> > > > > > > > handler
> > > > > > > > - * @gt: gt
> > > > > > > > + * xe_tlb_inval_done_handler() - TLB invalidation done
> > > > > > > > handler
> > > > > > > > + * @tlb_inval: TLB invalidation client
> > > > > > > >   * @seqno: seqno of invalidation that is done
> > > > > > > >   *
> > > > > > > >   * Update recv seqno, signal any TLB invalidation
> > > > > > > > fences,
> > > > > > > > and
> > > > > > > > restart TDR
> > > > > > > 
> > > > > > > I'd mention that is function is safe be called from any
> > > > > > > context
> > > > > > > (i.e.,
> > > > > > > process, atomic, and hardirq contexts are allowed).
> > > > > > > 
> > > > > > > We might need to convert tlb_inval.pending_lock to a
> > > > > > > raw_spinlock_t
> > > > > > > for
> > > > > > > PREEMPT_RT enablement. Same for the GuC fast_lock. AFAIK
> > > > > > > we
> > > > > > > haven’t
> > > > > > > had
> > > > > > > any complaints, so maybe I’m just overthinking it, but
> > > > > > > also
> > > > > > > perhaps
> > > > > > > not.
> > > > > > > 
> > > > > > > >   */
> > > > > > > > -static void xe_tlb_inval_done_handler(struct xe_gt
> > > > > > > > *gt,
> > > > > > > > int
> > > > > > > > seqno)
> > > > > > > > +void xe_tlb_inval_done_handler(struct xe_tlb_inval
> > > > > > > > *tlb_inval,
> > > > > > > > int seqno)
> > > > > > > >  {
> > > > > > > > -       struct xe_device *xe = gt_to_xe(gt);
> > > > > > > > +       struct xe_device *xe = tlb_inval->xe;
> > > > > > > >         struct xe_tlb_inval_fence *fence, *next;
> > > > > > > >         unsigned long flags;
> > > > > > > >  
> > > > > > > > @@ -535,77 +337,53 @@ static void
> > > > > > > > xe_tlb_inval_done_handler(struct xe_gt *gt, int seqno)
> > > > > > > >          * officially process the CT message like if
> > > > > > > > racing
> > > > > > > > against
> > > > > > > >          * process_g2h_msg().
> > > > > > > >          */
> > > > > > > > -       spin_lock_irqsave(&gt->tlb_inval.pending_lock,
> > > > > > > > flags);
> > > > > > > > -       if (tlb_inval_seqno_past(gt, seqno)) {
> > > > > > > > -               spin_unlock_irqrestore(&gt-
> > > > > > > > > tlb_inval.pending_lock, flags);
> > > > > > > > +       spin_lock_irqsave(&tlb_inval->pending_lock,
> > > > > > > > flags);
> > > > > > > > +       if (xe_tlb_inval_seqno_past(tlb_inval, seqno))
> > > > > > > > {
> > > > > > > > +               spin_unlock_irqrestore(&tlb_inval-
> > > > > > > > > pending_lock,
> > > > > > > > flags);
> > > > > > > >                 return;
> > > > > > > >         }
> > > > > > > >  
> > > > > > > > -       WRITE_ONCE(gt->tlb_inval.seqno_recv, seqno);
> > > > > > > > +       WRITE_ONCE(tlb_inval->seqno_recv, seqno);
> > > > > > > >  
> > > > > > > >         list_for_each_entry_safe(fence, next,
> > > > > > > > -                                &gt-
> > > > > > > > > tlb_inval.pending_fences,
> > > > > > > > link) {
> > > > > > > > +                                &tlb_inval-
> > > > > > > > > pending_fences,
> > > > > > > > link) {
> > > > > > > >                 trace_xe_tlb_inval_fence_recv(xe,
> > > > > > > > fence);
> > > > > > > >  
> > > > > > > > -               if (!tlb_inval_seqno_past(gt, fence-
> > > > > > > > > seqno))
> > > > > > > > +               if (!xe_tlb_inval_seqno_past(tlb_inval,
> > > > > > > > fence-
> > > > > > > > > seqno))
> > > > > > > >                         break;
> > > > > > > >  
> > > > > > > > -               inval_fence_signal(xe, fence);
> > > > > > > > +               xe_tlb_inval_fence_signal(fence);
> > > > > > > >         }
> > > > > > > >  
> > > > > > > > -       if (!list_empty(&gt->tlb_inval.pending_fences))
> > > > > > > > +       if (!list_empty(&tlb_inval->pending_fences))
> > > > > > > >                 mod_delayed_work(system_wq,
> > > > > > > > -                                &gt-
> > > > > > > > > tlb_inval.fence_tdr,
> > > > > > > > -                               
> > > > > > > > tlb_timeout_jiffies(gt));
> > > > > > > > +                                &tlb_inval->fence_tdr,
> > > > > > > > +                                tlb_inval->ops-
> > > > > > > > > timeout_delay(tlb_inval));
> > > > > > > >         else
> > > > > > > > -               cancel_delayed_work(&gt-
> > > > > > > > > tlb_inval.fence_tdr);
> > > > > > > > +               cancel_delayed_work(&tlb_inval-
> > > > > > > > > fence_tdr);
> > > > > > > >  
> > > > > > > > -       spin_unlock_irqrestore(&gt-
> > > > > > > > > tlb_inval.pending_lock,
> > > > > > > > flags);
> > > > > > > > -}
> > > > > > > > -
> > > > > > > > -/**
> > > > > > > > - * xe_guc_tlb_inval_done_handler - TLB invalidation
> > > > > > > > done
> > > > > > > > handler
> > > > > > > > - * @guc: guc
> > > > > > > > - * @msg: message indicating TLB invalidation done
> > > > > > > > - * @len: length of message
> > > > > > > > - *
> > > > > > > > - * Parse seqno of TLB invalidation, wake any waiters
> > > > > > > > for
> > > > > > > > seqno,
> > > > > > > > and signal any
> > > > > > > > - * invalidation fences for seqno. Algorithm for this
> > > > > > > > depends
> > > > > > > > on
> > > > > > > > seqno being
> > > > > > > > - * received in-order and asserts this assumption.
> > > > > > > > - *
> > > > > > > > - * Return: 0 on success, -EPROTO for malformed
> > > > > > > > messages.
> > > > > > > > - */
> > > > > > > > -int xe_guc_tlb_inval_done_handler(struct xe_guc *guc,
> > > > > > > > u32
> > > > > > > > *msg,
> > > > > > > > u32 len)
> > > > > > > > -{
> > > > > > > > -       struct xe_gt *gt = guc_to_gt(guc);
> > > > > > > > -
> > > > > > > > -       if (unlikely(len != 1))
> > > > > > > > -               return -EPROTO;
> > > > > > > > -
> > > > > > > > -       xe_tlb_inval_done_handler(gt, msg[0]);
> > > > > > > > -
> > > > > > > > -       return 0;
> > > > > > > > +       spin_unlock_irqrestore(&tlb_inval-
> > > > > > > > >pending_lock,
> > > > > > > > flags);
> > > > > > > >  }
> > > > > > > >  
> > > > > > > >  static const char *
> > > > > > > > -inval_fence_get_driver_name(struct dma_fence
> > > > > > > > *dma_fence)
> > > > > > > > +xe_inval_fence_get_driver_name(struct dma_fence
> > > > > > > > *dma_fence)
> > > > > > > >  {
> > > > > > > >         return "xe";
> > > > > > > >  }
> > > > > > > >  
> > > > > > > >  static const char *
> > > > > > > > -inval_fence_get_timeline_name(struct dma_fence
> > > > > > > > *dma_fence)
> > > > > > > > +xe_inval_fence_get_timeline_name(struct dma_fence
> > > > > > > > *dma_fence)
> > > > > > > >  {
> > > > > > > > -       return "inval_fence";
> > > > > > > > +       return "tlb_inval_fence";
> > > > > > > >  }
> > > > > > > >  
> > > > > > > >  static const struct dma_fence_ops inval_fence_ops = {
> > > > > > > > -       .get_driver_name = inval_fence_get_driver_name,
> > > > > > > > -       .get_timeline_name =
> > > > > > > > inval_fence_get_timeline_name,
> > > > > > > > +       .get_driver_name =
> > > > > > > > xe_inval_fence_get_driver_name,
> > > > > > > > +       .get_timeline_name =
> > > > > > > > xe_inval_fence_get_timeline_name,
> > > > > > > >  };
> > > > > > > >  
> > > > > > > >  /**
> > > > > > > > - * xe_tlb_inval_fence_init - Initialize TLB
> > > > > > > > invalidation
> > > > > > > > fence
> > > > > > > > + * xe_tlb_inval_fence_init() - Initialize TLB
> > > > > > > > invalidation
> > > > > > > > fence
> > > > > > > >   * @tlb_inval: TLB invalidation client
> > > > > > > >   * @fence: TLB invalidation fence to initialize
> > > > > > > >   * @stack: fence is stack variable
> > > > > > > > @@ -618,15 +396,12 @@ void
> > > > > > > > xe_tlb_inval_fence_init(struct
> > > > > > > > xe_tlb_inval *tlb_inval,
> > > > > > > >                              struct xe_tlb_inval_fence
> > > > > > > > *fence,
> > > > > > > >                              bool stack)
> > > > > > > >  {
> > > > > > > > -       struct xe_gt *gt = tlb_inval->private;
> > > > > > > > -
> > > > > > > > -       xe_pm_runtime_get_noresume(gt_to_xe(gt));
> > > > > > > > +       xe_pm_runtime_get_noresume(tlb_inval->xe);
> > > > > > > >  
> > > > > > > > -       spin_lock_irq(&gt->tlb_inval.lock);
> > > > > > > > -       dma_fence_init(&fence->base, &inval_fence_ops,
> > > > > > > > -                      &gt->tlb_inval.lock,
> > > > > > > > +       spin_lock_irq(&tlb_inval->lock);
> > > > > > > > +       dma_fence_init(&fence->base, &inval_fence_ops,
> > > > > > > > &tlb_inval->lock,
> > > > > > > >                        dma_fence_context_alloc(1), 1);
> > > > > > > > -       spin_unlock_irq(&gt->tlb_inval.lock);
> > > > > > > > +       spin_unlock_irq(&tlb_inval->lock);
> > > > > > > 
> > > > > > > While here, 'fence_lock' is probably a better name.
> > > > > > > 
> > > > > > > Matt
> > > > > > > 
> > > > > > > >         INIT_LIST_HEAD(&fence->link);
> > > > > > > >         if (stack)
> > > > > > > >                 set_bit(FENCE_STACK_BIT, &fence-
> > > > > > > > > base.flags);
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > > > > > > > b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > > > > > > > index 7adee3f8c551..cdeafc8d4391 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_tlb_inval.h
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h
> > > > > > > > @@ -18,24 +18,30 @@ struct xe_vma;
> > > > > > > >  int xe_gt_tlb_inval_init_early(struct xe_gt *gt);
> > > > > > > >  
> > > > > > > >  void xe_tlb_inval_reset(struct xe_tlb_inval
> > > > > > > > *tlb_inval);
> > > > > > > > -int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
> > > > > > > > -void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval,
> > > > > > > > struct
> > > > > > > > xe_vm *vm);
> > > > > > > >  int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval,
> > > > > > > >                      struct xe_tlb_inval_fence *fence);
> > > > > > > > +int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval);
> > > > > > > > +void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval,
> > > > > > > > struct
> > > > > > > > xe_vm *vm);
> > > > > > > >  int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval,
> > > > > > > >                        struct xe_tlb_inval_fence
> > > > > > > > *fence,
> > > > > > > >                        u64 start, u64 end, u32 asid);
> > > > > > > > -int xe_guc_tlb_inval_done_handler(struct xe_guc *guc,
> > > > > > > > u32
> > > > > > > > *msg,
> > > > > > > > u32 len);
> > > > > > > >  
> > > > > > > >  void xe_tlb_inval_fence_init(struct xe_tlb_inval
> > > > > > > > *tlb_inval,
> > > > > > > >                              struct xe_tlb_inval_fence
> > > > > > > > *fence,
> > > > > > > >                              bool stack);
> > > > > > > > -void xe_tlb_inval_fence_signal(struct
> > > > > > > > xe_tlb_inval_fence
> > > > > > > > *fence);
> > > > > > > >  
> > > > > > > > +/**
> > > > > > > > + * xe_tlb_inval_fence_wait() - TLB invalidiation fence
> > > > > > > > wait
> > > > > > > > + * @fence: TLB invalidation fence to wait on
> > > > > > > > + *
> > > > > > > > + * Wait on a TLB invalidiation fence until it signals,
> > > > > > > > non
> > > > > > > > interruptable
> > > > > > > > + */
> > > > > > > >  static inline void
> > > > > > > >  xe_tlb_inval_fence_wait(struct xe_tlb_inval_fence
> > > > > > > > *fence)
> > > > > > > >  {
> > > > > > > >         dma_fence_wait(&fence->base, false);
> > > > > > > >  }
> > > > > > > >  
> > > > > > > > +void xe_tlb_inval_done_handler(struct xe_tlb_inval
> > > > > > > > *tlb_inval,
> > > > > > > > int seqno);
> > > > > > > > +
> > > > > > > >  #endif /* _XE_TLB_INVAL_ */
> > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > > > > > > > b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > > > > > > > index 05b6adc929bb..c1ad96d24fc8 100644
> > > > > > > > --- a/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > > > > > > > +++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h
> > > > > > > > @@ -9,10 +9,85 @@
> > > > > > > >  #include <linux/workqueue.h>
> > > > > > > >  #include <linux/dma-fence.h>
> > > > > > > >  
> > > > > > > > -/** struct xe_tlb_inval - TLB invalidation client */
> > > > > > > > +struct xe_tlb_inval;
> > > > > > > > +
> > > > > > > > +/** struct xe_tlb_inval_ops - TLB invalidation ops
> > > > > > > > (backend)
> > > > > > > > */
> > > > > > > > +struct xe_tlb_inval_ops {
> > > > > > > > +       /**
> > > > > > > > +        * @all: Invalidate all TLBs
> > > > > > > > +        * @tlb_inval: TLB invalidation client
> > > > > > > > +        * @seqno: Seqno of TLB invalidation
> > > > > > > > +        *
> > > > > > > > +        * Return 0 on success, -ECANCELED if backend
> > > > > > > > is
> > > > > > > > mid-
> > > > > > > > reset, error on
> > > > > > > > +        * failure
> > > > > > > > +        */
> > > > > > > > +       int (*all)(struct xe_tlb_inval *tlb_inval, u32
> > > > > > > > seqno);
> > > > > > > > +
> > > > > > > > +       /**
> > > > > > > > +        * @ggtt: Invalidate global translation TLBs
> > > > > > > > +        * @tlb_inval: TLB invalidation client
> > > > > > > > +        * @seqno: Seqno of TLB invalidation
> > > > > > > > +        *
> > > > > > > > +        * Return 0 on success, -ECANCELED if backend
> > > > > > > > is
> > > > > > > > mid-
> > > > > > > > reset, error on
> > > > > > > > +        * failure
> > > > > > > > +        */
> > > > > > > > +       int (*ggtt)(struct xe_tlb_inval *tlb_inval, u32
> > > > > > > > seqno);
> > > > > > > > +
> > > > > > > > +       /**
> > > > > > > > +        * @ppttt: Invalidate per-process translation
> > > > > > > > TLBs
> > > > > > > > +        * @tlb_inval: TLB invalidation client
> > > > > > > > +        * @seqno: Seqno of TLB invalidation
> > > > > > > > +        * @start: Start address
> > > > > > > > +        * @end: End address
> > > > > > > > +        * @asid: Address space ID
> > > > > > > > +        *
> > > > > > > > +        * Return 0 on success, -ECANCELED if backend
> > > > > > > > is
> > > > > > > > mid-
> > > > > > > > reset, error on
> > > > > > > > +        * failure
> > > > > > > > +        */
> > > > > > > > +       int (*ppgtt)(struct xe_tlb_inval *tlb_inval,
> > > > > > > > u32
> > > > > > > > seqno,
> > > > > > > > u64 start,
> > > > > > > > +                    u64 end, u32 asid);
> > > > > > > > +
> > > > > > > > +       /**
> > > > > > > > +        * @initialized: Backend is initialized
> > > > > > > > +        * @tlb_inval: TLB invalidation client
> > > > > > > > +        *
> > > > > > > > +        * Return: True if back is initialized, False
> > > > > > > > otherwise
> > > > > > > > +        */
> > > > > > > > +       bool (*initialized)(struct xe_tlb_inval
> > > > > > > > *tlb_inval);
> > > > > > > > +
> > > > > > > > +       /**
> > > > > > > > +        * @flush: Flush pending TLB invalidations
> > > > > > > > +        * @tlb_inval: TLB invalidation client
> > > > > > > > +        */
> > > > > > > > +       void (*flush)(struct xe_tlb_inval *tlb_inval);
> > > > > > > > +
> > > > > > > > +       /**
> > > > > > > > +        * @timeout_delay: Timeout delay for TLB
> > > > > > > > invalidation
> > > > > > > > +        * @tlb_inval: TLB invalidation client
> > > > > > > > +        *
> > > > > > > > +        * Return: Timeout delay for TLB invalidation
> > > > > > > > in
> > > > > > > > jiffies
> > > > > > > > +        */
> > > > > > > > +       long (*timeout_delay)(struct xe_tlb_inval
> > > > > > > > *tlb_inval);
> > > > > > > > +
> > > > > > > > +       /**
> > > > > > > > +        * @lock: Lock resources protecting the backend
> > > > > > > > seqno
> > > > > > > > management
> > > > > > > > +        */
> > > > > > > > +       void (*lock)(struct xe_tlb_inval *tlb_inval);
> > > > > > > > +
> > > > > > > > +       /**
> > > > > > > > +        * @unlock: Lock resources protecting the
> > > > > > > > backend
> > > > > > > > seqno
> > > > > > > > management
> > > > > > > > +        */
> > > > > > > > +       void (*unlock)(struct xe_tlb_inval *tlb_inval);
> > > > > > > > +};
> > > > > > > > +
> > > > > > > > +/** struct xe_tlb_inval - TLB invalidation client
> > > > > > > > (frontend)
> > > > > > > > */
> > > > > > > >  struct xe_tlb_inval {
> > > > > > > >         /** @private: Backend private pointer */
> > > > > > > >         void *private;
> > > > > > > > +       /** @xe: Pointer to Xe device */
> > > > > > > > +       struct xe_device *xe;
> > > > > > > > +       /** @ops: TLB invalidation ops */
> > > > > > > > +       const struct xe_tlb_inval_ops *ops;
> > > > > > > >         /** @tlb_inval.seqno: TLB invalidation seqno,
> > > > > > > > protected
> > > > > > > > by CT lock */
> > > > > > > >  #define TLB_INVALIDATION_SEQNO_MAX     0x100000
> > > > > > > >         int seqno;
> > > > > > > > -- 
> > > > > > > > 2.34.1
> > > > > > > > 
> > > > > 
> > > 
>