[PATCH 03/11] drm/xe: Implement xe_pagefault_reset

Thu Aug 7 18:29:17 UTC 2025

On Wed, 2025-08-06 at 17:12 -0700, Matthew Brost wrote:
> On Wed, Aug 06, 2025 at 05:16:26PM -0600, Summers, Stuart wrote:
> > On Tue, 2025-08-05 at 23:22 -0700, Matthew Brost wrote:
> > > Squash any pending faults on the GT being reset by setting the GT
> > > field
> > > in struct xe_pagefault to NULL.
> > > 
> > > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_gt.c        |  2 ++
> > >  drivers/gpu/drm/xe/xe_pagefault.c | 23 ++++++++++++++++++++++-
> > >  2 files changed, 24 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_gt.c
> > > b/drivers/gpu/drm/xe/xe_gt.c
> > > index 390394bbaadc..5aa03f89a062 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > > @@ -50,6 +50,7 @@
> > >  #include "xe_map.h"
> > >  #include "xe_migrate.h"
> > >  #include "xe_mmio.h"
> > > +#include "xe_pagefault.h"
> > >  #include "xe_pat.h"
> > >  #include "xe_pm.h"
> > >  #include "xe_mocs.h"
> > > @@ -846,6 +847,7 @@ static int gt_reset(struct xe_gt *gt)
> > >  
> > >         xe_uc_gucrc_disable(&gt->uc);
> > >         xe_uc_stop_prepare(&gt->uc);
> > > +       xe_pagefault_reset(gt_to_xe(gt), gt);
> > 
> > Can we just pass the GT in here and then extrapolate xe from there?
> > I
> > realize you're thinking of dropping the GT piece, but maybe we can
> > change the parameters around at that time. Just feels weird passing
> > these both in at this point.
> > 
> 
> I think the style is generally layer accepts the main argument first
> exported function. ofc we don't actually do that everywhere in Xe
> though. Here the main argument for the layer 'xe'.
> 
> Can drop if you prefer, I don't really have strong opinion.

Yeah ok makes sense. Let's leave it the way you have it here.

>  
> 
> > >         xe_gt_pagefault_reset(gt);
> > >  
> > >         xe_uc_stop(&gt->uc);
> > > diff --git a/drivers/gpu/drm/xe/xe_pagefault.c
> > > b/drivers/gpu/drm/xe/xe_pagefault.c
> > > index 14304c41eb23..aef389e51612 100644
> > > --- a/drivers/gpu/drm/xe/xe_pagefault.c
> > > +++ b/drivers/gpu/drm/xe/xe_pagefault.c
> > > @@ -122,6 +122,24 @@ int xe_pagefault_init(struct xe_device *xe)
> > >         return err;
> > >  }
> > >  
> > > +static void xe_pagefault_queue_reset(struct xe_device *xe,
> > > struct
> > > xe_gt *gt,
> > > +                                    struct xe_pagefault_queue
> > > *pf_queue)
> > > +{
> > > +       u32 i;
> > > +
> > > +       /* Squash all pending faults on the GT */
> > > +
> > > +       spin_lock_irq(&pf_queue->lock);
> > > +       for (i = pf_queue->tail; i != pf_queue->head;
> > > +            i = (i + xe_pagefault_entry_size()) % pf_queue-
> > > >size) {
> > 
> > Should we add a check in here that pf_queue->head is some multiple
> > of
> > xe_pagefault_entry_size and pf_queue->size is aligned to
> > xe_pagefault_entry_size()?
> > 
> 
> We can add some asserts, still would get an infinite loop but at
> least
> CI we catch a bug quickly we introduced somewhere.

Yeah asserts is all I was thinking. Just to give us a way to detect
regressions in this area easily.

> 
> > > +               struct xe_pagefault *pf = pf_queue->data + i;
> > > +
> > > +               if (pf->gt == gt)
> > > +                       pf->gt = NULL;
> > 
> > Not sure I fully get the intent here... so we loop back around from
> > TAIL to HEAD and clear all of the GTs in pf_queue->data for each
> > one?
> > Is the expectation that each entry in the pf_queue has the same GT
> > or
> > is NULL? And then setting to NULL is a way we can abstract out the
> > GT?
> > 
> 
> This patch [1] moves the page faults queue from the GT to the device
> as
> having a thread pool per GT makes little sense - want our thread pool
> to
> be per device and enough threads to hit CPU<->GPU peak bus bandwidth
> in
> what we'd expect to be common prefetch / page faults cases (e.g., 2M
> SVM
> migrations).
> 
> When the fault queues were per GT, reset were easy, just reset all
> the
> queues on the GT but now sense they are shared on the device and
> individual GTs can reset, we squash all faults on the GT by setting
> the
> GT to NULL. This patch [2] ignores any fault with a NULL GT in the
> function xe_pagefault_queue_work.

Ok thanks for the explanation here. I also am still working through
those later patches, just an initial comment here. Let me read through
those other in detail and I'll get back if I still have questions
there.

Thanks,
Stuart

> 
> Matt
> 
> [1]
> https://patchwork.freedesktop.org/patch/667318/?series=152565&rev=1
> [2]
> https://patchwork.freedesktop.org/patch/667323/?series=152565&rev=1
> 
> > Still getting through the series, so appologize if this is also
> > answered later in the series...
> > 
> > Thanks,
> > Stuart
> > 
> > > +       }
> > > +       spin_unlock_irq(&pf_queue->lock);
> > > +}
> > > +
> > >  /**
> > >   * xe_pagefault_reset() - Page fault reset for a GT
> > >   * @xe: xe device instance
> > > @@ -132,7 +150,10 @@ int xe_pagefault_init(struct xe_device *xe)
> > >   */
> > >  void xe_pagefault_reset(struct xe_device *xe, struct xe_gt *gt)
> > >  {
> > > -       /* TODO - implement */
> > > +       int i;
> > > +
> > > +       for (i = 0; i < XE_PAGEFAULT_QUEUE_COUNT; ++i)
> > > +               xe_pagefault_queue_reset(xe, gt, xe->usm.pf_queue
> > > +
> > > i);
> > >  }
> > >  
> > >  /**
> >