[PATCH] drm/xe: Set PTE_AE for xe2 dgfx platforms

Thu Jan 18 16:08:12 UTC 2024

On Thu, Jan 18, 2024 at 03:42:56PM +0100, Nirmoy Das wrote:
> 
> Hi Matt,
> 
> On 1/18/2024 1:49 AM, Matt Roper wrote:
> > On Wed, Jan 17, 2024 at 03:48:51PM +0100, Nirmoy Das wrote:
> > > Atomics on XE2 works for both type of memory so
> > > extend setting PTE_AE for dgfx platforms as well.
> > There are no Xe2 dgfx platforms yet, so it's kind of hard to review this
> > fully,
> I should've have sent it internally.
> >   but my understanding from bspec 71539 is that in theory a dGPU
> > atomic operation against system memory would probably only be atomic in
> > device scope, not global scope.  I.e., it's atomic with other GPU
> > operations,
> This is my understand as well.
> >   assuming the CPU isn't also accessing the buffer.  But if
> > the buffer is shared between the CPU and GPU, then you'd want to set
> > AE=0 to ensure that we get a page fault and can migrate the object into
> > lmem first.
> 
> I think on system memory, AE=1 should be the default and opt-out by UMD with
> a uAPI. Basic operations like
> 
> MI_ATOMIC will fail otherwise which if understand correctly MI_ATOMIC should
> work on system memory except
> 
> on PVC because of a know bug. I added a small test to check MI_ATOMIC which
> works on dg2 system memory
> 
> https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_10547/bat-dg2-oem2/igt@xe_exec_atomic@basic-dec-all@engine-drm_xe_engine_class_compute-instance-0-tile-0-system-memory.html
> 
> and also on discrete xe2 but requires this patch.

I think it depends what you mean by "work."  My understanding is that
barring hardware defects, MI_ATOMIC can always execute and perform the
requested operation on Xe2.  However in some situations the operation
will not actually behave atomically from the CPU's point of view.  The
AE bit in the page table doesn't make MI_ATOMIC work/not work, it just
determines whether we want MI_ATOMIC to trigger a page fault or not.  In
cases like this where a buffer can be accessed from both the CPU and
GPU, and where we need true global scope atomicity, then we'd want to
make sure AE=0 so that a page fault is generated and we get the
opportunity to migrate the buffer to LMEM.

I think the expectation of userspace would be that MI_ATOMIC is always
truly atomic.  Wouldn't it be better to make sure we trigger a page
fault by default and only turn that off if userspace explicitly tells us
they're okay with relaxing to device scope atomicity?

Either way, we should probably wait until there actually are Xe2
discrete GPUs; at the moment this is all theoretical since the only
Xe2 platform we have today is an igpu.

Matt

> 
> 
> 
> Regards,
> 
> Nirmoy
> 
> > Matt
> > 
> > > Cc: Fei Yang<fei.yang at intel.com>
> > > Cc: Jose Souza<jose.souza at intel.com>
> > > Cc: Matt Roper<matthew.d.roper at intel.com>
> > > Cc: Brian Welty<brian.welty at intel.com>
> > > Signed-off-by: Nirmoy Das<nirmoy.das at intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_pt.c | 6 +++++-
> > >   1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > > index de1030a47588..3ace4b401369 100644
> > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > @@ -602,8 +602,12 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
> > >   	struct xe_pt *pt = xe_vma_vm(vma)->pt_root[tile->id];
> > >   	int ret;
> > > +	/**
> > > +	 * XE_USM_PPGTT_PTE_AE is available for igfx and dgfx from xe2 onwards
> > > +	 * and also for PVC but atomics only works for PVC on device memory.
> > > +	 */
> > >   	if (vma && (vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) &&
> > > -	    (is_devmem || !IS_DGFX(xe)))
> > > +	    (is_devmem || GRAPHICS_VER(xe) >= 20))
> > >   		xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE;
> > >   	if (is_devmem) {
> > > -- 
> > > 2.42.0
> > > 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation