[Intel-xe] [PATCH] drm/xe: Implement HW workaround 14016763929

Zeng, Oak oak.zeng at intel.com
Thu Aug 3 19:53:37 UTC 2023



Thanks,
Oak

> -----Original Message-----
> From: Souza, Jose <jose.souza at intel.com>
> Sent: August 3, 2023 3:10 PM
> To: intel-xe at lists.freedesktop.org; Zeng, Oak <oak.zeng at intel.com>
> Cc: Roper, Matthew D <matthew.d.roper at intel.com>; De Marchi, Lucas
> <lucas.demarchi at intel.com>
> Subject: Re: [Intel-xe] [PATCH] drm/xe: Implement HW workaround 14016763929
> 
> On Thu, 2023-08-03 at 15:01 -0400, Oak Zeng wrote:
> > To workaround a HW bug on DG2, driver is required to map the whole
> > ppgtt virtual address space before GPU workload submission. Thus
> > set the XE_VM_FLAG_SCRATCH_PAGE flag during vm create so the whole
> > address space is mapped to point to scratch page.
> >
> > Signed-off-by: Oak Zeng <oak.zeng at intel.com>
> > ---
> >  drivers/gpu/drm/xe/Makefile        | 2 +-
> >  drivers/gpu/drm/xe/xe_vm.c         | 5 +++++
> >  drivers/gpu/drm/xe/xe_wa_oob.rules | 1 +
> >  3 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index e79624ab2cb3..e6a05d35c931 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -37,7 +37,7 @@ quiet_cmd_wa_oob = GEN     $(notdir $(generated_oob))
> >  $(generated_oob) &: $(obj)/xe_gen_wa_oob
> $(srctree)/$(src)/xe_wa_oob.rules
> >  	$(call cmd,wa_oob)
> >
> > -$(obj)/xe_guc.o $(obj)/xe_wa.o $(obj)/xe_ring_ops.o: $(generated_oob)
> > +$(obj)/xe_guc.o $(obj)/xe_wa.o $(obj)/xe_ring_ops.o $(obj)/xe_vm.o:
> $(generated_oob)
> >
> >  # Please keep these build lists sorted!
> >
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index cb28dbc2bdbb..b045bb341a5c 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -29,6 +29,8 @@
> >  #include "xe_res_cursor.h"
> >  #include "xe_sync.h"
> >  #include "xe_trace.h"
> > +#include "generated/xe_wa_oob.h"
> > +#include "xe_wa.h"
> >
> >  #define TEST_VM_ASYNC_OPS_ERROR
> >
> > @@ -1235,6 +1237,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe,
> u32 flags)
> >
> >  	INIT_LIST_HEAD(&vm->extobj.list);
> >
> > +	if (XE_WA(xe_root_mmio_gt(xe), 14016763929))
> > +		flags |= XE_VM_FLAG_SCRATCH_PAGE;
> 
> Isn't DRM_XE_VM_CREATE_FAULT_MODE incompatible with SCRATCH_PAGE?
> What if user-space asks for fault mode in DG2?

Very good point. When I first get this request (to implement this workaround the way as showed in this patch), I thought this workaround is better to be implemented from UMD, since kmd already exposed those flags to user space. But there were some other experts has discussed, see the comment here: https://jira.devtools.intel.com/browse/VLK-49991?focusedCommentId=21055607&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-21055607

Let me communicate back to see whether we can implement this workaround in umd so we don't have such incompatible configuration and also not silently change user's setting.

Thanks,
Oak
> 
> 
> > +
> >  	if (!(flags & XE_VM_FLAG_MIGRATION))
> >  		xe_device_mem_access_get(xe);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules
> b/drivers/gpu/drm/xe/xe_wa_oob.rules
> > index 15c23813398a..6fc68c592ca8 100644
> > --- a/drivers/gpu/drm/xe/xe_wa_oob.rules
> > +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules
> > @@ -15,3 +15,4 @@
> >  18020744125	PLATFORM(PVC)
> >  1509372804	PLATFORM(PVC), GRAPHICS_STEP(A0, C0)
> >  1409600907	GRAPHICS_VERSION_RANGE(1200, 1250)
> > +14016763929	PLATFORM(DG2)



More information about the Intel-xe mailing list