[PATCH] drm/xe/xe2: Enable Priority Mem Read

Mishra, Pallavi pallavi.mishra at intel.com
Wed May 29 20:37:06 UTC 2024



> -----Original Message-----
> From: Roper, Matthew D <matthew.d.roper at intel.com>
> Sent: Wednesday, May 29, 2024 9:45 AM
> To: Mishra, Pallavi <pallavi.mishra at intel.com>
> Cc: intel-xe at lists.freedesktop.org
> Subject: Re: [PATCH] drm/xe/xe2: Enable Priority Mem Read
> 
> On Wed, May 29, 2024 at 09:00:42AM -0700, Mishra, Pallavi wrote:
> >
> >
> > > -----Original Message-----
> > > From: Roper, Matthew D <matthew.d.roper at intel.com>
> > > Sent: Tuesday, May 28, 2024 9:35 AM
> > > To: Mishra, Pallavi <pallavi.mishra at intel.com>
> > > Cc: intel-xe at lists.freedesktop.org
> > > Subject: Re: [PATCH] drm/xe/xe2: Enable Priority Mem Read
> > >
> > > On Sat, May 25, 2024 at 06:19:33AM +0530, Pallavi Mishra wrote:
> > > > Enable feature to allow memory reads to take a priority memory path.
> > > > This will reduce latency on the read path, but may introduce RAW
> > > > hazards.
> > >
> > > You might want to elaborate on what "may introduce RAW hazards"
> means.
> > > I.e., it's worth noting that both kernel *and* userspace-submitted
> > > batch buffers may now be subject to read-after-write hazards if they
> > > don't use MI_MEM_FENCE in cases where there are assumptions about
> > > operations happening in-order, but where a read could potentially
> > > leapfrog a preceding write now that we've enabled this.
> > >
> > > Presumably since there aren't any other changes in this patch you've
> > > already confirmed that there's nowhere in the KMD itself that needs
> > > extra fences added to avoid read-after-write hazards...you might
> > > want to mention that in the commit message too.
> > >
> > > This could expose previously-harmless bugs in userspace drivers
> > > and/or IGT, so it will be important to make sure we land this before
> > > lifting force_probe on these platforms, otherwise the appearance of
> > > those bugs would be seen as a KMD regression, even though the bugs
> themselves live in userspace.
> > >
> > > > Bspec: 45845, 60237, 60187, 60188
> > >
> > > 45845 is a page for pre-Xe2 platforms, so I don't think that's the
> > > one you want here.  I think you want 60298 for the Xe2 equivalent
> > > page that documents the register bit.
> > >
> > > >
> > > > Signed-off-by: Pallavi Mishra <pallavi.mishra at intel.com>
> > > > ---
> > > >  drivers/gpu/drm/xe/regs/xe_engine_regs.h | 1 +
> > > >  drivers/gpu/drm/xe/xe_hw_engine.c        | 7 +++++++
> > > >  2 files changed, 8 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h
> > > > b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
> > > > index 263ffc7bc2ef..4e8f9a61f0bf 100644
> > > > --- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h
> > > > +++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
> > > > @@ -104,6 +104,7 @@
> > > >  #define CSFE_CHICKEN1(base)			XE_REG((base) +
> > > 0xd4, XE_REG_OPTION_MASKED)
> > > >  #define   GHWSP_CSB_REPORT_DIS			REG_BIT(15)
> > > >  #define   PPHWSP_CSB_AND_TIMESTAMP_REPORT_DIS	REG_BIT(14)
> > > > +#define   CS_PRIORITY_MEM_READ			REG_BIT(7)
> > > >
> > > >  #define FF_SLICE_CS_CHICKEN1(base)		XE_REG((base) +
> 0xe0,
> > > XE_REG_OPTION_MASKED)
> > > >  #define   FFSC_PERCTX_PREEMPT_CTRL		REG_BIT(14)
> > > > diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > index de1aefaa2335..34aa8b5270b0 100644
> > > > --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > @@ -424,6 +424,13 @@ hw_engine_setup_default_state(struct
> > > xe_hw_engine *hwe)
> > > >  					   0xA,
> > > >
> > > XE_RTP_ACTION_FLAG(ENGINE_BASE)))
> > > >  		},
> > > > +		/* Enable Priority Mem Read */
> > > > +		{ XE_RTP_NAME("Priority_Mem_Read"),
> > > > +		  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2000, 2004)),
> > >
> > > The current platforms that currently support this are BMG (graphics
> > > version
> > > 20.01) and LNL (20.04), so the starting point of this range should
> > > be 2001 rather than 2000.  Also, since this is a new feature rather
> > > than a workaround, the expectation is that it will (probably)
> > > continue to exist on future platforms as well, so it's best to just
> > > use XE_RTP_END_VERSION_UNDEFINED as the end point so that we don't
> > > need to keep coming back here and updating this for every new platform in
> the future.
> > >
> > > Also, RTP rules using GRAPHICS_VERSION[_RANGE] only get applied on
> > > the primary GT.  If this feature is also available on the media
> > > engines (I think it is since I don't see anything in the bspec
> > > implying it wouldn't be supported), then we'll also want a second
> > > table entry for those (i.e., pretty much a copy of what you have
> > > here, but using a MEDIA_VERSION_RANGE(1301,
> > > XE_RTP_END_VERSION_UNDEFINED) to capture the media IP.
> >
> > Thanks for the review.
> >
> > Just to update you, I see below failure
> > https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-134038v1/bat-lnl-1/igt
> > @xe_module_load at load.html
> 
> The key error there is:
> 
>     *ERROR* GT0: illegal register in save/restore workaround list
> 
> Looking at the other dmesg output, I think the problem is that we're somehow
> adding the RCS engine's instance of CSFE_CHICKEN1 to each engine's save-
> restore list rather than the engine-appropriate instance of the register:
> 
>   <7> [305.651033] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> Applying rcs0 save-restore MMIOs
>   <7> [305.651079] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x20c4] = 0x3f7e0306
>   <7> [305.651121] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x20d4] = 0xc080c080
>   <7> [305.651164] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0xe48c] = 0x02000200
>   <7> [305.651202] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0xe49c] = 0x40004000
>   <7> [305.651239] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0xe4c4] = 0x08000800
>   <7> [305.651276] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0xe4f0] = 0x00020002
>   <7> [305.651315] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0xe7c8] = 0x00002000
>   <7> [305.651351] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0xe7cc] = 0x00009000
>   <7> [305.651692] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> Applying bcs0 save-restore MMIOs
>   <7> [305.651730] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x20d4] = 0x00800080
>   <7> [305.651765] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x220c4] = 0x3f7e0306
>   <7> [305.651796] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x220d4] = 0xc080c080
>   <7> [305.652125] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> Applying bcs8 save-restore MMIOs
>   <7> [305.652164] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x20d4] = 0x00800080
>   <7> [305.652202] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x220d4] = 0x00800080
>   <7> [305.652241] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x3ee0c4] = 0x3f7e0306
>   <7> [305.652274] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x3ee0d4] = 0xc080c080
>   <7> [305.652596] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> Applying ccs0 save-restore MMIOs
>   <7> [305.652629] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x20d4] = 0x00800080
>   <7> [305.652662] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x1a0c4] = 0x3f7e0308
>   <7> [305.652691] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x1a0d4] = 0xc080c080
>   <7> [305.652723] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x220d4] = 0x00800080
>   <7> [305.652757] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0:
> REG[0x3ee0d4] = 0x00800080
> 
> I.e., we're seeing 0x20d4 in the bcs0, bcs8, and ccs0 lists for some reason,
> which the GuC firmware (properly) flags as invalid.
> 
> I *think* that might be because you accidentally put the
> "XE_RTP_ACTION_FLAG(ENGINE_BASE)" outside the SET() rather than inside
> it.  The one downside of our magic macros to created nested RTP structures is
> that the compiler doesn't notice when we accidentally misplace the
> parentheses, and it can lead to really strange behavior.

Yes that indeed was the problem.
> 
> 
> Matt
> 
> >
> > Working on resolving this now. Will update once this is fixed and
> > patch with comments incorporated is ready for review.
> >
> > >
> > >
> > > Matt
> > >
> > > > +		  XE_RTP_ACTIONS(SET(CSFE_CHICKEN1(0),
> > > > +				     CS_PRIORITY_MEM_READ)),
> > > > +		  XE_RTP_ACTION_FLAG(ENGINE_BASE),
> > > > +		},
> > > >  		{}
> > > >  	};
> > > >
> > > > --
> > > > 2.25.1
> > > >
> > >
> > > --
> > > Matt Roper
> > > Graphics Software Engineer
> > > Linux GPU Platform Enablement
> > > Intel Corporation
> 
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation


More information about the Intel-xe mailing list