[Intel-xe] [PATCH v3 1/3] drm/xe: Store xe_he_engine in xe_hw_engine_snapshot

Matt Roper matthew.d.roper at intel.com
Thu Oct 12 17:30:36 UTC 2023


On Thu, Oct 12, 2023 at 10:20:33AM -0700, Souza, Jose wrote:
> On Thu, 2023-10-12 at 10:16 -0700, Matt Roper wrote:
> > On Tue, Oct 10, 2023 at 11:41:49AM -0700, José Roberto de Souza wrote:
> > > A future patch will require gt and xe device structs, so here
> > > replacing class by hwe.
> > > 
> > > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > Cc: Matt Roper <matthew.d.roper at intel.com>
> > > Signed-off-by: José Roberto de Souza <jose.souza at intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_hw_engine.c       | 6 +++---
> > >  drivers/gpu/drm/xe/xe_hw_engine_types.h | 4 ++--
> > >  2 files changed, 5 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > index b5b0845908887..37b3ee1268090 100644
> > > --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> > > +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > @@ -684,7 +684,7 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe)
> > >  	if (snapshot->name)
> > >  		strscpy(snapshot->name, hwe->name, len);
> > >  
> > > -	snapshot->class = hwe->class;
> > > +	snapshot->hwe = hwe;
> > 
> > I haven't really looked into the devcoredump stuff much, but it's not
> > clear to me from a really quick skim whether it's safe to save away the
> > hwe like this.  Is there ever a case where the snapshot can persist
> > after the engine itself has been destroyed (e.g., if we capture an error
> > encountered during a device probe that results in the device being torn
> > back down)?  If so, then you'd probably need to make sure that the hwe
> > field here only gets used during the capture, and the 'print' routine
> > would still need a cached snapshot->class to avoid use-after-free
> > errors.
> 
> Xe devcoredump dump is also removed with Xe module, so no problems with this.

With the module?  Or the device?  Even if probe fails on a single
device, the module may remain loaded (and might even still be driving
other GPUs).  If the dumps for failed device(s) are still accessible,
there might be a problem?


Matt

> 
> > 
> > Rodrigo probably has better insight as to whether that's something that
> > we need to worry about.
> > 
> > 
> > Matt
> > 
> > >  	snapshot->logical_instance = hwe->logical_instance;
> > >  	snapshot->forcewake.domain = hwe->domain;
> > >  	snapshot->forcewake.ref = xe_force_wake_ref(gt_to_fw(hwe->gt),
> > > @@ -731,7 +731,7 @@ xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe)
> > >  		hw_engine_mmio_read32(hwe, RING_DMA_FADD(0));
> > >  	snapshot->reg.ipehr = hw_engine_mmio_read32(hwe, RING_IPEHR(0));
> > >  
> > > -	if (snapshot->class == XE_ENGINE_CLASS_COMPUTE)
> > > +	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
> > >  		snapshot->reg.rcu_mode = xe_mmio_read32(hwe->gt, RCU_MODE);
> > >  
> > >  	return snapshot;
> > > @@ -786,7 +786,7 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
> > >  		   snapshot->reg.ring_dma_fadd_udw,
> > >  		   snapshot->reg.ring_dma_fadd);
> > >  	drm_printf(p, "\tIPEHR: 0x%08x\n\n", snapshot->reg.ipehr);
> > > -	if (snapshot->class == XE_ENGINE_CLASS_COMPUTE)
> > > +	if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
> > >  		drm_printf(p, "\tRCU_MODE: 0x%08x\n",
> > >  			   snapshot->reg.rcu_mode);
> > >  }
> > > diff --git a/drivers/gpu/drm/xe/xe_hw_engine_types.h b/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > > index 5d4ee29042407..71c0a0169e62a 100644
> > > --- a/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_hw_engine_types.h
> > > @@ -156,8 +156,8 @@ struct xe_hw_engine {
> > >  struct xe_hw_engine_snapshot {
> > >  	/** @name: name of the hw engine */
> > >  	char *name;
> > > -	/** @class: class of this hw engine */
> > > -	enum xe_engine_class class;
> > > +	/** @hwe: hw engine */
> > > +	struct xe_hw_engine *hwe;
> > >  	/** @logical_instance: logical instance of this hw engine */
> > >  	u16 logical_instance;
> > >  	/** @forcewake: Force Wake information snapshot */
> > > -- 
> > > 2.42.0
> > > 
> > 
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


More information about the Intel-xe mailing list