[PATCH] drm/xe: Use separate rpm lockdep map for non-d3cold-capable devices

Fri Aug 23 15:15:07 UTC 2024

Hi, Rodrigo.

On Fri, 2024-08-23 at 10:43 -0400, Rodrigo Vivi wrote:
> On Fri, Aug 23, 2024 at 03:59:06PM +0200, Thomas Hellström wrote:
> 
> Hi Thomas, first of all, please notice that you used the wrong
> address from our mailing list ;)

You're right. Well it's friday afternoon here...

> 
> but a few more comments below...
> 
> > For non-d3cold-capable devices we'd like to be able to wake up the
> > device from reclaim. In particular, for Lunar Lake we'd like to be
> > able to blit CCS metadata to system at shrink time; at least from
> > kswapd where it's reasonable OK to wait for rpm resume and a
> > preceding rpm suspend.
> > 
> > Therefore use a separate lockdep map for such devices and prime it
> > reclaim-tainted.
> 
> Cc: Matthew Auld <matthew.auld at intel.com>
> 
> > 
> > Cc: "Vivi, Rodrigo" <rodrigo.vivi at intel.com>
> > Signed-off-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_pm.c | 45 +++++++++++++++++++++++++++++++---
> > ----
> >  1 file changed, 37 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pm.c
> > b/drivers/gpu/drm/xe/xe_pm.c
> > index 9f3c14fd9f33..2e9fdb5da8bb 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.c
> > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > @@ -70,11 +70,29 @@
> >   */
> >  
> >  #ifdef CONFIG_LOCKDEP
> > -static struct lockdep_map xe_pm_runtime_lockdep_map = {
> > -	.name = "xe_pm_runtime_lockdep_map"
> > +static struct lockdep_map xe_pm_runtime_d3cold_map = {
> > +	.name = "xe_rpm_d3cold_map"
> > +};
> > +
> > +static struct lockdep_map xe_pm_runtime_nod3cold_map = {
> > +	.name = "xe_rpm_nod3cold_map"
> >  };
> >  #endif
> >  
> > +static void xe_pm_runtime_acquire(const struct xe_device *xe)
> 
> I believe this should have a different name.
> runtime_acquire and runtime_release sounds like runtime_get and
> runtime_put...
> 
> since it is static perhaps we should only call it
> xe_rpm_lockmap_acquire
> and xe_rpm_lockmap_release

Sure, I'll fix

> 
> or
> d3_lockmap_acquire
> d3_lockmap_release ?
> 
> or something like that...
> 
> > +{
> > +	lock_map_acquire(xe->d3cold.capable ?
> > +			 &xe_pm_runtime_d3cold_map :
> > +			 &xe_pm_runtime_nod3cold_map);
> > +}
> > +
> > +static void xe_pm_runtime_release(const struct xe_device *xe)
> > +{
> > +	lock_map_release(xe->d3cold.capable ?
> > +			 &xe_pm_runtime_d3cold_map :
> > +			 &xe_pm_runtime_nod3cold_map);
> > +}
> > +
> >  /**
> >   * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0-
> > >S2idle
> >   * @xe: xe device instance
> > @@ -354,7 +372,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> >  	 * annotation here and in xe_pm_runtime_get() lockdep will
> > see
> >  	 * the potential lock inversion and give us a nice splat.
> >  	 */
> > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > +	xe_pm_runtime_acquire(xe);
> >  
> >  	/*
> >  	 * Applying lock for entire list op as xe_ttm_bo_destroy
> > and xe_bo_move_notify
> > @@ -386,7 +404,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> >  out:
> >  	if (err)
> >  		xe_display_pm_resume(xe, true);
> > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> > +	xe_pm_runtime_release(xe);
> >  	xe_pm_write_callback_task(xe, NULL);
> >  	return err;
> >  }
> > @@ -407,7 +425,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> >  	/* Disable access_ongoing asserts and prevent recursive pm
> > calls */
> >  	xe_pm_write_callback_task(xe, current);
> >  
> > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > +	xe_pm_runtime_acquire(xe);
> >  
> >  	if (xe->d3cold.allowed) {
> >  		err = xe_pcode_ready(xe, true);
> > @@ -437,7 +455,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> >  			goto out;
> >  	}
> >  out:
> > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> > +	xe_pm_runtime_release(xe);
> >  	xe_pm_write_callback_task(xe, NULL);
> >  	return err;
> >  }
> > @@ -458,8 +476,19 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> >   */
> >  static void pm_runtime_lockdep_prime(void)
> >  {
> > -	lock_map_acquire(&xe_pm_runtime_lockdep_map);
> > -	lock_map_release(&xe_pm_runtime_lockdep_map);
> > +	struct dma_resv lockdep_resv;
> > +
> > +	dma_resv_init(&lockdep_resv);
> > +	lock_map_acquire(&xe_pm_runtime_d3cold_map);
> > +	/* D3Cold takes the dma_resv locks to evict bos */
> > +	dma_resv_lock(&lockdep_resv, NULL);
> > +	fs_reclaim_acquire(GFP_KERNEL);
> > +	/* Shrinkers like to wake up the device under reclaim. */
> > +	lock_map_acquire(&xe_pm_runtime_nod3cold_map);
> > +	lock_map_release(&xe_pm_runtime_nod3cold_map);
> > +	fs_reclaim_release(GFP_KERNEL);
> > +	dma_resv_unlock(&lockdep_resv);
> > +	lock_map_release(&xe_pm_runtime_d3cold_map);
> 
> do we really need this entire sequence? or checking for d3capable
> here could have 2 different smaller sequences?

Hm. I forgot to check when this function was called. I was assuming it
was called only once per driver instance and in that case we need it
all since we can have multiple devices with different capabilities, but
it seems to be called on each runtime_get(). Is that intentional?

/Thomas

> 
> >  }
> >  
> >  /**
> > -- 
> > 2.44.0
> >