[Intel-gfx] [PATCH RESEND] drm/i915: Flush buffer pools on driver remove

Matt Roper matthew.d.roper at intel.com
Wed Sep 22 22:24:29 UTC 2021


On Fri, Sep 03, 2021 at 04:23:20PM +0200, Janusz Krzysztofik wrote:
> In preparation for clean driver release, attempts to drain work queues
> and release freed objects are taken at driver remove time.  However, GT
> buffer pools are now not flushed before the driver release phase.
> Since unused objects may stay there for up to one second, some may
> survive until driver release is attempted.  That can potentially
> explain sporadic then hardly reproducible issues observed at driver
> release time, like non-zero shrink counter or outstanding address space

So just to make sure I'm understanding the description here:
 - We currently do an explicit flush of the buffer pools within the call
   path of drm_driver.release(); this removes all buffers, regardless of
   their age.
 - However there may be other code that runs *earlier* within the
   drm_driver.release() call chain that expects buffer pools have
   already been flushed and are already empty.
 - Since buffer pools auto-flush old buffers once per second in a worker
   thread, there's a small window where if we remove the driver while
   there are still buffers with an age of less than one second, the
   assumptions of the other release code may be violated.

So by moving the flush to driver remove (which executes earlier via the
pci_driver.remove() flow) you're ensuring that all buffers are flushed
before _any_ code in drm_driver.release() executes.

I found the wording of the commit message here somewhat confusing since
it's talking about flushes we do in driver release, but mentions
problems that arise during driver release due to lack of flushing.  You
might want to reword the commit message somewhat to help clarify.
Otherwise, the code change itself looks reasonable to me.

BTW, I do notice that drm_driver.release() in general is technically
deprecated at this point (with a suggestion in the drm_drv.h comments to
switch to using drmm_add_action(), drmm_kmalloc(), etc. to manage the
cleanup of resources).  At some point in the future me may want to
rework the i915 cleanup in general according to that guidance.


Matt

> areas.
> 
> Flush buffer pools on GT remove as a fix.  On driver release, don't
> flush the pools again, just assert that the flush was called and
> nothing added more in between.
> 
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> ---
> Resending with Cc: dri-devel at lists.freedesktop.org as requested, and a
> typo in commit description fixed.
> 
> Thanks,
> Janusz
> 
>  drivers/gpu/drm/i915/gt/intel_gt.c             | 2 ++
>  drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 --
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index 62d40c986642..8f322a4ecd87 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -737,6 +737,8 @@ void intel_gt_driver_remove(struct intel_gt *gt)
>  	intel_uc_driver_remove(&gt->uc);
>  
>  	intel_engines_release(gt);
> +
> +	intel_gt_flush_buffer_pool(gt);
>  }
>  
>  void intel_gt_driver_unregister(struct intel_gt *gt)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> index aa0a59c5b614..acc49c56a9f3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> @@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt)
>  	struct intel_gt_buffer_pool *pool = &gt->buffer_pool;
>  	int n;
>  
> -	intel_gt_flush_buffer_pool(gt);
> -
>  	for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++)
>  		GEM_BUG_ON(!list_empty(&pool->cache_list[n]));
>  }
> -- 
> 2.25.1
> 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795


More information about the Intel-gfx mailing list