[Intel-gfx] [PATCH] drm/i915: Kick rcu harder to free objects

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Thu Sep 8 12:23:50 UTC 2022


On 06/09/2022 18:46, Ville Syrjala wrote:
> From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> 
> On gen3 the selftests are pretty much always tripping this:
> <4> [383.822424] pci 0000:00:02.0: drm_WARN_ON(dev_priv->mm.shrink_count)
> <4> [383.822546] WARNING: CPU: 2 PID: 3560 at drivers/gpu/drm/i915/i915_gem.c:1223 i915_gem_cleanup_early+0x96/0xb0 [i915]
> 
> Looks to be due to the status page object lingering on the
> purge_list. Call synchronize_rcu() ahead of it to make more
> sure all objects have been freed.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0f49ec9d494a..5b61f7ad6473 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1098,6 +1098,7 @@ void i915_gem_drain_freed_objects(struct drm_i915_private *i915)
>   		flush_delayed_work(&i915->bdev.wq);
>   		rcu_barrier();
>   	}
> +	synchronize_rcu();

Looks a bit suspicious that a loop would not free all but one last rcu 
grace would. Definitely fixes the issue in your testing?

Perhaps the fact there is a cond_resched in __i915_gem_free_objects, but 
then again free count should reflect the state and keep it looping in here..

Regards,

Tvrtko


More information about the Intel-gfx mailing list