[Intel-gfx] [PATCH 2/3] drm/i915/ttm: Fix lockdep warning in __i915_gem_free_object()

Thomas Hellström thomas.hellstrom at linux.intel.com
Wed Sep 22 11:34:59 UTC 2021


On 9/22/21 12:55 PM, Matthew Auld wrote:
> On Wed, 22 Sept 2021 at 09:38, Thomas Hellström
> <thomas.hellstrom at linux.intel.com> wrote:
>> In the mman selftest, some tests make the ttm_bo_init_reserved() fail,
>> which may trigger a call to the i915_ttm_bo_destroy() function.
>> However, at this point the gem object refcount is set to 1, which
>> triggers a lockdep warning in __i915_gem_free_object() and a
>> corresponding failure in DG1 BAT, i915_selftest at live@mman.
>>
>> Fix this by clearing the gem object refcount if called from that
>> failure path.
>>
>> Fixes: f9b23c157a78 ("drm/i915: Move __i915_gem_free_object to ttm_bo_destroy")
>> Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>> Signed-off-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> index b94497989995..b1f561543ff3 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> @@ -900,6 +900,10 @@ void i915_ttm_bo_destroy(struct ttm_buffer_object *bo)
>>
>>          i915_ttm_backup_free(obj);
>>
>> +       /* Failure during ttm_bo_init_reserved leaves the refcount set to 1. */
>> +       if (IS_ENABLED(CONFIG_LOCKDEP) && !obj->ttm.created)
>> +               refcount_set(&obj->base.refcount.refcount, 0);
>> +
>>          /* This releases all gem object bindings to the backend. */
>>          __i915_gem_free_object(obj);
> The __i915_gem_free_object is also nuking stuff like mm.placements,
> which is still owned by the caller AFAIK, or at least it is until we
> have successfully initialised the object, so smells like potential
> double free? Can we easily move that under the ttm.created check?
> Otherwise maybe we are meant to move the mm.placements handling into
> the RCU callback?

Yes, it indeed sounds like a closer look is needed for the error 
handling here. Perhaps it makes sense to initialize the TTM part and 
then the GEM part while still having the lock. Meanwhile I'll put it 
under the ttm.created check.

Thanks,

Thomas


>
>> --
>> 2.31.1
>>


More information about the Intel-gfx mailing list