[Intel-gfx] [PATCH 3/3] drm/i915/gem: Serialise debugfs i915_gem_objects with ctx->mutex

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Sep 16 08:27:26 UTC 2020


On 16/09/2020 08:42, Daniel Vetter wrote:
> On Mon, Sep 14, 2020 at 05:45:09PM +0100, Tvrtko Ursulin wrote:
>>
>> On 23/07/2020 18:21, Chris Wilson wrote:
>>> Since the debugfs may peek into the GEM contexts as the corresponding
>>> client/fd is being closed, we may try and follow a dangling pointer.
>>> However, the context closure itself is serialised with the ctx->mutex,
>>> so if we hold that mutex as we inspect the state coupled in the context,
>>> we know the pointers within the context are stable and will remain valid
>>> as we inspect their tables.
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: CQ Tang <cq.tang at intel.com>
>>> Cc: Daniel Vetter <daniel.vetter at intel.com>
>>> Cc: stable at vger.kernel.org
>>> ---
>>>    drivers/gpu/drm/i915/i915_debugfs.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index 784219962193..ea469168cd44 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -326,6 +326,7 @@ static void print_context_stats(struct seq_file *m,
>>>    		}
>>>    		i915_gem_context_unlock_engines(ctx);
>>> +		mutex_lock(&ctx->mutex);
>>>    		if (!IS_ERR_OR_NULL(ctx->file_priv)) {
>>>    			struct file_stats stats = {
>>>    				.vm = rcu_access_pointer(ctx->vm),
>>> @@ -346,6 +347,7 @@ static void print_context_stats(struct seq_file *m,
>>>    			print_file_stats(m, name, stats);
>>>    		}
>>> +		mutex_unlock(&ctx->mutex);
>>>    		spin_lock(&i915->gem.contexts.lock);
>>>    		list_safe_reset_next(ctx, cn, link);
>>>
>>
>> Hm this apparently never got it's r-b and so got re-discovered in the field.
>> +Nikunj
>>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> I'm not super thrilled about patch 1 in this, for debugfs imo better to
> wrangle this in the driver. And without patch 1 and 2 this wont fix a
> whole lot.

I try to avoid spending too much time coming up with smart solutions for 
_debugfs_. So I was going by the fact it obviously fixes something so it 
is an improvement.

But your proposal to swith iteration to files->contexts also seems would 
work. It would be slightly semantically different where it wouldn't show 
the contexts which are active on the GPU but clients have exited, but 
its debugfs so no one should care.

Regards,

Tvrtko


More information about the dri-devel mailing list