[Intel-gfx] [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Mar 18 13:25:37 UTC 2019


On 18/03/2019 10:56, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-03-18 10:46:28)
>>
>> On 18/03/2019 10:37, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-03-18 10:31:57)
>>>>
>>>> On 18/03/2019 09:51, Chris Wilson wrote:
>>>>> As the final request on a ring may hold the reference to this ring (via
>>>>> retiring the last pinned context), we may find ourselves chasing a
>>>>> dangling pointer on completion of the list.
>>>>>
>>>>> A quick solution is to hold a reference to the ring itself as we retire
>>>>> along it so that we only free it after we stop dereferencing it.
>>>>
>>>> Is there a guilty commit to reference as Fixes: ?
>>>
>>> It only becomes a problem with veng as we gain an immediate free path,
>>> whereas at the moment, context frees are deferred until they can acquire
>>> the struct_mutex. We cannot hit this path at the moment, but that we had
>>> to use the safe iterator implies that we were aware that the ring itself
>>> could disappear. If you wanted to pin it on something,
>>>
>>> References: b887d6154624 ("drm/i915: Retire requests along rings")
>>>
>>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/i915_request.c          |  6 +++++-
>>>>>     drivers/gpu/drm/i915/intel_engine_types.h    |  2 ++
>>>>>     drivers/gpu/drm/i915/intel_lrc.c             |  4 ++--
>>>>>     drivers/gpu/drm/i915/intel_ringbuffer.c      |  9 +++++----
>>>>>     drivers/gpu/drm/i915/intel_ringbuffer.h      | 13 ++++++++++++-
>>>>>     drivers/gpu/drm/i915/selftests/mock_engine.c |  1 +
>>>>>     6 files changed, 27 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>>>> index 9533a85cb0b3..0a3d94517d0a 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>>> @@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915)
>>>>>         if (!i915->gt.active_requests)
>>>>>                 return;
>>>>>     
>>>>> -     list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
>>>>> +     list_for_each_entry_safe(ring, tmp,
>>>>> +                              &i915->gt.active_rings, active_link) {
>>>>> +             intel_ring_get(ring); /* last rq holds reference! */
>>>>>                 ring_retire_requests(ring);
>>>>> +             intel_ring_put(ring);
>>>>> +     }
>>>>
>>>> Where does it chase a dangling pointer? It used the safe iterator already.
>>>
>>> Inside ring_retire_requests(); the use of _safe here actually implies we
>>> met this problem already :)
>>
>> I get it, the issue is during ring->request_list iteration in
>> ring_retire_requests. How about move ring pinning in there so it is clearer?
> 
> It's only this path that is affected. That maye becomes clearer when we
> retire along timelines and not rings, which is the patchset I wanted to
> land to fix this, but figured this was a tiny fix for now.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

I failed to find a short and clear suggestion for making the comment 
more precise.

Regards,

Tvrtko


More information about the Intel-gfx mailing list