[Intel-gfx] [PATCH v2 1/3] drm/i915: Mark i915_request.timeline as a volatile, rcu pointer

Fri Sep 20 08:47:26 UTC 2019

On 19/09/2019 18:49, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-09-19 18:11:14)
>>
>> On 19/09/2019 14:26, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-09-19 14:02:19)
>>>>
>>>> On 19/09/2019 12:19, Chris Wilson wrote:
>>>>> +static struct intel_timeline *get_timeline(struct i915_request *rq)
>>>>> +{
>>>>> +     struct intel_timeline *tl;
>>>>> +
>>>>> +     /*
>>>>> +      * Even though we are holding the engine->active.lock here, there
>>>>> +      * is no control over the submission queue per-se and we are
>>>>> +      * inspecting the active state at a random point in time, with an
>>>>> +      * unknown queue. Play safe and make sure the timeline remains valid.
>>>>> +      * (Only being used for pretty printing, one extra kref shouldn't
>>>>> +      * cause a camel stampede!)
>>>>> +      */
>>>>> +     rcu_read_lock();
>>>>> +     tl = rcu_dereference(rq->timeline);
>>>>> +     if (!kref_get_unless_zero(&tl->kref))
>>>>> +             tl = NULL;
>>>>> +     rcu_read_unlock();
>>>>
>>>> How can it be NULL under the active lock? Isn't that the same assertion
>>>> from i915_timeline_get_active.
>>>
>>> Not NULL, but retired. The difference is that during submission we know
>>> that this request's context/timeline must be currently pinned until
>>> a subsequent request (containing the idle-barriers) is submitted. The
>>> danger I worry about here is that subsequent idle request may be already
>>> submitted and since the queued requests may *already* have been retired,
>>> the timeline may be unpinned and indeed dropped it's last reference.
>>
>> But here it is under the engine->active.lock with interrupts disabled
>> and the requests are fetched from execlists ports. Timeline is not
>> guaranteed to be kept alive under these conditions? intel_context
>> reference will be held until process_csb schedules it out so I'd expect
>> timeline and hwsp to be there. But I could be lost in the new scheme of
>> things.
> 
> I felt it was prudent to only rely on the active pin. You are right in
> that we have a context reference if it is in active, and that context
> holds a reference to the timeline. But... engine->active.lock is not
> the lock that guards rq->timeline, so I feel uneasy on extending
> i915_request_active_timeline() too far. Outside of the submission
> pathway, inside a pretty printer, it feels safer (whatever changes may
> come we don't have to worry about it) to not assume anything and just
> use the failsafe rcu_dereference() + kref.

Well okay, I can accept that in the overall situation.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko