[Intel-gfx] [PATCH v2 7/8] drm/i915: Grab execlist spinlock to avoid post-reset concurrency issues.

Tomas Elf tomas.elf at intel.com
Fri Oct 23 04:02:47 PDT 2015


On 23/10/2015 09:59, Daniel Vetter wrote:
> On Fri, Oct 23, 2015 at 09:42:27AM +0100, Tvrtko Ursulin wrote:
>>
>> On 19/10/15 16:32, Tomas Elf wrote:
>>> Grab execlist lock when cleaning up execlist queues after GPU reset to avoid
>>> concurrency problems between the context event interrupt handler and the reset
>>> path immediately following a GPU reset.
>>>
>>> * v2 (Chris Wilson):
>>> Do execlist check and use simpler form of spinlock functions.
>>>
>>> Signed-off-by: Tomas Elf <tomas.elf at intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem.c | 23 ++++++++++++++---------
>>>   1 file changed, 14 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index e57061a..2c7a0b7 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2753,18 +2753,23 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
>>>   	 * are the ones that keep the context and ringbuffer backing objects
>>>   	 * pinned in place.
>>>   	 */
>>> -	while (!list_empty(&ring->execlist_queue)) {
>>> -		struct drm_i915_gem_request *submit_req;
>>>
>>> -		submit_req = list_first_entry(&ring->execlist_queue,
>>> -				struct drm_i915_gem_request,
>>> -				execlist_link);
>>> -		list_del(&submit_req->execlist_link);
>>> +	if (i915.enable_execlists) {
>>> +		spin_lock_irq(&ring->execlist_lock);
>>> +		while (!list_empty(&ring->execlist_queue)) {
>>> +			struct drm_i915_gem_request *submit_req;
>>>
>>> -		if (submit_req->ctx != ring->default_context)
>>> -			intel_lr_context_unpin(submit_req);
>>> +			submit_req = list_first_entry(&ring->execlist_queue,
>>> +					struct drm_i915_gem_request,
>>> +					execlist_link);
>>> +			list_del(&submit_req->execlist_link);
>>>
>>> -		i915_gem_request_unreference(submit_req);
>>> +			if (submit_req->ctx != ring->default_context)
>>> +				intel_lr_context_unpin(submit_req);
>>> +
>>> +			i915_gem_request_unreference(submit_req);
>>> +		}
>>> +		spin_unlock_irq(&ring->execlist_lock);
>>
>> Can't this, in theory at least, end up calling drm_gem_object_unreference in
>> a couple of code paths, which can sleep, while holding a spinlock?
>>
>> If this cannot happen in practice for some reason it would probably be good
>> to put a comment explaining it.
>>
>> Or I am missing something?
>
> It can indeed I think. Tvrtko, since I'm off to KS next week and you have
> push access, can you pls push the revert if this checks out?

I've discussed it with Tvrtko and one solution that seems reasonable is 
to hold the execlist lock while traversing the execlist queue and moving 
the elements over to a temporary list. After that we can drop the 
execlist lock, traverse the temporary list and unreference the elements 
while not holding lock. That way we can get rid of the element in a 
synchronized fashion and also unreference without holding the lock. I 
believe this pattern is used elsewhere (such as during request retirement)

Does that sound like a plan or does anyone have a problem with that 
solution?

Thanks,
Tomas


>
> Thanks, Daniel
>




More information about the Intel-gfx mailing list