[Intel-gfx] [PATCH] drm/i915: Verify the engine after acquiring the active.lock

Wed Sep 18 15:54:36 UTC 2019

On 17/09/2019 16:17, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-09-17 15:59:25)
>>
>> On 16/09/2019 12:38, Chris Wilson wrote:
>>> When using virtual engines, the rq->engine is not stable until we hold
>>> the engine->active.lock (as the virtual engine may be exchanged with the
>>> sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
>>> we may retire a request concurrently with resubmitting it to HW, we need
>>> to be extra careful to verify we are holding the correct lock for the
>>> request's active list. This is similar to the issue we saw with
>>> rescheduling the virtual requests, see sched_lock_engine().
>>>
>>> Or else:
>>>
>>> <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
>>> <4> [876.736136] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
>>> <4> [876.736137] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
>>> <4> [876.736154] CPU: 2 PID: 21 Comm: ksoftirqd/2 Tainted: G     U            5.3.0-CI-CI_DRM_6898+ #1
>>> <4> [876.736156] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
>>> <4> [876.736157] RIP: 0010:__list_add_valid+0x4d/0x70
>>> <4> [876.736159] Code: c3 48 89 d1 48 c7 c7 20 33 0e 82 48 89 c2 e8 4a 4a bc ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 70 33 0e 82 e8 33 4a bc ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 c0 33 0e 82 e8
>>> <4> [876.736160] RSP: 0018:ffffc9000018bd30 EFLAGS: 00010082
>>> <4> [876.736162] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000104
>>> <4> [876.736163] RDX: 0000000080000104 RSI: 0000000000000000 RDI: 00000000ffffffff
>>> <4> [876.736164] RBP: ffffc9000018bd68 R08: 0000000000000000 R09: 0000000000000001
>>> <4> [876.736165] R10: 00000000aed95de3 R11: 000000007fe927eb R12: ffff888361ffca10
>>> <4> [876.736166] R13: ffff888361ffa610 R14: ffff888361ffc880 R15: ffff8883f931a1f8
>>> <4> [876.736168] FS:  0000000000000000(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
>>> <4> [876.736169] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> <4> [876.736170] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
>>> <4> [876.736171] PKRU: 55555554
>>> <4> [876.736172] Call Trace:
>>> <4> [876.736226]  __i915_request_submit+0x152/0x370 [i915]
>>> <4> [876.736263]  __execlists_submission_tasklet+0x6da/0x1f50 [i915]
>>> <4> [876.736293]  ? execlists_submission_tasklet+0x29/0x50 [i915]
>>> <4> [876.736321]  execlists_submission_tasklet+0x34/0x50 [i915]
>>> <4> [876.736325]  tasklet_action_common.isra.5+0x47/0xb0
>>> <4> [876.736328]  __do_softirq+0xd8/0x4ae
>>> <4> [876.736332]  ? smpboot_thread_fn+0x23/0x280
>>> <4> [876.736334]  ? smpboot_thread_fn+0x6b/0x280
>>> <4> [876.736336]  run_ksoftirqd+0x2b/0x50
>>> <4> [876.736338]  smpboot_thread_fn+0x1d3/0x280
>>> <4> [876.736341]  ? sort_range+0x20/0x20
>>> <4> [876.736343]  kthread+0x119/0x130
>>> <4> [876.736345]  ? kthread_park+0xa0/0xa0
>>> <4> [876.736347]  ret_from_fork+0x24/0x50
>>> <4> [876.736353] irq event stamp: 2290145
>>> <4> [876.736356] hardirqs last  enabled at (2290144): [<ffffffff8123cde8>] __slab_free+0x3e8/0x500
>>> <4> [876.736358] hardirqs last disabled at (2290145): [<ffffffff819cfb4d>] _raw_spin_lock_irqsave+0xd/0x50
>>> <4> [876.736360] softirqs last  enabled at (2290114): [<ffffffff81c0033e>] __do_softirq+0x33e/0x4ae
>>> <4> [876.736361] softirqs last disabled at (2290119): [<ffffffff810b815b>] run_ksoftirqd+0x2b/0x50
>>> <4> [876.736363] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
>>> <4> [876.736364] ---[ end trace 3e58d6c7356c65bf ]---
>>> <4> [876.736406] ------------[ cut here ]------------
>>> <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730
>>> <4> [876.736421] WARNING: CPU: 2 PID: 5490 at lib/list_debug.c:53 __list_del_entry_valid+0x79/0x90
>>> <4> [876.736422] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
>>> <4> [876.736433] CPU: 2 PID: 5490 Comm: i915_selftest Tainted: G     U  W         5.3.0-CI-CI_DRM_6898+ #1
>>> <4> [876.736435] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
>>> <4> [876.736436] RIP: 0010:__list_del_entry_valid+0x79/0x90
>>> <4> [876.736438] Code: 0b 31 c0 c3 48 89 fe 48 c7 c7 30 34 0e 82 e8 ae 49 bc ff 0f 0b 31 c0 c3 48 89 f2 48 89 fe 48 c7 c7 68 34 0e 82 e8 97 49 bc ff <0f> 0b 31 c0 c3 48 c7 c7 a8 34 0e 82 e8 86 49 bc ff 0f 0b 31 c0 c3
>>> <4> [876.736439] RSP: 0018:ffffc900003ef758 EFLAGS: 00010086
>>> <4> [876.736440] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000002
>>> <4> [876.736442] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff
>>> <4> [876.736443] RBP: ffffc900003ef780 R08: 0000000000000000 R09: 0000000000000001
>>> <4> [876.736444] R10: 000000001418e4b7 R11: 000000007f0ea93b R12: ffff888361ffcab8
>>> <4> [876.736445] R13: ffff88843b6d0000 R14: 000000000000217c R15: 0000000000000001
>>> <4> [876.736447] FS:  00007f4e6f255240(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
>>> <4> [876.736448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> <4> [876.736449] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
>>> <4> [876.736450] PKRU: 55555554
>>> <4> [876.736451] Call Trace:
>>> <4> [876.736488]  i915_request_retire+0x224/0x8e0 [i915]
>>> <4> [876.736521]  i915_request_create+0x4b/0x1b0 [i915]
>>> <4> [876.736550]  nop_virtual_engine+0x230/0x4d0 [i915]
>>>
>>> Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111695
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Matthew Auld <matthew.auld at intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_request.c | 25 ++++++++++++++++++++++---
>>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 754a78364a63..f12358150097 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -195,6 +195,27 @@ static void free_capture_list(struct i915_request *request)
>>>        }
>>>    }
>>>    
>>> +static void remove_from_engine(struct i915_request *rq)
>>> +{
>>> +     struct intel_engine_cs *engine, *locked;
>>> +
>>> +     /*
>>> +      * Virtual engines complicate acquiring the engine timeline lock,
>>> +      * as their rq->engine pointer is not stable until under that
>>> +      * engine lock. The simple ploy we use is to take the lock then
>>> +      * check that the rq still belongs to the newly locked engine.
>>> +      */
>>> +     locked = READ_ONCE(rq->engine);
>>
>> I thought we had rq->owner and so could bail early if that one was
>> non-virtual, but that field seems like a distant memory. Still, could be
>> a thought to store the original/fixed engine in there somewhere. Maybe
>> it would enable some interesting asserts. Or maybe not..
> 
> You are thinking of ce->engine (which used to be ce->owner, to
> distinguish it from the ce->inflight engine). So you could do something
> like
> 
> 	if (intel_engine_is_virtual(rq->hw_context->engine))
> 	    ... and then have the same trail and error spinlock ...
> 
> by which time the code is equally branchy, and our assumption here is
> the spinlock is the correct one for the vast majority of cases, and so
> we very rarely have to dance.
> 
>>> +     spin_lock(&locked->active.lock);
>>> +     while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
>>> +             spin_unlock(&locked->active.lock);
>>> +             spin_lock(&engine->active.lock);
>>> +             locked = engine;
>>> +     }
>>> +     list_del(&rq->sched.link);
>>> +     spin_unlock(&locked->active.lock);
>>> +}
>>> +
>>>    static bool i915_request_retire(struct i915_request *rq)
>>>    {
>>>        struct i915_active_request *active, *next;
>>> @@ -260,9 +281,7 @@ static bool i915_request_retire(struct i915_request *rq)
>>>         * request that we have removed from the HW and put back on a run
>>>         * queue.
>>>         */
>>> -     spin_lock(&rq->engine->active.lock);
>>> -     list_del(&rq->sched.link);
>>> -     spin_unlock(&rq->engine->active.lock);
>>> +     remove_from_engine(rq);
>>>    
>>>        spin_lock(&rq->lock);
>>>        i915_request_mark_complete(rq);
>>>
>>
>> How exactly we can retire and resubmit at the same time? Isn't
>> retirement an one off event when seqno is complete, rq signaled?
> 
> Yes. So preempt-to-busy introduces a window where the request is still
> on HW but we have returned it back to the submission queue. We catch up
> with the HW on the next process_csb, but it may have completed the
> request in the mean time (it is just not allowed to advance beyond the
> subsequent breadcrumb and so prevented from overtaking our knowledge of
> RING_TAIL and so we avoid telling the HW to go "backwards".).

Would it be sufficient to do:

   engine = READ_ONCE(rq->engine);
   spin_lock(...);
   list_del(...);
   spin_unlock(engine->active.lock);

To ensure the same engine is used? Although the oops is not about 
spinlock but list corruption. How does the list get corrupt though? 
list_del does not care on which list the request is.. If it is really 
key to have the correct lock, then why it is enough to re-check the 
engine after taking the lock? Why rq->engine couldn't change under the 
lock again? rq->engine does get updated under the very lock, no?

Regards,

Tvrtko