GVT Scheduler

Zhenyu Wang zhenyuw at linux.intel.com
Tue Nov 3 03:33:36 UTC 2020


On 2020.10.28 17:46:21 +0200, Julian Stecklina wrote:
> Hi!
> 
> On Wed, 2020-10-28 at 10:40 +0200, Julian Stecklina wrote:
> > >   According to our assumption, there might be extra execlist schedule-out
> > > status notification. Is it possible that you can open the tracepoint in
> > > execlist_context_schedule_in and execlist_context_schedule_out in
> > > intel_lrc.c?
> > 
> > 
> > We'll try turning trace_i915_request_in / trace_i915_request_out into printks
> > and see whether this helps in debugging. Alternatively, is there a way to get
> > trace events out of a crashed kernel?
> > 
> > Btw, would it make sense to count the schedule_in and schedule_out events for
> > each requests and dump a stacktrace when we see an unpaired schedule_out?
> 
> So we tried this out with a tiny patch that checks for matched schedule in/out
> events:
> 
> https://github.com/blitz/linux/commit/441663fab60df4a4692d5cc031dcfdeffe243008
> 
> It would be good if you can check whether this is a useful invariant to warn on.
> :)
> 
> On one system, we see this triggering right after boot with no VMs running at
> all (see below). I haven't seen this with our production VM workload yet, but
> that usually takes hours to manifest. So we might have something there tomorrow.
>

Hmm, looks one i915 change removed check of actual request preempted for status...
I'm not sure if that's relevant, but maybe you could try like:

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index d0be98b67138..f1a16d4b6e6a 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1439,7 +1439,9 @@ __execlists_schedule_out(struct i915_request *rq,
 
 	intel_context_update_runtime(ce);
 	intel_engine_context_out(engine);
-	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
+	execlists_context_status_change(rq, i915_request_completed(rq) ?
+					INTEL_CONTEXT_SCHEDULE_OUT:
+					INTEL_CONTEXT_SCHEDULE_PREEMPTED);
 	if (engine->fw_domain && !atomic_dec_return(&engine->fw_active))
 		intel_uncore_forcewake_put(engine->uncore, engine->fw_domain);
 	intel_gt_pm_put_async(engine->gt);


> [   10.370703] ------------[ cut here ]------------
> [   10.370734] mismatched schedule in/out operations
> [   10.370807] WARNING: CPU: 1 PID: 0 at drivers/gpu/drm/i915/gt/intel_lrc.c:612
> process_csb+0x762/0x7a0 [i915]
> [   10.370842]  fb_sys_fops e1000e igb i2c_i801 drm dca ahci i2c_algo_bit
> libahci wmi video pinctrl_cannonlake pinctrl_intel
> [   10.370849] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.4.61 #1
> [   10.370849] Hardware name: Gigabyte Technology Co., Ltd. Q370M D3H GSM
> PLUS/Q370M D3H GSM PLUS, BIOS F14 06/05/2019
> [   10.370902] RIP: 0010:process_csb+0x762/0x7a0 [i915]
> [   10.370904] Code: 88 aa 15 00 00 0f 85 0f fd ff ff 48 c7 c7 10 e3 70 c0 4c 89
> 55 b0 48 89 4d b8 48 89 55 c0 c6 05 68 aa 15 00 01 e8 99 b7 2a eb <0f> 0b 4c 8b
> 55 b0 48 8b 4d b8 48 8b 55 c0 e9 dd fc ff ff 4c 89 55
> [   10.370905] RSP: 0018:ffffb1204014ce60 EFLAGS: 00010286
> [   10.370906] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [   10.370907] RDX: 0000000000000025 RSI: ffffffffad387405 RDI: 0000000000000246
> [   10.370907] RBP: ffffb1204014cec0 R08: ffffffffad3873e0 R09: 0000000000000025
> [   10.370907] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000006
> [   10.370908] R13: ffff8ed12dcfe040 R14: 0000000000000001 R15: ffff8ed12f6fe000
> [   10.370909] FS:  0000000000000000(0000) GS:ffff8ed130440000(0000)
> knlGS:0000000000000000
> [   10.370909] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   10.370910] CR2: 000055da74158008 CR3: 000000017b40a004 CR4: 00000000003606e0
> [   10.370910] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   10.370910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   10.370911] Call Trace:
> [   10.370912]  <IRQ>
> [   10.370928]  execlists_submission_tasklet+0x19/0x70 [i915]
> [   10.370948]  tasklet_action_common.isra.0+0x60/0x110
> [   10.370949]  tasklet_hi_action+0x1f/0x30
> [   10.370952]  __do_softirq+0xe1/0x2d6
> [   10.370955]  ? update_ts_time_stats+0x58/0x80
> [   10.370956]  irq_exit+0xae/0xb0
> [   10.370957]  scheduler_ipi+0xe4/0x130
> [   10.370958]  smp_reschedule_interrupt+0x39/0xe0
> [   10.370959]  reschedule_interrupt+0xf/0x20
> [   10.370960]  </IRQ>
> [   10.370964] RIP: 0010:cpuidle_enter_state+0xc5/0x450
> [   10.370965] Code: ff e8 0f 78 82 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6
> c4 02 0f 85 65 03 00 00 31 ff e8 62 dc 88 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f
> 88 8f 02 00 00 49 63 cd 4c 8b 7d d0 4c 2b 7d c8 48 8d
> [   10.370966] RSP: 0018:ffffb120400efe38 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffff02
> [   10.370966] RAX: ffff8ed13046a880 RBX: ffffffffacf58e80 RCX: 000000000000001f
> [   10.370967] RDX: 0000000000000000 RSI: 000000002aaaab99 RDI: 0000000000000000
> [   10.370967] RBP: ffffb120400efe78 R08: 000000026a23c65e R09: 000000028d99190d
> [   10.370967] R10: ffff8ed130469580 R11: ffff8ed130469560 R12: ffff8ed130475928
> [   10.370968] R13: 0000000000000008 R14: 0000000000000008 R15: ffff8ed130475928
> [   10.370970]  ? cpuidle_enter_state+0xa1/0x450
> [   10.370971]  cpuidle_enter+0x2e/0x40
> [   10.370988]  call_cpuidle+0x23/0x40
> [   10.370989]  do_idle+0x1dd/0x270
> [   10.370990]  cpu_startup_entry+0x20/0x30
> [   10.370992]  start_secondary+0x167/0x1c0
> [   10.370994]  secondary_startup_64+0xa4/0xb0
> [   10.370995] ---[ end trace 85cd1056f39ffa8d ]---
> 
> Julian
> 
> 
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev

-- 

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gvt-dev/attachments/20201103/1474dbc2/attachment.sig>


More information about the intel-gvt-dev mailing list