[PATCH] i915/drm/gvt: initialize CSB tail value with zero
Xinyun Liu
xinyun.liu at intel.com
Fri Aug 31 08:23:47 UTC 2018
Didn't test with kvmgt, but suspect kvmgt should have similar issue, so send
patch for review per Yakui's suggestion.
Note:
Guest OS falls into infinity loop in process_csb() if there is no gfx
workload; otherwise, may run into kernel panic.
Thanks,
Xinyun
On Fri, Aug 31, 2018 at 04:14:52PM +0800, Xinyun Liu wrote:
> When run `./drv_hangman --run-subtest hangcheck-unterminated` with
> AcrnGT, vGPU reset falls into a dead loop because the original CSB tail
> value (0xF) was not updated correctly. In fact, the value should be zero
> after gpu reset caused by an invalid context. This dead loop also causes
> the kernel panic if there is some graphics workload running on the vGPU.
>
> BUG: unable to handle kernel paging request at 00000000fffffffc
> IP: process_csb+0x14a/0x2a0
> PGD 0 P4D 0
> Oops: 0002 [#1] PREEMPT SMP
> Modules linked in: dwc3_pci dwc3 snd_usb_audio xhci_pci mei_me xhci_hcd snd_usbmidi_lib mei snd_hwdep hci_uart bluetooth ecdh_generic rfkill_gpio trusty_timer trusty_wall trusty_b
> CPU: 0 PID: 1371 Comm: kworker/0:1H Tainted: P U W O 4.14.61-quilt-2e5dc0ac-g0feae7d57171 #2
> Hardware name: ACRN-DM, BIOS 1.00 03/14/2014
> Workqueue: events_highpri i915_error_reset
> task: ffff88007cbc0040 task.stack: ffffc900010b0000
> RIP: 0010:process_csb+0x14a/0x2a0
> RSP: 0018:ffffc900010b3c90 EFLAGS: 00010206
> RAX: 00000000fffffffc RBX: ffffc90001e02370 RCX: 0000000000000008
> RDX: 0000000000000009 RSI: ffff88007c830308 RDI: 0000000000000000
> RBP: ffffc900010b3cd8 R08: 0000000000000001 R09: 0000000000002370
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007c758000
> R13: 0000000000000007 R14: 0000000000000004 R15: ffff88007c830000
> FS: 0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000fffffffc CR3: 00000000796e0000 CR4: 00000000003406f0
> Call Trace:
> ? wake_up_process+0x20/0x20
> execlists_reset_prepare+0x65/0x120
> i915_gem_reset_prepare_engine+0x28/0x40
> i915_reset_engine+0x1e/0xe0
> i915_handle_error+0x117/0x470
> ? cpuacct_charge+0x81/0x90
> ? _raw_spin_unlock_irq+0x1e/0x40
> ? finish_task_switch+0x8d/0x1f0
> i915_error_reset+0x32/0x40
> process_one_work+0x186/0x3e0
> worker_thread+0x3d/0x3b0
> kthread+0x132/0x150
> ? process_one_work+0x3e0/0x3e0
> ? kthread_create_on_node+0x70/0x70
> ret_from_fork+0x3a/0x50
> Code: 00 00 44 89 00 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 89 d8 31 d2 45 31 f6 e9 57 ff ff ff 0f 1f 44 00 00 48 85 c0 74 13 <f0> ff 08 0f 88 ed 6f 5d 00 75 08 48 89 c7
> RIP: process_csb+0x14a/0x2a0 RSP: ffffc900010b3c90
> CR2: 00000000fffffffc
> ---[ end trace 5751fb1d7b00b459 ]---
>
> Link: https://lists.projectacrn.org/g/acrn-dev/message/11136
> Signed-off-by: Xinyun Liu <xinyun.liu at intel.com>
> ---
> drivers/gpu/drm/i915/gvt/execlist.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gvt/execlist.c b/drivers/gpu/drm/i915/gvt/execlist.c
> index 70494e394d2c..768e0b467a11 100644
> --- a/drivers/gpu/drm/i915/gvt/execlist.c
> +++ b/drivers/gpu/drm/i915/gvt/execlist.c
> @@ -523,7 +523,7 @@ static void init_vgpu_execlist(struct intel_vgpu *vgpu, int ring_id)
> _EL_OFFSET_STATUS_PTR);
> ctx_status_ptr.dw = vgpu_vreg(vgpu, ctx_status_ptr_reg);
> ctx_status_ptr.read_ptr = 0;
> - ctx_status_ptr.write_ptr = 0x7;
> + ctx_status_ptr.write_ptr = 0;
> vgpu_vreg(vgpu, ctx_status_ptr_reg) = ctx_status_ptr.dw;
> }
>
> --
> 2.18.0
>
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
More information about the intel-gvt-dev
mailing list