[Bug 110848] Everything using GPU gets stuck after running+killing parallel Media loads (after running 3D benchmarks)
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Tue Jul 2 08:30:49 UTC 2019
https://bugs.freedesktop.org/show_bug.cgi?id=110848
--- Comment #16 from Eero Tamminen <eero.t.tamminen at intel.com> ---
Large number of parallel media workloads, which get a group KILL signal
(because they took too much time), still continue to deadlock i915 and other
parts of kernel in most annoying way with latest drm-tip 5.2.0-rc7 git kernel.
Device doesn't appear frozen to normal freeze checks, but "random" operations
on it get stuck:
# for i in /proc/[0-9]*; do
egrep '^(Name|State|Pid)' $i/status; cat $i/stack;
done | grep -B1 -A6 "disk sleep"
Name: kworker/u8:1+i915
State: D (disk sleep)
Pid: 10660
[<0>] __i915_gem_free_work+0x5f/0x90 [i915]
[<0>] process_one_work+0x1e9/0x410
[<0>] worker_thread+0x2d/0x3d0
[<0>] kthread+0x113/0x130
[<0>] ret_from_fork+0x35/0x40
--
Name: ffmpeg
State: D (disk sleep)
Pid: 10704
[<0>] __fput+0xae/0x200
[<0>] task_work_run+0x84/0xa0
[<0>] do_exit+0x308/0xba0
[<0>] do_group_exit+0x33/0xa0
[<0>] get_signal+0x121/0x910
--
<17 similar FFmpeg processes>
--
Name: compiz
State: D (disk sleep)
Pid: 1195
[<0>] i915_gem_create_ioctl+0x17/0x30 [i915]
[<0>] drm_ioctl_kernel+0x88/0xf0
[<0>] drm_ioctl+0x2f8/0x3b0
[<0>] do_vfs_ioctl+0xa4/0x630
[<0>] ksys_ioctl+0x3a/0x70
--
Name: ffmpeg
State: D (disk sleep)
Pid: 12011
[<0>] chrdev_open+0xa3/0x1b0
[<0>] do_dentry_open+0x1c4/0x380
[<0>] path_openat+0x564/0x11f0
[<0>] do_filp_open+0x9b/0x110
[<0>] do_sys_open+0x1bd/0x260
--
Name: ps
State: D (disk sleep)
Pid: 16106
[<0>] proc_pid_cmdline_read+0x1e3/0x330
[<0>] vfs_read+0x91/0x140
[<0>] ksys_read+0x91/0xe0
[<0>] do_syscall_64+0x4f/0x130
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
--
Name: khugepaged
State: D (disk sleep)
Pid: 38
[<0>] kthread+0x113/0x130
[<0>] ret_from_fork+0x35/0x40
(This semi-zombie device state breaks our automation because it somehow manages
to freeze Jenkins job that shouldn't be doing anything on the device, just
trigger start of testing on that machine with a forced reboot job.)
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190702/117f0bd5/attachment-0001.html>
More information about the intel-gfx-bugs
mailing list