<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - [SKL] Crash while intel_fbdev_restore_mode and freeze"
href="https://bugs.freedesktop.org/show_bug.cgi?id=98257">98257</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[SKL] Crash while intel_fbdev_restore_mode and freeze
</td>
</tr>
<tr>
<th>Product</th>
<td>DRI
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>x86-64 (AMD64)
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux (All)
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>DRM/Intel
</td>
</tr>
<tr>
<th>Assignee</th>
<td>intel-gfx-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>dennis.wassenberg@secunet.com
</td>
</tr>
<tr>
<th>QA Contact</th>
<td>intel-gfx-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>CC</th>
<td>intel-gfx-bugs@lists.freedesktop.org
</td>
</tr></table>
<p>
<div>
<pre>Hi,
I observed an issue which is often reproducible but not always.
I am able to reproduce this with Ubuntu Kernel 4.7 and 4.8 using a Lenovo
Thinkpad X1 Tablet with additional Productivity Module and Onelink+ Docking
Station. Additionally an external display has to be plugged at the Docking
Station (VGA or DP, happens more often with VGA)
I configured both displays (internal and external) at F7 X server console.
After that I started a second X server at an other console (e.g. F1). I
configured X to use both displays. Then I unplugged the external display from
the docking station. After that I terminated the X Server on console F1 and
switch to X Server at console F7.
After doing these steps I got a black screen and the system freezes.
At a system where debugging is much easier for me I got the following debug
output in that case:
[ 5593.858748] general protection fault: 0000 [#1] SMP
[ 5593.858842] Modules linked in: ...
[ 5593.858885] CPU: 2 PID: 4008 Comm: Xorg Tainted: P W O
4.7.3-grsec+ #1
[ 5593.858888] Hardware name: LENOVO 20GHS0D600/20GHS0D600, BIOS N1LET55W (1.55
) 08/10/2016
[ 5593.858892] task: ffff8802174bad00 ti: ffff8802174bb5c0 task.ti:
ffff8802174bb5c0
[ 5593.858908] RIP: 0010:[<ffffffff81084d72>] [<ffffffff81084d72>]
mutex_optimistic_spin+0x42/0x1b0
[ 5593.858911] RSP: 0018:ffff8800d15ab870 EFLAGS: 00010282
[ 5593.858914] RAX: fefefefefefefefe RBX: 0000000000000001 RCX:
0000000000000005
[ 5593.858917] RDX: 0000000000000001 RSI: ffff8802158051c0 RDI:
ffff8800d12e2258
[ 5593.858920] RBP: ffff8800d15ab8c0 R08: 0000000000000000 R09:
00000000d14c7000
[ 5593.858922] R10: 0000000000000780 R11: 0000000000000000 R12:
ffff8802174bad00
[ 5593.858925] R13: ffff8802158051c0 R14: ffff8800d1102800 R15:
ffff8800d12e2258
[ 5593.858929] FS: 000003551c86d100(0000) GS:ffff880221480000(0000)
knlGS:0000000000000000
[ 5593.858933] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5593.858935] CR2: 0000000000000000 CR3: 00000000028a2000 CR4:
00000000003606b0
[ 5593.858938] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 5593.858940] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 5593.858942] Stack:
[ 5593.858951] 00000001024000c0 ffff8802174bad00 ffff880217aa9890
0000000000000001
[ 5593.858956] 0000000000099e2a ffff8802158051c0 ffff8802174bad00
ffff8800d12e2000
[ 5593.858962] ffff8800d1102800 ffff8800d12e2258 ffff8800d15ab930
ffffffff815f151c
[ 5593.858963] Call Trace:
[ 5593.858978] [<ffffffff815f151c>] __ww_mutex_lock_slowpath+0x3c/0x1d0
[ 5593.858986] [<ffffffff815f1714>] __ww_mutex_lock+0x64/0xa0
[ 5593.859058] [<ffffffffa008f5d0>] drm_modeset_lock+0x30/0xd0 [drm]
[ 5593.859118] [<ffffffffa0090037>] drm_atomic_get_connector_state+0x37/0x3d0
[drm]
[ 5593.859154] [<ffffffffa010ee64>] __drm_atomic_helper_set_config+0x274/0x370
[drm_kms_helper]
[ 5593.859183] [<ffffffffa0112efa>]
drm_fb_helper_restore_fbdev_mode_unlocked+0x28a/0x2c0 [drm_kms_helper]
[ 5593.859205] [<ffffffffa0112f58>] drm_fb_helper_set_par+0x28/0x50
[drm_kms_helper]
[ 5593.859306] [<ffffffffa01ff9c5>] intel_fbdev_set_par+0x15/0x60 [i915]
[ 5593.859316] [<ffffffff813285b8>] fb_set_var+0x248/0x450
[ 5593.859339] [<ffffffff8106f57a>] ? check_preempt_curr+0x8a/0xa0
[ 5593.859346] [<ffffffff812c062f>] ? rb_erase+0x10f/0x610
[ 5593.859352] [<ffffffff81321c1d>] fbcon_blank+0x20d/0x2e0
[ 5593.859361] [<ffffffff8138f9a2>] do_unblank_screen+0xc2/0x1d0
[ 5593.859371] [<ffffffff81383ed4>] complete_change_console+0x54/0xe0
[ 5593.859377] [<ffffffff813852d1>] vt_ioctl+0x1371/0x17e0
[ 5593.859429] [<ffffffffa00740e0>] ? drm_ioctl+0x160/0x630 [drm]
[ 5593.859474] [<ffffffffa0078630>] ? drm_setmaster_ioctl+0x130/0x130 [drm]
[ 5593.859483] [<ffffffff813776b5>] tty_ioctl+0x4a5/0xf60
[ 5593.859491] [<ffffffff8113f4ef>] do_vfs_ioctl+0x9f/0x9c0
[ 5593.859499] [<ffffffff81057892>] ? recalc_sigpending+0x12/0x50
[ 5593.859506] [<ffffffff8105852c>] ? __set_task_blocked+0x2c/0x80
[ 5593.859514] [<ffffffff8105aaa5>] ? __set_current_blocked+0x35/0x60
[ 5593.859520] [<ffffffff8113fe8a>] sys_ioctl+0x7a/0x90
[ 5593.859527] [<ffffffff8105ad59>] ? sys_rt_sigprocmask+0x149/0x1e0
[ 5593.859537] [<ffffffff815f389f>] entry_SYSCALL_64_fastpath+0x13/0x93
[ 5593.859617] Code: 83 ec 28 48 89 45 b8 89 55 c8 65 48 8b 04 25 48 b4 00 00
48 8b 00 a8 08 75 18 48 8b 47 18 49 89 ff 49 89 f5 89 d3 48 85 c0 74 38 <8b> 50
28 85 d2 75 31 65 48 8b 04 25 48 b4 00 00 48 8b 00 c6 45
[ 5593.859624] RIP [<ffffffff81084d72>] mutex_optimistic_spin+0x42/0x1b0
[ 5593.859626] RSP <ffff8800d15ab870>
[ 5593.859660] ---[ end trace 902e07127626f91b ]---
The memory protection fault occurred at 0xfefefefefefefefe. This is because
grsec will overwrite all freed data with these value. So it looks like a use
after free. Not using grsec this is still reproducible but not every time. But
if I instrument kfree this way that I write 0x0 to the freed buffer it is
always reproducible again. So I assume that in case it is working without a
crash at default ubuntu the memory was not reused util the use after free.
After some debugging I found that restore_fbdev_mode_unlocked will restore the
fbdev mode and access the fb_helper structure in drm_fb_helper.c. There the
drm_connector was removed from fb_helper->connector_info. This is because the
unplug was detected and the connected unregistered (drm_connector_unregister)
and drm_fb_helper_remove_one_connector was called. Just before the restore the
fbdev mode the last reference of the drm_connector was removed and the cleanup
of the drm_connected was done (drm_connector_cleanup).
Inside the fb_helper->crtc_info[i].mode_set there is still a reference to this
connector (was not removed during unplug). This reference is accessed during
fbdev mode restore and the memory protection fault will occur.
The backtrace of this call is:
mutex_optimistic_spin
__mutex_lock_common
__ww_mutex_lock_slowpath
__ww_mutex_lock
ww_mutex_lock
drm_modeset_lock
drm_atomic_get_connector_state
drm_atomic_add_affected_connectors
update_output_state
__drm_atomic_helper_set_config
restore_fbdev_mode
drm_fb_helper_restore_fbdev_mode_unlocked
drm_fb_helper_set_par
intel_fbdev_set_par
fb_set_var
...</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
<li>You are the QA Contact for the bug.</li>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>