<html> <head> <base href="https://bugs.freedesktop.org/"> </head> <body><table border="1" cellspacing="0" cellpadding="8"> <tr> <th>Bug ID</th> <td><a class="bz_bug_link bz_status_NEW " title="NEW - [SKL] Crash while intel_fbdev_restore_mode and freeze" href="https://bugs.freedesktop.org/show_bug.cgi?id=98257">98257</a> </td> </tr> <tr> <th>Summary</th> <td>[SKL] Crash while intel_fbdev_restore_mode and freeze </td> </tr> <tr> <th>Product</th> <td>DRI </td> </tr> <tr> <th>Version</th> <td>unspecified </td> </tr> <tr> <th>Hardware</th> <td>x86-64 (AMD64) </td> </tr> <tr> <th>OS</th> <td>Linux (All) </td> </tr> <tr> <th>Status</th> <td>NEW </td> </tr> <tr> <th>Severity</th> <td>normal </td> </tr> <tr> <th>Priority</th> <td>medium </td> </tr> <tr> <th>Component</th> <td>DRM/Intel </td> </tr> <tr> <th>Assignee</th> <td>intel-gfx-bugs@lists.freedesktop.org </td> </tr> <tr> <th>Reporter</th> <td>dennis.wassenberg@secunet.com </td> </tr> <tr> <th>QA Contact</th> <td>intel-gfx-bugs@lists.freedesktop.org </td> </tr> <tr> <th>CC</th> <td>intel-gfx-bugs@lists.freedesktop.org </td> </tr></table> <p> <div> <pre>Hi, I observed an issue which is often reproducible but not always. I am able to reproduce this with Ubuntu Kernel 4.7 and 4.8 using a Lenovo Thinkpad X1 Tablet with additional Productivity Module and Onelink+ Docking Station. Additionally an external display has to be plugged at the Docking Station (VGA or DP, happens more often with VGA) I configured both displays (internal and external) at F7 X server console. After that I started a second X server at an other console (e.g. F1). I configured X to use both displays. Then I unplugged the external display from the docking station. After that I terminated the X Server on console F1 and switch to X Server at console F7. After doing these steps I got a black screen and the system freezes. At a system where debugging is much easier for me I got the following debug output in that case: [ 5593.858748] general protection fault: 0000 [#1] SMP [ 5593.858842] Modules linked in: ... [ 5593.858885] CPU: 2 PID: 4008 Comm: Xorg Tainted: P W O 4.7.3-grsec+ #1 [ 5593.858888] Hardware name: LENOVO 20GHS0D600/20GHS0D600, BIOS N1LET55W (1.55 ) 08/10/2016 [ 5593.858892] task: ffff8802174bad00 ti: ffff8802174bb5c0 task.ti: ffff8802174bb5c0 [ 5593.858908] RIP: 0010:[<ffffffff81084d72>] [<ffffffff81084d72>] mutex_optimistic_spin+0x42/0x1b0 [ 5593.858911] RSP: 0018:ffff8800d15ab870 EFLAGS: 00010282 [ 5593.858914] RAX: fefefefefefefefe RBX: 0000000000000001 RCX: 0000000000000005 [ 5593.858917] RDX: 0000000000000001 RSI: ffff8802158051c0 RDI: ffff8800d12e2258 [ 5593.858920] RBP: ffff8800d15ab8c0 R08: 0000000000000000 R09: 00000000d14c7000 [ 5593.858922] R10: 0000000000000780 R11: 0000000000000000 R12: ffff8802174bad00 [ 5593.858925] R13: ffff8802158051c0 R14: ffff8800d1102800 R15: ffff8800d12e2258 [ 5593.858929] FS: 000003551c86d100(0000) GS:ffff880221480000(0000) knlGS:0000000000000000 [ 5593.858933] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5593.858935] CR2: 0000000000000000 CR3: 00000000028a2000 CR4: 00000000003606b0 [ 5593.858938] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5593.858940] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5593.858942] Stack: [ 5593.858951] 00000001024000c0 ffff8802174bad00 ffff880217aa9890 0000000000000001 [ 5593.858956] 0000000000099e2a ffff8802158051c0 ffff8802174bad00 ffff8800d12e2000 [ 5593.858962] ffff8800d1102800 ffff8800d12e2258 ffff8800d15ab930 ffffffff815f151c [ 5593.858963] Call Trace: [ 5593.858978] [<ffffffff815f151c>] __ww_mutex_lock_slowpath+0x3c/0x1d0 [ 5593.858986] [<ffffffff815f1714>] __ww_mutex_lock+0x64/0xa0 [ 5593.859058] [<ffffffffa008f5d0>] drm_modeset_lock+0x30/0xd0 [drm] [ 5593.859118] [<ffffffffa0090037>] drm_atomic_get_connector_state+0x37/0x3d0 [drm] [ 5593.859154] [<ffffffffa010ee64>] __drm_atomic_helper_set_config+0x274/0x370 [drm_kms_helper] [ 5593.859183] [<ffffffffa0112efa>] drm_fb_helper_restore_fbdev_mode_unlocked+0x28a/0x2c0 [drm_kms_helper] [ 5593.859205] [<ffffffffa0112f58>] drm_fb_helper_set_par+0x28/0x50 [drm_kms_helper] [ 5593.859306] [<ffffffffa01ff9c5>] intel_fbdev_set_par+0x15/0x60 [i915] [ 5593.859316] [<ffffffff813285b8>] fb_set_var+0x248/0x450 [ 5593.859339] [<ffffffff8106f57a>] ? check_preempt_curr+0x8a/0xa0 [ 5593.859346] [<ffffffff812c062f>] ? rb_erase+0x10f/0x610 [ 5593.859352] [<ffffffff81321c1d>] fbcon_blank+0x20d/0x2e0 [ 5593.859361] [<ffffffff8138f9a2>] do_unblank_screen+0xc2/0x1d0 [ 5593.859371] [<ffffffff81383ed4>] complete_change_console+0x54/0xe0 [ 5593.859377] [<ffffffff813852d1>] vt_ioctl+0x1371/0x17e0 [ 5593.859429] [<ffffffffa00740e0>] ? drm_ioctl+0x160/0x630 [drm] [ 5593.859474] [<ffffffffa0078630>] ? drm_setmaster_ioctl+0x130/0x130 [drm] [ 5593.859483] [<ffffffff813776b5>] tty_ioctl+0x4a5/0xf60 [ 5593.859491] [<ffffffff8113f4ef>] do_vfs_ioctl+0x9f/0x9c0 [ 5593.859499] [<ffffffff81057892>] ? recalc_sigpending+0x12/0x50 [ 5593.859506] [<ffffffff8105852c>] ? __set_task_blocked+0x2c/0x80 [ 5593.859514] [<ffffffff8105aaa5>] ? __set_current_blocked+0x35/0x60 [ 5593.859520] [<ffffffff8113fe8a>] sys_ioctl+0x7a/0x90 [ 5593.859527] [<ffffffff8105ad59>] ? sys_rt_sigprocmask+0x149/0x1e0 [ 5593.859537] [<ffffffff815f389f>] entry_SYSCALL_64_fastpath+0x13/0x93 [ 5593.859617] Code: 83 ec 28 48 89 45 b8 89 55 c8 65 48 8b 04 25 48 b4 00 00 48 8b 00 a8 08 75 18 48 8b 47 18 49 89 ff 49 89 f5 89 d3 48 85 c0 74 38 <8b> 50 28 85 d2 75 31 65 48 8b 04 25 48 b4 00 00 48 8b 00 c6 45 [ 5593.859624] RIP [<ffffffff81084d72>] mutex_optimistic_spin+0x42/0x1b0 [ 5593.859626] RSP <ffff8800d15ab870> [ 5593.859660] ---[ end trace 902e07127626f91b ]--- The memory protection fault occurred at 0xfefefefefefefefe. This is because grsec will overwrite all freed data with these value. So it looks like a use after free. Not using grsec this is still reproducible but not every time. But if I instrument kfree this way that I write 0x0 to the freed buffer it is always reproducible again. So I assume that in case it is working without a crash at default ubuntu the memory was not reused util the use after free. After some debugging I found that restore_fbdev_mode_unlocked will restore the fbdev mode and access the fb_helper structure in drm_fb_helper.c. There the drm_connector was removed from fb_helper->connector_info. This is because the unplug was detected and the connected unregistered (drm_connector_unregister) and drm_fb_helper_remove_one_connector was called. Just before the restore the fbdev mode the last reference of the drm_connector was removed and the cleanup of the drm_connected was done (drm_connector_cleanup). Inside the fb_helper->crtc_info[i].mode_set there is still a reference to this connector (was not removed during unplug). This reference is accessed during fbdev mode restore and the memory protection fault will occur. The backtrace of this call is: mutex_optimistic_spin __mutex_lock_common __ww_mutex_lock_slowpath __ww_mutex_lock ww_mutex_lock drm_modeset_lock drm_atomic_get_connector_state drm_atomic_add_affected_connectors update_output_state __drm_atomic_helper_set_config restore_fbdev_mode drm_fb_helper_restore_fbdev_mode_unlocked drm_fb_helper_set_par intel_fbdev_set_par fb_set_var ...</pre> </div> </p> <hr> <span>You are receiving this mail because:</span> <ul> <li>You are on the CC list for the bug.</li> <li>You are the QA Contact for the bug.</li> <li>You are the assignee for the bug.</li> </ul> </body> </html>