[Bug 102820] [bisected] commit ebbf7337e2daacacef3e01114e6be68a2a4f11b4 prevents X11 from starting

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Sep 17 11:46:39 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=102820

            Bug ID: 102820
           Summary: [bisected] commit
                    ebbf7337e2daacacef3e01114e6be68a2a4f11b4 prevents X11
                    from starting
           Product: DRI
           Version: DRI git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: blocker
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: jb5sgc1n.nya at 20mm.eu

After returning from a 2-weeks-vacation, I saw that some new and interesting
features made it into
https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next which I
had used before (at commit 94097b0f7f1bfa54b3b1f8b0d74bbd271a0564e4), so I
updated my kernel to the current-as-of-today version (at commit
43dd6fde5df450938568885249b836eb376e2ad6) - but found that X11 would not start
anymore with the new version.

The symptom more specifically is: Booting to the console is fine.
When I invoke "X" (manually), the console remains visible, and the Xorg.0.log
output indefinitely pauses after the messages 
[    36.622] (EE) AMDGPU(0): Failed to allocate scanout buffer memory
[    36.623] (EE) AMDGPU(0): Failed to allocate scanout buffer memory
[    36.623] (EE) AMDGPU(0): failed to set mode: Invalid argument
have been emitted. (These messages are not present with the older, working
kernel version.)

At about the same time, the following dmesg output is emitted:

[   36.405078] ------------[ cut here ]------------
[   36.405090] WARNING: CPU: 5 PID: 758 at
drivers/gpu/drm/drm_mode_object.c:294 drm_object_property_get_value+0x22/0x30
[drm]
[   36.405090] Modules linked in: ipt_REJECT nf_reject_ipv4 nf_log_ipv4
nf_log_common xt_LOG xt_tcpudp xt_owner xt_mark iptable_nat cmac
nf_conntrack_ipv4 c
pufreq_ondemand nf_defrag_ipv4 nf_nat_ipv4 nf_nat msr bnep nf_conntrack
iptable_mangle iptable_filter nls_iso8859_1 nls_cp437 vfat fat
snd_hda_codec_realtek
 snd_hda_codec_generic btusb btrtl btbcm btintel snd_hda_codec_hdmi bluetooth
igb snd_hda_intel snd_hda_codec ptp ecdh_generic pps_core rfkill snd_hda_core 
dca crc16 snd_hwdep snd_pcm edac_mce_amd snd_timer kvm_amd snd soundcore
sp5100_tco kvm tpm_tis tpm_tis_core input_leds evdev i2c_piix4 shpchp irqbypass
pcs
pkr led_class tpm button 8250_dw acpi_cpufreq sch_fq_codel usbip_host
usbip_core sg exfat(O) it87(O) hwmon_vid ip_tables x_tables algif_skcipher
af_alg sd_m
od uas usb_storage serio_raw atkbd
[   36.405113]  libps2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc
aesni_intel aes_x86_64 crypto_simd glue_helper cryptd ccp rng_core ahci libahc
i xhci_pci xhci_hcd libata usbcore scsi_mod usb_common i8042 serio amdgpu
i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
drm 
xfs libcrc32c crc32c_generic crc32c_intel dm_crypt dm_mod dax nvme nvme_core
i2c_dev
[   36.405125] CPU: 5 PID: 758 Comm: Xorg Tainted: G        W  O   
4.13.0-rc5-amd+ #6
[   36.405125] Hardware name: System manufacturer System Product Name/PRIME
X370-PRO, BIOS 0810 08/01/2017
[   36.405126] task: ffff8807fa35a940 task.stack: ffffc90008be0000
[   36.405134] RIP: 0010:drm_object_property_get_value+0x22/0x30 [drm]
[   36.405135] RSP: 0018:ffffc90008be3bc8 EFLAGS: 00010282
[   36.405136] RAX: ffffffffa04b9340 RBX: ffff8807f5ac0000 RCX:
0000000000000000
[   36.405136] RDX: ffffc90008be3be8 RSI: ffff8807fa3f4880 RDI:
ffff8807f6b84028
[   36.405137] RBP: ffffc90008be3bc8 R08: ffff8807fa3f6520 R09:
ffff8807ed303c00
[   36.405137] R10: 0000000000000040 R11: 0000000000000000 R12:
ffff8807f6b84000
[   36.405138] R13: 00000000ffffffea R14: ffff8807faf17980 R15:
ffff8807f6b84028
[   36.405138] FS:  00007f42ac7fc940(0000) GS:ffff88081ed40000(0000)
knlGS:0000000000000000
[   36.405139] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.405140] CR2: 000000706e122018 CR3: 00000007f790f000 CR4:
00000000003406e0
[   36.405140] Call Trace:
[   36.405175]  amdgpu_dm_connector_atomic_set_property+0x10a/0x180 [amdgpu]
[   36.405184]  drm_atomic_set_property+0x186/0x4a0 [drm]
[   36.405191]  drm_mode_obj_set_property_ioctl+0x12d/0x280 [drm]
[   36.405199]  ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm]
[   36.405206]  drm_mode_connector_property_set_ioctl+0x3f/0x60 [drm]
[   36.405212]  drm_ioctl_kernel+0x5d/0xb0 [drm]
[   36.405219]  drm_ioctl+0x32a/0x400 [drm]
[   36.405226]  ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm]
[   36.405229]  ? lru_cache_add_active_or_unevictable+0x36/0xb0
[   36.405249]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[   36.405251]  do_vfs_ioctl+0xa5/0x600
[   36.405253]  ? handle_mm_fault+0xd8/0x230
[   36.405254]  SyS_ioctl+0x79/0x90
[   36.405256]  entry_SYSCALL_64_fastpath+0x13/0x94
[   36.405257] RIP: 0033:0x7f42aa0c30c7
[   36.405258] RSP: 002b:00007ffe62874da8 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[   36.405259] RAX: ffffffffffffffda RBX: 00007f42aa387aa0 RCX:
00007f42aa0c30c7
[   36.405259] RDX: 00007ffe62874de0 RSI: 00000000c01064ab RDI:
000000000000000b
[   36.405260] RBP: 00007f42aa387af8 R08: 000000706e121b80 R09:
0000000000000001
[   36.405260] R10: 0000000000000004 R11: 0000000000000246 R12:
0000000000000020
[   36.405260] R13: 0000000000000004 R14: 00007f42aa387af8 R15:
0000000000000000
[   36.405261] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 46 60 55
48 89 e5 48 8b 80 70 03 00 00 48 83 78 20 00 75 07 e8 60 ff ff ff 5d c3 <0f> ff
e8 57 ff ff ff 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 
[   36.405277] ---[ end trace c128c94b0c5a469b ]---
[   36.532674] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* Atomic check
failed with err: -22 

(However, scary dmesg output like this, with a call trace at
amdgpu_dm_connector_atomic_set_property, do also occur for "working" kernel
versions, unlike the cited X11 messages above, but the "*ERROR* Atomic check
failed with err: -22" line only occurs with the "new, broken" kernel.)

At this point, the consolse is still visible, and if I use "Alt+F2" or such to
switch to another virtual console I can work with that console. If I switch
back to the virtual console that I started X from, then the X server ends
(without ever having displayed anything).

Since this symptom was 100% reproducible with the new kernel, I started a "git
bisect" on the amd-staging-drm-next kernel, which led to the following result:

ebbf7337e2daacacef3e01114e6be68a2a4f11b4 is the first bad commit
commit ebbf7337e2daacacef3e01114e6be68a2a4f11b4
Author: Charlene Liu <charlene.liu at amd.com>
Date:   Tue Aug 22 20:15:28 2017 -0400

    drm/amd/display: Block 6Ghz timing if SBIOS set HDMI_6G_en to 0

    Signed-off-by: Charlene Liu <charlene.liu at amd.com>
    Reviewed-by: Charlene Liu <Charlene.Liu at amd.com>
    Acked-by: Harry Wentland <Harry.Wentland at amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher at amd.com>

:040000 040000 0f221431fffb401f50d49e9dab16ca2d93bb6388
51f57c497d9d1d7263847e246e9aa794032e9112 M      drivers

>From looking at
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=ebbf7337e2daacacef3e01114e6be68a2a4f11b4
I cannot see how this patch could prevent my X from starting (using HDMI,
3840x2160), but it is 100% reproducible: X11 starts with the git before this
commit, but not with this git commit included, I tried multiple reboot-cycles
to verify this.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20170917/a066f3d8/attachment-0001.html>


More information about the dri-devel mailing list