[BUG] radeon DisplayPort hotplug - dmesg warning, monitor not waking

John Brooks john at fastquake.com
Tue Dec 27 22:27:43 UTC 2016


What I'm using:
- R9 290
- Kernel 4.9.0
- ASUS PB278Q monitor, connected with DisplayPort
- OpenGL core profile version string: 4.5 (Core Profile) Mesa 13.1.0-devel
  (git-d9fef84) 
- radeon DDX 7.8.99 (Git)

For as long as I remember I've had a couple of issues with DisplayPort on the
above setup.

First is the kernel WARN that happens when I turn the monitor on from an off
state. It looks like this:

[102004.855043] ------------[ cut here ]------------
[102004.855046] WARNING: CPU: 7 PID: 14012 at ./include/drm/drm_crtc.h:1403 drm_helper_choose_crtc_dpms+0x93/0xa0 [drm_kms_helper]
[102004.855046] Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables binfmt_misc nls_iso8859_1 mxm_wmi intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_usb_audio snd_usbmidi_lib input_leds joydev serio_raw snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_soc_rt5640 snd_hda_codec snd_soc_ssm4567 snd_soc_rl6231 snd_hda_core snd_soc_core snd_hwdep mei_me lpc_ich
[102004.855058]  mei snd_compress shpchp snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_timer snd_seq_device snd wmi elan_i2c soundcore dw_dmac video snd_soc_sst_acpi i2c_designware_platform snd_soc_sst_match i2c_designware_core 8250_dw mac_hid spi_pxa2xx_platform acpi_pad tpm_infineon kvm_intel kvm irqbypass sunrpc parport_pc ppdev lp parport autofs4 btrfs xor raid6_pq dm_mirror dm_region_hash dm_log hid_generic usbhid amdkfd amd_iommu_v2 radeon(OE) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm e1000e crc32c_intel drm psmouse ahci ptp libahci pps_core sdhci_acpi sdhci i2c_hid hid fjes
[102004.855070] CPU: 7 PID: 14012 Comm: kworker/7:1 Tainted: G        W  OE   4.9.0 #13
[102004.855070] Hardware name: Gigabyte Technology Co., Ltd. Z97X-UD3H-BK/Z97X-UD3H-BK-CF, BIOS F6 06/17/2014
[102004.855078] Workqueue: events radeon_dp_work_func [radeon]
[102004.855079]  ffffa82d4c277d20 ffffffffaa432693 0000000000000000 0000000000000000
[102004.855080]  ffffa82d4c277d60 ffffffffaa082bbb 0000057b401c8000 ffff8c7641aa1800
[102004.855081]  ffff8c764008a000 ffff8c764008a000 0000000000000003 0000000000000000
[102004.855082] Call Trace:
[102004.855083]  [<ffffffffaa432693>] dump_stack+0x63/0x90
[102004.855083]  [<ffffffffaa082bbb>] __warn+0xcb/0xf0
[102004.855084]  [<ffffffffaa082ced>] warn_slowpath_null+0x1d/0x20
[102004.855087]  [<ffffffffc059a453>] drm_helper_choose_crtc_dpms+0x93/0xa0 [drm_kms_helper]
[102004.855089]  [<ffffffffc059a4d7>] drm_helper_connector_dpms+0x77/0x100 [drm_kms_helper]
[102004.855096]  [<ffffffffc05d5bb0>] ? atombios_blank_crtc+0x150/0x150 [radeon]
[102004.855103]  [<ffffffffc05ef0c6>] radeon_connector_hotplug+0xf6/0x120 [radeon]
[102004.855111]  [<ffffffffc05fcd0f>] radeon_dp_work_func+0x3f/0x60 [radeon]
[102004.855112]  [<ffffffffaa09d90b>] process_one_work+0x16b/0x480
[102004.855113]  [<ffffffffaa09dc6b>] worker_thread+0x4b/0x500
[102004.855114]  [<ffffffffaa09dc20>] ? process_one_work+0x480/0x480
[102004.855115]  [<ffffffffaa09dc20>] ? process_one_work+0x480/0x480
[102004.855116]  [<ffffffffaa0a3ec9>] kthread+0xd9/0xf0
[102004.855117]  [<ffffffffaa0a3df0>] ? kthread_park+0x60/0x60
[102004.855118]  [<ffffffffaa8773b5>] ret_from_fork+0x25/0x30
[102004.855118] ---[ end trace cbb9abffe6127dc8 ]---

After some investigation I found that this will solve that warning:

diff --git a/drivers/gpu/drm/radeon/radeon_irq_kms.c b/drivers/gpu/drm/radeon/radeon_irq_kms.c
index c084cad..c8504b1 100644
--- a/drivers/gpu/drm/radeon/radeon_irq_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_irq_kms.c
@@ -102,11 +102,12 @@ static void radeon_dp_work_func(struct work_struct *work)
        struct drm_mode_config *mode_config = &dev->mode_config;
        struct drm_connector *connector;
 
-       /* this should take a mutex */
+       mutex_lock(&mode_config->mutex);
        if (mode_config->num_connector) {
                list_for_each_entry(connector, &mode_config->connector_list, head)
                        radeon_connector_hotplug(connector);
        }
+       mutex_unlock(&mode_config->mutex);
 }
 /**
  * radeon_driver_irq_preinstall_kms - drm irq preinstall callback


However, it does not solve my other problem, which is that sometimes, when
turning the monitor on, the display will not wake; it will just display "No
Signal". To make it wake up one has to turn the monitor off and on again a few
times, unplug the cable, or (more recently; I'm not sure what changed), turn
off the monitor, go to a TTY, and turn the monitor back on.

When the monitor turns off or on, I get a seemingly random smattering of errors
in dmesg:

[72940.457950] [drm:radeon_dp_link_train [radeon]] *ERROR* displayport link status failed
[72940.457961] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed
[72940.513750] [drm:radeon_dp_link_train [radeon]] *ERROR* displayport link status failed
[72940.513762] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed
[72963.581780] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery reached max voltage
[72963.581792] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed
[72964.662341] [drm:radeon_dp_link_train [radeon]] *ERROR* displayport link status failed
[72964.662353] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed
[72964.718847] [drm:radeon_dp_link_train [radeon]] *ERROR* displayport link status failed
[72964.718864] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed

I enabled debugging in /sys/module/drm/parameters/debug, and some messages
(dp_aux_ch flags not zero: 00000201) lead me to believe that this would help:

diff --git a/drivers/gpu/drm/radeon/radeon_dp_auxch.c b/drivers/gpu/drm/radeon/radeon_dp_auxch.c
index 474a8a18..2f30806 100644
--- a/drivers/gpu/drm/radeon/radeon_dp_auxch.c
+++ b/drivers/gpu/drm/radeon/radeon_dp_auxch.c
@@ -27,7 +27,6 @@
 #include "nid.h"
 
 #define AUX_RX_ERROR_FLAGS (AUX_SW_RX_OVERFLOW |            \
-                           AUX_SW_RX_HPD_DISCON |           \
                            AUX_SW_RX_PARTIAL_BYTE |         \
                            AUX_SW_NON_AUX_MODE |            \
                            AUX_SW_RX_SYNC_INVALID_L |       \


Indeed, it does help the monitor consistently wake up as it should. Though the
radeon_dp_link_errors in dmesg still show up.

Both of these issues have been touched on in one way or another in the past.
The mutex thing was discussed here: https://patchwork.kernel.org/patch/6430431/

And doing a web search for AUX_SW_RX_HPD_DISCON yielded very little useful
information except for an attachment to a similar bug from Alex:
https://bugs.freedesktop.org/attachment.cgi?id=115885&action=edit
I couldn't find any public documentation that describes what those error flags
mean.

My understanding of the problem(s) is limited; I've done what I can to piece it
all together. Let me know if you need more information.


More information about the dri-devel mailing list