[Intel-gfx] The mysterious case of IRQs, failed DP aux ch transactions, and Skylake
Runyan, Arthur J
arthur.j.runyan at intel.com
Wed Mar 2 06:40:07 UTC 2016
I like mysteries, but DP and docks usually ends up as a sob story.
The master interrupt control doesn't have any hardware connection to DP aux. I don't know how much confidence you have in the IRQ check, but it seems likely that an interrupt, probably hotplug, triggered programming to mess up aux.
PSR is a concern, but even that would only block 1 transaction on eDP aux, and here you must be using a non-eDP port.
During the 15msec, check the PWR_WELL_CTL Misc IO Power Request. That needs to be enabled to supply power to DP aux.
Also your DDI_AUX_CTL value, 0x7d40001f, looks wrong since the Fast Wake Sync Pulse Count is zeroed out. Check what value it uses after 15msec when aux is working.
>From: Lyude [mailto:cpaul at redhat.com]
>Sent: Tuesday, March 01, 2016 9:15 AM
>To: Runyan, Arthur J
>Cc: intel-gfx at lists.freedesktop.org; Vetter, Daniel; David Airlie; Rob Clark
>Subject: The mysterious case of IRQs, failed DP aux ch transactions, and Skylake
>Hi! Daniel Vetter referred me to you since you're a hardware guy, and
>also suggested I include the whole intel-gfx list on this.
>So as of late I've been testing the mainline kernel on some new
>production Skylake machines. While it worked perfectly on most of them,
>I've been stumped with a weird issue that arose with a Lenovo ThinkPad
>T560. If I place the laptop in a dock, suspend it, and then resume it,
>monitors connected to the dock don't come back up on resume. It should
>be noted that these docks use DP MST for all of the monitor connections
>they have, so it's basically just an MST hub. As far as I can tell, it
>looks like this only occurs while using the dock. Using normal MST
>monitors doesn't show this issue.
>After doing some investigation, I managed to find where the problem
>starts. So, the main functions of concern in the driver when it comes
>to resume are i915_drm_resume_early() and i915_drm_resume(). The
>problem starts in the latter function, where we reenable interrupts for
>the GPU by calling intel_runtime_pm_enable_interrupts(). If we go down
>a little further, the exact line where the problem starts is
> I915_WRITE(GEN8_MASTER_IRQ, DE_MASTER_IRQ_CONTROL);
>Simple explanation: this writes to the master IRQ control register and
>toggles bit 31 to on, the bit that enables/disables all interrupts on
>the GPU. So this is where things get weird: if we start resuming DP MST
>before doing this single register write (by calling
>intel_dp_mst_resume()), everything works perfectly and the screen turns
>back on. If we try resuming DP MST after this register write, all of
>the DP aux transactions timeout according to the hardware:
>[ 23.928507] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x7d40001f
>[ 23.938506] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x7d40001f
>[ 23.948587] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x7d40001f
>[ 24.006942] [drm:drm_dp_check_act_status] failed to get ACT bit 1 after 30
>It looks like what happens is that after doing this register write, we
>become unable to successfully do any DP aux transactions for about 15-
>20 msec. After that time passes, everything goes back to normal and DP
>aux works fine again. In fact, if we just wait for 15 msec before
>trying to resume DP MST, the monitors come on perfectly as a result.
>While I'd love to just have a fix as simple as that, unfortunately we'd
>like to know what's actually causing this to happen. What's strange is
>that it doesn't seem like we actually get any interrupts from the GPU
>during that 15-20 msec duration where DP aux stops working, since I
>don't see our IRQ handler for i915 getting called at all during that
>time. Daniel Vetter has suggested it might be the DMC firmware doing
>aux transactions using the PSR block, resulting in the bus being busy,
>but preventing the firmware for the DMC from being loaded at all
>doesn't seem to make a difference.
>Hopefully as a hardware guy you might be able to give us some insight
>as to what's going on. If anyone notices I've missed any important
>details about this, feel free to reply and mention them. Thanks ahead
>of time for the help.
More information about the Intel-gfx