[Bug 108131] [CI][SHARDS] igt@* - dmesg-warn - *ERROR* LSPCON mode hasn't settled

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Aug 27 17:29:29 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=108131

Matt Roper <matthew.d.roper at intel.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|medium                      |low

--- Comment #5 from Matt Roper <matthew.d.roper at intel.com> ---
LSPCON refers to a DP -> HDMI adapter used on these systems ("Level Shifter and
Protocol CONverter"); it's a separate downstream device and when we perform a
suspend/resume cycle, we need to settle into its PCON mode before using it. 
The messages here indicate that although the LSPCON is responding to DPCD reads
on the aux channel following resume, when we try to check the mode (LS or PCON)
by doing DPCD reads of offset 41, all of those reads return "defer" until we
eventually give up and declare a timeout.

Higher level logic does itself retry probing the LSPCON mode and the LSPCON
finally starts responding again after more than a second has passed (658.672242
-> 659.860423).

It's hard to say why the LSPCON flakes out for over a second and fails to
respond to us, but there have been a few upstream changes to extend the
timeouts in places (e.g., "drm/i915: Increase LSPCON timeout").  From the CI
database, it looks like the issue became significantly less common once those
timeouts were extended (last seen two months ago, and the previous occurrence
was five months before that); we could probably eliminate this completely if we
kept extending timeouts far enough, but that would likely lead to poor user
experience in situations where we legitimately do need to timeout for an
operation (the commit message for the commit above does indicate they chose
400ms rather than the original 1000ms for this reason).

Due to the rarity of this problem, the lack of user-visible impact (the
higher-level code does retry further and get a response as we can see in the
logs), I think it's safe to downgrade this bug to 'low' exposure.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190827/37aff09f/attachment-0001.html>


More information about the intel-gfx-bugs mailing list