[Intel-gfx] [PATCH v4 0/6] drm/i915/dp_link_training: Define a final failure state when link training fails

Gil Dekel gildekel at chromium.org
Fri Sep 1 19:25:36 UTC 2023


On Fri, Sep 1, 2023 at 3:05 PM Rodrigo Vivi <rodrigo.vivi at intel.com> wrote:
>
> On Fri, Sep 01, 2023 at 02:38:11PM -0400, Rodrigo Vivi wrote:
> > On Wed, Aug 30, 2023 at 05:41:37PM -0400, Lyude Paul wrote:
> > > Other then the name typo (s/Pual/Paul):
> > >
> > > Signed-off-by: Lyude Paul <lyude at redhat.com> (just since I co-authored
> > > things~)
> >
> > I believe having the Co-developed-by: in the patches that you helped
> > out would be nice.
> >
Agreed. I think I'll set Lyude as the author of the two patches she originally
wrote, and set myself as the co-developer. Thank you both for pointing this out.

> > > Reviewed-by: Lyude Paul <lyude at redhat.com>
> > >
> > > I think we definitely want to make sure we get intel's opinions on this
> > > though, especially regarding the usage of link-status. I think we're close
> > > enough to link-status's intended purpose, but I definitely would like to know
> > > what others think about that since userspace will definitely have to handle
> > > situations like this a bit differently than with SST.
>
> I'd like to get Jani, or Ville, or Imre's eyes here. I believe this
> series has a good potential to solve some bad lingering MST issues,
> but I'm indeed concerned with the impact on depending on userspace
> behavior here.
>
I would love to have other userspaces reviewing (or at least ACKing)
this series. Any thoughts on who should be added on the next revision?

> (Besides that potential dead-lock...)
>
Response is on patch 4/5.
> > >
> > > Also - definitely make sure you take a look at Imre's patch series that's
> > > currently on the list (I just finished reviewing it), since it adds some
> > > things to the helpers that might end up being useful here :)
> > >
> > > https://patchwork.freedesktop.org/series/122589/
> > >
Do you have anything particular in mind?

> > > On Thu, 2023-08-24 at 16:50 -0400, Gil Dekel wrote:
> > > > Next version of https://patchwork.freedesktop.org/series/122850/
> > > >
> > > > v4:
> > > >   Another blunder. I uploaded the patches from my ChromeiumOS kernel dev repo
> > > >   instead of drm-tip/drm-tip. Apologies for the noise :(
> > > >
> > > > v3:
> > > >   Still learning the ropes of upstream workflow. Apologies for mucking up v2.
> > > >   This is just a re-upload.
> > > >
> > > > v2:
> > > >   Reorganize into:
> > > >   1) Add for final failure state for SST and MST link training fallback.
> > > >   2) Add a DRM helper for setting downstream MST ports' link-status state.
> > > >   3) Make handling SST and MST connectors simpler via intel_dp.
> > > >   4) Update link-status for downstream MST ports.
> > > >   5) Emit a uevent with the "link-status" trigger property.
> > > >
> > > > v1:
> > > > Currently, when link training fails after all fallback values have been
> > > > exhausted, the i915 driver seizes to send uevents to userspace. This leave
> > > > userspace thinking that the last passing atomic commit was successful, and that
> > > > all connectors (displays) are connected and operational, when in fact, the last
> > > > link failed to train and the displays remain dark. This manifests as "zombie"
> > > > displays in userspace, in which users observe the displays appear in their
> > > > display settings page, but they are dark and unresponsive.
> > > >
> > > > Since, at the time of writing, MST link training fallback is not implemented,
> > > > failing MST link training is a significantly more common case then a complete
> > > > SST link training failure. And with users using MST hubs more than ever to
> > > > connect multiple displays via their USB-C ports we observe this case often.
> > > >
> > > > This patchset series suggest a solution, in which a final failure state is
> > > > defined. In this final state, the connector's bit rate capabilities, namely
> > > > max_link_rate and max_link_lane_count, are set to 0. This effectively set the
> > > > connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
> > > > following connector probing.
> > > >
> > > > Next, with this state defined, we emit a link-status=Bad uevent. The next time
> > > > userspace probes the connector, it should recognize that the connector has no
> > > > modes and ignore it since it is in a bad state.
> > > >
> > > > I am aware that always sending a uevent and never stopping may result in some
> > > > userspaces having their expectations broken and enter an infinite loop of
> > > > modesets and link-training attempts. However, per DRM link-status spec:
> > > > ```
> > > >  * link-status:
> > > >  *      Connector link-status property to indicate the status of link. The
> > > >  *      default value of link-status is "GOOD". If something fails during or
> > > >  *      after modeset, the kernel driver may set this to "BAD" and issue a
> > > >  *      hotplug uevent. Drivers should update this value using
> > > >  *      drm_connector_set_link_status_property().
> > > >  *
> > > >  *      When user-space receives the hotplug uevent and detects a "BAD"
> > > >  *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
> > > >  *      becomes completely black). The list of available modes may have
> > > >  *      changed. User-space is expected to pick a new mode if the current one
> > > >  *      has disappeared and perform a new modeset with link-status set to
> > > >  *      "GOOD" to re-enable the connector.
> > > > ```
> > > > (form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)
> > > >
> > > > it seems reasonable to assume that the suggested state is an extension of the
> > > > spec's guidelines, in which the next new mode userspace picks for a connector
> > > > with no modes is - none, thus breaking the cycle of failed link-training
> > > > attempts.
> > > >
> > > > I suspect that, maybe, zeroing out the bit rate capabilities is not the right
> > > > way to go, and perhaps marking the connector as disconnected instead may be a
> > > > better solution. However, if marking a connector disconnected is the way to go,
> > > > We will have to iterate over all MST ports in the MST case and mark the spawned
> > > > connectors as disconnected as well.
> > >
> > > I -think- this is probably fine, that's likely how I'd
> > >
> > > >
> > > > As a final note I should add that this approach was tested with ChromeOS as
> > > > userspace, and we observed that the zombie displays stop showing up once the
> > > > connectors are pruned of all their modes and are ignored by userspace.
> > > >
> > > > For your consideration and guidance.
> > > > Thanks,
> > > >
> > > > Gil Dekel (6):
> > > >   drm/i915/dp_link_training: Add a final failing state to link training
> > > >     fallback
> > > >   drm/i915/dp_link_training: Add a final failing state to link training
> > > >     fallback for MST
> > > >   drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
> > > >   drm/i915: Move DP modeset_retry_work into intel_dp
> > > >   drm/i915/dp_link_training: Set all downstream MST ports to BAD before
> > > >     retrying
> > > >   drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
> > > >     property
> > > >
> > > >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
> > > >  drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
> > > >  .../drm/i915/display/intel_display_types.h    |  6 +-
> > > >  drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
> > > >  drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
> > > >  .../drm/i915/display/intel_dp_link_training.c | 11 ++-
> > > >  include/drm/display/drm_dp_mst_helper.h       |  3 +
> > > >  7 files changed, 110 insertions(+), 40 deletions(-)
> > > >
> > > > --
> > > > Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
> > > >
> > >
> > > --
> > > Cheers,
> > >  Lyude Paul (she/her)
> > >  Software Engineer at Red Hat
> > >

Thanks for your time and comments!
--
Best,
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics


More information about the Intel-gfx mailing list