[Intel-gfx] [PATCH v4 5/6] drm/i915/dp_link_training: Set all downstream MST ports to BAD before retrying

Gil Dekel gildekel at chromium.org
Fri Sep 1 21:13:57 UTC 2023


On Fri, Sep 1, 2023 at 2:55 PM Rodrigo Vivi <rodrigo.vivi at intel.com> wrote:
>
> On Thu, Aug 24, 2023 at 04:50:20PM -0400, Gil Dekel wrote:
> > Before sending a uevent to userspace in order to trigger a corrective
> > modeset, we change the failing connector's link-status to BAD. However,
> > the downstream MST branch ports are left in their original GOOD state.
> >
> > This patch utilizes the drm helper function
> > drm_dp_set_mst_topology_link_status() to rectify this and set all
> > downstream MST connectors' link-status to BAD before emitting the uevent
> > to userspace.
> >
> > Signed-off-by: Gil Dekel <gildekel at chromium.org>
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c | 16 ++++++++++------
> >  1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 42353b1ac487..e8b10f59e141 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -5995,16 +5995,20 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
> >       struct intel_dp *intel_dp =
> >               container_of(work, typeof(*intel_dp), modeset_retry_work);
> >       struct drm_connector *connector = &intel_dp->attached_connector->base;
> > -     drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s]\n", connector->base.id,
> > -                 connector->name);
> >
> > -     /* Grab the locks before changing connector property*/
> > -     mutex_lock(&connector->dev->mode_config.mutex);
> > -     /* Set connector link status to BAD and send a Uevent to notify
> > -      * userspace to do a modeset.
> > +     /* Set the connector's (and possibly all its downstream MST ports') link
> > +      * status to BAD.
> >        */
> > +     mutex_lock(&connector->dev->mode_config.mutex);
> > +     drm_dbg_kms(connector->dev, "[CONNECTOR:%d:%s] link status %d -> %d\n",
> > +                 connector->base.id, connector->name,
> > +                 connector->state->link_status, DRM_MODE_LINK_STATUS_BAD);
> >       drm_connector_set_link_status_property(connector,
> >                                              DRM_MODE_LINK_STATUS_BAD);
> > +     if (intel_dp->is_mst) {
> > +             drm_dp_set_mst_topology_link_status(&intel_dp->mst_mgr,
> > +                                                 DRM_MODE_LINK_STATUS_BAD);
>
> Something is weird with the locking here.
> I noticed that on patch 3 this new function also gets the same
> mutex_lock(&connector->dev->mode_config.mutex);
>
> Since you didn't reach the deadlock, I'm clearly missing something
> on the flow. But regardless of what I could be missing, I believe
> this is totally not future proof and we will for sure hit dead-lock
> cases.
>
You are not wrong.

Something must have been wrong in my workflow, as I was positive I
tested the code with this lock, but I must remember wrong. I tried
testing my current code and it immediately locked, as you expected.
So thank you for catching this.

Lyude's original patch didn't include drm_dp_set_mst_topology_link_status()
as an exposed drm helper function, so when I adjusted it for this series, I
decided to add locks similar to how her other function using
drm_dp_set_mst_topology_link_status() did. However, I failed to use the
right lock, which is:
drm_modeset_lock(&connector->dev->mode_config.connection_mutex, NULL);
drm_modeset_unlock(&connector->dev->mode_config.connection_mutex);
This is similar to how drm_connector_set_link_status_property() locks
before writing to link_status.

I made sure to test my code with the above locks, and it runs well. Here's
an instrumented log excerpt for failing link-training with an MST hub
(I hacked the driver to always fail non eDP connectors and print the
raw pointer addresses of the drm_device and mutex right before locking):
[   43.466329] i915 0000:00:02.0: [drm] *ERROR* Link Training
Unsuccessful via gildekel HACK - (not eDP)
[   43.594950] i915 0000:00:02.0: [drm] *ERROR* Link Training
Unsuccessful via gildekel HACK - (not eDP)
[   43.594979] i915 0000:00:02.0: [drm] *ERROR* Link Training Unsuccessful
[   43.595023] i915 0000:00:02.0: [drm] *ERROR* [CONNECTOR:273:DP-3]:
[   43.595028] i915 0000:00:02.0: [drm] *ERROR*
connector->dev=00000000d4850450
[   43.595033] i915 0000:00:02.0: [drm] *ERROR*
connector->dev->mode_config.mutex=00000000aac3fe45
[   44.771091] i915 0000:00:02.0: [drm] *ERROR*
[MST-CONNECTOR:300:DP-5]:
[   44.771108] i915 0000:00:02.0: [drm] *ERROR*
connector->dev=000000003fb97435
[   44.771115] i915 0000:00:02.0: [drm] *ERROR*
&connector->dev->mode_config.connection_mutex=000000009aece20e
[   44.771127] i915 0000:00:02.0: [drm] *ERROR*
[MST-CONNECTOR:303:DP-6]:
[   44.771132] i915 0000:00:02.0: [drm] *ERROR*
connector->dev=0000000075236b75
[   44.771137] i915 0000:00:02.0: [drm] *ERROR*
&connector->dev->mode_config.connection_mutex=000000009aece20e

Also, I was under the assumption that all connectors in an MST topology
should reference the same drm_device object, but it seems like that's
not the case. Is my assumption wrong?

> > +     }
> >       mutex_unlock(&connector->dev->mode_config.mutex);
> >       /* Send Hotplug uevent so userspace can reprobe */
> >       drm_kms_helper_connector_hotplug_event(connector);
> > --
> > Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics


Thanks for your time and comments!
--
Best,
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics


More information about the dri-devel mailing list