[Intel-gfx] [PATCH 3/3] drm/i915: Implement Link Rate fallback on Link training failure

Wed Nov 30 16:56:19 UTC 2016

On Wed, Nov 30, 2016 at 09:36:33AM +0100, Daniel Vetter wrote:
> On Tue, Nov 29, 2016 at 11:30:33PM -0800, Manasi Navare wrote:
> > If link training at a link rate optimal for a particular
> > mode fails during modeset's atomic commit phase, then we
> > let the modeset complete and then retry. We save the link rate
> > value at which link training failed, update the link status property
> > to "BAD" and use a lower link rate to prune the modes. It will redo
> > the modeset on the current mode at lower link rate or if the current
> > mode gets pruned due to lower link constraints then, it will send a
> > hotplug uevent for userspace to handle it.
> > 
> > This is also required to pass DP CTS tests 4.3.1.3, 4.3.1.4,
> > 4.3.1.6.
> > 
> > v9:
> > * Use the trimmed max values of link rate/lane count based on
> > link train fallback (Daniel Vetter)
> > v8:
> > * Set link_status to BAD first and then call mode_valid (Jani Nikula)
> > v7:
> > Remove the redundant variable in previous patch itself
> > v6:
> > * Obtain link rate index from fallback_link_rate using
> > the helper intel_dp_link_rate_index (Jani Nikula)
> > * Include fallback within intel_dp_start_link_train (Jani Nikula)
> > v5:
> > * Move set link status to drm core (Daniel Vetter, Jani Nikula)
> > v4:
> > * Add fallback support for non DDI platforms too
> > * Set connector->link status inside set_link_status function
> > (Jani Nikula)
> > v3:
> > * Set link status property to BAd unconditionally (Jani Nikula)
> > * Dont use two separate variables link_train_failed and link_status
> > to indicate same thing (Jani Nikula)
> > v2:
> > * Squashed a few patches (Jani Nikula)
> > 
> > Acked-by: Tony Cheng <tony.cheng at amd.com>
> > Acked-by: Harry Wentland <Harry.wentland at amd.com>
> > Cc: Jani Nikula <jani.nikula at linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter at intel.com>
> > Cc: Ville Syrjala <ville.syrjala at linux.intel.com>
> > Signed-off-by: Manasi Navare <manasi.d.navare at intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_dp.c               | 44 +++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_dp_link_training.c | 25 +++++++++++++--
> >  drivers/gpu/drm/i915/intel_drv.h              |  3 ++
> >  3 files changed, 70 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> > index bc1268c..a50b6cd 100644
> > --- a/drivers/gpu/drm/i915/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/intel_dp.c
> > @@ -4435,6 +4435,8 @@ static bool intel_digital_port_connected(struct drm_i915_private *dev_priv,
> >  		intel_dp->compliance_test_active = 0;
> >  		intel_dp->compliance_test_type = 0;
> >  		intel_dp->compliance_test_data = 0;
> > +		intel_dp->fallback_link_rate = 0;
> > +		intel_dp->fallback_lane_count = 0;
> 
> Hm, I thought we agreed on irc to just track the max link rate/lane count
> at all times, instead of fallback values that sometimes are valid and
> sometimes reset to 0.
> 
> Also, resetting to 0 here is wrong, since this is not the long-pulse hpd
> handler. That probably also explains why you need the hack below, but not
> sure.
> 
> >  
> >  		if (intel_dp->is_mst) {
> >  			DRM_DEBUG_KMS("MST device may have disappeared %d vs %d\n",
> > @@ -4526,6 +4528,13 @@ static bool intel_digital_port_connected(struct drm_i915_private *dev_priv,
> >  	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n",
> >  		      connector->base.id, connector->name);
> >  
> > +	/* If this is a retry due to link training failure
> > +	 * then do no do a full detect
> > +	 */
> > +	if (status == connector_status_connected &&
> > +	    intel_dp->fallback_lane_count)
> > +		return status;
> 
> That sounds very wrong. Why do we need it?
> 
> > +
> >  	/* If full detect is not performed yet, do a full detect */
> >  	if (!intel_dp->detect_done)
> >  		status = intel_dp_long_pulse(intel_dp->attached_connector);
> > @@ -5690,6 +5699,37 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
> >  	return false;
> >  }
> >  
> > +static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
> > +{
> > +	struct intel_connector *intel_connector;
> > +	struct drm_connector *connector;
> > +	struct drm_display_mode *mode;
> > +	bool verbose_prune = true;
> > +
> > +	intel_connector = container_of(work, typeof(*intel_connector),
> > +				       modeset_retry_work);
> > +	connector = &intel_connector->base;
> > +	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id,
> > +		      connector->name);
> > +
> > +	/* Grab the locks before changing connector property*/
> > +	mutex_lock(&connector->dev->mode_config.mutex);
> > +	/* Set connector link status to BAD and send a Uevent to notify
> > +	 * userspace to do a modeset.
> > +	 */
> > +	drm_mode_connector_set_link_status_property(connector,
> > +						    DRM_MODE_LINK_STATUS_BAD);
> > +	list_for_each_entry(mode, &connector->modes, head) {
> > +		mode->status = intel_dp_mode_valid(connector,
> > +						   mode);
> > +	}
> > +	drm_mode_prune_invalid(connector->dev, &connector->modes,
> > +			       verbose_prune);
> 
> This call to drm_mode_prune_invalid is probably just to paper over a bug
> in SNA - SNA violates the hotplug handling uabi by not unconditionally
> reprobing. Inconsistently paper over that bug in the kernel is not good,

I would categorize pruning here as "avoid pointless work later". Of
course with the current uabi that might not happen anyway. Would be
nice if it eventually did however.

-- 
Ville Syrjälä
Intel OTC