[Intel-gfx] [PATCH 3/3] drm/i915: Implement Link Rate fallback on Link training failure

Wed Nov 30 16:27:50 UTC 2016

On Wed, Nov 30, 2016 at 09:36:33AM +0100, Daniel Vetter wrote:
> On Tue, Nov 29, 2016 at 11:30:33PM -0800, Manasi Navare wrote:
> > If link training at a link rate optimal for a particular
> > mode fails during modeset's atomic commit phase, then we
> > let the modeset complete and then retry. We save the link rate
> > value at which link training failed, update the link status property
> > to "BAD" and use a lower link rate to prune the modes. It will redo
> > the modeset on the current mode at lower link rate or if the current
> > mode gets pruned due to lower link constraints then, it will send a
> > hotplug uevent for userspace to handle it.
> > 
> > This is also required to pass DP CTS tests 4.3.1.3, 4.3.1.4,
> > 4.3.1.6.
> > 
> > v9:
> > * Use the trimmed max values of link rate/lane count based on
> > link train fallback (Daniel Vetter)
> > v8:
> > * Set link_status to BAD first and then call mode_valid (Jani Nikula)
> > v7:
> > Remove the redundant variable in previous patch itself
> > v6:
> > * Obtain link rate index from fallback_link_rate using
> > the helper intel_dp_link_rate_index (Jani Nikula)
> > * Include fallback within intel_dp_start_link_train (Jani Nikula)
> > v5:
> > * Move set link status to drm core (Daniel Vetter, Jani Nikula)
> > v4:
> > * Add fallback support for non DDI platforms too
> > * Set connector->link status inside set_link_status function
> > (Jani Nikula)
> > v3:
> > * Set link status property to BAd unconditionally (Jani Nikula)
> > * Dont use two separate variables link_train_failed and link_status
> > to indicate same thing (Jani Nikula)
> > v2:
> > * Squashed a few patches (Jani Nikula)
> > 
> > Acked-by: Tony Cheng <tony.cheng at amd.com>
> > Acked-by: Harry Wentland <Harry.wentland at amd.com>
> > Cc: Jani Nikula <jani.nikula at linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter at intel.com>
> > Cc: Ville Syrjala <ville.syrjala at linux.intel.com>
> > Signed-off-by: Manasi Navare <manasi.d.navare at intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_dp.c               | 44 +++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_dp_link_training.c | 25 +++++++++++++--
> >  drivers/gpu/drm/i915/intel_drv.h              |  3 ++
> >  3 files changed, 70 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> > index bc1268c..a50b6cd 100644
> > --- a/drivers/gpu/drm/i915/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/intel_dp.c
> > @@ -4435,6 +4435,8 @@ static bool intel_digital_port_connected(struct drm_i915_private *dev_priv,
> >  		intel_dp->compliance_test_active = 0;
> >  		intel_dp->compliance_test_type = 0;
> >  		intel_dp->compliance_test_data = 0;
> > +		intel_dp->fallback_link_rate = 0;
> > +		intel_dp->fallback_lane_count = 0;
> 
> Hm, I thought we agreed on irc to just track the max link rate/lane count
> at all times, instead of fallback values that sometimes are valid and
> sometimes reset to 0.
> 
> Also, resetting to 0 here is wrong, since this is not the long-pulse hpd
> handler. That probably also explains why you need the hack below, but not
> sure.
> 
> >  
> >  		if (intel_dp->is_mst) {
> >  			DRM_DEBUG_KMS("MST device may have disappeared %d vs %d\n",
> > @@ -4526,6 +4528,13 @@ static bool intel_digital_port_connected(struct drm_i915_private *dev_priv,
> >  	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n",
> >  		      connector->base.id, connector->name);
> >  
> > +	/* If this is a retry due to link training failure
> > +	 * then do no do a full detect
> > +	 */
> > +	if (status == connector_status_connected &&
> > +	    intel_dp->fallback_lane_count)
> > +		return status;
> 
> That sounds very wrong. Why do we need it?
> 
> > +
> >  	/* If full detect is not performed yet, do a full detect */
> >  	if (!intel_dp->detect_done)
> >  		status = intel_dp_long_pulse(intel_dp->attached_connector);
> > @@ -5690,6 +5699,37 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
> >  	return false;
> >  }
> >  
> > +static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
> > +{
> > +	struct intel_connector *intel_connector;
> > +	struct drm_connector *connector;
> > +	struct drm_display_mode *mode;
> > +	bool verbose_prune = true;
> > +
> > +	intel_connector = container_of(work, typeof(*intel_connector),
> > +				       modeset_retry_work);
> > +	connector = &intel_connector->base;
> > +	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id,
> > +		      connector->name);
> > +
> > +	/* Grab the locks before changing connector property*/
> > +	mutex_lock(&connector->dev->mode_config.mutex);
> > +	/* Set connector link status to BAD and send a Uevent to notify
> > +	 * userspace to do a modeset.
> > +	 */
> > +	drm_mode_connector_set_link_status_property(connector,
> > +						    DRM_MODE_LINK_STATUS_BAD);
> > +	list_for_each_entry(mode, &connector->modes, head) {
> > +		mode->status = intel_dp_mode_valid(connector,
> > +						   mode);
> > +	}
> > +	drm_mode_prune_invalid(connector->dev, &connector->modes,
> > +			       verbose_prune);
> 
> This call to drm_mode_prune_invalid is probably just to paper over a bug
> in SNA - SNA violates the hotplug handling uabi by not unconditionally
> reprobing. Inconsistently paper over that bug in the kernel is not good,
> userspace interfaces need to be well defined. Please remove this call and
> test with either UXA or -modesetting until SNA is fixed.

This is not required for link retraining in userspace, since userspace's
response to seeing the link-status == BAD property is to retrain the
current mode (and then it checks link-status again to see if the modeset
worked because returning an error from the modeset seems troublesome).
After that it reprobes. However, the kernel must still fail the modeset
following link-status = BAD if it decides that the mode is no longer
valid.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre