[Intel-gfx] [PATCH v7 1/6] drm/i915: Fallback to lower link rate and lane count during link training

Thu Sep 29 23:17:06 UTC 2016

On Thu, Sep 29, 2016 at 09:05:01AM -0700, Manasi Navare wrote:
> On Thu, Sep 29, 2016 at 06:48:43PM +0300, Jani Nikula wrote:
> > On Thu, 29 Sep 2016, Ville Syrjälä <ville.syrjala at linux.intel.com> wrote:
> > > On Thu, Sep 29, 2016 at 12:44:19PM +0100, Chris Wilson wrote:
> > >> On Thu, Sep 29, 2016 at 02:26:16PM +0300, Jani Nikula wrote:
> > >> > On Thu, 29 Sep 2016, Manasi Navare <manasi.d.navare at intel.com> wrote:
> > >> > > On Tue, Sep 27, 2016 at 08:07:01PM +0300, Jani Nikula wrote:
> > >> > >> On Tue, 27 Sep 2016, Manasi Navare <manasi.d.navare at intel.com> wrote:
> > >> > >> > On Mon, Sep 26, 2016 at 04:39:34PM +0300, Jani Nikula wrote:
> > >> > >> >> On Fri, 16 Sep 2016, Manasi Navare <manasi.d.navare at intel.com> wrote:
> > >> > >> >> > According to the DisplayPort Spec, in case of Clock Recovery failure
> > >> > >> >> > the link training sequence should fall back to the lower link rate
> > >> > >> >> > followed by lower lane count until CR succeeds.
> > >> > >> >> > On CR success, the sequence proceeds with Channel EQ.
> > >> > >> >> > In case of Channel EQ failures, it should fallback to
> > >> > >> >> > lower link rate and lane count and start the CR phase again.
> > >> > >> >> 
> > >> > >> >> This change makes the link training start at the max lane count and max
> > >> > >> >> link rate. This is not ideal, as it wastes the link. And it is not a
> > >> > >> >> spec requirement. "The Link Policy Maker of the upstream device may
> > >> > >> >> choose any link count and link rate as long as they do not exceed the
> > >> > >> >> capabilities of the DP receiver."
> > >> > >> >> 
> > >> > >> >> Our current code starts at the minimum required bandwidth for the mode,
> > >> > >> >> therefore we can't fall back to lower link rate and lane count without
> > >> > >> >> reducing the mode.
> > >> > >> >> 
> > >> > >> >> AFAICT this patch here makes it possible for the link bandwidth to drop
> > >> > >> >> below what is required for the mode. This is unacceptable.
> > >> > >> >> 
> > >> > >> >> BR,
> > >> > >> >> Jani.
> > >> > >> >> 
> > >> > >> >>
> > >> > >> >
> > >> > >> > Thanks Jani for your review comments.
> > >> > >> > Yes in this change we start at the max link rate and lane count. This
> > >> > >> > change was made according to the design document discussions we had
> > >> > >> > before strating this DP Redesign project. The main reason for starting
> > >> > >> > at the maxlink rate and max lane count was for ensuring proper
> > >> > >> > behavior of DP MST. In case of DP MST, we want to train the link at
> > >> > >> > the maximum supported link rate/lane count based on an early/ upfront
> > >> > >> > link training result so that we dont fail when we try to connect a
> > >> > >> > higher resolution monitor as a second monitor. This a trade off
> > >> > >> > between wsting the link or higher power vs. needing to retrain for
> > >> > >> > every monitor that requests a higher BW in case of DP MST.
> > >> > >> 
> > >> > >> We already train at max bandwidth for DP MST, which seems to be the
> > >> > >> sensible thing to do.
> > >> > >> 
> > >> > >> > Actually this is also the reason for enabling upfront link training in
> > >> > >> > the following patch where we train the link much ahead in the modeset
> > >> > >> > sequence to understand the link rate and lane count values at which
> > >> > >> > the link can be successfully trained and then the link training
> > >> > >> > through modeset will always start at the upfront values (maximum
> > >> > >> > supported values of lane count and link rate based on upfront link
> > >> > >> > training).
> > >> > >> 
> > >> > >> I don't see a need to do this for DP SST.
> > >> > >> 
> > >> > >> > As per the CTS, all the test 4.3.1.4 requires that you fall back to
> > >> > >> > the lower link rate after trying to train at the maximum link rate
> > >> > >> > advertised through the DPCD registers.
> > >> > >> 
> > >> > >> That test does not require the source DUT to default to maximum lane
> > >> > >> count or link rate of the sink. The source may freely choose the lane
> > >> > >> count and link rate as long as they don't exceed sink capabilities.
> > >> > >> 
> > >> > >> For the purposes of the test, the test setup can request specific
> > >> > >> parameters to be used, but that does not mean using maximum by
> > >> > >> *default*.
> > >> > >> 
> > >> > >> We currently lack the feature to reduce lane count and link rate. The
> > >> > >> key to understand here is that starting at max and reducing down to the
> > >> > >> sufficient parameters for the mode (which is where we start now) offers
> > >> > >> no real benefit for any use case. What we're lacking is a feature to
> > >> > >> reduce the link parameters *below* what's required by the mode the
> > >> > >> userspace wants. This can only be achieved through cooperation with
> > >> > >> userspace.
> > >> > >> 
> > >> > >
> > >> > > We can train at the optimal link rate required for the requested mode as
> > >> > > done in the existing implementation and retrain whenever the link training
> > >> > > test request is sent. 
> > >> > > For the test 4.3.1.4 in CTS, it does force a failure in CR and expects the
> > >> > > driver to fall back to even lower link rate. We do not implement this in the
> > >> > > current driver and so this test fails. Could you elaborate on how this can
> > >> > > be achieved with the the cooperation with userspace?
> > >> > > Should we send a uevent to the userspace asking to retry at a lower resolution
> > >> > > after retraining at the lower link rate?
> > >> > > This is pertty much the place where majority of the compliance tests are failing.
> > >> > > How can we pass compliance with regards to this feature?
> > >> > 
> > >> > So here's an idea Ville and I came up with. It's not completely thought
> > >> > out yet, probably has some wrinkles still, but then there are wrinkles
> > >> > with the upfront link training too (I'll get back to those separately).
> > >> > 
> > >> > If link training fails during modeset (either for real or because it's a
> > >> > test sink that wants to test failures), we 1) store the link parameters
> > >> > as failing, 2) send a uevent to userspace, hopefully getting the
> > >> > userspace to do another get modes and try again, 3) propage errors from
> > >> > modeset.
> > >> 
> > >> userspace already tries to do a reprobe after a setcrtc fails, to try
> > >> and gracefully handle the race between hotplug being in its event queue
> > >> and performing setcrtc, i.e. I think the error is enough.
> > >
> > > I presume we want the modeset to be async, so by the time we notice the
> > > problem we're no longer in the ioctl.
> > 
> > IOW, we'll just need to send the hotplug uevent anyway.
> > 
> > BR,
> > Jani.
> >
> 
> I am going to try to implement a the code where if wefail link training at a 
> particular link rate then i send the uevent to the userspace saving off the
> values at which thelink training failed so that these values can be used in the next
> attempt of the modeset to prune the modes accordingly and link training should be
> tried in that attempt with the lower link rate. The hope is that this will make the
> compliance test 4.3.1.4 happy.
> 
> Regards
> Manasi

This is what I am doing when we get a test request to train at a particular rate:
if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING)) {
                        intel_dp_set_link_params(intel_dp,
                                                 drm_dp_bw_code_to_link_rate(intel_dp->
                                                                             compliance_test_link_rate),
                                                 intel_dp->compliance_test_lane_count,
                                                 false);
                	drm_kms_helper_hotplug_event(intel_encoder->base.dev); 
	}

I see in the dmesg that it sends a hotplug uevent to the userspace that triggers a drm_setup_crtcs()
But it finds that the connector is already enabled and has a CRTC so it does not go ahead with 
compute_config. Do we need to disable the crtc and update the atomic state before generating
this uevent? How can this be done?

Manasi

> > -- 
> > Jani Nikula, Intel Open Source Technology Center
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx