[Intel-gfx] [PATCH 2/2] drm/i915/lspcon: Add workaround for resuming in PCON mode

Jani Nikula jani.nikula at intel.com
Fri Oct 21 08:51:18 UTC 2016


On Fri, 21 Oct 2016, Imre Deak <imre.deak at intel.com> wrote:
> On Fri, 2016-10-21 at 00:40 +0300, Imre Deak wrote:
>> On Thu, 2016-10-20 at 23:50 +0300, Jani Nikula wrote:
>> > On Thu, 20 Oct 2016, Imre Deak <imre.deak at intel.com> wrote:
>> > > On Thu, 2016-10-20 at 22:24 +0300, Jani Nikula wrote:
>> > > > On Thu, 20 Oct 2016, Jani Nikula <jani.nikula at intel.com> wrote:
>> > > > > On Thu, 20 Oct 2016, Imre Deak <imre.deak at intel.com> wrote:
>> > > > > > On my APL the LSPCON firmware resumes in PCON mode as
>> > > > > > opposed to the
>> > > > > > expected LS mode. It also appears to be in a state where
>> > > > > > AUX DPCD reads
>> > > > > > will succeed but return garbage recovering only after a few
>> > > > > > hundreds of
>> > > > > > milliseconds. After the recovery time DPCD reads will
>> > > > > > result in the
>> > > > > > correct values and things will continue to work. If I2C
>> > > > > > over AUX is
>> > > > > > attempted during this recovery time (implying an AUX write
>> > > > > > transaction)
>> > > > > > the firmware won't recover and will stay in this broken
>> > > > > > state.
>> > > > > > 
>> > > > > > As a workaround check if the firmware is in PCON state
>> > > > > > after resume and
>> > > > > > if so wait until the correct DPCD values are returned. For
>> > > > > > this we
>> > > > > > compare the branch descriptor with the one we cached during
>> > > > > > init time.
>> > > > > > If the firmware was in the LS state, we skip the w/a and
>> > > > > > continue as
>> > > > > > before.
>> > > > > > 
>> > > > > > Cc: Shashank Sharma <shashank.sharma at intel.com>
>> > > > > > Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>> > > > > > Cc: Jani Nikula <jani.nikula at intel.com>
>> > > > > > Signed-off-by: Imre Deak <imre.deak at intel.com>
>> > > > > > ---
>> > > > > >  drivers/gpu/drm/i915/intel_dp.c     |  2 +-
>> > > > > >  drivers/gpu/drm/i915/intel_drv.h    |  6 ++++-
>> > > > > >  drivers/gpu/drm/i915/intel_lspcon.c | 52
>> > > > > > ++++++++++++++++++++++++++++++-------
>> > > > > >  3 files changed, 48 insertions(+), 12 deletions(-)
>> > > > > > 
>> > > > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c
>> > > > > > b/drivers/gpu/drm/i915/intel_dp.c
>> > > > > > index e90211e..ec031db 100644
>> > > > > > --- a/drivers/gpu/drm/i915/intel_dp.c
>> > > > > > +++ b/drivers/gpu/drm/i915/intel_dp.c
>> > > > > > @@ -3487,7 +3487,7 @@ intel_dp_link_down(struct intel_dp
>> > > > > > *intel_dp)
>> > > > > >  	intel_dp->DP = DP;
>> > > > > >  }
>> > > > > >  
>> > > > > > -static bool
>> > > > > > +bool
>> > > > > >  intel_dp_read_dpcd(struct intel_dp *intel_dp)
>> > > > > >  {
>> > > > > >  	if (drm_dp_dpcd_read(&intel_dp->aux, 0x000,
>> > > > > > intel_dp->dpcd,
>> > > > > > diff --git a/drivers/gpu/drm/i915/intel_drv.h
>> > > > > > b/drivers/gpu/drm/i915/intel_drv.h
>> > > > > > index a35e241..9a2366e 100644
>> > > > > > --- a/drivers/gpu/drm/i915/intel_drv.h
>> > > > > > +++ b/drivers/gpu/drm/i915/intel_drv.h
>> > > > > > @@ -972,7 +972,9 @@ struct intel_dp {
>> > > > > >  struct intel_lspcon {
>> > > > > >  	bool active;
>> > > > > >  	enum drm_lspcon_mode mode;
>> > > > > > -	struct drm_dp_aux *aux;
>> > > > > > +	struct intel_dp *intel_dp;
>> > > > > > +	bool desc_valid;
>> > > > > > +	struct intel_dp_desc desc;
>> > > > > 
>> > > > > I guess we could cache the desc in intel_dp directly.
>> > > > > Independent of
>> > > > > this patch.
>> > > > > 
>> > > > > Also, I'm wondering if we could stick with just aux here, and
>> > > > > read
>> > > > > something else from dpcd instead.
>> > > > > 
>> > > > > >  };
>> > > > > >  
>> > > > > >  struct intel_digital_port {
>> > > > > > @@ -1469,6 +1471,8 @@ static inline unsigned int
>> > > > > > intel_dp_unused_lane_mask(int lane_count)
>> > > > > >  }
>> > > > > >  
>> > > > > >  bool
>> > > > > > +intel_dp_read_dpcd(struct intel_dp *intel_dp);
>> > > > > > +bool
>> > > > > >  intel_dp_read_desc(struct intel_dp *intel_dp, struct
>> > > > > > intel_dp_desc *desc);
>> > > > > >  void
>> > > > > >  intel_dp_print_desc(struct intel_dp *intel_dp, struct
>> > > > > > intel_dp_desc *desc);
>> > > > > > diff --git a/drivers/gpu/drm/i915/intel_lspcon.c
>> > > > > > b/drivers/gpu/drm/i915/intel_lspcon.c
>> > > > > > index d2c8cb2..54c6173 100644
>> > > > > > --- a/drivers/gpu/drm/i915/intel_lspcon.c
>> > > > > > +++ b/drivers/gpu/drm/i915/intel_lspcon.c
>> > > > > > @@ -30,7 +30,7 @@
>> > > > > >  static enum drm_lspcon_mode lspcon_get_current_mode(struct
>> > > > > > intel_lspcon *lspcon)
>> > > > > >  {
>> > > > > >  	enum drm_lspcon_mode current_mode =
>> > > > > > DRM_LSPCON_MODE_INVALID;
>> > > > > > -	struct i2c_adapter *adapter = &lspcon->aux->ddc;
>> > > > > > +	struct i2c_adapter *adapter = &lspcon->intel_dp-
>> > > > > > >aux.ddc;
>> > > > > >  
>> > > > > >  	if (drm_lspcon_get_mode(adapter, ¤t_mode))
>> > > > > >  		DRM_ERROR("Error reading LSPCON mode\n");
>> > > > > > @@ -45,7 +45,7 @@ static int lspcon_change_mode(struct
>> > > > > > intel_lspcon *lspcon,
>> > > > > >  {
>> > > > > >  	int err;
>> > > > > >  	enum drm_lspcon_mode current_mode;
>> > > > > > -	struct i2c_adapter *adapter = &lspcon->aux->ddc;
>> > > > > > +	struct i2c_adapter *adapter = &lspcon->intel_dp-
>> > > > > > >aux.ddc;
>> > > > > >  
>> > > > > >  	err = drm_lspcon_get_mode(adapter, ¤t_mode);
>> > > > > >  	if (err) {
>> > > > > > @@ -72,7 +72,7 @@ static int lspcon_change_mode(struct
>> > > > > > intel_lspcon *lspcon,
>> > > > > >  static bool lspcon_probe(struct intel_lspcon *lspcon)
>> > > > > >  {
>> > > > > >  	enum drm_dp_dual_mode_type adaptor_type;
>> > > > > > -	struct i2c_adapter *adapter = &lspcon->aux->ddc;
>> > > > > > +	struct i2c_adapter *adapter = &lspcon->intel_dp-
>> > > > > > >aux.ddc;
>> > > > > >  
>> > > > > >  	/* Lets probe the adaptor and check its type */
>> > > > > >  	adaptor_type = drm_dp_dual_mode_detect(adapter);
>> > > > > > @@ -89,8 +89,42 @@ static bool lspcon_probe(struct
>> > > > > > intel_lspcon *lspcon)
>> > > > > >  	return true;
>> > > > > >  }
>> > > > > >  
>> > > > > > +static void lspcon_resume_in_pcon_wa(struct intel_lspcon
>> > > > > > *lspcon)
>> > > > > > +{
>> > > > > > +	unsigned long start = jiffies;
>> > > > > > +
>> > > > > > +	if (!lspcon->desc_valid)
>> > > > > > +		return;
>> > > > > > +
>> > > > > > +	while (1) {
>> > > > > > +		struct intel_dp_desc desc;
>> > > > > > +
>> > > > > > +		/*
>> > > > > > +		 * The w/a only applies in PCON mode and
>> > > > > > we don't expect any
>> > > > > > +		 * AUX errors.
>> > > > > > +		 */
>> > > > > > +		if (!intel_dp_read_desc(lspcon->intel_dp,
>> > > > > > &desc))
>> > > > > > +			return;
>> > > > > > +
>> > > > > > +		if (!memcmp(&lspcon->desc, &desc,
>> > > > > > sizeof(desc))) {
>> > > > > > +			DRM_DEBUG_KMS("LSPCON recovering
>> > > > > > in PCON mode after %u ms\n",
>> > > > > > +				      jiffies_to_msecs(jif
>> > > > > > fies - start));
>> > > > > > +			return;
>> > > > > > +		}
>> > > > > > +
>> > > > > > +		if (time_after(jiffies, start +
>> > > > > > msecs_to_jiffies(1000)))
>> > > > > > +			break;
>> > > > > > +
>> > > > > > +		msleep(10);
>> > > > > > +	}
>> > > > > > +
>> > > > > > +	DRM_DEBUG_KMS("LSPCON DP descriptor mismatch after
>> > > > > > resume\n");
>> > > > > > +}
>> > > > > 
>> > > > > I think we need to go through the LSPCON spec once more
>> > > > > before we merge
>> > > > > this. Maybe we don't reset the LSPCON properly. Maybe we
>> > > > > don't handle
>> > > > > the resume properly. Maybe this isn't a "workaround" after
>> > > > > all.
>> > > > 
>> > > > Looks like this stuff is normal behaviour if the LSPCON has
>> > > > been in low
>> > > > power state, and we need to wake it up. In PCON mode, this is
>> > > > done using
>> > > > DP_SET_POWER. And while waking up, it'll issue AUX DEFER, as we
>> > > > can see
>> > > > from the logs.
>> > > 
>> > > I did think about that, but waking the device via
>> > > DP_SET_POWER(D0)
>> > > didn't help. Doing any AUX write transactions actually gets the
>> > > device
>> > > into a stuck state for good (until next reboot on APL).
>> > > 
>> > > More over, we don't get AUX DEFERs the AUX transactions succeed,
>> > > it's
>> > > only that reads will return corrupted values for the given time
>> > > period.
>> > 
>> > The bug at hand has native defers
>> > https://bugs.freedesktop.org/show_bug.cgi?id=98353
>> 
>> For I2C-over-AUX I also get defers, but not for DPCD transactions for
>> native AUX registers for instance DP_SET_POWER or the OUI, device-ID
>> registers. I also tried the LS mode wake-up sequence via
>> I2C-over-AUX (offset 0x20), but didn't help either.
>> 
>> The bug above could be a different issue, although I did see this
>> particular scenario getting fixed on the same machine see
>> /archive/results/CI_IGT_test/Patchwork_2778/
>
> I tried this now on the fi-skl-6700hq machine in the bug report and it
> seems to be the same failure mode as the APL one. There is one
> difference in that on the SKL machine once the firmware is in the stuck
> state after suspend/resume, a warm reboot may not recover it, so during
> the next boot the LSPCON probe will fail. I guess that during warm
> reboot LSPCON may continue to be powered, so to fully reset it a power
> cycle is needed. In any case this patch fixes resume on that machine
> too, so I couldn't reproduce the problem with it even across multiple
> reboots, whereas without it it's 100% reproducible.

A quick thought, when we do module unload or suspend, perhaps we should
put the LSPCON to the state in which we expect to find it on
probe/resume? In case it doesn't get reset.

BR,
Jani.


>
> --Imre

-- 
Jani Nikula, Intel Open Source Technology Center


More information about the Intel-gfx mailing list