[Intel-gfx] [PATCH] drm/i915: don't return -ENXIO from gmbus xfer

Daniel Vetter daniel at ffwll.ch
Sun May 20 20:16:46 CEST 2012


On Sun, May 20, 2012 at 11:07:46AM -0700, Daniel Kurtz wrote:
> On Sun, May 20, 2012 at 8:19 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
> >
> > On Sat, May 19, 2012 at 10:10:12PM +0200, Daniel Vetter wrote:
> > > ... too much risk for flaky edid transfers.
> > >
> > > This regression has been introduced in
> > >
> > > commit e646d5773572bf52017983d758bdf05777dc5600
> > > Author: Daniel Kurtz <djkurtz at chromium.org>
> > > Date:   Fri Mar 30 19:46:38 2012 +0800
> > >
> > >     drm/i915/intel_i2c: always wait for IDLE before clearing NAK
> > >
> > > This patch keeps the improved NAK handling on the hw side, but reverts
> > > the change to return -ENXIO in case the gmbus controller reports a
> > > NAK.
> > >
> > > Cc: Daniel Kurtz <djkurtz at chromium.org>
> >
> > Hi Daniel,
> >
> > Can you please take a look at this one and smash your r-b onto it if you
> > agree?
> >
> > Thanks, Daniel
> >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49518
> > > Reported-and-Tested-by: Julian Simioni <julian.simioni at gmail.com>
> > > Signed-Off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > ---
> > >  drivers/gpu/drm/i915/intel_i2c.c |    7 ++++---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/intel_i2c.c b/drivers/gpu/drm/i915/intel_i2c.c
> > > index e04255e..0588d8e 100644
> > > --- a/drivers/gpu/drm/i915/intel_i2c.c
> > > +++ b/drivers/gpu/drm/i915/intel_i2c.c
> > > @@ -418,10 +418,11 @@ clear_err:
> > >        * If no ACK is received during the address phase of a transaction,
> > >        * the adapter must report -ENXIO.
> > >        * It is not clear what to return if no ACK is received at other times.
> > > -      * So, we always return -ENXIO in all NAK cases, to ensure we send
> > > -      * it at least during the one case that is specified.
> > > +      *
> > > +      * Unfortunately we can't afford false positives in returning -ENXIO,
> > > +      * hence never return -ENXIO.
> > >        */
> > > -     ret = -ENXIO;
> > > +     ret = i;
> 
> 
> The bugzilla report shows that in the old case, the gmbus failure was:
> [    2.528812] vgaarb: device changed decodes:
> PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [    2.540753] [drm] GMBUS [i915 gmbus panel] timed out waiting for idle
> [    2.580813] [drm:intel_panel_get_backlight], get backlight PWM = 0
> 
> And now the failure is:
> [    2.523015] vgaarb: device changed decodes:
> PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [    2.534770] [drm] GMBUS [i915 gmbus panel] timed out after NAK
> [    2.534797] [drm:gmbus_xfer], GMBUS [i915 gmbus panel] NAK for
> addr: 0050 w(1)
> [    2.534883] [drm:intel_panel_get_backlight], get backlight PWM = 0
> 
> From these logs, it looks like in both cases, there is an i2c
> communication error between host and panel.  The difference is that:
>   * in the old case, the return value was 0, which triggered a silent
> retry (the 40ms gap between GMBUS and PWM messages)
>   * but, now that the return value is -ENXIO, the caller does not retry.
> 
> Actually, in the 'new' case, there are two errors happening: a NAK and
> then a timeout after the NAK, while waiting for the controller to
> clear the ACTIVE bit.  I'm not actually sure what causes "timeout
> after NAK", but I think it means the controller is waiting for entire
> transaction to complete (perhaps the STOP bit after NAK?).
> 
> Since this reported issue is happening in this double error path, I'd
> rather a patch that fixes it without disabling the more generic NAK
> path.  Maybe something like this:

Hm, good suggestion. This way, if we get a NAK but no issues later on,
we'll still fail faster if gmbus detected that no device is responding.
Which should speed up boot a bit. I'll run this past the bug reporter.

Thanks, Daniel

> 
> clear_err:
> 	/*
> 	 * Wait for bus to IDLE before clearing NAK.
> 	 * If we clear the NAK while bus is still active, then it will stay
> 	 * active and the next transaction may fail.
> 	 */
> +      ret = -ENXIO;
> 	if (wait_for((I915_READ(GMBUS2 + reg_offset) & GMBUS_ACTIVE) == 0,
> -		     10))
> +		     10)) {
> 		DRM_DEBUG_KMS("GMBUS [%s] timed out after NAK\n",
> 			      adapter->name);
> +		ret = -ETIMEDOUT;  // Or 0 ?
> +	}
> 
> 	/* Toggle the Software Clear Interrupt bit. This has the effect
> 	 * of resetting the GMBUS controller and so clearing the
> 	 * BUS_ERROR raised by the slave's NAK.
> 	 */
> 	I915_WRITE(GMBUS1 + reg_offset, GMBUS_SW_CLR_INT);
> 	I915_WRITE(GMBUS1 + reg_offset, 0);
> 	I915_WRITE(GMBUS0 + reg_offset, 0);
> 
> 	DRM_DEBUG_KMS("GMBUS [%s] NAK for addr: %04x %c(%d)\n",
> 			 adapter->name, msgs[i].addr,
> 			 (msgs[i].flags & I2C_M_RD) ? 'r' : 'w', msgs[i].len);
> 
> 	/*
> 	 * If no ACK is received during the address phase of a transaction,
> 	 * the adapter must report -ENXIO.
> 	 * It is not clear what to return if no ACK is received at other times.
> -	 * So, we always return -ENXIO in all NAK cases, to ensure we send
> -	 * it at least during the one case that is specified.
> +	 * So, return -ENXIO for NAK after any byte, unless there was a timeout
> +	 * while waiting for IDLE after NAK.
> 	 */
> -	ret = -ENXIO;
> 	goto out;
> 
> -Daniel
> 
> >
> > >       goto out;
> > >
> > >  timeout:
> > > --
> > > 1.7.10
> > >
> >
> > --
> > Daniel Vetter
> > Mail: daniel at ffwll.ch
> > Mobile: +41 (0)79 365 57 48

-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48



More information about the Intel-gfx mailing list