[Intel-gfx] [PATCH] drm/i915: don't return -ENXIO from gmbus xfer
Daniel Vetter
daniel at ffwll.ch
Sun May 20 20:16:46 CEST 2012
On Sun, May 20, 2012 at 11:07:46AM -0700, Daniel Kurtz wrote:
> On Sun, May 20, 2012 at 8:19 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
> >
> > On Sat, May 19, 2012 at 10:10:12PM +0200, Daniel Vetter wrote:
> > > ... too much risk for flaky edid transfers.
> > >
> > > This regression has been introduced in
> > >
> > > commit e646d5773572bf52017983d758bdf05777dc5600
> > > Author: Daniel Kurtz <djkurtz at chromium.org>
> > > Date: Fri Mar 30 19:46:38 2012 +0800
> > >
> > > drm/i915/intel_i2c: always wait for IDLE before clearing NAK
> > >
> > > This patch keeps the improved NAK handling on the hw side, but reverts
> > > the change to return -ENXIO in case the gmbus controller reports a
> > > NAK.
> > >
> > > Cc: Daniel Kurtz <djkurtz at chromium.org>
> >
> > Hi Daniel,
> >
> > Can you please take a look at this one and smash your r-b onto it if you
> > agree?
> >
> > Thanks, Daniel
> >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49518
> > > Reported-and-Tested-by: Julian Simioni <julian.simioni at gmail.com>
> > > Signed-Off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > ---
> > > drivers/gpu/drm/i915/intel_i2c.c | 7 ++++---
> > > 1 file changed, 4 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/intel_i2c.c b/drivers/gpu/drm/i915/intel_i2c.c
> > > index e04255e..0588d8e 100644
> > > --- a/drivers/gpu/drm/i915/intel_i2c.c
> > > +++ b/drivers/gpu/drm/i915/intel_i2c.c
> > > @@ -418,10 +418,11 @@ clear_err:
> > > * If no ACK is received during the address phase of a transaction,
> > > * the adapter must report -ENXIO.
> > > * It is not clear what to return if no ACK is received at other times.
> > > - * So, we always return -ENXIO in all NAK cases, to ensure we send
> > > - * it at least during the one case that is specified.
> > > + *
> > > + * Unfortunately we can't afford false positives in returning -ENXIO,
> > > + * hence never return -ENXIO.
> > > */
> > > - ret = -ENXIO;
> > > + ret = i;
>
>
> The bugzilla report shows that in the old case, the gmbus failure was:
> [ 2.528812] vgaarb: device changed decodes:
> PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [ 2.540753] [drm] GMBUS [i915 gmbus panel] timed out waiting for idle
> [ 2.580813] [drm:intel_panel_get_backlight], get backlight PWM = 0
>
> And now the failure is:
> [ 2.523015] vgaarb: device changed decodes:
> PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [ 2.534770] [drm] GMBUS [i915 gmbus panel] timed out after NAK
> [ 2.534797] [drm:gmbus_xfer], GMBUS [i915 gmbus panel] NAK for
> addr: 0050 w(1)
> [ 2.534883] [drm:intel_panel_get_backlight], get backlight PWM = 0
>
> From these logs, it looks like in both cases, there is an i2c
> communication error between host and panel. The difference is that:
> * in the old case, the return value was 0, which triggered a silent
> retry (the 40ms gap between GMBUS and PWM messages)
> * but, now that the return value is -ENXIO, the caller does not retry.
>
> Actually, in the 'new' case, there are two errors happening: a NAK and
> then a timeout after the NAK, while waiting for the controller to
> clear the ACTIVE bit. I'm not actually sure what causes "timeout
> after NAK", but I think it means the controller is waiting for entire
> transaction to complete (perhaps the STOP bit after NAK?).
>
> Since this reported issue is happening in this double error path, I'd
> rather a patch that fixes it without disabling the more generic NAK
> path. Maybe something like this:
Hm, good suggestion. This way, if we get a NAK but no issues later on,
we'll still fail faster if gmbus detected that no device is responding.
Which should speed up boot a bit. I'll run this past the bug reporter.
Thanks, Daniel
>
> clear_err:
> /*
> * Wait for bus to IDLE before clearing NAK.
> * If we clear the NAK while bus is still active, then it will stay
> * active and the next transaction may fail.
> */
> + ret = -ENXIO;
> if (wait_for((I915_READ(GMBUS2 + reg_offset) & GMBUS_ACTIVE) == 0,
> - 10))
> + 10)) {
> DRM_DEBUG_KMS("GMBUS [%s] timed out after NAK\n",
> adapter->name);
> + ret = -ETIMEDOUT; // Or 0 ?
> + }
>
> /* Toggle the Software Clear Interrupt bit. This has the effect
> * of resetting the GMBUS controller and so clearing the
> * BUS_ERROR raised by the slave's NAK.
> */
> I915_WRITE(GMBUS1 + reg_offset, GMBUS_SW_CLR_INT);
> I915_WRITE(GMBUS1 + reg_offset, 0);
> I915_WRITE(GMBUS0 + reg_offset, 0);
>
> DRM_DEBUG_KMS("GMBUS [%s] NAK for addr: %04x %c(%d)\n",
> adapter->name, msgs[i].addr,
> (msgs[i].flags & I2C_M_RD) ? 'r' : 'w', msgs[i].len);
>
> /*
> * If no ACK is received during the address phase of a transaction,
> * the adapter must report -ENXIO.
> * It is not clear what to return if no ACK is received at other times.
> - * So, we always return -ENXIO in all NAK cases, to ensure we send
> - * it at least during the one case that is specified.
> + * So, return -ENXIO for NAK after any byte, unless there was a timeout
> + * while waiting for IDLE after NAK.
> */
> - ret = -ENXIO;
> goto out;
>
> -Daniel
>
> >
> > > goto out;
> > >
> > > timeout:
> > > --
> > > 1.7.10
> > >
> >
> > --
> > Daniel Vetter
> > Mail: daniel at ffwll.ch
> > Mobile: +41 (0)79 365 57 48
--
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48
More information about the Intel-gfx
mailing list