[Intel-gfx] [PATCH] drm/i915: fix DP AUX errors due to false timeouts when using wait_event_timeout

Imre Deak imre.deak at intel.com
Thu May 2 11:37:48 CEST 2013


On Thu, 2013-05-02 at 11:33 +0200, Daniel Vetter wrote:
> On Thu, May 2, 2013 at 11:31 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
> > On Thu, May 2, 2013 at 11:17 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> >> On Thu, May 02, 2013 at 12:00:03PM +0300, Imre Deak wrote:
> >>> Due to possible scheduling latencies wait_event_timeout doesn't
> >>> guarantee a non-zero return value, even if the condition becomes true
> >>> before the specified timeout expires. Thus we can incorrectly signal a
> >>> timeout and abort a DP AUX transaction.
> >>>
> >>> If wait_event_timeout returns 0, it's guaranteed that at least the
> >>> specified timeout (minus one jiffies, see below) had passed, so we can
> >>> fix this by checking the condition explicitly in this case.
> >>>
> >>> Also the timeout that wait_event_timeout() is guaranteed to wait if the
> >>> condition doesn't become true is one less jiffies than what is passed to
> >>> it as a parameter. This is because the absolute expiration time in
> >>> schedule_timeout() may be calculated at a moment close to the next
> >>> scheduling tick, when jiffies is incremented. So make sure we pass always
> >>> a jiffies value of 2 or greater. Here this makes a difference only for
> >>> HZ=100.
> >>>
> >>> This fixes DP AUX errors I saw during booting on an ILK.
> >>>
> >>> This should ideally be fixed in wait_event_timeout(), but that can take
> >>> a while. Until that's done use this fix as a band-aid.
> >>
> >> As we have 3 such vulnerable callsite in our driver alone, perhaps we
> >> should push for your general fix.
> >
> > Yeah, I think this should be fixed in general, doesn't really make
> > much sense given that the jiffies value is relative this way. Also,
> > the actual timeout we need to wait is just 400 us, so I think we're
> > pretty safe here. Can you please resumbit just the recheck part of
> > your patch?

Well, without the +1 to msecs_to_jiffies(10) part, I still can trigger
the problem on HZ=100. Since w/o that we pass 1 jiffies and wait less
than 400us..

> >
> > Also, can you do the same fix for gmbus_wait_idle in intel_i2c.c please?

Yes, can do this.

--Imre

> 
> Oops, just now I've seen your generic patch. Yeah, I like that much better ;-)
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch





More information about the Intel-gfx mailing list