[Intel-gfx] [PATCH 1/5] drm/i915: timeout parameter for seqno wait

Thu May 3 06:32:17 CEST 2012

On Wed, 2 May 2012 23:47:35 +0200
Daniel Vetter <daniel at ffwll.ch> wrote:

> On Mon, Apr 30, 2012 at 06:39:58PM -0700, Ben Widawsky wrote:
> > Insert a wait parameter in the code so we can possibly timeout on a
> > seqno wait if need be. The code should be functionally the same as
> > before because all the callers will continue to retry if an arbitrary
> > timeout elapses.
> > 
> > We'd like to have nanosecond granularity, but the only way to do this is
> > with hrtimer, and that doesn't fit well with the needs of this code.
> > 
> > v2: Fix rebase error (Chris)
> > Return proper time even in wedged + signal case (Chris + Ben)
> > Use timespec constructs (Ben)
> > Didn't take Daniel's advice regarding the Frankenstein-ness of the
> >   function. I did try his advice, but in the end I liked the way the
> >   original code looked, better.
> > 
> > v3: Make wakeups far less frequent for infinite waits (Chris)
> > 
> > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> 
> Ok, I've looked at this again and noticed stuff like dummy_time. Add to
> that that I don't like the loop which neatly papers over any missed irqs
> without yelling in the logs (which is a qa disaster given our history on
> snb/ivb) and all the other differences between timout-present and infinite
> sleep, I vote for __wait_seqno_timeout (which would always use the
> interruptible wait).
> -Daniel
> 

I'm confused about how the loop neatly papers over IRQs. In the wait
forever case, it sleeps for a second (and we can always make it longer).
Won't the hangcheck fire long before that to report the fact that we've
missed an IRQ? Furthermore, from an end user standpoint, masking such an
event may actually be a nice thing; just saying. It seems Chris and you
both seem to agree on this masking the problem, could one of you provide
some edification and provide the missing piece for me?

Regarding the dummy_time variable - again, forgive my ignorance, but I
don't really understand why this is so awful. I concur it's not the
prettiest code, but I honestly don't understand what's so terrible about
it.

To address the separate function request. Truly I don't feel very
strongly one way or another. I've coded both before submitting the
patches and I felt the complexity of the combined function wasn't
prohibitively confusing, it's fewer LOC, and less duplicated code. I'm
also of the opinion that waiting forever is something we never actually
want to do, and so eventually the first if (timeout == NULL) can turn
into a BUG_ON eventually.

To sum up, I guess if you can give me good explanations on the first two
points, and still want the separate functions, I'll willingly acquiesce.

-- 
Ben Widawsky, Intel Open Source Technology Center