[Intel-gfx] [PATCH 1/3] drm/i915: Use consistent forcewake auto-release timeout across kernel configs

Chris Wilson chris at chris-wilson.co.uk
Mon Apr 4 20:22:11 UTC 2016


On Mon, Apr 04, 2016 at 08:41:20PM +0100, Dave Gordon wrote:
> On 04/04/16 19:58, Chris Wilson wrote:
> >On Mon, Apr 04, 2016 at 05:51:09PM +0100, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >>Current implementation releases the forcewake at any time between
> >>straight away, and one jiffie from the last put, or first automatic
> >>grab.
> >
> >That isn't the problem though. The problem is that we set the timer on
> >first use rather than last use. All you are stating here is that by
> >lengthening the timeout on your system you reduce the number of times it
> >times out.
> 
> I thought that was already done, per my comments on one of your
> recent patches (rename __force_wake_get to __force_wake_auto) and
> your reply (gains are probably worth extra complexity). But of
> course Tvrtko wasn't here the last few days so may not have seen
> that. I suppose it would make sense to fold both changes together.

We were talking about the cost of having 2 passes vs 1 to do the same
task. Before Tvrtko left, I realised again that we didn't defer the
timer itself for every access, rearming a timer on every register access
seems costly (more so switching to hrtimer). Rearming the timer inside
the timer callback seems a reasonable compromise.
 
> But changing the timeout to 1ms makes it generally *shorter* than
> before, not longer. The gain can't just be from having a longer
> timeout, 'cos it isn't. (Average timeout now 1.5ms vs. previous
> 8ms?)

It is. jiffies+1 does not equal an 8ms delay (it would be 1ms on his
machine anways, 10ms on mine). The issue is that +1 jiffie is up to 1
jiffie and often close to 0, and that very well explains the wakeups
Tvrtko measured.

> I think it should come from the fact that accesses are likely to be
> bursty, having a bimodal distribution in the time domain. When we've
> made one access, we're likely to make another much sooner than 1ms
> later, but once we stop making accesses there will probably be far
> more than 1ms before the next set of accesses.

Yes. We expect temporal coherence in the access patten. That's assuredly
so since we very rarely access one register by itself.

> One more idea - do we want two different expiry periods depending on
> whether the caller used the _auto version or not?

Hard to tell. Broadly speaking we have two users of the non-auto
interface: userspace disabling forcewake indefinitely, and the other is
code that wants to hold the spinlock + forcewake over a bunch of
register writes.
 
> If they did, they're probably anticipating "only a few" accesses in
> the current operation, but perhaps with no idea about when the next
> set of accesses will occur.
> 
> If not, then the final decrement means they think they've finished
> for a while, and perhaps don't expect to come back for quite a long
> time. In that situation the timeout only helps avoid rapid off/on
> cycles by unrelated functions, not between logically-related
> operations.

Execlists. (Wants to hold forcewake over a bunch of reads and writes,
and may be very frequent context switches or very slow)
 
> So would two different timeouts make sense? Or even ...
> 
> _auto: longish timeout from the *first* _get() - don't rearm thereafter
> other: shortish timeout from the _put() - non-auto _get() will cancel
> 
> or is that just too complicated?

The challenge is that I think we should be pushing for shorter timeouts,
because of the bimodality in many accesses followed by a long period of
sleep. One of the original reasons for the choice in 1 jiffie with a
pinned timer was that it should mean that on the next call to schedule()
we should be deasserting forcewake (i.e. we only hold forcewake for this
sequence of register accesses). However, execlists scuppers that by
having an interrupt context that requires forcewake, and frequently so.

So I think we do need a combination of something like task_work_run() to
tie a forcewake reference to the active task with something like an
extra timer to allow frequent wakers to avoid having to retake
forcewake.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list