[RFC] Reduce idle vblank wakeups

Matthew Garrett mjg59 at srcf.ucam.org
Thu Nov 17 12:46:24 PST 2011


On Thu, Nov 17, 2011 at 09:36:23PM +0100, Mario Kleiner wrote:
> On Nov 17, 2011, at 3:19 AM, Matthew Garrett wrote:
> >Assuming we're sleeping rather than busy-looping, that's certainly ok.
> >My previous experiments with radeon indicated that the scanout irq was
> >certainly not entirely reliable - on the other hand, I was trying
> >to use
> >it for completing memory reclocking within the vblank interval. It was
> >typically still within a few scanlines, so a sanity check there
> >wouldn't
> >pose too much of a problem.
> >
> 
> Sleeping in the timer triggered off path would be ok, but in the on-
> path we need to be relatively fast, so i think some kind of cpu
> friendly busy waiting (cpu_relax() or something like that, if i
> understand its purpose?) probably would be neccessary.

The aim is to reduce the wakeup count and spend more time in deep C 
states. Busy waiting defeats that.

> >Right. My testing of sandybridge suggests that there wasn't a problem
> >here - even with the ping-ponging I was reliably getting 60 interrupts
> >in 60 seconds, with the counter incrementing by 1 each time. I
> >certainly
> >wouldn't push to enable it elsewhere without making sure that the
> >results are reliable.
> >
> 
> How hard did you test it? Given that the off-by-one is a race-
> condition, there is a good chance you miss the bad cases due to
> lucky timing. I think one way to test the ping-pong case might be a
> tight loop with calls to glXGetSyncValuesOML(), or at a lower level
> a tight loop to the drmWaitVblank() ioctl, which is what
> glXGetSyncValuesOML() does.

Enable vblank, wait for vblank, immediately disable vblank, read 
counter, wait 5msec, repeat. Verify that I get 60 vblanks per second and 
that the counter incremented by 60. Tested with various delayoff values, 
which should have had the effect of making it likely that the disable 
would fall at various points throughout the scanout. It's not 
definitive, so I'm happy to do other tests.

> >Ok, so as long as we believe that we're reliably reading the hardware
> >counter and not getting off by one, we should be ok? I just want to be
> >clear that this is dependent on hardware behaviour rather than being
> >absolutely inherent :)
> >
> 
> Yes. The way we need to find the final/new software/hardware vblank
> count at irq off /on time must be consistent with what would have
> happened if the counters would have been incremented by "just
> another vblank irq".

Ok.

> But apart from this, i would assume that even if we can remove all
> races, it probably would still make sense to always have some small
> vblank off delay, even if it is only half a video refresh cycle?
> During execution of a bufferswap or a three-step procedure like my
> toolkit does, or a desktop composition pass (at least the way compiz
> does it, didn't check other compositors), we will usually have
> multiple drm_vblank_get() -> drm_vblank_put() invocations in quick
> succession, triggered by the ddx inside the x-server and the drm
> code during swap preparation and swap completion. Without a delay of
> at least a few milliseconds the code would turn on and off the irq's
> multiple times within a few milliseconds, each time involving
> locking, transactions with the gpu, some memory barriers, etc. As
> long as the desktop is busy with some OpenGL animations, vblank
> irq's will neccessarily fire for each refresh cycle regardless what
> the timeout is. And for a small timeout of a few milliseconds, when
> the desktop finally goes idle, you'd probably end up with at most
> one extra vblank irq, but save a little bit of cpu time for multiple
> on/off transitions for each frame during the busy period.

As long as the delay is shorter than a frame then yes, it ought to be 
fine.

> Out of pure interest, how much power does vblank off actually save,
> compared to waking up the cpu every 16 msecs?

It depends on the state of the rest of the system. If you're under load, 
basically nothing. If you're otherwise idle, around a Watt or so.

-- 
Matthew Garrett | mjg59 at srcf.ucam.org


More information about the dri-devel mailing list