[Intel-gfx] RFC asynchronous vblank tasks
Daniel Vetter
daniel at ffwll.ch
Fri Jul 5 10:42:49 CEST 2013
On Thu, Jul 04, 2013 at 11:07:06PM +0100, Chris Wilson wrote:
> This is an old series to make sprite and primary plane flip that little
> bit more asynchronous. Shaving 16-20ms off X start up time is no small
> feat when we can start X in around 100-200ms.
Summarizing our discussion on irc from yesterday:
- The first patch seems to be missing ;-)
- The names should imo get more bikeshedding to be more consistent with
the core delayed_work interfaces, e.g. intel_crtc_schedule_vblank_work.
The important part is to be able to easily distinguish work items fired
off from the vblank handler which run in process context (in a workqueue
somewhere) and vblank callbacks that run in interrupt context since
they're more timing critical and don't need sleeping looks. Ville wants
that for his watermark stuff. And I think we could use it to avoid a
bunch of vblank waits in e.g. the fbc or DRRS code that we need to
implement workarounds.
- Imo if we have more delayed unpins we really need to integrate this into
the eviction code. Atm we get mostly away with it since most users don't
have per-crtc framebuffers and we have hence at most 2 buffers in-flight
at any time due to pageflipping. The real fun of making that happen is
to avoid deadlocks.
Especially the last point imo is a blocker for the delayed unpin patches
(the other stuff could go in), and I think it should also convert the
delayed unpin work we use in the pageflip handler while at it.
I think we need a few pieces to make this work:
- global delayed_unpin list_head of objects which can be unpinned after
something has happened (vblank, pageflip, ...)
- second list where objects are stored which can be unpinned
- irqsafe spinlock to protect it
- waitqueue that gets woken up every time someone pushes a new obj from
the delayed_unpin list to the unpin list
Since we're dealing with framebuffer objects everywhere I'd say we could
use those (and hence also shift around the refcounting to there). Plain
gem objects get their delayed unpin through the normal active list, so I
don't think this reduces the usefulness.
We could also embed the vblank callback into the framebuffer object.
Now the actual unpinning happens in two places:
1) From a work item which gets scheduled every time the interrupt handler
pushes something onto the unpin list. In pseudo-code:
mutex_lock(dev->struct_mutex);
while (1) {
spin_lock_irqsave(delayed_unpin_lock);
fb = first_list_item(unpin_list);
if (fb)
list_del(fb->unpin_list);
spin_unlock_irqrestore(delayed_unpin_lock);
if (!fb)
break;
/* the unpin list holds a reference which we inherit */
unpin_fb(fb);
unref_fb(fb);
}
mutex_unlock(dev->struct_mutex);
The same function could also be used to trim pin references in the
pageflip code instead of the currently-used flush_work.
2) From the evict code, at the very end shortly before we give up on
everything. We check whether the delayed_unpin list is empty, and if it's
not we wait on the waitqueue for new stuff to get moved from the delayed
unpin list to the unpin list. Every time we get woken up we grab the
spinlock and unpin the framebuffer. We have to do this ourselves since we
already hold the dev->struct_mutex lock. We can probably reuse the inner
loop from the work item above for that.
Note that if we do that only after we've search the entire active list we
could idle the gpu completely to make sure that the pageflips can indeed
happen. Still for paranoia the wait should also check for gpu hangs, so we
need to wake up the unpin_queue too when a gpu hang happens like we
already do with all the rinq waitqueues.
I don't think that we need anything special before we disable the crtc
since I expect the generic vblank callback infrastructure to already take
care of this by waiting for any pending vblank callbacks to complete
before it kills the crtc.
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list