[Intel-gfx] [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset

Fri Aug 9 22:06:36 CEST 2013

On Fri, Aug 09, 2013 at 09:17:11PM +0200, Daniel Vetter wrote:
> On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote:
> > A simple modeset, where we only wish to switch over to a new framebuffer
> > such as the transition from fbcon to X, takes around 30-60ms. This is
> > due to three factors:
> > 
> > 1. We need to make sure the fb->obj is in the display domain, which
> > incurs a cache flush to ensure no dirt is left on the scanout.
> > 
> > 2. We need to flush any pending rendering before performing the mmio
> > so that the frame is complete before it is shown.
> > 
> > 3. We currently wait for the vblank after the mmio to be sure that the
> > old fb is no longer being shown before releasing it.
> > 
> > (1) can only be eliminated by userspace preparing the fb->obj in advance
> > to already be in the display domain. This can be done through use of the
> > create2 ioctl, or by reusing an existing fb->obj.
> > 
> > However, (2) and (3) are already solved by the existing page flip
> > mechanism, and it is surprisingly trivial to wire them up for use in the
> > set-base fast path. Though it can be argued that this represents a
> > subtle ABI break in that the set_config ioctl now returns before the old
> > framebuffer is unpinned. The danger is that userspace will start to
> > modify it before it is no longer being shown, however we should be able
> > to prevent that through proper domain tracking.
> 
> Hm, right now we don't prevent anyone from rendering into a to-be-flipped
> out buffer. There was once code in it, using MI_WAIT_EVENT but we've
> ripped it out. I guess we could just throw in a synchronous stall on the
> flip queue though, that should work always.

I'm glad we did. I'd rather put that into userspace rather than have the
kernel impose that policy on everybody, as for X that is exactly the
behaviour we want (i.e. not blocking rendering on the next scanout).

> Testing would be easy if we have the crtc CRC stuff, but that's atm stuck
> due to lack of volunteers ...
> 
> Overall I really like the idea and I think doing most of the plane
> enabling (including psr, fbc, ips, and all that stuff which potentially
> blows through a wblank wait) should be done in async work queues. That
> should then also help resume time a lot.

I'd also like to hear Ville's opinion since with his atomic modesetting
I hope we will be able to achieve something very similar.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre