[Intel-gfx] [PATCH] [RFC] drm/i915: read-read semaphore optimization

Tue Dec 13 17:59:14 CET 2011

On Tue, 13 Dec 2011 17:01:33 +0100, Daniel Vetter <daniel at ffwll.ch> wrote:
> Afaik the only use-case for parallel reads is video decode with
> post-processing on the render ring. The decode ring needs read-only access
> to reference frames to decode the next frame and the render ring read-only
> access to past frames for post-processing (e.g. deinterlacing). But given
> the general state of perf optimizations in libva I think we have lower
> hanging fruit to chase if we actually miss a performance target for this
> use-case.

One in the near future will be: render to backbuffer (RCS),
pageflip to scanout (BCS), read from front (RCS).

And in its current form UXA will do the back-to-front blit on the BCS.
But that is async and so not a large race window, whereas the pageflip
may takes ~16ms to process. I don't think it is entirely unfeasible that
we see some form of this whilst running compositors or games. Or at
least would if we enabled semaphores for pageflips. Except in the
pageflip scenario we know we are protected by the fb ref, so consider
the hypothetical scenario where we have a working vsync'ed blit...

The real question is in any event do we have enough instrumentation to
diagnose GPU stalls upon buffer migration? Then we can replace the read
optimisation with a tracepoint and wait for a test case.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre