[Intel-gfx] [PATCH] drm/i915: Update write_domains on active list after flush.

Mon Feb 1 14:06:49 CET 2010

On Mon, 1 Feb 2010 13:26:14 +0100, Daniel Vetter <daniel at ffwll.ch> wrote:
> btw, I've thought a bit about why you want to do such crazy tricks with
> flushes in the middle of batchbuffers.

If we were to be strict, then the 2D drivers would effectively have to
emit one op per batch, as we often draw to a mask and then immediately use
that in a composite operation. As such, we want to use pipelined flushes
(rather than MI_FLUSH) so that we can actually utilize the GPU
effectively. In the extreme, a sequence of 2D operations is still
typically very short.

> 1) Per execbuf-ioctl overhead on the cpu. I haven't seen anything obvious
> there. But even when profiling points this way, IMHO it's better to fix it
> than to paper over it by issuing fewer ioctls.

Hmm, in the worst case it is pure syscall overhead and state maintenance,
the GPU is effectively idle. The best way to reduce this overhead is
actually to improve our drivers to generate fewer batches and even fewer
reloctions...

> 2) Per execbuf gpu<->cpu synchronization (e.g. too many interrupts). Code
> as-is is quite suboptimal, for when we need some flushes, we already emit
> more than one interrupt per execbuf. There's also an XXX already there
> no one yet bothered to fix. I have some ideas there to mitigate this (as
> prep work for something else).

It is easy enough to roughly halve the number of interrupt commands we
make, but I haven't actually measured a performance difference to justify
the patch. The critical thing is not to wait on the GPU in the first
place. :)

> 3) Per execbuf overhead for the gpu due to the need to re-emit all the
> necessary state (shaders, tex-units, ...). My idea to fix this:

[snip cookies]

Similar ideas here, but I was thinking of using hardware contexts and
specifying in execbuffer2 the active context. Then the kernel can do
save/restore from the ringbuffer as necessary.

As for the problem of a GPU hog, we need an in-kernel GPU scheduler to
prevent such DoS. Be evil and construct a circular chain of batch
buffer(s).
-ickle

-- 
Chris Wilson, Intel Open Source Technology Centre