[Intel-gfx] [RFC] execbuf2 support to avoid fence register allocation

Chris Wilson chris at chris-wilson.co.uk
Wed Jun 17 23:33:37 CEST 2009


On Mon, 2009-06-15 at 11:04 -0700, Jesse Barnes wrote:
> I'm still in the process of testing this patchset, but at this point it
> no longer crashes in the various config possibilities (new execbuf2 path
> on both 965+ and pre-965, old path on both), and seems to
> allocate/avoid allocating fence regs in various cases as well.

After working through various issues with my i915, execbuffer2 does seem
to avoid the allocation of fences for normal drawing operations. Yay!

However, benchmarking cairo-drm using execbuffer vs execbuffer2 with the
collection of cairo-traces, show very little variation between the two
approaches. Except for one trace:

(poppler-alt-20090608)
     drm-rgba-fenced	 764675.49:  1.31x ▎
     drm-rgba-no-fences	 994886.76:  1.00x 
     drm-rgba-no-fences2 998345.74:  1.00x

[poppler-alt is a trace composed of rendering every page of
http://intellinuxgraphics.org/*.pdf using a new context per page. It is
dominated by unaligned clips, which require a creation of a temporary
clip mask for every operation.]

As you can tell, I was so astonished by this result I had to rerun it!

I haven't yet ascertained what the reason behind for the discrepancy is.
My theory is that without fences the limit upon execbuffer size is GTT
aperture size and so we execute fewer, but much bigger, batches which
stall whilst waiting for eviction. But I'm open to ideas on what it may
be and the best way to tackle (measure and resolve) it.

So suggestions?
-ickle




More information about the Intel-gfx mailing list