[Intel-gfx] [RFC] execbuf2 support to avoid fence register allocation
chris at chris-wilson.co.uk
Wed Jun 17 23:33:37 CEST 2009
On Mon, 2009-06-15 at 11:04 -0700, Jesse Barnes wrote:
> I'm still in the process of testing this patchset, but at this point it
> no longer crashes in the various config possibilities (new execbuf2 path
> on both 965+ and pre-965, old path on both), and seems to
> allocate/avoid allocating fence regs in various cases as well.
After working through various issues with my i915, execbuffer2 does seem
to avoid the allocation of fences for normal drawing operations. Yay!
However, benchmarking cairo-drm using execbuffer vs execbuffer2 with the
collection of cairo-traces, show very little variation between the two
approaches. Except for one trace:
drm-rgba-fenced 764675.49: 1.31x ▎
drm-rgba-no-fences 994886.76: 1.00x
drm-rgba-no-fences2 998345.74: 1.00x
[poppler-alt is a trace composed of rendering every page of
http://intellinuxgraphics.org/*.pdf using a new context per page. It is
dominated by unaligned clips, which require a creation of a temporary
clip mask for every operation.]
As you can tell, I was so astonished by this result I had to rerun it!
I haven't yet ascertained what the reason behind for the discrepancy is.
My theory is that without fences the limit upon execbuffer size is GTT
aperture size and so we execute fewer, but much bigger, batches which
stall whilst waiting for eviction. But I'm open to ideas on what it may
be and the best way to tackle (measure and resolve) it.
More information about the Intel-gfx