[Intel-gfx] [PATCH v2] drm/i915: Optimise VMA lookup slightly

Thu Dec 15 17:08:43 UTC 2016

On Thu, Dec 15, 2016 at 04:49:49PM +0000, Tvrtko Ursulin wrote:
> 
> On 13/12/2016 14:47, Chris Wilson wrote:
> >On Tue, Dec 13, 2016 at 02:37:27PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >>Cast VM pointers before substraction to save the compiler
> >>doing a smart one which includes multiplication.
> >>
> >>v2: Only keep the first optimisation and prettify it. (Chris Wilson)
> >>
> >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >
> >Step 1, ok.
> >Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
> >
> >(I wasn't against the others, just curious as to what gcc was doing for
> >#2 and #3 I'd like just to pursue a different path altogether :)
> 
> Thanks.
> 
> Yes I know. Longer VMA lists is not something I've tested yet. I've
> just noticed that even where lookups are predominantly on short
> lists it can still be up to 1% of CPU time spent in the lookup. It
> averages around 0.7% AFAIR.
> 
> More precisely in that test (which is simply running a vsync limited
> neverball intro screen :)), 65% of all lookups are on single VMA
> object! 29% on objects with two VMAs and 29% on on objects with
> three VMAs. That's it, no longer lists at all.

Also note that I've been trying to teach mesa not to be so dumb as well.
We still do get benefits from improving the execbuf vma/reloc paths, but
that pales in comparison to the improvements we can make in mesa.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre