[Intel-gfx] [PATCH v2] drm/i915: Optimise VMA lookup slightly
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Dec 15 16:49:49 UTC 2016
On 13/12/2016 14:47, Chris Wilson wrote:
> On Tue, Dec 13, 2016 at 02:37:27PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Cast VM pointers before substraction to save the compiler
>> doing a smart one which includes multiplication.
>>
>> v2: Only keep the first optimisation and prettify it. (Chris Wilson)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>
> Step 1, ok.
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
>
> (I wasn't against the others, just curious as to what gcc was doing for
> #2 and #3 I'd like just to pursue a different path altogether :)
Thanks.
Yes I know. Longer VMA lists is not something I've tested yet. I've just
noticed that even where lookups are predominantly on short lists it can
still be up to 1% of CPU time spent in the lookup. It averages around
0.7% AFAIR.
More precisely in that test (which is simply running a vsync limited
neverball intro screen :)), 65% of all lookups are on single VMA object!
29% on objects with two VMAs and 29% on on objects with three VMAs.
That's it, no longer lists at all.
How much benefit for this case smarter lookup would make I was not sure.
So simply wanted to tighten up the existing search as much as possible.
Even for that I am not sure that it makes a difference but at least if
we can pointless instructions why not.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list