[Intel-gfx] [PATCH v2] drm/i915: Optimise VMA lookup slightly

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Thu Dec 15 16:49:49 UTC 2016


On 13/12/2016 14:47, Chris Wilson wrote:
> On Tue, Dec 13, 2016 at 02:37:27PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Cast VM pointers before substraction to save the compiler
>> doing a smart one which includes multiplication.
>>
>> v2: Only keep the first optimisation and prettify it. (Chris Wilson)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>
> Step 1, ok.
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
>
> (I wasn't against the others, just curious as to what gcc was doing for
> #2 and #3 I'd like just to pursue a different path altogether :)

Thanks.

Yes I know. Longer VMA lists is not something I've tested yet. I've just 
noticed that even where lookups are predominantly on short lists it can 
still be up to 1% of CPU time spent in the lookup. It averages around 
0.7% AFAIR.

More precisely in that test (which is simply running a vsync limited 
neverball intro screen :)), 65% of all lookups are on single VMA object! 
29% on objects with two VMAs and 29% on on objects with three VMAs. 
That's it, no longer lists at all.

How much benefit for this case smarter lookup would make I was not sure. 
So simply wanted to tighten up the existing search as much as possible. 
Even for that I am not sure that it makes a difference but at least if 
we can pointless instructions why not.

Regards,

Tvrtko



More information about the Intel-gfx mailing list