[Intel-gfx] [PATCH] drm/i915: Optimise VMA lookup slightly

Chris Wilson chris at chris-wilson.co.uk
Tue Dec 13 12:41:09 UTC 2016


On Tue, Dec 13, 2016 at 12:22:18PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> A few details to hopefully make a very hot function a tiny bit
> more efficient:
> 
>  1. Cast VM pointers before substraction to save the compiler
>     doing a smart one which includes multiplication.

Indeed. Not pretty though.

static always_inline __kernel_ptrdiff_t ptrdiff(const void *a, const void *b)
{
	return a - b;
}

cmp = ptrdiff(vma->vm, vm);
if (cmp)
	return cmp;


>  2. Use smaller type for comparison since we only care about
>     the sign.

Should be a no-op since the compiler also should only care about the
sign and not be moving the registers about, just the cc and we should be
inlining... Is gcc not smart enough? :(

> 
>  3. Prefer the ppgtt lookup branch and inline it, allowing the
>     compiler to optimise out the second part of i915_vma_compare
>     and save one call indirection.

This runs counter to a better optimisation that completely avoids
calling obj_to_vma for ppgtt lookups (i.e. in execbuffer we go straight
from handle to vma, skipping the handle to obj intermediate lookup).

Primary caller for this function should be ggtt users, with single
negative lookups before creating the ppgtt vma.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list