[Intel-gfx] [RFC 3/3] drm/i915: Micro-optimize i915_gem_obj_to_vma

Chris Wilson chris at chris-wilson.co.uk
Tue Apr 26 10:45:14 UTC 2016


On Tue, Apr 26, 2016 at 11:35:53AM +0100, Dave Gordon wrote:
> On 21/04/16 13:05, Tvrtko Ursulin wrote:
> >From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >
> >i915_gem_obj_to_vma is one of the most expensive functions in
> >our profiles. Could avoiding some branching by replacing it
> >with arithmetic be beneficial? Some benchmarks suggest it
> >slightly might.
> >
> >Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >---
> >  drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 0549dea683e1..243bfb922eb3 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -4642,11 +4642,21 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> >  				     struct i915_address_space *vm)
> >  {
> >  	struct i915_vma *vma;
> >+
> >+	BUILD_BUG_ON(I915_GGTT_VIEW_NORMAL != 0);
> >+
> >  	list_for_each_entry(vma, &obj->vma_list, obj_link) {
> >-		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL &&
> >-		    vma->vm == vm)
> >+		/*
> >+		 * Below is just a branching avoiding way of saying:
> >+		 * vma_ggtt_view.type == I915_GGTT_VIEW_NORMAL && vma->vm == vm,
> >+		 * which relies on the fact I915_GGTT_VIEW_NORMAL has to be
> >+		 * zero.
> >+		 */
> >+		if (!((unsigned long)vma->ggtt_view.type |
> >+		    ((unsigned long)vma->vm ^ (unsigned long)vm)))
> >  			return vma;
> >  	}
> >+
> >  	return NULL;
> >  }
> 
> Other alternatives might include splitting the vma_list, so that we
> have one list for the most-frequently searched-for entries (GGTT
> view NORMAL) and for everything else, so the above would just need a
> single test for equality.
> 
> Or, slightly less effectively, add GGTT/NORMAL entries at the head
> of the list and others at the tail (and search backwards if you
> *don't* want a GGTT/NORMAL entry). That would still need the
> comparisons, but would likely hit an early match.

We want one list for convenience elsewhere, but can keep a rht in
parallel. This is not as effective/important as keeping a hashtable to
translate from handle to vma, but is still useful for some stress cases.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list