[Intel-gfx] [RFC 3/3] drm/i915: Micro-optimize i915_gem_obj_to_vma
Chris Wilson
chris at chris-wilson.co.uk
Tue Apr 26 10:45:14 UTC 2016
On Tue, Apr 26, 2016 at 11:35:53AM +0100, Dave Gordon wrote:
> On 21/04/16 13:05, Tvrtko Ursulin wrote:
> >From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >
> >i915_gem_obj_to_vma is one of the most expensive functions in
> >our profiles. Could avoiding some branching by replacing it
> >with arithmetic be beneficial? Some benchmarks suggest it
> >slightly might.
> >
> >Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >---
> > drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
> > 1 file changed, 12 insertions(+), 2 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 0549dea683e1..243bfb922eb3 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -4642,11 +4642,21 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> > struct i915_address_space *vm)
> > {
> > struct i915_vma *vma;
> >+
> >+ BUILD_BUG_ON(I915_GGTT_VIEW_NORMAL != 0);
> >+
> > list_for_each_entry(vma, &obj->vma_list, obj_link) {
> >- if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL &&
> >- vma->vm == vm)
> >+ /*
> >+ * Below is just a branching avoiding way of saying:
> >+ * vma_ggtt_view.type == I915_GGTT_VIEW_NORMAL && vma->vm == vm,
> >+ * which relies on the fact I915_GGTT_VIEW_NORMAL has to be
> >+ * zero.
> >+ */
> >+ if (!((unsigned long)vma->ggtt_view.type |
> >+ ((unsigned long)vma->vm ^ (unsigned long)vm)))
> > return vma;
> > }
> >+
> > return NULL;
> > }
>
> Other alternatives might include splitting the vma_list, so that we
> have one list for the most-frequently searched-for entries (GGTT
> view NORMAL) and for everything else, so the above would just need a
> single test for equality.
>
> Or, slightly less effectively, add GGTT/NORMAL entries at the head
> of the list and others at the tail (and search backwards if you
> *don't* want a GGTT/NORMAL entry). That would still need the
> comparisons, but would likely hit an early match.
We want one list for convenience elsewhere, but can keep a rht in
parallel. This is not as effective/important as keeping a hashtable to
translate from handle to vma, but is still useful for some stress cases.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list