[Intel-gfx] [RFC 3/3] drm/i915: Micro-optimize i915_gem_obj_to_vma
Dave Gordon
david.s.gordon at intel.com
Tue Apr 26 10:35:53 UTC 2016
On 21/04/16 13:05, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> i915_gem_obj_to_vma is one of the most expensive functions in
> our profiles. Could avoiding some branching by replacing it
> with arithmetic be beneficial? Some benchmarks suggest it
> slightly might.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0549dea683e1..243bfb922eb3 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4642,11 +4642,21 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
> struct i915_address_space *vm)
> {
> struct i915_vma *vma;
> +
> + BUILD_BUG_ON(I915_GGTT_VIEW_NORMAL != 0);
> +
> list_for_each_entry(vma, &obj->vma_list, obj_link) {
> - if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL &&
> - vma->vm == vm)
> + /*
> + * Below is just a branching avoiding way of saying:
> + * vma_ggtt_view.type == I915_GGTT_VIEW_NORMAL && vma->vm == vm,
> + * which relies on the fact I915_GGTT_VIEW_NORMAL has to be
> + * zero.
> + */
> + if (!((unsigned long)vma->ggtt_view.type |
> + ((unsigned long)vma->vm ^ (unsigned long)vm)))
> return vma;
> }
> +
> return NULL;
> }
Other alternatives might include splitting the vma_list, so that we have
one list for the most-frequently searched-for entries (GGTT view NORMAL)
and for everything else, so the above would just need a single test for
equality.
Or, slightly less effectively, add GGTT/NORMAL entries at the head of
the list and others at the tail (and search backwards if you *don't*
want a GGTT/NORMAL entry). That would still need the comparisons, but
would likely hit an early match.
.Dave.
More information about the Intel-gfx
mailing list