[Intel-gfx] [RFC 3/3] drm/i915: Micro-optimize i915_gem_obj_to_vma
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Apr 21 12:05:53 UTC 2016
From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
i915_gem_obj_to_vma is one of the most expensive functions in
our profiles. Could avoiding some branching by replacing it
with arithmetic be beneficial? Some benchmarks suggest it
slightly might.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0549dea683e1..243bfb922eb3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4642,11 +4642,21 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
struct i915_address_space *vm)
{
struct i915_vma *vma;
+
+ BUILD_BUG_ON(I915_GGTT_VIEW_NORMAL != 0);
+
list_for_each_entry(vma, &obj->vma_list, obj_link) {
- if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL &&
- vma->vm == vm)
+ /*
+ * Below is just a branching avoiding way of saying:
+ * vma_ggtt_view.type == I915_GGTT_VIEW_NORMAL && vma->vm == vm,
+ * which relies on the fact I915_GGTT_VIEW_NORMAL has to be
+ * zero.
+ */
+ if (!((unsigned long)vma->ggtt_view.type |
+ ((unsigned long)vma->vm ^ (unsigned long)vm)))
return vma;
}
+
return NULL;
}
--
1.9.1
More information about the Intel-gfx
mailing list