[PATCH 16/24] drm/i915: postpone the fill-dma operation for large objects

Matthew Auld matthew.auld at intel.com
Fri Sep 8 17:58:04 UTC 2017


If we are going to fill the entire page anyway when inserting our
entries then there's not much point in clearing it before hand. For
particularly large objects the fill_page_dma() tends to rank rather
highly in perf when running benchmarks like gem_exec_fault, for both the
non-huge and especially the huge page cases.

Signed-off-by: Matthew Auld <matthew.auld at intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
Cc: Chris Wilson <chris at chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 35ba39883094..bfb6430c6992 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1344,7 +1344,14 @@ static int gen8_ppgtt_alloc_pd(struct i915_address_space *vm,
 			if (IS_ERR(pt))
 				goto unwind;
 
-			gen8_initialize_pt(vm, pt);
+			/* As a small optimisation we can postpone clearing the
+			 * pt if we are going fill every PTE when we later
+			 * insert our entries. This is especially true for 1G
+			 * and 2M pages where the pt remains unused.
+			 */
+			if (length - start < (1 << GEN8_PDE_SHIFT) ||
+			    !IS_ALIGNED(start, 1 << GEN8_PDE_SHIFT))
+				gen8_initialize_pt(vm, pt);
 
 			gen8_ppgtt_set_pde(vm, pd, pt, pde);
 			pd->used_pdes++;
@@ -1375,7 +1382,15 @@ static int gen8_ppgtt_alloc_pdp(struct i915_address_space *vm,
 			if (IS_ERR(pd))
 				goto unwind;
 
-			gen8_initialize_pd(vm, pd);
+			/* As a small optimisation we can postpone clearing the
+			 * pd if we are going fill every PDE when we later
+			 * insert the entries. This is especially true for 1G
+			 * pages we the pd remains unused.
+			 */
+			if (length - start < (1 << GEN8_PDPE_SHIFT) ||
+			    !IS_ALIGNED(start, 1 << GEN8_PDPE_SHIFT))
+				gen8_initialize_pd(vm, pd);
+
 			gen8_ppgtt_set_pdpe(vm, pdp, pd, pdpe);
 			pdp->used_pdpes++;
 			GEM_BUG_ON(pdp->used_pdpes > i915_pdpes_per_pdp(vm));
-- 
2.13.5



More information about the Intel-gfx-trybot mailing list