[Mesa-dev] [PATCH 5/5] i965/miptree: Rewrite the miptree map logic

Ben Widawsky benjamin.widawsky at intel.com
Tue Jul 14 09:56:13 PDT 2015


This patch rewrites the logic for determining which method we using for mapping
a miptree. It is my intention that that this patch, the required patches before
this do not change functionality, or if they do, it's in very obscure an
unobservable cases.

I have two reasons why I decided to write this patch. The existing logic was way
too tricky. In particular, the way in which it evaluated which operation to use
was out of order - specifically when it checked to use the blitter in
use_intel_mipree_map_blit(), part of the check is to determine if it will later
be unable to use the GTT. The other reason is to make playing with the various
operations much easier. For example, there are some theories being thrown around
that we might actually want to use the blitter where we use the GTT today, and
vice versa. After this patch, benchmarking those changes is much more
straightforward.

It's pretty difficult for me to prove there is no real change going on. I ran a
subset of my benchmarks on this though. The following benchmarks show no perf
difference on BDW with ministat with n=5 and CI=.95:
OglBatch7
OglDeferred
OglFillPixel
OglGeomPoint
OglGeomTriList
OglHdrBloom
OglPSBump2
OglPSPhong
OglPSPom
OglShMapPcf
OglTerrainFlyInst
OglTexMem512
OglVSDiffuse8
OglVSInstancing
OglZBuffer
plot3d
trex

It's important to point out that much of the changes effect non-LLC platform,
and I do not yet have data for that. I'll be collecting it over the next few
days, but I figure this patch can get some comments meanwhile.

Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 76 +++++++++++++--------------
 1 file changed, 37 insertions(+), 39 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 2788270..545fbf3 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2283,6 +2283,8 @@ intel_miptree_unmap_movntdqa(struct brw_context *brw,
    map->buffer = NULL;
    map->ptr = NULL;
 }
+#else
+#define intel_miptree_map_movntdqa(x,y,z,w,a) abort()
 #endif
 
 static void
@@ -2621,36 +2623,6 @@ can_blit_slice(struct brw_context *brw,
    return true;
 }
 
-static bool
-use_intel_mipree_map_blit(struct brw_context *brw,
-                          struct intel_mipmap_tree *mt,
-                          GLbitfield mode,
-                          unsigned int level,
-                          unsigned int slice)
-{
-   if (brw->has_llc &&
-       !(mode & GL_MAP_WRITE_BIT) &&
-       can_blit_slice(brw, mt, level, slice))
-      return true;
-
-   if (mt->tiling != I915_TILING_NONE &&
-       mt->bo->size >= brw->max_gtt_map_object_size) {
-      /* XXX: This assertion is actually the final condition for platforms
-       * without SSE4.1.  Returning false is not the right thing to do with
-       * the current code. On those platforms, the goal of this function is to give
-       * preference to the GTT, and at this point we've determined we cannot use
-       * the GTT, and we cannot blit, so we are out of options.
-       *
-       * NOTE: It should be possible to actually handle the case, but AFAIK, we
-       * never get this assertion.
-       */
-      assert(can_blit_slice(brw, mt, level, slice));
-      return true;
-   }
-
-   return false;
-}
-
 /**
  * Parameter \a out_stride has type ptrdiff_t not because the buffer stride may
  * exceed 32 bits but to diminish the likelihood subtle bugs in pointer
@@ -2706,18 +2678,44 @@ intel_miptree_map(struct brw_context *brw,
       goto done;
    }
 
-   if (use_intel_mipree_map_blit(brw, mt, mode, level, slice)) {
-      intel_miptree_map_blit(brw, mt, map, level, slice);
+   /* First determine what the available option are, then pick from the best
+    * option based on the platform.
+    */
+   bool can_hw_blit = can_blit_slice(brw, mt, level, slice);
+   bool can_use_gtt = mt->bo->size < brw->max_gtt_map_object_size;
 #if defined(USE_SSE41)
-   } else if (!(mode & GL_MAP_WRITE_BIT) &&
-              !mt->compressed && cpu_has_sse4_1 &&
-              (mt->pitch % 16 == 0)) {
-      intel_miptree_map_movntdqa(brw, mt, map, level, slice);
+   bool can_stream_map = cpu_has_sse4_1 && mt->pitch % 16 == 0;
+#else
+   bool can_stream_map = false;
 #endif
-   } else {
-      assert(mode & GL_MAP_WRITE_BIT == 0);
-      assert(!mt->compressed);
+
+   if (can_stream_map) {
+      /* BENCHMARK_ME: GTT maps for non-llc */
+      intel_miptree_map_movntdqa(brw, mt, map, level, slice);
+      goto done;
+   }
+
+   /*
+    * Hopefully we've been able to use the streaming copy, but if we really
+    * can't, make a decision based on the two things we know to matter: tiling,
+    * and LLC.
+    *
+    * The general thinking is that with shared cache, doing software detiling
+    * is always cheaper than what we'd get from the GTT, and we always want a
+    * cached mapping, (so we use the blitter).
+    *
+    * On non-LLC, since we don't get any benefit from using the blitter wrt
+    * caching, and both mechanism can do our detile (support was determined
+    * already), opt for GTT first to match the legacy behavior.
+    *
+    * BENCHMARK_ME: non-llc + tiled + blitter
+    */
+   if (brw->has_llc && can_hw_blit) {
+      intel_miptree_map_blit(brw, mt, map, level, slice);
+   } else if (can_use_gtt) {
       intel_miptree_map_gtt(brw, mt, map, level, slice);
+   } else {
+      unreachable("We don't yet support slice blits");
    }
 
 done:
-- 
2.4.5



More information about the mesa-dev mailing list