[Mesa-dev] [PATCH 5/5] i965/miptree: Rewrite the miptree map logic

Ben Widawsky ben at bwidawsk.net
Fri Jul 17 13:51:09 PDT 2015


On Thu, Jul 16, 2015 at 01:45:56PM -0700, Chad Versace wrote:
> On Tue 14 Jul 2015, Ben Widawsky wrote:
> > This patch rewrites the logic for determining which method we using for mapping
> > a miptree. It is my intention that that this patch, the required patches before
> > this do not change functionality, or if they do, it's in very obscure an
> > unobservable cases.
> > 
> > I have two reasons why I decided to write this patch. The existing logic was way
> > too tricky. In particular, the way in which it evaluated which operation to use
> > was out of order - specifically when it checked to use the blitter in
> > use_intel_mipree_map_blit(), part of the check is to determine if it will later
> > be unable to use the GTT. The other reason is to make playing with the various
> > operations much easier. For example, there are some theories being thrown around
> > that we might actually want to use the blitter where we use the GTT today, and
> > vice versa. After this patch, benchmarking those changes is much more
> > straightforward.
> > 
> > It's pretty difficult for me to prove there is no real change going on. I ran a
> > subset of my benchmarks on this though. The following benchmarks show no perf
> > difference on BDW with ministat with n=5 and CI=.95:
> > OglBatch7
> > OglDeferred
> > OglFillPixel
> > OglGeomPoint
> > OglGeomTriList
> > OglHdrBloom
> > OglPSBump2
> > OglPSPhong
> > OglPSPom
> > OglShMapPcf
> > OglTerrainFlyInst
> > OglTexMem512
> > OglVSDiffuse8
> > OglVSInstancing
> > OglZBuffer
> > plot3d
> > trex
> > 
> > It's important to point out that much of the changes effect non-LLC platform,
> > and I do not yet have data for that. I'll be collecting it over the next few
> > days, but I figure this patch can get some comments meanwhile.
> > 
> > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 76 +++++++++++++--------------
> >  1 file changed, 37 insertions(+), 39 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 2788270..545fbf3 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -2283,6 +2283,8 @@ intel_miptree_unmap_movntdqa(struct brw_context *brw,
> >     map->buffer = NULL;
> >     map->ptr = NULL;
> >  }
> > +#else
> > +#define intel_miptree_map_movntdqa(x,y,z,w,a) abort()
> >  #endif
> >  
> >  static void
> > @@ -2621,36 +2623,6 @@ can_blit_slice(struct brw_context *brw,
> >     return true;
> >  }
> >  
> > -static bool
> > -use_intel_mipree_map_blit(struct brw_context *brw,
> > -                          struct intel_mipmap_tree *mt,
> > -                          GLbitfield mode,
> > -                          unsigned int level,
> > -                          unsigned int slice)
> > -{
> > -   if (brw->has_llc &&
> > -       !(mode & GL_MAP_WRITE_BIT) &&
> > -       can_blit_slice(brw, mt, level, slice))
> > -      return true;
> > -
> > -   if (mt->tiling != I915_TILING_NONE &&
> > -       mt->bo->size >= brw->max_gtt_map_object_size) {
> > -      /* XXX: This assertion is actually the final condition for platforms
> > -       * without SSE4.1.  Returning false is not the right thing to do with
> > -       * the current code. On those platforms, the goal of this function is to give
> > -       * preference to the GTT, and at this point we've determined we cannot use
> > -       * the GTT, and we cannot blit, so we are out of options.
> > -       *
> > -       * NOTE: It should be possible to actually handle the case, but AFAIK, we
> > -       * never get this assertion.
> > -       */
> > -      assert(can_blit_slice(brw, mt, level, slice));
> > -      return true;
> > -   }
> > -
> > -   return false;
> > -}
> > -
> >  /**
> >   * Parameter \a out_stride has type ptrdiff_t not because the buffer stride may
> >   * exceed 32 bits but to diminish the likelihood subtle bugs in pointer
> > @@ -2706,18 +2678,44 @@ intel_miptree_map(struct brw_context *brw,
> >        goto done;
> >     }
> >  
> > -   if (use_intel_mipree_map_blit(brw, mt, mode, level, slice)) {
> > -      intel_miptree_map_blit(brw, mt, map, level, slice);
> > +   /* First determine what the available option are, then pick from the best
> > +    * option based on the platform.
> > +    */
> > +   bool can_hw_blit = can_blit_slice(brw, mt, level, slice);
> 
> can_hw_blit is too weak of a condition.
> 
> "can" is very different from "should". Before, this function chose to
> call intel_miptree_map_blit() if use_intel_mipree_map_blit() recommended
> it (because "use" really means "should use" in that function name). The
> set of conditions that satisfies "can" are much larger.
> 
> For example, can_blit_slice() should return true for linear buffers
> (they are blittable, after all). However, intel_miptree_map() should
> mmap those buffers instead of blitting them.
> 
> > +   bool can_use_gtt = mt->bo->size < brw->max_gtt_map_object_size;
> >  #if defined(USE_SSE41)
> > -   } else if (!(mode & GL_MAP_WRITE_BIT) &&
> > -              !mt->compressed && cpu_has_sse4_1 &&
> > -              (mt->pitch % 16 == 0)) {
> > -      intel_miptree_map_movntdqa(brw, mt, map, level, slice);
> > +   bool can_stream_map = cpu_has_sse4_1 && mt->pitch % 16 == 0;
> > +#else
> > +   bool can_stream_map = false;
> >  #endif
> > -   } else {
> > -      assert(mode & GL_MAP_WRITE_BIT == 0);
> > -      assert(!mt->compressed);
> > +
> > +   if (can_stream_map) {
> > +      /* BENCHMARK_ME: GTT maps for non-llc */
> > +      intel_miptree_map_movntdqa(brw, mt, map, level, slice);
> > +      goto done;
> > +   }
> > +
> > +   /*
> > +    * Hopefully we've been able to use the streaming copy, but if we really
> > +    * can't, make a decision based on the two things we know to matter: tiling,
> > +    * and LLC.
> > +    *
> > +    * The general thinking is that with shared cache, doing software detiling
> > +    * is always cheaper than what we'd get from the GTT, and we always want a
> > +    * cached mapping, (so we use the blitter).
> > +    *
> > +    * On non-LLC, since we don't get any benefit from using the blitter wrt
> > +    * caching, and both mechanism can do our detile (support was determined
> > +    * already), opt for GTT first to match the legacy behavior.
> > +    *
> > +    * BENCHMARK_ME: non-llc + tiled + blitter
> > +    */
> > +   if (brw->has_llc && can_hw_blit) {
> > +      intel_miptree_map_blit(brw, mt, map, level, slice);
> > +   } else if (can_use_gtt) {
> >        intel_miptree_map_gtt(brw, mt, map, level, slice);
> > +   } else {
> > +      unreachable("We don't yet support slice blits");
> >     }
> 
> I find the new logic, with multiple if-trees and gotos, more difficult
> to follow and verify for correctness that old style of:
> 
>     if (a) {
>         do_stuff;
>     } else if (b) {
>         do_stuff;
>     } else if (c) {
>         do_stuff;
>     } else {
>         do_stuff;
>     }
> 
> With the old style, given a miptree and the map's readwrite mode,
> I could easily walk through the if tree until I reached "true". With the
> new style in this patch, in theory I could do the same, but I no longer
> have confidence that my logic-walking would be correct.
> 
> Could you accomplish a similar set of cleanups, and still preserve the
> monolithic 'if' tree, by moving all the helper variables (such as
> can_use_gtt) to the top of the function, before the 'if' tree begins?

Forgive my bluntness but I read what you just said as the old way is easier to
read and verify (something I completely disagree with, but it's certainly a
subjective thing). Moving more of the conditions into a single variable is a
benefit assuming you can do it well, but it doesn't solve the complexity that I
see here. I see no reason, and have no motivation to continue down this path if
this is your impression of the patch.


More information about the mesa-dev mailing list