[Mesa-dev] [PATCH 4/4] i965: Allow Y-tiled allocations for large surfaces

Jordan Justen jordan.l.justen at intel.com
Sat Jan 31 11:54:47 PST 2015


I agree with Jason's comments on 3 & 4. I think I asked something
similar previously regarding patch 3. (Ie, why not just do the
adjustments all the time?)

With Jason's comments fixed, series
Reviewed-by: Jordan Justen <jordan.l.justen at intel.com>

On 2015-01-14 10:34:52, Jason Ekstrand wrote:
> 
> 
> On Tue, Jan 13, 2015 at 11:37 PM, Ben Widawsky <benjamin.widawsky at intel.com>
> wrote:
> 
>     This patch will use a new calculation to determine if a surface can be
>     blitted
>     from or to. Previously, the "total_height" member was used. Total_height in
>     the
>     case of 2d, 3d, and cube map arrays is the height of each slice/layer/face.
>     Since the GL map APIS only ever deal with a slice at a time however, the
>     determining factor is really the height of one slice.
> 
>     This patch also has a side effect of not needing to set potentially large
>     texture objects to the CPU domain, which implies we do not need to clflush
>     the
>     entire objects. (See references below for a kernel patch to achieve the
>     same
>     thing)
> 
>     With both the Y-tiled surfaces, and the removal of excessive clflushes,
>     this
>     improves the terrain benchmark on Cherryview (data collected by Jordan)
> 
>     Difference at 95.0% confidence
>     17.9236 +/- 0.252116 153.005% +/- 2.1522% (Student's t, pooled s =
>     0.205889)
> 
>     With the performance governor, and HI-Z raw stall optimization, the
>     improvement
>     is even more stark on Braswell.
> 
>     Jordan was extremely helpful in creating this patch. Consider him
>     co-author.
> 
>     References: http://patchwork.freedesktop.org/patch/38909/
>     Cc: Jordan Justen <jordan.l.justen at intel.com>
>     Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>     ---
>      src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 71
>     +++++++++++++++++++++------
>      1 file changed, 56 insertions(+), 15 deletions(-)
> 
>     diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/
>     drivers/dri/i965/intel_mipmap_tree.c
>     index 639309b..1319f1e 100644
>     --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>     +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>     @@ -86,6 +86,27 @@ compute_msaa_layout(struct brw_context *brw, mesa_format
>     format, GLenum target)
>         }
>      }
> 
>     +static uint32_t
>     +compute_real_blit_height(struct intel_mipmap_tree *mt)
>     +{
>     +   switch (mt->target) {
>     +   case GL_TEXTURE_CUBE_MAP:
>     +   case GL_TEXTURE_1D_ARRAY:
>     +   case GL_TEXTURE_2D_ARRAY:
>     +   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
>     +   case GL_TEXTURE_CUBE_MAP_ARRAY:
>     +      assert(mt->logical_depth0);
>     +      return mt->qpitch;
>     +   case GL_TEXTURE_3D:
>     +      /* FIXME 3d textures don't have a qpitch. I think it's simply the
>     tiled
>     +       * aligned mt->physical_height0. Since 3D textures aren't used
>     often, just
>     +       * print the perf debug from the caller and bail
>     +       */
>     +       /* fallthrough */
>     +   default:
>     +      return mt->total_height;
>     +   }
>     +}
> 
>      /**
>       * For single-sampled render targets ("non-MSRT"), the MCS buffer is a
>     @@ -416,6 +437,17 @@ intel_miptree_create_layout(struct brw_context *brw,
>         return mt;
>      }
> 
>     +static bool
>     +miptree_exceeds_blit_height(struct intel_mipmap_tree *mt)
>     +{
>     +   /* FIXME: Add 3d texture support */
>     +   if (mt->target == GL_TEXTURE_3D && mt->total_height >= 32768) {
>     +      return true;
>     +   }
>     +
>     +   return compute_real_blit_height(mt) >= 32768;
> 
> 
> I don't think this is quite correct.  Let's say we have 2D array texture with a
> qpitch of 32767.  Then slice 1 (the second slice) will start at 32767.  Since
> we have to align to a tile, the vertical offset we use for the blit will be
> 32768 - tile_height.  Then the Y2 coordinat will be (2 * 32767) - (32768 -
> tile_height) which is greater than 32767.  If we just give ourselves an extra
> tile_height of padding here, it should solve this problem.  Probably want to
> throw the above in as a comment as well.
>  
> 
>     +}
>     +
>      /**
>       * \brief Helper function for intel_miptree_create().
>       */
>     @@ -473,10 +505,22 @@ intel_miptree_choose_tiling(struct brw_context *brw,
>         if (minimum_pitch < 64)
>            return I915_TILING_NONE;
> 
>     -   if (ALIGN(minimum_pitch, 512) >= 32768 ||
>     -       mt->total_width >= 32768 || mt->total_height >= 32768) {
>     -      perf_debug("%dx%d miptree too large to blit, falling back to
>     untiled",
>     -                 mt->total_width, mt->total_height);
>     +   if (ALIGN(minimum_pitch, 512) >= 32768 || miptree_exceeds_blit_height
>     (mt)) {
>     +      if (mt->format == GL_TEXTURE_3D) {
>     +         perf_debug("Unsupported large 3D texture blit. "
>     +                    "Falling back to untiled.\n");
>     +      } else {
>     +         /* qpitch should always be greater than or equal to the tile
>     aligned
>     +          * maximum of lod0 height.  That is sufficient to determine if we
>     can
>     +          * blit, but the most optimal value is potentially less.
>     +          */
>     +         if (mt->physical_height0 < 32768) {
>     +            perf_debug("Potentially skipped a blittable buffers %d\n",
>     +                  mt->physical_height0);
>     +         }
>     +         perf_debug("%dx%d miptree too large to blit, falling back to
>     untiled",
>     +                    mt->total_width, mt->total_height);
>     +      }
>            return I915_TILING_NONE;
>         }
> 
>     @@ -620,11 +664,14 @@ intel_miptree_create(struct brw_context *brw,
>                                            BO_ALLOC_FOR_RENDER : 0));
>         mt->pitch = pitch;
> 
>     +   uint32_t size = ALIGN(compute_real_blit_height(mt) * mt->pitch, 512);
>     +   assert(size <= mt->bo->size);
>     +
>         /* If the BO is too large to fit in the aperture, we need to use the
>          * BLT engine to support it.  The BLT paths can't currently handle
>     Y-tiling,
>          * so we need to fall back to X.
>          */
>     -   if (y_or_x && mt->bo->size >= brw->max_gtt_map_object_size) {
>     +   if (y_or_x && size >= brw->max_gtt_map_object_size) {
> 
> 
> We run into the same issue I pointed out above here too.  Maybe we want to roll
> the padding into the compute_real_blit_height function and call it
> compute_real_max_blit_height or something?
>  
> 
>            perf_debug("%dx%d miptree larger than aperture; falling back to
>     X-tiled\n",
>                       mt->total_width, mt->total_height);
> 
>     @@ -1748,6 +1795,8 @@ intel_miptree_map_gtt(struct brw_context *brw,
>         intptr_t x = map->x;
>         intptr_t y = map->y;
> 
>     +   assert(mt->bo->size < brw->max_gtt_map_object_size);
>     +
>         /* For compressed formats, the stride is the number of bytes per
>          * row of blocks.  intel_miptree_get_image_offset() already does
>          * the divide.
>     @@ -2247,16 +2296,8 @@ static bool
>      can_blit_slice(struct intel_mipmap_tree *mt,
>                     unsigned int level, unsigned int slice)
>      {
>     -   uint32_t image_x;
>     -   uint32_t image_y;
>     -   intel_miptree_get_image_offset(mt, level, slice, &image_x, &image_y);
>     -   if (image_x >= 32768 || image_y >= 32768)
>     -      return false;
>     -
>     -   if (mt->pitch >= 32768)
>     -      return false;
>     -
>     -   return true;
>     +   return compute_real_blit_height(mt) < 32768 &&
>     +          mt->pitch < 32768;
>      }
> 
>      /**
>     --
>     2.2.1
> 
>     _______________________________________________
>     mesa-dev mailing list
>     mesa-dev at lists.freedesktop.org
>     http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 


More information about the mesa-dev mailing list