[Mesa-dev] [PATCH 04/14] i965/gen7: Don't allocate hiz miptree structure

Pohjolainen, Topi topi.pohjolainen at intel.com
Mon Jul 21 00:51:01 PDT 2014


On Sat, Jul 19, 2014 at 07:08:59PM -0700, Jordan Justen wrote:
> On Fri, Jul 18, 2014 at 1:58 AM, Pohjolainen, Topi
> <topi.pohjolainen at intel.com> wrote:
> > On Tue, Jul 15, 2014 at 06:32:12PM -0700, Jordan Justen wrote:
> >> We now skip allocating a hiz miptree for gen7. Instead, we calculate
> >> the required hiz buffer parameters and allocate a bo directly.
> >>
> >> Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> >> ---
> >>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 95 ++++++++++++++++++++++++++-
> >>  1 file changed, 93 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> >> index 8719c29..7e8bec8 100644
> >> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> >> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> >> @@ -823,7 +823,10 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
> >>        drm_intel_bo_unreference((*mt)->bo);
> >>        intel_miptree_release(&(*mt)->stencil_mt);
> >>        if ((*mt)->hiz_buf) {
> >> -         intel_miptree_release(&(*mt)->hiz_buf->mt);
> >> +         if ((*mt)->hiz_buf->mt)
> >> +            intel_miptree_release(&(*mt)->hiz_buf->mt);
> >> +         else
> >> +            drm_intel_bo_unreference((*mt)->hiz_buf->bo);
> >>           free((*mt)->hiz_buf);
> >>        }
> >>        intel_miptree_release(&(*mt)->mcs_mt);
> >> @@ -1374,6 +1377,89 @@ intel_miptree_level_enable_hiz(struct brw_context *brw,
> >>  }
> >>
> >>
> >> +/**
> >> + * Helper for intel_miptree_alloc_hiz() that determines the required hiz
> >> + * buffer dimensions and allocates a bo for the hiz buffer.
> >> + */
> >> +static struct intel_miptree_aux_buffer *
> >> +intel_gen7_hiz_buf_create(struct brw_context *brw,
> >> +                          struct intel_mipmap_tree *mt)
> >> +{
> >> +   unsigned z_width = mt->logical_width0;
> >> +   unsigned z_height = mt->logical_height0;
> >> +   const unsigned z_depth = mt->logical_depth0;
> >> +   unsigned hz_width, hz_height, qpitch;
> >
> > Minor nit, qpitch could be called hz_qpitch for clarity as it is a result of
> > hiz-specific rules just as hz_width and hz_height. Simple matter of taste and
> > you choose the way that you feel the best.
> >
> >> +   struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1);
> >> +
> >> +   if (!buf)
> >> +      return NULL;
> >> +
> >> +   /* Gen7 PRM Volume 2, Part 1, 11.5.3 "Hierarchical Depth Buffer" documents
> >> +    * adjustments required for Z_Height and Z_Width based on multisampling.
> >> +    */
> >> +   switch(mt->num_samples) {
> >> +   case 0:
> >> +   case 1:
> >> +      break;
> >> +   case 2:
> >> +   case 4:
> >> +      z_width *= 2;
> >> +      z_height *= 2;
> >> +      break;
> >> +   case 8:
> >> +      z_width *= 4;
> >> +      z_height *= 2;
> >> +      break;
> >> +   default:
> >> +      assert(!"Unsupported sample count!");
> >> +   }
> >> +
> >> +   const unsigned vertical_align = 8; /* 'j' in the docs */
> >> +   const unsigned H0 = z_height;
> >> +   const unsigned h0 = ALIGN(H0, vertical_align);
> >> +   const unsigned h1 = ALIGN(minify(H0, 1), vertical_align);
> >> +   const unsigned Z0 = z_depth;
> >> +
> >> +   /* HZ_Width (bytes) = ceiling(Z_Width / 16) * 16 */
> >> +   hz_width = ALIGN(z_width, 16);
> >> +
> >> +   if (mt->target == GL_TEXTURE_3D) {
> >> +      unsigned H_i = H0;
> >> +      unsigned Z_i = Z0;
> >> +      hz_height = 0;
> >> +      for (int level = mt->first_level; level <= mt->last_level; ++level) {
> >> +         unsigned h_i = ALIGN(H_i, vertical_align);
> >> +         /* sum(i=0 to m; h_i * max(1, floor(Z_Depth/2**i))) */
> >
> > I had to think for a second if you had typo here (2**i) but then realized
> > you used it to mean power-of-two. I've also seen people using 2^i, would that
> > make sense to you?
> >
> >> +         hz_height += h_i * Z_i;
> >> +         H_i = minify(H_i, 1);
> >> +         Z_i = minify(Z_i, 1);
> >> +      }
> >> +      /* HZ_Height =
> >> +       *    (1/2) * sum(i=0 to m; h_i * max(1, floor(Z_Depth/2**i)))
> >> +       */
> >> +      hz_height = CEILING(hz_height, 2);
> >> +   } else {
> >> +      qpitch = h0 + h1 + (12 * vertical_align);
> >> +      /* HZ_Height (rows) = Ceiling ( ( Q_pitch * Z_depth/2) /8 ) * 8 */
> >> +      hz_height = (ALIGN(qpitch, 8) / 2) * Z0;
> >
> > Here the ALIGN is no-op - qpitch is a sum of three already aligned numbers,
> > and hence it is aligned itself. The final result in turn is not always aligned
> > (althought is should be). For example, say
> >
> >   qpitch = ALIGN(16, 8) + ALIGN(minify(16, 1), 8) + 12 * 8 = 15 * 8
> >   ZO = z_depth = 1
> >
> >   => hz_height = (15 * 8 / 2) * 1 = 60
> >
> > This particular case would probably fine as there is only one layer and still
> > a lot of extra. But that may not be the case with higher odd layer numbers
> > anymore.
> >
> > I would change this into:
> >
> >          hz_height = ALIGN(qpitch * Z0 / 2, vertical_align);
> 
> The comment above the assignment is from the docs, and it uses the
> constant 8, rather than 'j', so I thought it would be better just to
> use 8. Although, you are right that it is probably 8 because they set
> 'j' as 8 for the purposes of hiz calculations.
> 
> I also wanted to do the ALIGN before the integer / 2.

I'm missing the rational for this, the spec says that the result of the
division should be aligned by eight, right?

> 
> How do you feel about:
> hz_height = ALIGN(qpitch * Z0, 8) / 2;

With the example numbers the end result is not aligned by 8:

  hz_height = ALIGN(15 * 8 * 1, 8) / 2 = 120 / 2 = 60

> 
> I also think this alternative is closest to the docs:
> hz_height = CEILING(qpitch * Z0, 2 * 8) * 8;

This I agree, this would be in line with the spec:

  hz_height = CEILING(15 * 8 * 1, 2 * 8) * 8 = 64

> 
> -Jordan


More information about the mesa-dev mailing list