[Mesa-dev] [PATCH] i965: Fix an off-by-1 error in the draw upload code's size calculation.

Ilia Mirkin imirkin at alum.mit.edu
Tue Oct 14 17:36:17 PDT 2014


On Tue, Oct 14, 2014 at 7:42 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> According to INTEL_DEBUG=perf, "Borderlands: The Pre-Sequel" was
> stalling on nearly every glBufferSubData call, with very slightly
> overlapping busy ranges.
>
> It turns out the draw upload code was accidentally including an extra
> stride's worth of data in the vertex buffer size due to a simple
> off-by-one error.  We considered this extra bit of buffer space to be
> busy (in use by the GPU), when it was actually idle.
>
> The new diagram should make it easier to understand the formula.  It's
> basically what I drew on paper when working through an actual
> glDrawRangeElements call.
>
> Eliminates all glBufferSubData stalls in "Borderlands: The Pre-Sequel."
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> Cc: mesa-stable at lists.freedesktop.org
> ---
>  src/mesa/drivers/dri/i965/brw_draw_upload.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> No Piglit regressions on Haswell.  This might help Dota 2 and Serious Sam 3
> as well, but I haven't checked.
>
> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> index 5a12439..6cb653c 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> @@ -486,8 +486,28 @@ brw_prepare_vertices(struct brw_context *brw)
>                    offset = 0;
>                    size = intel_buffer->Base.Size;
>                 } else {
> +                  /* Compute the size/amount of data referenced by the GPU.
> +                   * If the data is interleaved, StrideB may be larger than
> +                   * _ElementSize.  As an example, assume we have 2 interleaved
> +                   * attributes A and B.  The data is organized like this:
> +                   *
> +                   *       Stride    EltSize
> +                   *        _,,_        ,
> +                   *       /    \      / \
> +                   *    A: ---   ---   ---   ---   ---   ---
> +                   *    B:    ---   ---   ---   ---   ---   ---
> +                   *
> +                   *       |===== 4 elts ======|  (4-1) * Stride + EltSize
> +                   *
> +                   * max_index - min_index gives the number of elements that
> +                   * will be referenced.  Say we're drawing 4 elements.  On

Hm... is that right? Intuitively, max_index - min_index + 1 gives the
number of elements that will be referenced. Does the max already have
the +1 built into it?

  -ilia

> +                   * the first three, we need the full stride in order to get
> +                   * to the next element.  But on the last, we only want the
> +                   * element size, since we don't actually read the other
> +                   * interleaved vertex attributes stored beyond that.
> +                   */
>                    offset = buffer->offset + min_index * buffer->stride;
> -                  size = (buffer->stride * (max_index - min_index) +
> +                  size = (buffer->stride * MAX2(max_index - min_index - 1, 0) +
>                            glarray->_ElementSize);
>                 }
>              }
> --
> 2.1.2
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list