[Mesa-stable] [Mesa-dev] [PATCH] i965: Fix an off-by-1 error in the draw upload code's size calculation.

Kenneth Graunke kenneth at whitecape.org
Wed Oct 15 10:04:17 PDT 2014


On Wednesday, October 15, 2014 09:50:42 AM Jason Ekstrand wrote:
> On Tue, Oct 14, 2014 at 4:42 PM, Kenneth Graunke <kenneth at whitecape.org>
> wrote:
> 
> > According to INTEL_DEBUG=perf, "Borderlands: The Pre-Sequel" was
> > stalling on nearly every glBufferSubData call, with very slightly
> > overlapping busy ranges.
> >
> > It turns out the draw upload code was accidentally including an extra
> > stride's worth of data in the vertex buffer size due to a simple
> > off-by-one error.  We considered this extra bit of buffer space to be
> > busy (in use by the GPU), when it was actually idle.
> >
> 
> Nice Catch!
> 
> 
> >
> > The new diagram should make it easier to understand the formula.  It's
> > basically what I drew on paper when working through an actual
> > glDrawRangeElements call.
> >
> > Eliminates all glBufferSubData stalls in "Borderlands: The Pre-Sequel."
> >
> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > Cc: mesa-stable at lists.freedesktop.org
> > ---
> >  src/mesa/drivers/dri/i965/brw_draw_upload.c | 22 +++++++++++++++++++++-
> >  1 file changed, 21 insertions(+), 1 deletion(-)
> >
> > No Piglit regressions on Haswell.  This might help Dota 2 and Serious Sam 3
> > as well, but I haven't checked.
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > index 5a12439..6cb653c 100644
> > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > @@ -486,8 +486,28 @@ brw_prepare_vertices(struct brw_context *brw)
> >                    offset = 0;
> >                    size = intel_buffer->Base.Size;
> >                 } else {
> > +                  /* Compute the size/amount of data referenced by the
> > GPU.
> > +                   * If the data is interleaved, StrideB may be larger
> > than
> > +                   * _ElementSize.  As an example, assume we have 2
> > interleaved
> > +                   * attributes A and B.  The data is organized like this:
> > +                   *
> > +                   *       Stride    EltSize
> > +                   *        _,,_        ,
> > +                   *       /    \      / \
> > +                   *    A: ---   ---   ---   ---   ---   ---
> > +                   *    B:    ---   ---   ---   ---   ---   ---
> > +                   *
> > +                   *       |===== 4 elts ======|  (4-1) * Stride + EltSize
> > +                   *
> > +                   * max_index - min_index gives the number of elements
> > that
> > +                   * will be referenced.  Say we're drawing 4 elements.
> > On
> > +                   * the first three, we need the full stride in order to
> > get
> > +                   * to the next element.  But on the last, we only want
> > the
> > +                   * element size, since we don't actually read the other
> > +                   * interleaved vertex attributes stored beyond that.
> > +                   */
> >                    offset = buffer->offset + min_index * buffer->stride;
> > -                  size = (buffer->stride * (max_index - min_index) +
> > +                  size = (buffer->stride * MAX2(max_index - min_index -
> > 1, 0) +
> >                            glarray->_ElementSize);
> >
> 
> I'm not so sure about this new formula.  Looking at the docs on
> glDrawRangeElements, the index range is inclusive on both ends meaning that
> the number of elements drawn is max_index - min_index + 1.  If that's the
> case, then the old formula was correct.  Are we using a half-open interval
> in mesa?

Yes, you're both right...I got tripped up by seeing (max_index - min_index) instead of (elements = max_index - min_index + 1) and then (elements - 1).

Looking over the code, everything sure looks correct.

Which leaves the question of why both Borderlands: TPS and Serious Sam 3 have all kinds of stalling with the current code, and work fine with no stalls with the patch.

I wonder if the games are incorrectly translating DirectX's DrawIndexedPrimitive call, which takes MinIndex and NumVertices, where NumVertices is apparently supposed to be (Max-Min+1).  I suppose I could ask...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-stable/attachments/20141015/23f880e0/attachment-0001.sig>


More information about the mesa-stable mailing list