[Mesa-dev] [PATCH] i965: Fix an off-by-1 error in the draw upload code's size calculation.

Ian Romanick idr at freedesktop.org
Wed Oct 15 19:01:32 PDT 2014


On 10/15/2014 10:04 AM, Kenneth Graunke wrote:
> On Wednesday, October 15, 2014 09:50:42 AM Jason Ekstrand wrote:
>> On Tue, Oct 14, 2014 at 4:42 PM, Kenneth Graunke <kenneth at whitecape.org>
>> wrote:
>>
>>> According to INTEL_DEBUG=perf, "Borderlands: The Pre-Sequel" was
>>> stalling on nearly every glBufferSubData call, with very slightly
>>> overlapping busy ranges.
>>>
>>> It turns out the draw upload code was accidentally including an extra
>>> stride's worth of data in the vertex buffer size due to a simple
>>> off-by-one error.  We considered this extra bit of buffer space to be
>>> busy (in use by the GPU), when it was actually idle.
>>>
>>
>> Nice Catch!
>>
>>
>>>
>>> The new diagram should make it easier to understand the formula.  It's
>>> basically what I drew on paper when working through an actual
>>> glDrawRangeElements call.
>>>
>>> Eliminates all glBufferSubData stalls in "Borderlands: The Pre-Sequel."
>>>
>>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>>> Cc: mesa-stable at lists.freedesktop.org
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_draw_upload.c | 22 +++++++++++++++++++++-
>>>  1 file changed, 21 insertions(+), 1 deletion(-)
>>>
>>> No Piglit regressions on Haswell.  This might help Dota 2 and Serious Sam 3
>>> as well, but I haven't checked.
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c
>>> b/src/mesa/drivers/dri/i965/brw_draw_upload.c
>>> index 5a12439..6cb653c 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
>>> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
>>> @@ -486,8 +486,28 @@ brw_prepare_vertices(struct brw_context *brw)
>>>                    offset = 0;
>>>                    size = intel_buffer->Base.Size;
>>>                 } else {
>>> +                  /* Compute the size/amount of data referenced by the
>>> GPU.
>>> +                   * If the data is interleaved, StrideB may be larger
>>> than
>>> +                   * _ElementSize.  As an example, assume we have 2
>>> interleaved
>>> +                   * attributes A and B.  The data is organized like this:
>>> +                   *
>>> +                   *       Stride    EltSize
>>> +                   *        _,,_        ,
>>> +                   *       /    \      / \
>>> +                   *    A: ---   ---   ---   ---   ---   ---
>>> +                   *    B:    ---   ---   ---   ---   ---   ---
>>> +                   *
>>> +                   *       |===== 4 elts ======|  (4-1) * Stride + EltSize
>>> +                   *
>>> +                   * max_index - min_index gives the number of elements
>>> that
>>> +                   * will be referenced.  Say we're drawing 4 elements.
>>> On
>>> +                   * the first three, we need the full stride in order to
>>> get
>>> +                   * to the next element.  But on the last, we only want
>>> the
>>> +                   * element size, since we don't actually read the other
>>> +                   * interleaved vertex attributes stored beyond that.
>>> +                   */
>>>                    offset = buffer->offset + min_index * buffer->stride;
>>> -                  size = (buffer->stride * (max_index - min_index) +
>>> +                  size = (buffer->stride * MAX2(max_index - min_index -
>>> 1, 0) +
>>>                            glarray->_ElementSize);
>>>
>>
>> I'm not so sure about this new formula.  Looking at the docs on
>> glDrawRangeElements, the index range is inclusive on both ends meaning that
>> the number of elements drawn is max_index - min_index + 1.  If that's the
>> case, then the old formula was correct.  Are we using a half-open interval
>> in mesa?
> 
> Yes, you're both right...I got tripped up by seeing (max_index - min_index) instead of (elements = max_index - min_index + 1) and then (elements - 1).
> 
> Looking over the code, everything sure looks correct.
> 
> Which leaves the question of why both Borderlands: TPS and Serious Sam 3 have all kinds of stalling with the current code, and work fine with no stalls with the patch.
> 
> I wonder if the games are incorrectly translating DirectX's DrawIndexedPrimitive call, which takes MinIndex and NumVertices, where NumVertices is apparently supposed to be (Max-Min+1).  I suppose I could ask...

That seems very likely.  This may be an optimization / work-around to
enable via driconf.  If two games make that mistake, you can be sure
there are others. :(

> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20141015/b88b38ab/attachment.sig>


More information about the mesa-dev mailing list