[Mesa-dev] [PATCH v3 04/16] i965/draw: Delay when we get the bo for vertex buffers
Kenneth Graunke
kenneth at whitecape.org
Mon May 23 08:57:57 UTC 2016
On Monday, May 23, 2016 1:35:31 AM PDT Jason Ekstrand wrote:
> On Mon, May 23, 2016 at 1:19 AM, Kenneth Graunke <kenneth at whitecape.org>
> wrote:
>
> > On Friday, May 20, 2016 4:53:15 PM PDT Jason Ekstrand wrote:
> > > The previous code got the BO the first time we encountered it. However,
> > > this can potentially lead to problems if the BO is used for multiple
> > arrays
> > > with the same buffer object because the range we declare as busy may not
> > be
> > > quite right. By delaying the call to intel_bufferobj_buffer, we can
> > ensure
> > > that we have the full range for the given buffer.
> > >
> > > Cc: "11.1 11.2" <mesa-stable at lists.freedesktop.org>
> > > Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
> > > ---
> > > src/mesa/drivers/dri/i965/brw_draw_upload.c | 71 +++++++++++++++++++
> > +---------
> > > 1 file changed, 49 insertions(+), 22 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > b/src/mesa/drivers/
> > dri/i965/brw_draw_upload.c
> > > index 3ec37f8..0a7725d 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > > @@ -453,6 +453,11 @@ brw_prepare_vertices(struct brw_context *brw)
> > > if (brw->vb.nr_buffers)
> > > return;
> > >
> > > + /* The range of data in a given buffer represented as [min, max) */
> > > + struct intel_buffer_object *enabled_buffer[VERT_ATTRIB_MAX];
> > > + uint32_t buffer_range_start[VERT_ATTRIB_MAX];
> > > + uint32_t buffer_range_end[VERT_ATTRIB_MAX];
> > > +
> > > for (i = j = 0; i < brw->vb.nr_enabled; i++) {
> > > struct brw_vertex_element *input = brw->vb.enabled[i];
> > > const struct gl_client_array *glarray = input->glarray;
> > > @@ -460,12 +465,31 @@ brw_prepare_vertices(struct brw_context *brw)
> > > if (_mesa_is_bufferobj(glarray->BufferObj)) {
> > > struct intel_buffer_object *intel_buffer =
> > > intel_buffer_object(glarray->BufferObj);
> > > - unsigned k;
> > > +
> > > + const uint32_t offset = (uintptr_t)glarray->Ptr;
> > > +
> > > + uint32_t start, range;
> > > + if (glarray->InstanceDivisor) {
> > > + start = offset;
> > > + range = (glarray->StrideB * ((brw->num_instances /
> > > + glarray->InstanceDivisor) - 1)
> > +
> > > + glarray->_ElementSize);
> > > + } else {
> > > + if (!brw->vb.index_bounds_valid) {
> > > + start = 0;
> > > + range = intel_buffer->Base.Size;
> > > + } else {
> > > + start = offset + min_index * glarray->StrideB;
> > > + range = (glarray->StrideB * (max_index - min_index) +
> > > + glarray->_ElementSize);
> > > + }
> > > + }
> > >
> > > /* If we have a VB set to be uploaded for this buffer object
> > > * already, reuse that VB state so that we emit fewer
> > > * relocations.
> > > */
> > > + unsigned k;
> > > for (k = 0; k < i; k++) {
> > > const struct gl_client_array *other =
> > brw->vb.enabled[k]->glarray;
> > > if (glarray->BufferObj == other->BufferObj &&
> > > @@ -475,6 +499,9 @@ brw_prepare_vertices(struct brw_context *brw)
> > > {
> > > input->buffer = brw->vb.enabled[k]->buffer;
> > > input->offset = glarray->Ptr - other->Ptr;
> > > +
> > > + buffer_range_start[k] = MIN2(buffer_range_start[k],
> > start);
> > > + buffer_range_end[k] = MAX2(buffer_range_end[k], start +
> > range);
> >
> > Hey Jason,
> >
> > This patch should work, but I wonder if this is the solution we want.
> >
> > You've found a real bug - we go look for an existing VERTEX_BUFFER_STATE
> > entry that could be reused. If we found one, we wouldn't call
> > intel_bufferobj_buffer (and thus mark_buffer_gpu_usage), so we failed to
> > update the busy-tracking information.
> >
> > However, rather than widening the existing range to MIN(starts),
> > MAX(ends), I wonder if we shouldn't just mark the new subrange busy.
> >
> > I don't know how common this is, but consider the case where we had:
> >
> > Existing range: BO #1, [0, 100]
> > New range: BO #1, [1000, 1100].
> >
> > Your code would widen the busy range to [0, 1100]. Yes, technically,
> > that's how we set up the VERTEX_BUFFER_STATE. But the GPU should only
> > ever read from the first or second range - never [101, 999].
> >
> > The point of reusing an existing VERTEX_BUFFER_STATE is supposedly to
> > reduce relocations...but it seems unfortunate to expand busy ranges in
> > order to do that, as stalls are way more expensive than relocations.
> >
> > It seems like we could solve this bug by simply adding an
> > intel_bufferobj_buffer call:
> >
> > for (k = 0; k < i; k++) {
> > const struct gl_client_array *other = brw->vb.enabled[k]->glarray;
> > if (glarray->BufferObj == other->BufferObj &&
> > glarray->StrideB == other->StrideB &&
> > glarray->InstanceDivisor == other->InstanceDivisor &&
> > (uintptr_t)(glarray->Ptr - other->Ptr) < glarray->StrideB)
> > {
> > input->buffer = brw->vb.enabled[k]->buffer;
> > input->offset = glarray->Ptr - other->Ptr;
> > --NEW!-> intel_bufferobj_buffer(brw, intel_buffer, offset, size);
> > break;
> > }
> > }
> >
> > I suppose we'd still need to move the offset/size calculations earlier,
> > like you've done in this patch.
> >
> > Thoughts?
> >
>
> Perhaps? I did think about that but it seemed like a lot of work. Also, I
> looked at intel_bufferobj_buffer and found out that it just uses an
> interval so as far as ranges go it works out exactly the same.
Whoops. Yeah, I'd forgotten that mark_buffer_gpu_usage just does
MIN/MAX anyway. So yeah...my comment is rather pointless. If that case
is actually important (and I have no evidence that it is), then we'd
have even more things to fix, first.
> The real reason for the move wasn't the call to intel_bufferobj_buffer but
> so that we could get the correct size to set in VERTEX_BUFFER_STATE. We
> could emit more VB states to solve that problem and, TBH, that might not be
> a bad idea. I'm not really sure.
> --Jason
Right. At any rate, this seems reasonable.
I'm trying to convince myself that the new arrays are initialized in
all cases. I may have to apply the patch and re-read it tomorrow.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160523/1a6b4ed5/attachment.sig>
More information about the mesa-dev
mailing list