<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 19, 2016 at 7:48 AM, Iago Toral <span dir="ltr"><<a href="mailto:itoral@igalia.com" target="_blank">itoral@igalia.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Thu, 2016-05-19 at 00:21 -0700, Jason Ekstrand wrote:<br>
> The previous code got the BO the first time we encountered it. However,<br>
> this can potentially lead to problems if the BO is used for multiple arrays<br>
> with the same buffer object because the range we declare as busy may not be<br>
> quite right. By delaying the call to intel_bufferobj_buffer, we can ensure<br>
> that we have the full range for the given buffer.<br>
><br>
> Cc: "10.2" <<a href="mailto:mesa-stable@lists.freedesktop.org">mesa-stable@lists.freedesktop.org</a>><br>
> ---<br>
> src/mesa/drivers/dri/i965/brw_draw_upload.c | 71 ++++++++++++++++++++---------<br>
> 1 file changed, 49 insertions(+), 22 deletions(-)<br>
><br>
> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c<br>
> index 3ec37f8..0a7725d 100644<br>
> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c<br>
> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c<br>
> @@ -453,6 +453,11 @@ brw_prepare_vertices(struct brw_context *brw)<br>
> if (brw->vb.nr_buffers)<br>
> return;<br>
><br>
> + /* The range of data in a given buffer represented as [min, max) */<br>
> + struct intel_buffer_object *enabled_buffer[VERT_ATTRIB_MAX];<br>
> + uint32_t buffer_range_start[VERT_ATTRIB_MAX];<br>
> + uint32_t buffer_range_end[VERT_ATTRIB_MAX];<br>
> +<br>
> for (i = j = 0; i < brw->vb.nr_enabled; i++) {<br>
> struct brw_vertex_element *input = brw->vb.enabled[i];<br>
> const struct gl_client_array *glarray = input->glarray;<br>
> @@ -460,12 +465,31 @@ brw_prepare_vertices(struct brw_context *brw)<br>
> if (_mesa_is_bufferobj(glarray->BufferObj)) {<br>
> struct intel_buffer_object *intel_buffer =<br>
> intel_buffer_object(glarray->BufferObj);<br>
> - unsigned k;<br>
> +<br>
> + const uint32_t offset = (uintptr_t)glarray->Ptr;<br>
<br>
</div></div>Should we use uint64_t instead or do we know that these offsets need to<br>
be within a 32-bit address?<br></blockquote><div><br></div><div>I think they do need to be within 32 bits at the moment because we use 32 bits everywhere. Maybe on BDW+ we should do 64 bits but I think that's a separate patch.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="HOEnZb"><div class="h5"><br>
> + uint32_t start, range;<br>
> + if (glarray->InstanceDivisor) {<br>
> + start = offset;<br>
> + range = (glarray->StrideB * ((brw->num_instances /<br>
> + glarray->InstanceDivisor) - 1) +<br>
> + glarray->_ElementSize);<br>
> + } else {<br>
> + if (!brw->vb.index_bounds_valid) {<br>
> + start = 0;<br>
> + range = intel_buffer->Base.Size;<br>
> + } else {<br>
> + start = offset + min_index * glarray->StrideB;<br>
> + range = (glarray->StrideB * (max_index - min_index) +<br>
> + glarray->_ElementSize);<br>
> + }<br>
> + }<br>
><br>
> /* If we have a VB set to be uploaded for this buffer object<br>
> * already, reuse that VB state so that we emit fewer<br>
> * relocations.<br>
> */<br>
> + unsigned k;<br>
> for (k = 0; k < i; k++) {<br>
> const struct gl_client_array *other = brw->vb.enabled[k]->glarray;<br>
> if (glarray->BufferObj == other->BufferObj &&<br>
> @@ -475,6 +499,9 @@ brw_prepare_vertices(struct brw_context *brw)<br>
> {<br>
> input->buffer = brw->vb.enabled[k]->buffer;<br>
> input->offset = glarray->Ptr - other->Ptr;<br>
> +<br>
> + buffer_range_start[k] = MIN2(buffer_range_start[k], start);<br>
> + buffer_range_end[k] = MAX2(buffer_range_end[k], start + range);<br>
> break;<br>
> }<br>
> }<br>
> @@ -482,29 +509,13 @@ brw_prepare_vertices(struct brw_context *brw)<br>
> struct brw_vertex_buffer *buffer = &brw->vb.buffers[j];<br>
><br>
> /* Named buffer object: Just reference its contents directly. */<br>
> - buffer->offset = (uintptr_t)glarray->Ptr;<br>
> + buffer->offset = offset;<br>
> buffer->stride = glarray->StrideB;<br>
> buffer->step_rate = glarray->InstanceDivisor;<br>
><br>
> - uint32_t offset, size;<br>
> - if (glarray->InstanceDivisor) {<br>
> - offset = buffer->offset;<br>
> - size = (buffer->stride * ((brw->num_instances /<br>
> - glarray->InstanceDivisor) - 1) +<br>
> - glarray->_ElementSize);<br>
> - } else {<br>
> - if (!brw->vb.index_bounds_valid) {<br>
> - offset = 0;<br>
> - size = intel_buffer->Base.Size;<br>
> - } else {<br>
> - offset = buffer->offset + min_index * buffer->stride;<br>
> - size = (buffer->stride * (max_index - min_index) +<br>
> - glarray->_ElementSize);<br>
> - }<br>
> - }<br>
> - buffer->bo = intel_bufferobj_buffer(brw, intel_buffer,<br>
> - offset, size);<br>
> - drm_intel_bo_reference(buffer->bo);<br>
> + enabled_buffer[j] = intel_buffer;<br>
> + buffer_range_start[j] = start;<br>
> + buffer_range_end[j] = start + range;<br>
><br>
> input->buffer = j++;<br>
> input->offset = 0;<br>
> @@ -519,7 +530,7 @@ brw_prepare_vertices(struct brw_context *brw)<br>
> * probably a service to the poor programmer to do so rather than<br>
> * trying to just not render.<br>
> */<br>
> - assert(input->offset < brw->vb.buffers[input->buffer].bo->size);<br>
> + assert(input->offset < intel_buffer->Base.Size);<br>
> } else {<br>
> /* Queue the buffer object up to be uploaded in the next pass,<br>
> * when we've decided if we're doing interleaved or not.<br>
> @@ -554,6 +565,22 @@ brw_prepare_vertices(struct brw_context *brw)<br>
> }<br>
> }<br>
><br>
> + /* Now that we've set up all of the buffers, we walk through and reference<br>
> + * each of them. We do this late so that we get the right size in each<br>
> + * buffer and don't reference too little data.<br>
> + */<br>
> + for (i = 0; i < j; i++) {<br>
> + struct brw_vertex_buffer *buffer = &brw->vb.buffers[i];<br>
> + if (buffer->bo)<br>
> + continue;<br>
> +<br>
> + const uint32_t start = buffer_range_start[i];<br>
> + const uint32_t range = buffer_range_end[i] - buffer_range_start[i];<br>
> +<br>
> + buffer->bo = intel_bufferobj_buffer(brw, enabled_buffer[i], start, range);<br>
> + drm_intel_bo_reference(buffer->bo);<br>
> + }<br>
> +<br>
> /* If we need to upload all the arrays, then we can trim those arrays to<br>
> * only the used elements [min_index, max_index] so long as we adjust all<br>
> * the values used in the 3DPRIMITIVE i.e. by setting the vertex bias.<br>
<br>
<br>
</div></div></blockquote></div><br></div></div>