<div dir="ltr">Thank you for reviewing Ken!<br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 26, 2017 at 5:14 AM, Kenneth Graunke <span dir="ltr"><<a href="mailto:kenneth@whitecape.org" target="_blank">kenneth@whitecape.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-m_7774770774541908125gmail-">On Wednesday, October 25, 2017 9:54:46 AM PDT Plamena Manolova wrote:<br>
> This patch modifies the ARB_indirect_parameters logic in<br>
> brw_draw_prims, so that our implementation isn't affected if<br>
> another application attempts to use predicates. Previously we<br>
> were using a predicate with a DELTAS_EQUAL comparison operation<br>
> and relying on the MI_PREDICATE_DATA register being 0.<br>
<br>
</span>It's unfortunately a bit nastier of an explanation. How about:<br>
<br>
Our code to initialize MI_PREDICATE_DATA to 0 was incorrect, so we<br>
were accidentally using whatever value was written there. Because<br>
the kernel does not initialize the MI_PREDICATE_DATA register on<br>
hardware context creation, we might inherit the value from whatever<br>
context was last running on the GPU (likely another process).<br>
<br>
The Haswell command parser also does not currently allow us to write<br>
the MI_PREDICATE_DATA register. Rather than fixing this and requiring<br>
an updated kernel, we switch to a different approach which uses two<br>
<div><div class="gmail-m_7774770774541908125gmail-h5"> predicates (one using SRC_EQUAL and the other DELTAS_EQUAL) that makes<br>
no assumption about the states of any of the predicate registers.<br>
<br></div></div></blockquote><div><br></div><div>This sounds more accurate, thanks!</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="gmail-m_7774770774541908125gmail-h5">
> This<br>
> assumtion is incorrect when another program is using predicates<br>
> which store data in MI_PREDICATE_DATA. This patch introduces a<br>
> different approach which uses 2 predicates (one using SRC_EQUAL<br>
> and the other DELTAS_EQUAL) that makes no assumption about the<br>
> states of any of the predicate registers.<br>
><br>
> Fixes: piglit.spec.arb_indirect_param<wbr>eters.tf-count-arrays<br>
> Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085" rel="noreferrer" target="_blank">https://bugs.freedesktop.org/s<wbr>how_bug.cgi?id=103085</a><br>
><br>
> Signed-off-by: Plamena Manolova <<a href="mailto:plamena.manolova@intel.com" target="_blank">plamena.manolova@intel.com</a>><br>
> CC: Kenneth Graunke <<a href="mailto:kenneth@whitecape.org" target="_blank">kenneth@whitecape.org</a>><br>
> ---<br>
> src/mesa/drivers/dri/i965/brw_<wbr>draw.c | 75 +++++++++++++++++++++---------<wbr>------<br>
> 1 file changed, 44 insertions(+), 31 deletions(-)<br>
><br>
> diff --git a/src/mesa/drivers/dri/i965/br<wbr>w_draw.c b/src/mesa/drivers/dri/i965/br<wbr>w_draw.c<br>
> index 80d4891f6f..81465c79fb 100644<br>
> --- a/src/mesa/drivers/dri/i965/br<wbr>w_draw.c<br>
> +++ b/src/mesa/drivers/dri/i965/br<wbr>w_draw.c<br>
> @@ -866,7 +866,6 @@ brw_draw_prims(struct gl_context *ctx,<br>
> struct brw_context *brw = brw_context(ctx);<br>
> const struct gl_vertex_array **arrays = ctx->Array._DrawArrays;<br>
> int predicate_state = brw->predicate.state;<br>
> - int combine_op = MI_PREDICATE_COMBINEOP_SET;<br>
> struct brw_transform_feedback_object *xfb_obj =<br>
> (struct brw_transform_feedback_object *) gl_xfb_obj;<br>
><br>
> @@ -910,49 +909,63 @@ brw_draw_prims(struct gl_context *ctx,<br>
> * to it.<br>
> */<br>
><br>
> - if (brw->draw.draw_params_count_b<wbr>o &&<br>
> - predicate_state == BRW_PREDICATE_STATE_USE_BIT) {<br>
> - /* We need to empty the MI_PREDICATE_DATA register since it might<br>
> - * already be set.<br>
> - */<br>
> -<br>
> - BEGIN_BATCH(4);<br>
> - OUT_BATCH(MI_PREDICATE_DATA);<br>
> - OUT_BATCH(0u);<br>
> - OUT_BATCH(MI_PREDICATE_DATA + 4);<br>
> - OUT_BATCH(0u);<br>
> - ADVANCE_BATCH();<br>
> -<br>
> - /* We need to combine the results of both predicates.*/<br>
> - combine_op = MI_PREDICATE_COMBINEOP_AND;<br>
> - }<br>
> -<br>
> for (i = 0; i < nr_prims; i++) {<br>
> /* Implementation of ARB_indirect_parameters via predicates */<br>
> if (brw->draw.draw_params_count_b<wbr>o) {<br>
> - struct brw_bo *draw_id_bo = NULL;<br>
> - uint32_t draw_id_offset;<br>
> -<br>
> - intel_upload_data(brw, &prims[i].draw_id, 4, 4, &draw_id_bo,<br>
> - &draw_id_offset);<br>
> -<br>
> brw_emit_pipe_control_flush(b<wbr>rw, PIPE_CONTROL_FLUSH_ENABLE);<br>
><br>
> + /*<br>
<br>
</div></div>Extra line here, should be /* Upload the ... (also below).<br>
<span class="gmail-m_7774770774541908125gmail-"><br>
> + * Upload the current draw count from the draw parameters buffer to<br>
> + * MI_PREDICATE_SRC0<br>
> + */<br>
> brw_load_register_mem(brw, MI_PREDICATE_SRC0,<br>
> brw->draw.draw_params_count_b<wbr>o,<br>
> brw->draw.draw_params_count_o<wbr>ffset);<br>
> - brw_load_register_mem(brw, MI_PREDICATE_SRC1, draw_id_bo,<br>
> - draw_id_offset);<br>
> -<br>
> + /*<br>
> + * Upload the id of the current primitive to MI_PREDICATE_SRC1.<br>
> + */<br>
> + brw_load_register_imm64(brw, MI_PREDICATE_SRC1, prims[i].draw_id);<br>
> +<br>
> + /*<br>
> + * This calculates MI_PREDICATE_SRC0 - MI_PREDICATE_SRC1 and stores it<br>
> + * in MI_PREDICATE_DATA without modifying the Predicate State Bit since<br>
> + * we don't need the comparision result (it's<br>
> + * MI_PREDICATE_SRC0 == MI_PREDICATE_SRC0).<br>
<br>
</span>MI_PREDICATE_SRC0 == MI_PREDICATE_SRC1, right?<br>
<span class="gmail-m_7774770774541908125gmail-"><br></span></blockquote><div> </div><div>Yeah, this is just a typo. </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-m_7774770774541908125gmail-">
> + */<br>
> BEGIN_BATCH(1);<br>
> OUT_BATCH(GEN7_MI_PREDICATE |<br>
> - MI_PREDICATE_LOADOP_LOADINV | combine_op |<br>
> - MI_PREDICATE_COMPAREOP_<wbr>DELTAS_EQUAL);<br>
> + MI_PREDICATE_LOADOP_KEEP | MI_PREDICATE_COMBINEOP_SET |<br>
> + MI_PREDICATE_COMPAREOP_SRCS_E<wbr>QUAL);<br>
> ADVANCE_BATCH();<br>
><br>
> - brw->predicate.state = BRW_PREDICATE_STATE_USE_BIT;<br>
><br>
> - brw_bo_unreference(draw_id_<wbr>bo);<br>
> + /* We set both MI_PREDICATE_SRC0 and MI_PREDICATE_SRC1 to 0 */<br>
> + brw_load_register_imm64(brw, MI_PREDICATE_SRC0, 0);<br>
> + brw_load_register_imm64(brw, MI_PREDICATE_SRC1, 0);<br>
> +<br>
> + BEGIN_BATCH(1);<br>
> + if (brw->predicate.state != BRW_PREDICATE_STATE_USE_BIT) {<br>
> + /*<br>
> + * This calculates MI_PREDICATE_SRC0 - MI_PREDICATE_SRC1 which<br>
> + * is 0, compares it to the value stored in MI_PREDICATE_DATA<br>
> + * (MI_PREDICATE_DATA == 0) and sets the Predicate State Bit.<br>
> + */<br>
> + OUT_BATCH(GEN7_MI_PREDICATE |<br>
> + MI_PREDICATE_LOADOP_LOADINV | MI_PREDICATE_COMBINEOP_SET |<br>
> + MI_PREDICATE_COMPAREOP_DELTAS_<wbr>EQUAL);<br>
<br>
</span>I'm not quite convinced that this works...<br>
<br>
MI_PREDICATE_DATA is initialized with (DrawCount - DrawID). Here,<br>
you're checking whether MI_PREDICATE_DATA == 0, or in other words<br>
DrawID == DrawCount. You're setting the predicate bit if the value<br>
<br>
Let's say you had a draw count of 4:<br>
<br>
Draw Count (SRC0) | Draw ID (SRC1) | MI_PREDICATE_DATA | Predicate Bit<br>
------------------------------<wbr>------------------------------<wbr>----------<br>
4 | 0 | 4 | True<br>
4 | 1 | 3 | True<br>
4 | 2 | 2 | True<br>
4 | 3 | 1 | True<br>
4 | 4 | 0 | False<br>
4 | 5 | -1 | True<br>
4 | 6 | -2 | True<br>
<br>
So, the predicate properly flips to false when Draw ID == Draw Count,<br>
but it doesn't stay false if Draw ID > Draw Count.<br>
<br>
I think what you need to do is use MI_PREDICATE_COMBINEOP_AND, so it'll<br>
end up being (True & True) until the end, then (True & False) on draw 4,<br>
and (False & True) on draws >= 5.<br>
<br>
In order to make that work, you'll need to do:<br>
<br>
unsigned combine_op = MI_PREDICATE_COMBINEOP_AND;<br>
if (i == 0 && brw->predicate.state != BRW_PREDICATE_STATE_USE_BIT)<br>
combine_op = MI_PREDICATE_COMBINEOP_SET;<br>
<br>
so we do SET on the first iteration, unless we're combining with an<br>
existing predicate.<br>
<div class="gmail-m_7774770774541908125gmail-HOEnZb"><div class="gmail-m_7774770774541908125gmail-h5"><br></div></div></blockquote><div><br></div><div>After taking another look you're absolutely right. For some reason propagating the predicates like this didn't occur to me.</div><div>If we use this approach couldn't we get away with just having one predicate per draw call? Maybe something like this:</div><div><br></div><div> unsigned combine_op = MI_PREDICATE_COMBINEOP_AND;<br> if (i == 0 && brw->predicate.state != BRW_PREDICATE_STATE_USE_BIT)<br> combine_op = MI_PREDICATE_COMBINEOP_SET;</div><div><br></div><div> BEGIN_BATCH(1);</div><div> OUT_BATCH(GEN7_MI_PREDICATE |</div><div> MI_PREDICATE_LOADOP_LOADINV | combine_op |</div><div> MI_PREDICATE_COMPAREOP_SRCS_EQUAL);</div><div> ADVANCE_BATCH();</div><div> </div><div>Then we can ditch the one using MI_PREDICATE_COMPAREOP_DELTAS_<wbr>EQUAL?</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail-m_7774770774541908125gmail-HOEnZb"><div class="gmail-m_7774770774541908125gmail-h5">
> + } else {<br>
> + /*<br>
> + * We should be able to use conditional rendering with indirect<br>
> + * parameters, so we use MI_PREDICATE_COMBINEOP_AND to combine the<br>
> + * result with the one stored in the Predicate State Bit.<br>
> + */<br>
> + OUT_BATCH(GEN7_MI_PREDICATE |<br>
> + MI_PREDICATE_LOADOP_LOADINV | MI_PREDICATE_COMBINEOP_AND |<br>
> + MI_PREDICATE_COMPAREOP_DELTAS_<wbr>EQUAL);<br>
> + }<br>
> + ADVANCE_BATCH();<br>
> +<br>
> + brw->predicate.state = BRW_PREDICATE_STATE_USE_BIT;<br>
> }<br>
><br>
> brw_draw_single_prim(ctx, arrays, &prims[i], i, xfb_obj, stream,<br>
><br>
<br>
</div></div></blockquote></div><br></div></div>