[Mesa-dev] [PATCH] i965: Fix ARB_indirect_parameters logic.

Manolova, Plamena plamena.manolova at intel.com
Thu Oct 26 11:24:10 UTC 2017


Thank you for reviewing Ken!

On Thu, Oct 26, 2017 at 5:14 AM, Kenneth Graunke <kenneth at whitecape.org>
wrote:

> On Wednesday, October 25, 2017 9:54:46 AM PDT Plamena Manolova wrote:
> > This patch modifies the ARB_indirect_parameters logic in
> > brw_draw_prims, so that our implementation isn't affected if
> > another application attempts to use predicates. Previously we
> > were using a predicate with a DELTAS_EQUAL comparison operation
> > and relying on the MI_PREDICATE_DATA register being 0.
>
> It's unfortunately a bit nastier of an explanation.  How about:
>
>   Our code to initialize MI_PREDICATE_DATA to 0 was incorrect, so we
>   were accidentally using whatever value was written there.  Because
>   the kernel does not initialize the MI_PREDICATE_DATA register on
>   hardware context creation, we might inherit the value from whatever
>   context was last running on the GPU (likely another process).
>
>   The Haswell command parser also does not currently allow us to write
>   the MI_PREDICATE_DATA register.  Rather than fixing this and requiring
>   an updated kernel, we switch to a different approach which uses two
>   predicates (one using SRC_EQUAL and the other DELTAS_EQUAL) that makes
>   no assumption about the states of any of the predicate registers.
>
>
This sounds more accurate, thanks!

> This
> > assumtion is incorrect when another program is using predicates
> > which store data in MI_PREDICATE_DATA. This patch introduces a
> > different approach which uses 2 predicates (one using SRC_EQUAL
> > and the other DELTAS_EQUAL) that makes no assumption about the
> > states of any of the predicate registers.
> >
> > Fixes: piglit.spec.arb_indirect_parameters.tf-count-arrays
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103085
> >
> > Signed-off-by: Plamena Manolova <plamena.manolova at intel.com>
> > CC: Kenneth Graunke <kenneth at whitecape.org>
> > ---
> >  src/mesa/drivers/dri/i965/brw_draw.c | 75
> +++++++++++++++++++++---------------
> >  1 file changed, 44 insertions(+), 31 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c
> b/src/mesa/drivers/dri/i965/brw_draw.c
> > index 80d4891f6f..81465c79fb 100644
> > --- a/src/mesa/drivers/dri/i965/brw_draw.c
> > +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> > @@ -866,7 +866,6 @@ brw_draw_prims(struct gl_context *ctx,
> >     struct brw_context *brw = brw_context(ctx);
> >     const struct gl_vertex_array **arrays = ctx->Array._DrawArrays;
> >     int predicate_state = brw->predicate.state;
> > -   int combine_op = MI_PREDICATE_COMBINEOP_SET;
> >     struct brw_transform_feedback_object *xfb_obj =
> >        (struct brw_transform_feedback_object *) gl_xfb_obj;
> >
> > @@ -910,49 +909,63 @@ brw_draw_prims(struct gl_context *ctx,
> >      * to it.
> >      */
> >
> > -    if (brw->draw.draw_params_count_bo &&
> > -        predicate_state == BRW_PREDICATE_STATE_USE_BIT) {
> > -      /* We need to empty the MI_PREDICATE_DATA register since it might
> > -       * already be set.
> > -       */
> > -
> > -      BEGIN_BATCH(4);
> > -      OUT_BATCH(MI_PREDICATE_DATA);
> > -      OUT_BATCH(0u);
> > -      OUT_BATCH(MI_PREDICATE_DATA + 4);
> > -      OUT_BATCH(0u);
> > -      ADVANCE_BATCH();
> > -
> > -      /* We need to combine the results of both predicates.*/
> > -      combine_op = MI_PREDICATE_COMBINEOP_AND;
> > -   }
> > -
> >     for (i = 0; i < nr_prims; i++) {
> >        /* Implementation of ARB_indirect_parameters via predicates */
> >        if (brw->draw.draw_params_count_bo) {
> > -         struct brw_bo *draw_id_bo = NULL;
> > -         uint32_t draw_id_offset;
> > -
> > -         intel_upload_data(brw, &prims[i].draw_id, 4, 4, &draw_id_bo,
> > -                           &draw_id_offset);
> > -
> >           brw_emit_pipe_control_flush(brw, PIPE_CONTROL_FLUSH_ENABLE);
> >
> > +         /*
>
> Extra line here, should be /* Upload the ... (also below).
>
> > +          * Upload the current draw count from the draw parameters
> buffer to
> > +          * MI_PREDICATE_SRC0
> > +          */
> >           brw_load_register_mem(brw, MI_PREDICATE_SRC0,
> >                                 brw->draw.draw_params_count_bo,
> >                                 brw->draw.draw_params_count_offset);
> > -         brw_load_register_mem(brw, MI_PREDICATE_SRC1, draw_id_bo,
> > -                               draw_id_offset);
> > -
> > +         /*
> > +          * Upload the id of the current primitive to MI_PREDICATE_SRC1.
> > +          */
> > +         brw_load_register_imm64(brw, MI_PREDICATE_SRC1,
> prims[i].draw_id);
> > +
> > +         /*
> > +          * This calculates MI_PREDICATE_SRC0 - MI_PREDICATE_SRC1 and
> stores it
> > +          * in MI_PREDICATE_DATA without modifying the Predicate State
> Bit since
> > +          * we don't need the comparision result (it's
> > +          * MI_PREDICATE_SRC0 == MI_PREDICATE_SRC0).
>
> MI_PREDICATE_SRC0 == MI_PREDICATE_SRC1, right?
>
>
Yeah, this is just a typo.

> +          */
> >           BEGIN_BATCH(1);
> >           OUT_BATCH(GEN7_MI_PREDICATE |
> > -                   MI_PREDICATE_LOADOP_LOADINV | combine_op |
> > -                   MI_PREDICATE_COMPAREOP_DELTAS_EQUAL);
> > +                   MI_PREDICATE_LOADOP_KEEP |
> MI_PREDICATE_COMBINEOP_SET |
> > +                   MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> >           ADVANCE_BATCH();
> >
> > -         brw->predicate.state = BRW_PREDICATE_STATE_USE_BIT;
> >
> > -         brw_bo_unreference(draw_id_bo);
> > +         /* We set both MI_PREDICATE_SRC0 and MI_PREDICATE_SRC1 to 0 */
> > +         brw_load_register_imm64(brw, MI_PREDICATE_SRC0, 0);
> > +         brw_load_register_imm64(brw, MI_PREDICATE_SRC1, 0);
> > +
> > +         BEGIN_BATCH(1);
> > +         if (brw->predicate.state != BRW_PREDICATE_STATE_USE_BIT) {
> > +            /*
> > +             * This calculates MI_PREDICATE_SRC0 - MI_PREDICATE_SRC1
> which
> > +             * is 0, compares it to the value stored in
> MI_PREDICATE_DATA
> > +             * (MI_PREDICATE_DATA == 0) and sets the Predicate State
> Bit.
> > +             */
> > +            OUT_BATCH(GEN7_MI_PREDICATE |
> > +                      MI_PREDICATE_LOADOP_LOADINV |
> MI_PREDICATE_COMBINEOP_SET |
> > +                      MI_PREDICATE_COMPAREOP_DELTAS_EQUAL);
>
> I'm not quite convinced that this works...
>
> MI_PREDICATE_DATA is initialized with (DrawCount - DrawID).  Here,
> you're checking whether MI_PREDICATE_DATA == 0, or in other words
> DrawID == DrawCount.  You're setting the predicate bit if the value
>
> Let's say you had a draw count of 4:
>
>   Draw Count (SRC0) | Draw ID (SRC1) | MI_PREDICATE_DATA | Predicate Bit
>   ----------------------------------------------------------------------
>                   4 |              0 |                 4 | True
>                   4 |              1 |                 3 | True
>                   4 |              2 |                 2 | True
>                   4 |              3 |                 1 | True
>                   4 |              4 |                 0 | False
>                   4 |              5 |                -1 | True
>                   4 |              6 |                -2 | True
>
> So, the predicate properly flips to false when Draw ID == Draw Count,
> but it doesn't stay false if Draw ID > Draw Count.
>
> I think what you need to do is use MI_PREDICATE_COMBINEOP_AND, so it'll
> end up being (True & True) until the end, then (True & False) on draw 4,
> and (False & True) on draws >= 5.
>
> In order to make that work, you'll need to do:
>
>          unsigned combine_op = MI_PREDICATE_COMBINEOP_AND;
>          if (i == 0 && brw->predicate.state != BRW_PREDICATE_STATE_USE_BIT)
>             combine_op = MI_PREDICATE_COMBINEOP_SET;
>
> so we do SET on the first iteration, unless we're combining with an
> existing predicate.
>
>
After taking another look you're absolutely right. For some reason
propagating the predicates like this didn't occur to me.
If we use this approach couldn't we get away with just having one predicate
per draw call? Maybe something like this:

         unsigned combine_op = MI_PREDICATE_COMBINEOP_AND;
         if (i == 0 && brw->predicate.state != BRW_PREDICATE_STATE_USE_BIT)
            combine_op = MI_PREDICATE_COMBINEOP_SET;

         BEGIN_BATCH(1);
         OUT_BATCH(GEN7_MI_PREDICATE |
                   MI_PREDICATE_LOADOP_LOADINV | combine_op |
                   MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
         ADVANCE_BATCH();

Then we can ditch the one using MI_PREDICATE_COMPAREOP_DELTAS_EQUAL?

> +         } else {
> > +            /*
> > +             * We should be able to use conditional rendering with
> indirect
> > +             * parameters, so we use MI_PREDICATE_COMBINEOP_AND to
> combine the
> > +             * result with the one stored in the Predicate State Bit.
> > +             */
> > +            OUT_BATCH(GEN7_MI_PREDICATE |
> > +                      MI_PREDICATE_LOADOP_LOADINV |
> MI_PREDICATE_COMBINEOP_AND |
> > +                      MI_PREDICATE_COMPAREOP_DELTAS_EQUAL);
> > +         }
> > +         ADVANCE_BATCH();
> > +
> > +         brw->predicate.state = BRW_PREDICATE_STATE_USE_BIT;
> >        }
> >
> >        brw_draw_single_prim(ctx, arrays, &prims[i], i, xfb_obj, stream,
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20171026/baf09c84/attachment-0001.html>


More information about the mesa-dev mailing list