[Mesa-stable] [Mesa-dev] [PATCH 2/3 v2] i965: Swap the order of the vertex ID and edge flag attributes

Ben Widawsky ben at bwidawsk.net
Fri Aug 21 22:14:01 PDT 2015


top-post to start from fresh...

Okay, so after a serious attempt at attempting to write something which I think
is "more" correct, I failed. I'm planning to just push your patches (with my
r-b and t-b) after I finish a jenkins run. Unless anyone has arguments.

My biggest confusion is the use of the dummy element - I can't make sense of why
it's needed, but it clearly is.

On Tue, Aug 18, 2015 at 06:27:42PM -0700, Ben Widawsky wrote:
> On Mon, Jul 13, 2015 at 06:01:14PM +0100, Neil Roberts wrote:
> > The edge flag data on Gen6+ is passed through the fixed function hardware as
> > an extra attribute. According to the PRM it must be the last valid
> > VERTEX_ELEMENT structure. However if the vertex ID is also used then another
> > extra element is added to source the VID. This made it so the vertex ID is in
> > the wrong register in the vertex shader and the edge attribute is no longer in
> > the last element.
> 
> It took me forever to find the docs which stated the edge flag must be last...
> could you maybe add a PRM location to the code comment, and/or commit message?
> 
> > 
> > v2: Also implement for BDW+
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84677
> > Cc: "10.6 10.5" <mesa-stable at lists.freedesktop.org>
> > ---
> >  src/mesa/drivers/dri/i965/brw_draw_upload.c  | 30 +++++++--------
> >  src/mesa/drivers/dri/i965/gen8_draw_upload.c | 56 +++++++++++++++++++++-------
> >  2 files changed, 57 insertions(+), 29 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > index 320e40e..67a8dfd 100644
> > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > @@ -787,21 +787,6 @@ static void brw_emit_vertices(struct brw_context *brw)
> >                      ((i * 4) << BRW_VE1_DST_OFFSET_SHIFT));
> >     }
> >  
> > -   if (brw->gen >= 6 && gen6_edgeflag_input) {
> > -      uint32_t format =
> > -         brw_get_vertex_surface_type(brw, gen6_edgeflag_input->glarray);
> > -
> > -      OUT_BATCH((gen6_edgeflag_input->buffer << GEN6_VE0_INDEX_SHIFT) |
> > -                GEN6_VE0_VALID |
> > -                GEN6_VE0_EDGE_FLAG_ENABLE |
> > -                (format << BRW_VE0_FORMAT_SHIFT) |
> > -                (gen6_edgeflag_input->offset << BRW_VE0_SRC_OFFSET_SHIFT));
> > -      OUT_BATCH((BRW_VE1_COMPONENT_STORE_SRC << BRW_VE1_COMPONENT_0_SHIFT) |
> > -                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_1_SHIFT) |
> > -                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
> > -                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
> > -   }
> > -
> >     if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid) {
> >        uint32_t dw0 = 0, dw1 = 0;
> >        uint32_t comp0 = BRW_VE1_COMPONENT_STORE_0;
> > @@ -842,6 +827,21 @@ static void brw_emit_vertices(struct brw_context *brw)
> >        OUT_BATCH(dw1);
> >     }
> >  
> > +   if (brw->gen >= 6 && gen6_edgeflag_input) {
> > +      uint32_t format =
> > +         brw_get_vertex_surface_type(brw, gen6_edgeflag_input->glarray);
> > +
> > +      OUT_BATCH((gen6_edgeflag_input->buffer << GEN6_VE0_INDEX_SHIFT) |
> > +                GEN6_VE0_VALID |
> > +                GEN6_VE0_EDGE_FLAG_ENABLE |
> > +                (format << BRW_VE0_FORMAT_SHIFT) |
> > +                (gen6_edgeflag_input->offset << BRW_VE0_SRC_OFFSET_SHIFT));
> > +      OUT_BATCH((BRW_VE1_COMPONENT_STORE_SRC << BRW_VE1_COMPONENT_0_SHIFT) |
> > +                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_1_SHIFT) |
> > +                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
> > +                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
> > +   }
> > +
> >     ADVANCE_BATCH();
> >  }
> >  
> > diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > index f7d9952..2bac5ff 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > @@ -40,16 +40,25 @@ gen8_emit_vertices(struct brw_context *brw)
> >  {
> >     struct gl_context *ctx = &brw->ctx;
> >     uint32_t mocs_wb = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
> > +   bool uses_edge_flag;
> >  
> >     brw_prepare_vertices(brw);
> >     brw_prepare_shader_draw_parameters(brw);
> >  
> > +   uses_edge_flag = (ctx->Polygon.FrontMode != GL_FILL ||
> > +                     ctx->Polygon.BackMode != GL_FILL);
> > +
> >     if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid) {
> >        unsigned vue = brw->vb.nr_enabled;
> >  
> > -      WARN_ONCE(brw->vs.prog_data->inputs_read & VERT_BIT_EDGEFLAG,
> > -                "Using VID/IID with edgeflags, need to reorder the "
> > -                "vertex attributes");
> > +      /* The element for the edge flags must always be last, so we have to
> > +       * insert the SGVS before it in that case.
> > +       */
> > +      if (uses_edge_flag) {
> > +         assert(vue > 0);
> > +         vue--;
> > +      }
> > +
> 
> I don't get this. Shouldn't we be pushing the element with the edge flag to the
> end? As an example, suppose nr_enabled = 3. Won't this make us write the SGVs
> out to vue 2, which presumable had something of value before?
> 
> Also, I just noticed:
> "It is INVALID to store both the VertexID and InstanceID in the same
> element/component location within the VUE."
> 
> Aren't we potentially doing that today (and still with your patch)?
> 
> >        WARN_ONCE(vue >= 33,
> >                  "Trying to insert VID/IID past 33rd vertex element, "
> >                  "need to reorder the vertex attrbutes.");
> > @@ -138,7 +147,18 @@ gen8_emit_vertices(struct brw_context *brw)
> >        ADVANCE_BATCH();
> >     }
> >  
> > -   unsigned nr_elements = brw->vb.nr_enabled + brw->vs.prog_data->uses_vertexid;
> > +   /* Normally we don't need an element for the SGVS attribute because the
> > +    * 3DSTATE_VF_SGVS instruction lets you store the generated attribute in an
> > +    * element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if the
> > +    * vertex ID is used then it needs an element for the base vertex buffer.
> > +    * Additionally if there is an edge flag element then the SGVS can't be
> > +    * inserted past that so we need a dummy element to ensure that the edge
> > +    * flag is the last one.
> > +    */
> > +   bool needs_sgvs_element = (brw->vs.prog_data->uses_vertexid ||
> > +                              (brw->vs.prog_data->uses_instanceid &&
> > +                               uses_edge_flag));
> > +   unsigned nr_elements = brw->vb.nr_enabled + needs_sgvs_element;
> >  
> >     /* The hardware allows one more VERTEX_ELEMENTS than VERTEX_BUFFERS,
> >      * presumably for VertexID/InstanceID.
> > @@ -192,6 +212,24 @@ gen8_emit_vertices(struct brw_context *brw)
> >                  (comp3 << BRW_VE1_COMPONENT_3_SHIFT));
> >     }
> >  
> > +   if (needs_sgvs_element) {
> > +      if (brw->vs.prog_data->uses_vertexid) {
> > +         OUT_BATCH(GEN6_VE0_VALID |
> > +                   brw->vb.nr_buffers << GEN6_VE0_INDEX_SHIFT |
> > +                   BRW_SURFACEFORMAT_R32_UINT << BRW_VE0_FORMAT_SHIFT);
> > +         OUT_BATCH((BRW_VE1_COMPONENT_STORE_SRC << BRW_VE1_COMPONENT_0_SHIFT) |
> > +                   (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_1_SHIFT) |
> > +                   (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
> > +                   (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
> > +      } else {
> > +         OUT_BATCH(GEN6_VE0_VALID);
> > +         OUT_BATCH((BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_0_SHIFT) |
> > +                   (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_1_SHIFT) |
> > +                   (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
> > +                   (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
> > +      }
> > +   }
> > +
> >     if (gen6_edgeflag_input) {
> >        uint32_t format =
> >           brw_get_vertex_surface_type(brw, gen6_edgeflag_input->glarray);
> > @@ -206,16 +244,6 @@ gen8_emit_vertices(struct brw_context *brw)
> >                  (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
> >                  (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
> >     }
> > -
> > -   if (brw->vs.prog_data->uses_vertexid) {
> > -      OUT_BATCH(GEN6_VE0_VALID |
> > -                brw->vb.nr_buffers << GEN6_VE0_INDEX_SHIFT |
> > -                BRW_SURFACEFORMAT_R32_UINT << BRW_VE0_FORMAT_SHIFT);
> > -      OUT_BATCH((BRW_VE1_COMPONENT_STORE_SRC << BRW_VE1_COMPONENT_0_SHIFT) |
> > -                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_1_SHIFT) |
> > -                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
> > -                (BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
> > -   }
> >     ADVANCE_BATCH();
> >  
> >     for (unsigned i = 0; i < brw->vb.nr_enabled; i++) {
> 
> Empirically, this patch is needed, since without it, I cannot get the test to
> pass. However, since I don't understand it well enough, I am going to push 1/3,
> and wait until you get back from vacation to continue with this.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-stable mailing list