[Mesa-dev] [PATCH 1/2] i965/gs: Move vertex_count != 0 check up a level; skip one caller.

Matt Turner mattst88 at gmail.com
Fri Jul 10 14:25:09 PDT 2015


On Wed, Jul 1, 2015 at 7:28 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> Paul's original code had emit_control_data_bits() skip the URB write if
> vertex_count was 0.  This meant wrapping every control data write in a
> conditional write.
>
> We accumulate control data bits in a single UD (32-bit) register.  For
> simple shaders that don't emit many vertices, the control data header
> will be <= 32-bits long, so we only need to write it once at the end of
> the shader.
>
> For shaders with larger headers, we write out batches of control data
> bits at EmitVertex(), when (vertex_count * bits_per_vertex) % 32 == 0.
> On the first EmitVertex() call, the above expression will evaluate to
> true simply because vertex_count == 0.  But we want to avoid emitting
> the control data bits, because we haven't accumulated 32-bits worth yet.
>
> In other words, the vertex_count != 0 check is really only necessary in
> the EmitVertex() batching case, not the end-of-thread case.
>
> This saves a CMP/IF/ENDIF in every shader that uses EndPrimitive() or
> multiple streams.  The only downside is that a shader which emits no
> vertices at all will execute an additional URB write---but such shaders
> are pointless and not worth optimizing.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> index 2f948ee..55408eb 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> @@ -348,11 +348,6 @@ vec4_gs_visitor::emit_control_data_bits()
>     if (c->control_data_header_size_bits > 128)
>        urb_write_flags = urb_write_flags | BRW_URB_WRITE_PER_SLOT_OFFSET;
>
> -   /* If vertex_count is 0, then no control data bits have been accumulated
> -    * yet, so we should do nothing.
> -    */
> -   emit(CMP(dst_null_d(), this->vertex_count, 0u, BRW_CONDITIONAL_NEQ));
> -   emit(IF(BRW_PREDICATE_NORMAL));
>     {
>        /* If we are using either channel masks or a per-slot offset, then we
>         * need to figure out which DWORD we are trying to write to, using the
> @@ -431,7 +426,6 @@ vec4_gs_visitor::emit_control_data_bits()
>        inst->base_mrf = base_mrf;
>        inst->mlen = 2;
>     }
> -   emit(BRW_OPCODE_ENDIF);
>  }
>
>  void
> @@ -531,9 +525,17 @@ vec4_gs_visitor::visit(ir_emit_vertex *ir)
>              emit(AND(dst_null_d(), this->vertex_count,
>                       (uint32_t) (32 / c->control_data_bits_per_vertex - 1)));
>           inst->conditional_mod = BRW_CONDITIONAL_Z;
> +
>           emit(IF(BRW_PREDICATE_NORMAL));
>           {
> +            /* If vertex_count is 0, then no control data bits have been
> +             * accumulated yet, so we skip emitting them.
> +             */
> +            emit(CMP(dst_null_d(), this->vertex_count, 0u,
> +                     BRW_CONDITIONAL_NEQ));

I think you wanted to indent BRW_CONDITIONAL_NEQ to match dst_null_d().

Also, can we s/NEQ/NZ/ while we're here?

Reviewed-by: Matt Turner <mattst88 at gmail.com>


More information about the mesa-dev mailing list