[Mesa-dev] [PATCH V2 05/12] i965/Gen4-5: Set clip flags from clip distances

Paul Berry stereotype441 at gmail.com
Wed Aug 14 09:20:27 PDT 2013


On 9 August 2013 17:14, Chris Forbes <chrisf at ijw.co.nz> wrote:

> V2: - Use the new VS_OPCODE_UNPACK_FLAGS_SIMD4X2 to correctly split the
>       flags for the two vertices being processed together.
>     - Don't apply bogus masking of clip flags. The set of plane enables
>       aren't included in the shader key, and we wouldn't want the
>       recompiles anyway.
>
> Signed-off-by: Chris Forbes <chrisf at ijw.co.nz>
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 25
> ++++++++++++++-----------
>  1 file changed, 14 insertions(+), 11 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index f80777b..5e007de 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -2625,7 +2625,6 @@ vec4_visitor::emit_psiz_and_flags(struct brw_reg reg)
>        dst_reg header1 = dst_reg(this, glsl_type::uvec4_type);
>        dst_reg header1_w = header1;
>        header1_w.writemask = WRITEMASK_W;
> -      GLuint i;
>
>        emit(MOV(header1, 0u));
>
> @@ -2637,18 +2636,22 @@ vec4_visitor::emit_psiz_and_flags(struct brw_reg
> reg)
>          emit(AND(header1_w, src_reg(header1_w), 0x7ff << 8));
>        }
>
> -      current_annotation = "Clipping flags";
> -      for (i = 0; i < key->nr_userclip_plane_consts; i++) {
> -        vec4_instruction *inst;
> -         gl_varying_slot slot = (prog_data->vue_map.slots_valid &
> VARYING_BIT_CLIP_VERTEX)
> -            ? VARYING_SLOT_CLIP_VERTEX : VARYING_SLOT_POS;
> +      if (key->userclip_active) {
> +         current_annotation = "Clipping flags";
> +         dst_reg temp = dst_reg(this, glsl_type::uint_type);
> +         dst_reg temp2 = dst_reg(this, glsl_type::uint_type);
>
> -        inst = emit(DP4(dst_null_f(), src_reg(output_reg[slot]),
> -                         src_reg(this->userplane[i])));
> -        inst->conditional_mod = BRW_CONDITIONAL_L;
> +         emit(CMP(dst_null_f(),
> src_reg(output_reg[VARYING_SLOT_CLIP_DIST0]), src_reg(0.0f),
> BRW_CONDITIONAL_L));
> +         emit(VS_OPCODE_UNPACK_FLAGS_SIMD4X2, temp, src_reg(0));
> +         emit(AND(temp2, src_reg(temp), src_reg(0x0fu)));
>

I think this AND is unnecessary.  VS_OPCODE_UNPACK_FLAGS_SIMD4X2 is
guaranteed to produce outputs that only have bits 0-3 set.


>
> -        inst = emit(OR(header1_w, src_reg(header1_w), 1u << i));
> -        inst->predicate = BRW_PREDICATE_NORMAL;
> +         emit(CMP(dst_null_f(),
> src_reg(output_reg[VARYING_SLOT_CLIP_DIST1]), src_reg(0.0f),
> BRW_CONDITIONAL_L));
> +         emit(VS_OPCODE_UNPACK_FLAGS_SIMD4X2, temp, src_reg(0));
> +         emit(AND(temp, src_reg(temp), src_reg(0x0fu)));
>

Same with this AND.


> +         emit(SHL(temp, src_reg(temp), src_reg(4)));
> +         emit(OR(temp2, src_reg(temp2), src_reg(temp)));
> +
> +         emit(OR(header1_w, src_reg(header1_w), src_reg(temp2)));
>

Slight nit-pick: re-using temp registers introduces extra dependencies,
making it more difficult for the instruction scheduler to produce efficient
code.  It probably won't produce a significant performance difference, but
you might want to consider using a separate temp register for each
intermediate result.  It shouldn't create a sizeable amount of extra
register pressure since this code executes at the end ofthe shader anyhow.


>        }
>
>        /* i965 clipping workaround:
> --
> 1.8.3.4
>

All of my suggestions on this patch are minor, so whether or not you decide
to follow them, this patch is:

Reviewed-by: Paul Berry <stereotype441 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130814/1383ab05/attachment.html>


More information about the mesa-dev mailing list