[Mesa-dev] [PATCH V2 05/12] i965/Gen4-5: Set clip flags from clip distances
Paul Berry
stereotype441 at gmail.com
Wed Aug 14 09:20:27 PDT 2013
On 9 August 2013 17:14, Chris Forbes <chrisf at ijw.co.nz> wrote:
> V2: - Use the new VS_OPCODE_UNPACK_FLAGS_SIMD4X2 to correctly split the
> flags for the two vertices being processed together.
> - Don't apply bogus masking of clip flags. The set of plane enables
> aren't included in the shader key, and we wouldn't want the
> recompiles anyway.
>
> Signed-off-by: Chris Forbes <chrisf at ijw.co.nz>
> ---
> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 25
> ++++++++++++++-----------
> 1 file changed, 14 insertions(+), 11 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index f80777b..5e007de 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -2625,7 +2625,6 @@ vec4_visitor::emit_psiz_and_flags(struct brw_reg reg)
> dst_reg header1 = dst_reg(this, glsl_type::uvec4_type);
> dst_reg header1_w = header1;
> header1_w.writemask = WRITEMASK_W;
> - GLuint i;
>
> emit(MOV(header1, 0u));
>
> @@ -2637,18 +2636,22 @@ vec4_visitor::emit_psiz_and_flags(struct brw_reg
> reg)
> emit(AND(header1_w, src_reg(header1_w), 0x7ff << 8));
> }
>
> - current_annotation = "Clipping flags";
> - for (i = 0; i < key->nr_userclip_plane_consts; i++) {
> - vec4_instruction *inst;
> - gl_varying_slot slot = (prog_data->vue_map.slots_valid &
> VARYING_BIT_CLIP_VERTEX)
> - ? VARYING_SLOT_CLIP_VERTEX : VARYING_SLOT_POS;
> + if (key->userclip_active) {
> + current_annotation = "Clipping flags";
> + dst_reg temp = dst_reg(this, glsl_type::uint_type);
> + dst_reg temp2 = dst_reg(this, glsl_type::uint_type);
>
> - inst = emit(DP4(dst_null_f(), src_reg(output_reg[slot]),
> - src_reg(this->userplane[i])));
> - inst->conditional_mod = BRW_CONDITIONAL_L;
> + emit(CMP(dst_null_f(),
> src_reg(output_reg[VARYING_SLOT_CLIP_DIST0]), src_reg(0.0f),
> BRW_CONDITIONAL_L));
> + emit(VS_OPCODE_UNPACK_FLAGS_SIMD4X2, temp, src_reg(0));
> + emit(AND(temp2, src_reg(temp), src_reg(0x0fu)));
>
I think this AND is unnecessary. VS_OPCODE_UNPACK_FLAGS_SIMD4X2 is
guaranteed to produce outputs that only have bits 0-3 set.
>
> - inst = emit(OR(header1_w, src_reg(header1_w), 1u << i));
> - inst->predicate = BRW_PREDICATE_NORMAL;
> + emit(CMP(dst_null_f(),
> src_reg(output_reg[VARYING_SLOT_CLIP_DIST1]), src_reg(0.0f),
> BRW_CONDITIONAL_L));
> + emit(VS_OPCODE_UNPACK_FLAGS_SIMD4X2, temp, src_reg(0));
> + emit(AND(temp, src_reg(temp), src_reg(0x0fu)));
>
Same with this AND.
> + emit(SHL(temp, src_reg(temp), src_reg(4)));
> + emit(OR(temp2, src_reg(temp2), src_reg(temp)));
> +
> + emit(OR(header1_w, src_reg(header1_w), src_reg(temp2)));
>
Slight nit-pick: re-using temp registers introduces extra dependencies,
making it more difficult for the instruction scheduler to produce efficient
code. It probably won't produce a significant performance difference, but
you might want to consider using a separate temp register for each
intermediate result. It shouldn't create a sizeable amount of extra
register pressure since this code executes at the end ofthe shader anyhow.
> }
>
> /* i965 clipping workaround:
> --
> 1.8.3.4
>
All of my suggestions on this patch are minor, so whether or not you decide
to follow them, this patch is:
Reviewed-by: Paul Berry <stereotype441 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130814/1383ab05/attachment.html>
More information about the mesa-dev
mailing list