[Mesa-dev] Mesa (master): i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.
Ian Romanick
idr at freedesktop.org
Wed Feb 26 09:29:51 PST 2014
On 02/26/2014 02:25 AM, Kenneth Graunke wrote:
> Module: Mesa
> Branch: master
> Commit: 56879a7ac41b8c7513a97cc02921f76a2ec8407c
> URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=56879a7ac41b8c7513a97cc02921f76a2ec8407c
>
> Author: Kenneth Graunke <kenneth at whitecape.org>
> Date: Sun Feb 23 16:29:46 2014 -0800
>
> i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.
>
> When the vec4 backend encountered an ir_triop_lrp, it always emitted an
> actual LRP instruction, which only exists on Gen6+. Gen4-5 used
> lower_instructions() to decompose ir_triop_lrp at the IR level.
>
> Since commit 8d37e9915a3b21 ("glsl: Optimize open-coded lrp into lrp."),
> we've had an bug where lower_instructions translates ir_triop_lrp into
> arithmetic, but opt_algebraic reassembles it back into a lrp.
>
> To avoid this ordering concern, just handle ir_triop_lrp in the backend.
> The FS backend already does this, so we may as well do likewise.
>
> v2: Add a comment reminding us that we could emit better assembly if we
> implemented the infrastructure necessary to support using MAC.
> (Assembly code provided by Eric Anholt).
>
> Cc: "10.1" <mesa-stable at lists.freedesktop.org>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75253
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> Reviewed-by: Matt Turner <mattst88 at gmail.com>
> Acked-by: Eric Anholt <eric at anholt.net>
>
> ---
>
> src/mesa/drivers/dri/i965/brw_vec4.h | 3 ++
> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 42 ++++++++++++++++++++----
> 2 files changed, 38 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 6bd8b80..fb5c0a6 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -506,6 +506,9 @@ public:
>
> void emit_minmax(uint32_t condmod, dst_reg dst, src_reg src0, src_reg src1);
>
> + void emit_lrp(const dst_reg &dst,
> + const src_reg &x, const src_reg &y, const src_reg &a);
> +
> void emit_block_move(dst_reg *dst, src_reg *src,
> const struct glsl_type *type, uint32_t predicate);
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index 95e0064..aedab93 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -1132,6 +1132,40 @@ vec4_visitor::emit_minmax(uint32_t conditionalmod, dst_reg dst,
> }
> }
>
> +void
> +vec4_visitor::emit_lrp(const dst_reg &dst,
> + const src_reg &x, const src_reg &y, const src_reg &a)
> +{
> + if (brw->gen >= 6) {
> + /* Note that the instruction's argument order is reversed from GLSL
> + * and the IR.
> + */
> + emit(LRP(dst,
> + fix_3src_operand(a), fix_3src_operand(y), fix_3src_operand(x)));
> + } else {
> + /* Earlier generations don't support three source operations, so we
> + * need to emit x*(1-a) + y*a.
> + *
> + * A better way to do this would be:
> + * ADD one_minus_a, negate(a), 1.0f
> + * MUL null, y, a
> + * MAC dst, x, one_minus_a
> + * but we would need to support MAC and implicit accumulator.
> + */
> + dst_reg y_times_a = dst_reg(this, glsl_type::vec4_type);
> + dst_reg one_minus_a = dst_reg(this, glsl_type::vec4_type);
> + dst_reg x_times_one_minus_a = dst_reg(this, glsl_type::vec4_type);
> + y_times_a.writemask = dst.writemask;
> + one_minus_a.writemask = dst.writemask;
> + x_times_one_minus_a.writemask = dst.writemask;
> +
> + emit(MUL(y_times_a, y, a));
> + emit(ADD(one_minus_a, negate(a), src_reg(1.0f)));
After picking to 10.1, this line gives me:
brw_vec4_visitor.cpp: In member function 'void brw::vec4_visitor::emit_lrp(const brw::dst_reg&, const brw::src_reg&, const brw::src_reg&, const brw::src_reg&)':
brw_vec4_visitor.cpp:1171:37: error: could not convert 'a' from 'const brw::src_reg' to 'brw_reg'
I don't get this error on master. Is there another patch that also
needs picking?
> + emit(MUL(x_times_one_minus_a, x, src_reg(one_minus_a)));
> + emit(ADD(dst, src_reg(x_times_one_minus_a), src_reg(y_times_a)));
> + }
> +}
> +
> static bool
> is_16bit_constant(ir_rvalue *rvalue)
> {
> @@ -1628,13 +1662,7 @@ vec4_visitor::visit(ir_expression *ir)
> break;
>
> case ir_triop_lrp:
> - op[0] = fix_3src_operand(op[0]);
> - op[1] = fix_3src_operand(op[1]);
> - op[2] = fix_3src_operand(op[2]);
> - /* Note that the instruction's argument order is reversed from GLSL
> - * and the IR.
> - */
> - emit(LRP(result_dst, op[2], op[1], op[0]));
> + emit_lrp(result_dst, op[0], op[1], op[2]);
> break;
>
> case ir_triop_csel:
>
> _______________________________________________
> mesa-commit mailing list
> mesa-commit at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-commit
>
More information about the mesa-dev
mailing list