[Mesa-dev] Mesa (master): i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.

Kenneth Graunke kenneth at whitecape.org
Wed Feb 26 10:46:20 PST 2014


On 02/26/2014 09:29 AM, Ian Romanick wrote:
> On 02/26/2014 02:25 AM, Kenneth Graunke wrote:
>> Module: Mesa
>> Branch: master
>> Commit: 56879a7ac41b8c7513a97cc02921f76a2ec8407c
>> URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=56879a7ac41b8c7513a97cc02921f76a2ec8407c
>>
>> Author: Kenneth Graunke <kenneth at whitecape.org>
>> Date:   Sun Feb 23 16:29:46 2014 -0800
>>
>> i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.
>>
>> When the vec4 backend encountered an ir_triop_lrp, it always emitted an
>> actual LRP instruction, which only exists on Gen6+.  Gen4-5 used
>> lower_instructions() to decompose ir_triop_lrp at the IR level.
>>
>> Since commit 8d37e9915a3b21 ("glsl: Optimize open-coded lrp into lrp."),
>> we've had an bug where lower_instructions translates ir_triop_lrp into
>> arithmetic, but opt_algebraic reassembles it back into a lrp.
>>
>> To avoid this ordering concern, just handle ir_triop_lrp in the backend.
>> The FS backend already does this, so we may as well do likewise.
>>
>> v2: Add a comment reminding us that we could emit better assembly if we
>>      implemented the infrastructure necessary to support using MAC.
>>      (Assembly code provided by Eric Anholt).
>>
>> Cc: "10.1" <mesa-stable at lists.freedesktop.org>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75253
>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>> Reviewed-by: Matt Turner <mattst88 at gmail.com>
>> Acked-by: Eric Anholt <eric at anholt.net>
>>
>> ---
>>
>>   src/mesa/drivers/dri/i965/brw_vec4.h           |    3 ++
>>   src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |   42 ++++++++++++++++++++----
>>   2 files changed, 38 insertions(+), 7 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h
>> index 6bd8b80..fb5c0a6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
>> @@ -506,6 +506,9 @@ public:
>>   
>>      void emit_minmax(uint32_t condmod, dst_reg dst, src_reg src0, src_reg src1);
>>   
>> +   void emit_lrp(const dst_reg &dst,
>> +                 const src_reg &x, const src_reg &y, const src_reg &a);
>> +
>>      void emit_block_move(dst_reg *dst, src_reg *src,
>>   			const struct glsl_type *type, uint32_t predicate);
>>   
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> index 95e0064..aedab93 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> @@ -1132,6 +1132,40 @@ vec4_visitor::emit_minmax(uint32_t conditionalmod, dst_reg dst,
>>      }
>>   }
>>   
>> +void
>> +vec4_visitor::emit_lrp(const dst_reg &dst,
>> +                       const src_reg &x, const src_reg &y, const src_reg &a)
>> +{
>> +   if (brw->gen >= 6) {
>> +      /* Note that the instruction's argument order is reversed from GLSL
>> +       * and the IR.
>> +       */
>> +      emit(LRP(dst,
>> +               fix_3src_operand(a), fix_3src_operand(y), fix_3src_operand(x)));
>> +   } else {
>> +      /* Earlier generations don't support three source operations, so we
>> +       * need to emit x*(1-a) + y*a.
>> +       *
>> +       * A better way to do this would be:
>> +       *    ADD one_minus_a, negate(a), 1.0f
>> +       *    MUL null, y, a
>> +       *    MAC dst, x, one_minus_a
>> +       * but we would need to support MAC and implicit accumulator.
>> +       */
>> +      dst_reg y_times_a           = dst_reg(this, glsl_type::vec4_type);
>> +      dst_reg one_minus_a         = dst_reg(this, glsl_type::vec4_type);
>> +      dst_reg x_times_one_minus_a = dst_reg(this, glsl_type::vec4_type);
>> +      y_times_a.writemask           = dst.writemask;
>> +      one_minus_a.writemask         = dst.writemask;
>> +      x_times_one_minus_a.writemask = dst.writemask;
>> +
>> +      emit(MUL(y_times_a, y, a));
>> +      emit(ADD(one_minus_a, negate(a), src_reg(1.0f)));
> 
> After picking to 10.1, this line gives me:
> 
> brw_vec4_visitor.cpp: In member function 'void brw::vec4_visitor::emit_lrp(const brw::dst_reg&, const brw::src_reg&, const brw::src_reg&, const brw::src_reg&)':
> brw_vec4_visitor.cpp:1171:37: error: could not convert 'a' from 'const brw::src_reg' to 'brw_reg'
> 
> I don't get this error on master.  Is there another patch that also
> needs picking?

Oops, sorry.  You need:

commit 98306e727b8291507ff4fd5dd5c4806f3fed9202
Author: Francisco Jerez <currojerez at riseup.net>
Date:   Wed Feb 19 15:20:27 2014 +0100

    i965/vec4: Add non-mutating helper functions to modify
src_reg::swizzle and ::negate.

    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

Thanks, Ian!
--Ken

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140226/6d3b8d45/attachment.pgp>


More information about the mesa-dev mailing list