[Mesa-dev] Mesa (master): i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.
Kenneth Graunke
kenneth at whitecape.org
Wed Feb 26 10:46:20 PST 2014
On 02/26/2014 09:29 AM, Ian Romanick wrote:
> On 02/26/2014 02:25 AM, Kenneth Graunke wrote:
>> Module: Mesa
>> Branch: master
>> Commit: 56879a7ac41b8c7513a97cc02921f76a2ec8407c
>> URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=56879a7ac41b8c7513a97cc02921f76a2ec8407c
>>
>> Author: Kenneth Graunke <kenneth at whitecape.org>
>> Date: Sun Feb 23 16:29:46 2014 -0800
>>
>> i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.
>>
>> When the vec4 backend encountered an ir_triop_lrp, it always emitted an
>> actual LRP instruction, which only exists on Gen6+. Gen4-5 used
>> lower_instructions() to decompose ir_triop_lrp at the IR level.
>>
>> Since commit 8d37e9915a3b21 ("glsl: Optimize open-coded lrp into lrp."),
>> we've had an bug where lower_instructions translates ir_triop_lrp into
>> arithmetic, but opt_algebraic reassembles it back into a lrp.
>>
>> To avoid this ordering concern, just handle ir_triop_lrp in the backend.
>> The FS backend already does this, so we may as well do likewise.
>>
>> v2: Add a comment reminding us that we could emit better assembly if we
>> implemented the infrastructure necessary to support using MAC.
>> (Assembly code provided by Eric Anholt).
>>
>> Cc: "10.1" <mesa-stable at lists.freedesktop.org>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75253
>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>> Reviewed-by: Matt Turner <mattst88 at gmail.com>
>> Acked-by: Eric Anholt <eric at anholt.net>
>>
>> ---
>>
>> src/mesa/drivers/dri/i965/brw_vec4.h | 3 ++
>> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 42 ++++++++++++++++++++----
>> 2 files changed, 38 insertions(+), 7 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h
>> index 6bd8b80..fb5c0a6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
>> @@ -506,6 +506,9 @@ public:
>>
>> void emit_minmax(uint32_t condmod, dst_reg dst, src_reg src0, src_reg src1);
>>
>> + void emit_lrp(const dst_reg &dst,
>> + const src_reg &x, const src_reg &y, const src_reg &a);
>> +
>> void emit_block_move(dst_reg *dst, src_reg *src,
>> const struct glsl_type *type, uint32_t predicate);
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> index 95e0064..aedab93 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
>> @@ -1132,6 +1132,40 @@ vec4_visitor::emit_minmax(uint32_t conditionalmod, dst_reg dst,
>> }
>> }
>>
>> +void
>> +vec4_visitor::emit_lrp(const dst_reg &dst,
>> + const src_reg &x, const src_reg &y, const src_reg &a)
>> +{
>> + if (brw->gen >= 6) {
>> + /* Note that the instruction's argument order is reversed from GLSL
>> + * and the IR.
>> + */
>> + emit(LRP(dst,
>> + fix_3src_operand(a), fix_3src_operand(y), fix_3src_operand(x)));
>> + } else {
>> + /* Earlier generations don't support three source operations, so we
>> + * need to emit x*(1-a) + y*a.
>> + *
>> + * A better way to do this would be:
>> + * ADD one_minus_a, negate(a), 1.0f
>> + * MUL null, y, a
>> + * MAC dst, x, one_minus_a
>> + * but we would need to support MAC and implicit accumulator.
>> + */
>> + dst_reg y_times_a = dst_reg(this, glsl_type::vec4_type);
>> + dst_reg one_minus_a = dst_reg(this, glsl_type::vec4_type);
>> + dst_reg x_times_one_minus_a = dst_reg(this, glsl_type::vec4_type);
>> + y_times_a.writemask = dst.writemask;
>> + one_minus_a.writemask = dst.writemask;
>> + x_times_one_minus_a.writemask = dst.writemask;
>> +
>> + emit(MUL(y_times_a, y, a));
>> + emit(ADD(one_minus_a, negate(a), src_reg(1.0f)));
>
> After picking to 10.1, this line gives me:
>
> brw_vec4_visitor.cpp: In member function 'void brw::vec4_visitor::emit_lrp(const brw::dst_reg&, const brw::src_reg&, const brw::src_reg&, const brw::src_reg&)':
> brw_vec4_visitor.cpp:1171:37: error: could not convert 'a' from 'const brw::src_reg' to 'brw_reg'
>
> I don't get this error on master. Is there another patch that also
> needs picking?
Oops, sorry. You need:
commit 98306e727b8291507ff4fd5dd5c4806f3fed9202
Author: Francisco Jerez <currojerez at riseup.net>
Date: Wed Feb 19 15:20:27 2014 +0100
i965/vec4: Add non-mutating helper functions to modify
src_reg::swizzle and ::negate.
Reviewed-by: Paul Berry <stereotype441 at gmail.com>
Thanks, Ian!
--Ken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140226/6d3b8d45/attachment.pgp>
More information about the mesa-dev
mailing list