[Mesa-dev] [PATCH 2/4] i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.

Kenneth Graunke kenneth at whitecape.org
Tue Feb 25 17:06:32 PST 2014


On 02/25/2014 09:38 AM, Eric Anholt wrote:
> Matt Turner <mattst88 at gmail.com> writes:
> 
>> On Mon, Feb 24, 2014 at 10:15 AM, Eric Anholt <eric at anholt.net> wrote:
>>> I think we would do better by emitting
>>> ADD(y_minus_x, y, negate(x))
>>> MAC(dst, x, y_minus_x, a)
>>
>> MAC only takes two arguments, so
>>  - if you meant MAD, there's no MAD on platforms that don't have LRP
>>  - if you meant MAC(dst, ...) I don't see a way of doing it only two
>> instructions, but we could do
>>
>> MOV(acc, x)
>> ADD(y_minus_x, y, negate(x)
>> MAC(dst, y_minus_x, a)
> 
> Oops, yeah, I was still thinking in terms of MAD.  This should still be
> better I think, while being an obvious translation of the LRP
> instruction:
> 
> ADD one_minus_a, negate(a), 1.0f
> MUL null, y, a
> MAC dst, x, one_minus_a
> 
> (multiplying y * a first to slightly reduce the stall pressure from
> one_minus_a)

Nice.  I agree this is better, but it's harder than you think.  We would
have to:

1. Create a MAC() emitter.
2. Add BRW_OPCODE_MAC to vec4_generator.
3. Add a new "enable accumulator writes" flag to vec4_instruction
   and make vec4_generator respect that.  (The MUL needs this.)
4. Fix up dead code elimination and other things to know about implicit
accumulator writes.

Given the severity of this problem (GPU hangs and crashes) and the fact
that it's a regression in 10.1---which we plan to ship in three days---I
would like to commit my existing patches and improve this after the release.

--Ken

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140225/e8d3c193/attachment.pgp>


More information about the mesa-dev mailing list