[Mesa-dev] [PATCH 0/4] gallium: add new opcodes needed for ARB_gs5

Ilia Mirkin imirkin at alum.mit.edu
Fri Apr 25 12:52:21 PDT 2014


On Fri, Apr 25, 2014 at 3:36 PM, Matt Turner <mattst88 at gmail.com> wrote:
> On Fri, Apr 25, 2014 at 10:41 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>> This is enough to catch up to core mesa, with the exception of
>> uaddCarry/usubBorrow -- those will require some thought. I don't like the way
>> they were done in core mesa, so I may redo it differently. (Will start a
>> discussion on that topic after I've given it more thought.)
>
> I'm not sure you have all of the context. GLSL IR is pure, meaning
> expressions have no side effects. uaddCarry and usubBorrow have side
> effects.
>
> Short of implementing an intrinsic system or something, which seems
> like massive overkill for these built-ins, you have to split them.
>
> Everyone I talked to about it at the office thought it was a really
> elegant solution, especially given that some other hardware implements
> instructions for the split pieces and that a peephole could recombine
> them for hardware that has a combined instruction. I didn't just
> implement the first thing that came to mind.

Sorry, I didn't mean for that to come off as implying that the current
design was in any way bad or stupid. What you've implemented is
perfectly logical and reasonable. However it makes my life a little
more difficult, and I believe there's a similarly clean and elegant
solution that also has the advantage of not making my life more
difficult exists. Basically instead of the glsl ir for uaddCarry being
emitted as

a = carry(x, y)
b = uadd(x, y)

Perhaps it can be emitted as

b = uadd(x, y)
a = (x > b)

And then you have a peephole pass that looks for this and converts it
into a single instruction. Additionally this has the advantage of
working on code where people manually implemented uaddCarry (although
there are other ways to implement it, and this would only detect one
of them).

The problem with the current way is that (a) I'd have to add yet-more
TGSI instructions to pipe this through (not my favourite activity),
and (b) would need to do the lowering + hope CSE takes care of the
duplicate ADD instruction (nvc0 doesn't really have an ADDC... there's
a flag you can set when doing the ADD, but I guess I'll need to test
out when it gets set). With the alternative above, no need for new
instructions, and everything Just Works (tm). And those ISA's that
actually have an ADDC would need a peephole (or similar) pass to make
use of it anyways, so they'll just detect the alterate instruction
sequence.

  -ilia


More information about the mesa-dev mailing list