[Lima] [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support

Roland Scheidegger sroland at vmware.com
Thu Jan 3 23:35:27 UTC 2019


Am 03.01.19 um 20:50 schrieb Jason Ekstrand:
>     > The problem you're describing is in converting from NIR to another IR,
>     > not to hardware.  In LLVM they made a choice to put types on SSA
>     values
>     > but then to have the actual semantics be based on the instruction
>     > itself.  This leads to lots of redundancy in the IR but also lots of
>     > things you can validate which is kind-of nice.  Is that redundancy
>     > really useful?  In our experience with NIR, we haven't found it to be
>     > other than booleans (now sorted), this one constant issue, and
>     > translating to IRs that have that redundancy.  Then why did they add
>     > it?  I'm really not sure but it may, at least somewhat, be related to
>     > the fact that they allow arrays and structs in their SSA values and so
>     > need full types.  This is another design decision in LLVM which I find
>     > highly questionable.  What you're is more-or-less that NIR should
>     carry,
>     > maintain, and validate extra useless information just so we can
>     pass it
>     > on to LLVM which is going to use it for what, exactly?  Sorry if I'm
>     > extremely reluctant to make fairly fundamental changes to NIR with no
>     > better reason than "LLVM did it that way".
>     >
>     > There's a second reason why I don't like the idea of putting types on
>     > SSA values: It's extremely deceptive.  In SPIR-V you're allowed to
>     do an
>     > OpSConvert with a source that is an unsigned 16-bit integer and a
>     > destination that is an unsigned 32-bit integer.  What happens?  Your
>     > uint -> uint cast sign extends!  Yup.... That's what happens.  No
>     joke. 
>     > The same is true of signed vs. unsigned division or modulus.  The end
>     > result is that the types in SPIR-V are useless and you can't actually
>     > trust them for anything except bit-size and sometimes labelling
>     > something as a float vs. int.
>     This is really a decision of spir-v only though, llvm doesn't have that
>     nonsense (maybe for making it easier to translate to other languages?) -
>     there's just int and float types there, with no distinction between
>     signed and unsigned.
> 
> 
> I think with SPIR-V you could probably just pick one and make everything
> either signed or unsigned.  I'm not immediately aware of any opcodes
> that require one signedness or the other; most just require an integer
> or require a float.  That said, this is SPIR-V so I'm not going to bet
> money on that...
>  
> 
>     You are quite right though that float vs. int is somewhat redundant too
>     due to the instructions indicating the type. I suppose there might be
>     reasons why there's different types - hw may use different registers for
>     instance (whereas of course noone in their right mind would use
>     different registers for signed vs. unsigned ints), or there might be
>     some real cost of bitcasts (at least bypass delays are common for cpus
>     when transitioning values between int and float execution units). For
>     instance, it makes a difference with x86 sse if you use float or int
>     loads, which otherwise you couldn't indicate directly (llvm can optimize
>     this into the right kind of load nowadays even if you use the wrong kind
>     of variable for the load, e.g. int load then bitcast to float and do
>     some float op will change it into a float load, but this is IIRC
>     actually a pretty new ability, and possibly doesn't happen if you
>     disable enough optimization passes).
> 
> 
> Having it for the purpose of register allocation makes sense in the CPU
> world.  In the GPU world, everything is typically designed float-first
> and I'm not aware of any hardware has separate int and float registers
> like x86 does.  That said, hardware changes over time and it's entirely
> possible that someone will decide that a heterogeneous register file is
> a good idea on a GPU.  (Technically, most GPUs do have flag regs or
> accumulators or something but it's not as bad as x86.)  Our of
> curiosity, do you (or anyone else) know if LLVM actually uses them for
> that purpose?  I could see that information getting lost in the back-end
> and them using something else to choose registers.
> 
> --Jason

For llvm with x86 sse, I don't think a variable being characterized as
float or int via bitcast actually makes a difference whatsoever (at
least not with optimization). llvm would determine if it's float or int
on its own (based on preceeding / succeeding instructions). For example,
as you might know, sse has both float and int logic ops, whereas llvm of
course does not - but it would still use float logic ops appropriately
(when the value is coming out of / going into a "true" float op),
despite that you have to cast it to int in llvm to do the logic op. (A
more interesting examples are actually shuffles, since again due to sse
being quite horrid there some are only directly possible with ints, some
with floats, and some older llvm versions (before 3.9 or so) would never
use a shuffle from the wrong domain, even if it would have been
advantageous, and you could not trick llvm into using it even with
bitcasts - there's still some gallivm code which tricks llvm into using
the "correct" shuffle by using differently typed load instead, although
I think that code might be pointless now (if you want to look at it,
lp_bld_format_soa.c line 552...)).
So there can be differences (for better or worse...) depending on values
being characterized as float or int (vectors in this case), although
that characterization might not really be based on the definition of the
value.
This is of course just x86 sse, gpus shouldn't be all that crazy (though
for instance I could imagine there being bypass delays for int<->float
transitions in some theoretical future gpu, even if the register file is
the same, but even in this case  it's probably of little consequence, as
long as you don't have different float / int instructions doing the same
thing, which seems rather unlikely).

Roland


More information about the lima mailing list