[Mesa-dev] Proposal for a long-term shader compiler (and IR) architecture

Mon Oct 18 12:27:20 PDT 2010

On Mon, 2010-10-18 at 10:52 -0700, Keith Whitwell wrote:
> On Mon, Oct 18, 2010 at 9:18 AM, Jerome Glisse <j.glisse at gmail.com> wrote:
> > On Fri, Oct 15, 2010 at 7:44 PM, John Kessenich <johnk at lunarg.com> wrote:
> >> Hi,
> >> LunarG has decided to work on an open source, long-term, highly-functional,
> >> and modular shader and kernel compiler stack. Attached is our high-level
> >> proposal for this compiler architecture (LunarGLASS).  We would like to
> >> solicit feedback from the open source community on doing this.
> >> I have read several posts here where it seems the time has come for
> >> something like this, and in that spirit, I hope this is consistent with the
> >> desire and direction many contributors to this list have already alluded to.
> >> Perhaps the biggest point of the proposal is to standardize on LLVM as an
> >> intermediate representation.  This is actually done at two levels within the
> >> proposal; one at a high-level IR close to the source language and one at a
> >> low-level IR close to the target architecture.  The full picture is in the
> >> attached document.
> >> Based on feedback to this proposal, our next step is to more precisely
> >> define the two forms of LLVM IR.
> >> Please let me know if you have any trouble reading the attached, or any
> >> questions, or any feedback regarding the proposal.
> >> Thanks,
> >> JohnK

> > Just a quick reply (i won't have carefully read through this proposition before
> > couple weeks) last time i check LLVM didn't seemed to fit the bill for GPU,
> > newer GPU can be seen as close to scalar but not completely, there are
> > restriction on instruction packing and the amount of data computation
> > unit of gpu can access per cycle, also register allocation is different
> > from normal CPU, you don't wan to do register peeling on GPU. So from
> > my POV instruction scheduling & packing and register allocation are
> > interlace process (where you store variable impact instruction packing).
> > Also on newer gpu it makes sense to use a mixed scalar/vector representation
> > to preserve things like dot product. 

LLVM has always been able represent both scalar and vectors. Although
the dot product is not natively represented in IR, one can perfectly
define an dot product intrinsic which takes two vectors and returns a
scalar. I haven't look at the backends, but I believe the same applies.

> Last loop, jump, function have kind
> > of unsual restriction unlike any CPU (thought i haven't broad CPU knowledge)
> >
> > Bottom line is i don't think LLVM is anywhere near what would help us.
>
> I think this is the big question mark with this proposal -- basically
> can it be done?

I also think there are indeed challenges translating LLVM IR to
something like TGSI, Mesa IR; and I was skeptical about standardizing on
LLVM IR for quite some time, but lately I've been reaching the
conclusion that there's so much momentum behind LLVM that the
benefits/synergy one gets by leveraging it will most likely exceed the
pitfalls.

But I never felt much skepticism for GPU code generation. There is e.g.,
a LLVM PTX backend already out there. And if it's not easy to make a
LLVM backend for a particular GPU, then it should be at very least
possible to implement a LLVM backend that generates a code in a
representation very close to the GPU code, and do the final steps (e.g.,
register allocation, scheduling, etc) in a custom pass. Therefore
benefiting from all high level optimizations that happened before.

> If it can't be done, we'll find out quickly, if it can then we can
> stop debating whether or not it's possible.

Indeed.

Jose