[Mesa-dev] Flatland

Tom Stellard tom at stellard.net
Fri Feb 7 10:12:23 PST 2014


On Fri, Feb 07, 2014 at 10:49:01AM -0500, Alex Deucher wrote:
> On Fri, Feb 7, 2014 at 12:34 AM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> > Hi,
> >
> > So I believe that we can all agree that the tree-based representation
> > that GLSL IR currently uses for shaders needs to go. For the benefit
> > of those that didn't watch Ian Romanick's talk at FOSDEM, I'll
> > reiterate some of the problems with it as of now:
> >
> > - All the ir_dereference chains blow up the memory usage, and the
> > constant pointer chasing in the recursive algorithms needed to handle
> > them is not just cache-unfriendly but "cache-mean."
> >
> > - The ir_hierachical_visitor pattern that we currently use for
> > optimization/analysis passes has to examine every piece of IR, even
> > the irrelevant stuff, making the above problems even worse.
> >
> > - Nobody else does it this way, meaning that the existing well-known
> > optimizations don't apply as much here, and oftentimes we have to
> > write some pretty nasty code in order to make necessary optimizations
> > (like tree grafting).
> >
> > - It turns out that the original advantage of a tree-based IR, to be
> > able to automatically generate pattern-matching code for optimizing
> > certain code patterns, only really matters for CPU's with weird
> > instruction sets with lots of exotic instructions; GPU's tend to be
> > pretty regular and consistent in their ISA's, so being able to
> > pattern-match with trees doesn't help us much here.
> >
> > Finally, it seems like a lot of important SSA-based passes assume that
> > we have a flat IR, and so moving to SSA won't be nearly as beneficial
> > as we would like it to be; we could flatten the IR before doing these
> > passes, but that would make the first problem even worse. So we can't
> > really take advantage of SSA too much either until we have a flat IR.
> >
> > The real issue is, how do we let this transition occur gradually, in
> > pieces, without breaking existing code? Ian proposed one solution at
> > FOSDEM, but here's my idea of another.
> >
> > So, my idea is that rather than slowly introducing changes across the
> > board, we create the IR in its final form in the beginning, write
> > passes to flatten and unflatten the IR, and then piece-by-piece
> > rewrite the rest of the compiler. We're going to have to rewrite a lot
> > of the passes to support SSA in the first place, so why not convert
> > them to a flat IR while we're at it? The benefit of this is that it's
> > much easier to do asynchronously and in parallel; rather than
> > introducing changes to the entire thing at once, several people can
> > convert this and that pass, the frontend, the linker, etc.
> > independently. It would entail some extra overhead during the
> > transition in the form of the flattening and unflattening passes, but
> > I think it would be worth it for the immediate benefits (optimizations
> > like GVN-GCM and CSE made possible, etc.).
> >
> > The first part to be converted would be my passes to convert to and
> > from SSA, so that the compiler optimization part would look like this:
> >
> > flatten -> convert to SSA -> (the new hotness) -> out of SSA ->
> > unflatten -> (the old stuff)
> >
> > Then we gradually convert ast_to_hir, various passes, the linker,
> > backends, etc. to this form while now actually having the
> > infrastructure to implement any advanced compiler optimization
> > designed in the last ~15 years or so by more-or-less copying down the
> > pseudocode. Hopefully, then, we can reach a point where we can rip out
> > the old IR and the converters.
> >
> > So what would this new IR look like? Well, here's my 2 cents (in the
> > form of some abridged class definitions, you should get the point...)
> >
> > struct ir_calc_source
> > {
> >     mode; /** < SSA or non-SSA */
> >     union {
> >         ir_calculation *def; /** < for SSA sources */
> >         unsigned int reg; /** < for non-SSA sources */
> >     } src;
> >     unsigned swizzle : 8;
> > };
> >
> > struct ir_calc_dest
> > {
> >     mode; /** < SSA or non-SSA */
> >     union {
> >         unsigned int reg; /** < for non-SSA destinations */
> >
> >         /**
> >          * For SSA destinations. Types are needed here because
> > normally they're part
> >          * of the register, but SSA doesn't have registers.
> >          */
> >         glsl_type *type;
> >     } reg_or_type; /* this name is kinda ugly but couldn't think of
> > anything better. */
> > };
> >
> > /*
> >  * This is Ian's name for it, personally I would vote for
> > s/ir_instruction/ir_node/ and
> >  * call this ir_instruction
> >  */
> >
> > class ir_calculation
> > {
> >     ir_calc_dest dest;
> >     ir_expression_operation op;
> >     unsigned write_mask : 4;
> >     ir_calc_source srcs[4];
> > };
> >
> > class ir_load_var
> > {
> >     ir_calc_dest dest;
> >     ir_variable *src;
> >
> >     /**
> >      * For array and record loads, whether we're loading a specific
> > member or the whole
> >      * thing.
> >      */
> >     bool deref_member;
> >     ir_calc_source array_index; /** < for array loads if
> > deref_array_index is true */
> >     char *record_index; /** < for structure loads */
> > };
> >
> > class ir_store_var
> > {
> >     ir_variable *dest;
> >     ir_calc_source src;
> >     bool deref_member;
> >     ir_calc_source array_index; /** < for array loads */
> >     char *record_index; /** < for structure loads */
> >     unsigned write_mask : 4;
> > };
> >
> > So ir_variable still exists, but it will only be used for function
> > parameters, shader in/outs and uniforms, and arrays and structures.
> > Registers will be much more lightweight, only requiring a table with
> > each register's type and perhaps uses and definitions. The flattening
> > pass, and later ast_to_hir, will emit loads and stores wherever there
> > is an ir_dereference now, but there will be an ir_variable -> register
> > pass that converts these to moves that will later be eliminated by
> > copy propagation (in SSA form, after converting the registers to SSA
> > writes). This is similar to how LLVM works, with everything starting
> > out allocated on the stack using alloca (equivalent to ir_variables
> > here) and accessed explicitly using loads and stores, but then some of
> > these loads/stores are optimized out.
> >
> 
> What about just moving to llvm directly?  We already use it for
> compute/OpenCL on gallium and as the shader compiler for radeon
> hardware and llvmpipe.
> 

Vincent Lejeune wrote a GLSL IR to LLVM IR state tracker a while back:
http://cgit.freedesktop.org/~vlj/mesa/log/?h=glsl-to-llvm-05nov

It might be worth looking at if anyone is considering using LLVM.

-Tom


More information about the mesa-dev mailing list