[Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

Tue Aug 19 15:57:31 PDT 2014

On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote:
> On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> > Tom Stellard <tom at stellard.net> writes:
> >
> >> On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
> >>> On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer <michel at daenzer.net> wrote:
> >>> > On 19.08.2014 01:28, Connor Abbott wrote:
> >>> >> On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer <michel at daenzer.net> wrote:
> >>> >>> On 16.08.2014 09:12, Connor Abbott wrote:
> >>> >>>> I know what you might be thinking right now. "Wait, *another* IR? Don't
> >>> >>>> we already have like 5 of those, not counting all the driver-specific
> >>> >>>> ones? Isn't this stuff complicated enough already?" Well, there are some
> >>> >>>> pretty good reasons to start afresh (again...). In the years we've been
> >>> >>>> using GLSL IR, we've come to realize that, in fact, it's not what we
> >>> >>>> want *at all* to do optimizations on.
> >>> >>>
> >>> >>> Did you evaluate using LLVM IR instead of inventing yet another one?
> >>> >>>
> >>> >>>
> >>> >>> --
> >>> >>> Earthling Michel Dänzer            |                  http://www.amd.com
> >>> >>> Libre software enthusiast          |                Mesa and X developer
> >>> >>
> >>> >> Yes. See
> >>> >>
> >>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
> >>> >>
> >>> >> and
> >>> >>
> >>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
> >>> >
> >>> > I know Ian can't deal with LLVM for some reason. I was wondering if
> >>> > *you* evaluated it, and if so, why you rejected it.
> >>> >
> >>> >
> >>> > --
> >>> > Earthling Michel Dänzer            |                  http://www.amd.com
> >>> > Libre software enthusiast          |                Mesa and X developer
> >>>
> >>>
> >>> Well, first of all, the fact that Ian and Ken don't want to use it
> >>> means that any plan to use LLVM for the Intel driver is dead in the
> >>> water anyways - you can translate NIR into LLVM if you want, but for
> >>> i965 we want to share optimizations between our 2 backends (FS and
> >>> vec4) that we can't do today in GLSL IR so this is what we want to use
> >>> for that, and since nobody else does anything with the core GLSL
> >>> compiler except when they have to, when we start moving things out of
> >>> GLSL IR this will probably replace GLSL IR as the infrastructure that
> >>> all Mesa drivers use. But with that in mind, here are a few reasons
> >>> why we wouldn't want to use LLVM:
> >>>
> >>> * LLVM wasn't built to understand structured CFG's, meaning that you
> >>> need to re-structurize it using a pass that's fragile and prone to
> >>> break if some other pass "optimizes" the shader in a way that makes it
> >>> non-structured (i.e. not expressible in terms of loops and if
> >>> statements). This loss of information also means that passes that need
> >>> to know things like, for example, the loop nesting depth need to do an
> >>> analysis pass whereas with NIR you can just walk up the control flow
> >>> tree and count the number of loops we hit.
> >>>
> >>
> >> LLVM has a pass to structurize the CFG.  We use it in the radeon
> >> drivers, and it is run after all of the other LLVM optimizations which have
> >> no concept of structured CFG.  It's not bug free, but it works really
> >> well even with all of the complex OpenCL kernels we throw at it.
> >>
> >> Your point about losing information when the CFG is de-structurized is
> >> valid, but for things like loop depth, I'm not sure why we couldn't write an
> >> LLVM analysis pass for this (if one doesn't already exist).
> >>
> >
> > I don't think this is such a big deal either.  At least the
> > structurization pass used on newer AMD hardware isn't "fragile" in the
> > way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
> > algorithm) it's guaranteed to give you a valid structurized output no
> > matter what the previous optimization passes have done to the CFG,
> > modulo bugs.  I admit that the situation is nevertheless suboptimal.
> > Ideally this information wouldn't get lost along the way.  For the long
> > term we may want to represent structured control flow directly in the IR
> > as you say, I just don't see how reinventing the IR saves us any work if
> > we could just fix the existing one.
> 
> It seems to me that something like how we represent control flow is a
> pretty fundamental part of the IR - it affects any optimization pass
> that needs to do anything beyond adding and removing instructions. How
> would you fix that, especially given that LLVM is primarily designed
> for CPU's where you don't want to be restricted to structured control
> flow at all? It seems like our goals (preserve the structure) conflict
> with the way LLVM has been designed.
> 

I think it's important to distinguish between LLVM IR and the tools
available to manipulate it.  LLVM IR is meant to be a platform
independent program representation.  There is nothing about the IR that
would prevent someone from using it for hardware that required structured
control flow.

The tools (mainly the optimization passes) are where decisions about
things like preserving structured control flow are made.  There are
currently two strategies available for using the tools to produce programs
with structured control flow:

1. Use the CFG structurizer pass

2. Only use transforms that maintain the structure of the control flow.

-Tom

> >
> >>> * LLVM doesn't do modifiers, meaning that we can't do optimizations
> >>> like "clamp(x, 0.0, 1.0) => mov.sat x" and "clamp(x, 0.25, 1.0) =>
> >>> max.sat(x, .25)" in a generic fashion.
> >>>
> >>
> >> The way to handle this with LLVM would be to add intrinsics to represent
> >> the various modifiers and then fold them into instructions during
> >> instruction selection.
> >>
> >
> > IMHO this is a feature.  One of the things I don't like about NIR is
> > that it's still vec4-centric.  Most drivers are going to want something
> > else and different to each other, we cannot please all of them with one
> > single vector addressing model built into the core instruction set, so
> > I'd rather have modifiers, writemasks and swizzles represented as the
> > composition of separate instructions/intrinsics with simple and
> > well-defined semantics, which can be coalesced back into the real
> > instruction as Tom says (easy even if you don't use LLVM's instruction
> > selector as long as it's SSA form).
> 
> While NIR is vec4-centric, nothing's stopping you from splitting up
> instructions and doing optimizations at the scalar level for scalar
> ISA's - in fact, that's what I expect to happen. And for backends that
> really do need to have swizzles and writemasks, coalescing these
> things back into the original instruction is not at all trivial - in
> fact, going into and out of SSA without introducing extra copies even
> in situations like:
> 
> foo.xyz = ...
> ... = foo
> foo.x = ...
> 
> is a problem that hasn't been solved yet publicly (it seems doable,
> but difficult). So while we might not need swizzles and writemasks for
> most backends, for the few that do need it (like, for example, the
> i965 vec4 backend) it will be very nice to have one common lowering
> pass that solves this hard problem, which would be impossible to do
> without having swizzles and writemasks in the IR. And it's very likely
> that these backends, which probably aren't using SSA due to the
> aforementioned difficulties, will also benefit from having modifiers
> already folded for them - this is something that's already a problem
> for i965 vec4 backend and that NIR will help a lot.
> 
> >
> >>> * LLVM is hard to embed into other projects, especially if it's used
> >>> as anything but a command-line tool that only runs once. See, for
> >>> example, http://blog.llvm.org/2014/07/ftl-webkits-llvm-based-jit.html
> >>> under "Linking WebKit with LLVM" - most of those problems would also
> >>> apply to us.
> >>>
> >>
> >> You have to keep in mind that the way webkit uses LLVM is totally
> >> different than how Mesa would use LLVM if LLVM IR was adopted as a
> >> common IR.
> >>
> >> webkit is using LLVM as a full JIT compiler, which means it depends
> >> on almost all of the pieces of the LLVM stack, the IR manipulation,
> >> optimization passes, one or more of the code gen backends, as well
> >> as the entire JIT layer.  The JIT layer in particular is missing a lot of
> >> functionality in the C API, which makes it more difficult to work with.
> >>
> >> If Mesa were to adopt LLVM IR as a common IR, the only LLVM library
> >> functionality it would need would be the IR manipulation and the
> >> optimizations passes.
> >>
> >>> * LLVM is on a different release schedule (6 months vs. 3 months), has
> >>> a different review process, etc., which means that to add support for
> >>> new functionality that involves shaders, we now have to submit patches
> >>> to two separate projects, and then 2 months later when we ship Mesa it
> >>> turns out that nobody can actually use the new feature because it
> >>> depends upon an unreleased version of LLVM that won't be released for
> >>> another 3 months and then packaged by distros even later... we've
> >>> already had problems where distros refused to ship newer Mesa releases
> >>> because radeon depended on a version of LLVM newer than the one they
> >>> were shipping, and if we started using LLVM in core Mesa it would get
> >>> even worse. Proprietary drivers solve this problem by just forking
> >>> LLVM, building it with the rest of their driver, and linking it in as
> >>> a static library, but distro packagers would hate us if we did that.
> >>>
> >>
> >> If Mesa were using LLVM IR as a common IR I'm not sure what features
> >> in Mesa would be tied to new additions in LLVM.  As I said before,
> >> all Mesa would be using would be the IR manipulations and the
> >> optimization passes.  The IR manipulations only require new features
> >> when something new is added to LLVM IR specification, which is rare.
> >> It's possible there could be some lag in new features that go into
> >> the optimization passes, but if there was some optimization that was
> >> deemed really critical, it could be implemented in Mesa using the IR
> >> manipulators.
> >>
> >> -Tom
> >>
> >>> I wouldn't completely rule out LLVM, and I do think they do a lot of
> >>> things right, but for now it seems like it's not the path that the
> >>> Intel team wants to take.
> >>>
> >>> Connor
> >>> _______________________________________________
> >>> mesa-dev mailing list
> >>> mesa-dev at lists.freedesktop.org
> >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >> _______________________________________________
> >> mesa-dev mailing list
> >> mesa-dev at lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev