[Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

Wed Aug 20 12:26:15 PDT 2014

On Wed, Aug 20, 2014 at 12:17 PM, Tom Stellard <tom at stellard.net> wrote:
> On Tue, Aug 19, 2014 at 05:19:15PM -0700, Connor Abbott wrote:
>> On Tue, Aug 19, 2014 at 3:57 PM, Tom Stellard <tom at stellard.net> wrote:
>> > On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote:
>> >> On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>> >> > Tom Stellard <tom at stellard.net> writes:
>> >> >
>> >> >> On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
>> >> >>> On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer <michel at daenzer.net> wrote:
>> >> >>> > On 19.08.2014 01:28, Connor Abbott wrote:
>> >> >>> >> On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer <michel at daenzer.net> wrote:
>> >> >>> >>> On 16.08.2014 09:12, Connor Abbott wrote:
>> >> >>> >>>> I know what you might be thinking right now. "Wait, *another* IR? Don't
>> >> >>> >>>> we already have like 5 of those, not counting all the driver-specific
>> >> >>> >>>> ones? Isn't this stuff complicated enough already?" Well, there are some
>> >> >>> >>>> pretty good reasons to start afresh (again...). In the years we've been
>> >> >>> >>>> using GLSL IR, we've come to realize that, in fact, it's not what we
>> >> >>> >>>> want *at all* to do optimizations on.
>> >> >>> >>>
>> >> >>> >>> Did you evaluate using LLVM IR instead of inventing yet another one?
>> >> >>> >>>
>> >> >>> >>>
>> >> >>> >>> --
>> >> >>> >>> Earthling Michel Dänzer            |                  http://www.amd.com
>> >> >>> >>> Libre software enthusiast          |                Mesa and X developer
>> >> >>> >>
>> >> >>> >> Yes. See
>> >> >>> >>
>> >> >>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
>> >> >>> >>
>> >> >>> >> and
>> >> >>> >>
>> >> >>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
>> >> >>> >
>> >> >>> > I know Ian can't deal with LLVM for some reason. I was wondering if
>> >> >>> > *you* evaluated it, and if so, why you rejected it.
>> >> >>> >
>> >> >>> >
>> >> >>> > --
>> >> >>> > Earthling Michel Dänzer            |                  http://www.amd.com
>> >> >>> > Libre software enthusiast          |                Mesa and X developer
>> >> >>>
>> >> >>>
>> >> >>> Well, first of all, the fact that Ian and Ken don't want to use it
>> >> >>> means that any plan to use LLVM for the Intel driver is dead in the
>> >> >>> water anyways - you can translate NIR into LLVM if you want, but for
>> >> >>> i965 we want to share optimizations between our 2 backends (FS and
>> >> >>> vec4) that we can't do today in GLSL IR so this is what we want to use
>> >> >>> for that, and since nobody else does anything with the core GLSL
>> >> >>> compiler except when they have to, when we start moving things out of
>> >> >>> GLSL IR this will probably replace GLSL IR as the infrastructure that
>> >> >>> all Mesa drivers use. But with that in mind, here are a few reasons
>> >> >>> why we wouldn't want to use LLVM:
>> >> >>>
>> >> >>> * LLVM wasn't built to understand structured CFG's, meaning that you
>> >> >>> need to re-structurize it using a pass that's fragile and prone to
>> >> >>> break if some other pass "optimizes" the shader in a way that makes it
>> >> >>> non-structured (i.e. not expressible in terms of loops and if
>> >> >>> statements). This loss of information also means that passes that need
>> >> >>> to know things like, for example, the loop nesting depth need to do an
>> >> >>> analysis pass whereas with NIR you can just walk up the control flow
>> >> >>> tree and count the number of loops we hit.
>> >> >>>
>> >> >>
>> >> >> LLVM has a pass to structurize the CFG.  We use it in the radeon
>> >> >> drivers, and it is run after all of the other LLVM optimizations which have
>> >> >> no concept of structured CFG.  It's not bug free, but it works really
>> >> >> well even with all of the complex OpenCL kernels we throw at it.
>> >> >>
>> >> >> Your point about losing information when the CFG is de-structurized is
>> >> >> valid, but for things like loop depth, I'm not sure why we couldn't write an
>> >> >> LLVM analysis pass for this (if one doesn't already exist).
>> >> >>
>> >> >
>> >> > I don't think this is such a big deal either.  At least the
>> >> > structurization pass used on newer AMD hardware isn't "fragile" in the
>> >> > way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
>> >> > algorithm) it's guaranteed to give you a valid structurized output no
>> >> > matter what the previous optimization passes have done to the CFG,
>> >> > modulo bugs.  I admit that the situation is nevertheless suboptimal.
>> >> > Ideally this information wouldn't get lost along the way.  For the long
>> >> > term we may want to represent structured control flow directly in the IR
>> >> > as you say, I just don't see how reinventing the IR saves us any work if
>> >> > we could just fix the existing one.
>> >>
>> >> It seems to me that something like how we represent control flow is a
>> >> pretty fundamental part of the IR - it affects any optimization pass
>> >> that needs to do anything beyond adding and removing instructions. How
>> >> would you fix that, especially given that LLVM is primarily designed
>> >> for CPU's where you don't want to be restricted to structured control
>> >> flow at all? It seems like our goals (preserve the structure) conflict
>> >> with the way LLVM has been designed.
>> >>
>> >
>> > I think it's important to distinguish between LLVM IR and the tools
>> > available to manipulate it.  LLVM IR is meant to be a platform
>> > independent program representation.  There is nothing about the IR that
>> > would prevent someone from using it for hardware that required structured
>> > control flow.
>>
>> Right - when I said that structured control flow was a fundamental
>> part of the IR, I meant that in the sense that it's a constraint that
>> all optimization passes have to follow. I was also thinking of NIR,
>> where it actually is a fundamental part of the IR datastructures - all
>> control flow consists of a tree of loops, if statements, and basic
>> blocks and there are no jump statements in the IR except for break,
>> continue, and return. There are helpers to mutate the control flow
>> tree (adding an if after an instruction, deleting a loop, etc.) so
>> that you can more or less pretend you're operating on something like
>> GLSL IR, while the CFG is being updated for you, basic blocks are
>> being created and deleted, etc.
>>
>> >
>> > The tools (mainly the optimization passes) are where decisions about
>> > things like preserving structured control flow are made.  There are
>> > currently two strategies available for using the tools to produce programs
>> > with structured control flow:
>> >
>> > 1. Use the CFG structurizer pass
>> >
>> > 2. Only use transforms that maintain the structure of the control flow.
>>
>> I'm a little confused about how this strategy would work. I'm assuming
>> that the control flow structure (i.e. the tree of loops and ifs) is
>> stored in some kind of metadata or fake instruction on top of the IR -
>> I haven't looked into this much, so correct me if I'm wrong. If so,
>> wouldn't you still have to make every optimization pass that touches
>> the CFG properly update that metadata to avoid it going stale, since
>> the optimizations themselves are operating on a list of basic blocks
>> which is a little lower-level?
>>
>
> There is no CFG metadata.  If you want to collect some information about the
> CFG, you would use an analysis pass to do this.  For example, LLVM has an
> analysis pass for computing the dominator tree.  If an optimization
> wants to use this analysis it would add this analysis as a pass dependency
> and then LLVM would run the dominator tree analysis before the optimizations pass.
>
> Once the analysis has been run, the result is cached for other passes to use.
> However, the base assumption is that optimization passes invalidate
> all analysis information, so passes are required to report which analysis passes
> or which features of the program are preserved.  So, if a pass reports
> that it preserves the CFG, then the dominator tree analysis is still considered
> valid.
>
> This a high level overview of how it works, but to get back to your question,
> if you wanted to use strategy number 2, you could just choose to only run
> optimizations that preserved the CFG.
>
> -Tom

Ah, I see, that makes sense. That does seem like a rather terrible
solution though, since not being able to change the CFG seems rather
harsh.

>> >
>> > -Tom
>> >
>> >> >
>> >> >>> * LLVM doesn't do modifiers, meaning that we can't do optimizations
>> >> >>> like "clamp(x, 0.0, 1.0) => mov.sat x" and "clamp(x, 0.25, 1.0) =>
>> >> >>> max.sat(x, .25)" in a generic fashion.
>> >> >>>
>> >> >>
>> >> >> The way to handle this with LLVM would be to add intrinsics to represent
>> >> >> the various modifiers and then fold them into instructions during
>> >> >> instruction selection.
>> >> >>
>> >> >
>> >> > IMHO this is a feature.  One of the things I don't like about NIR is
>> >> > that it's still vec4-centric.  Most drivers are going to want something
>> >> > else and different to each other, we cannot please all of them with one
>> >> > single vector addressing model built into the core instruction set, so
>> >> > I'd rather have modifiers, writemasks and swizzles represented as the
>> >> > composition of separate instructions/intrinsics with simple and
>> >> > well-defined semantics, which can be coalesced back into the real
>> >> > instruction as Tom says (easy even if you don't use LLVM's instruction
>> >> > selector as long as it's SSA form).
>> >>
>> >> While NIR is vec4-centric, nothing's stopping you from splitting up
>> >> instructions and doing optimizations at the scalar level for scalar
>> >> ISA's - in fact, that's what I expect to happen. And for backends that
>> >> really do need to have swizzles and writemasks, coalescing these
>> >> things back into the original instruction is not at all trivial - in
>> >> fact, going into and out of SSA without introducing extra copies even
>> >> in situations like:
>> >>
>> >> foo.xyz = ...
>> >> ... = foo
>> >> foo.x = ...
>> >>
>> >> is a problem that hasn't been solved yet publicly (it seems doable,
>> >> but difficult). So while we might not need swizzles and writemasks for
>> >> most backends, for the few that do need it (like, for example, the
>> >> i965 vec4 backend) it will be very nice to have one common lowering
>> >> pass that solves this hard problem, which would be impossible to do
>> >> without having swizzles and writemasks in the IR. And it's very likely
>> >> that these backends, which probably aren't using SSA due to the
>> >> aforementioned difficulties, will also benefit from having modifiers
>> >> already folded for them - this is something that's already a problem
>> >> for i965 vec4 backend and that NIR will help a lot.
>> >>
>> >> >
>> >> >>> * LLVM is hard to embed into other projects, especially if it's used
>> >> >>> as anything but a command-line tool that only runs once. See, for
>> >> >>> example, http://blog.llvm.org/2014/07/ftl-webkits-llvm-based-jit.html
>> >> >>> under "Linking WebKit with LLVM" - most of those problems would also
>> >> >>> apply to us.
>> >> >>>
>> >> >>
>> >> >> You have to keep in mind that the way webkit uses LLVM is totally
>> >> >> different than how Mesa would use LLVM if LLVM IR was adopted as a
>> >> >> common IR.
>> >> >>
>> >> >> webkit is using LLVM as a full JIT compiler, which means it depends
>> >> >> on almost all of the pieces of the LLVM stack, the IR manipulation,
>> >> >> optimization passes, one or more of the code gen backends, as well
>> >> >> as the entire JIT layer.  The JIT layer in particular is missing a lot of
>> >> >> functionality in the C API, which makes it more difficult to work with.
>> >> >>
>> >> >> If Mesa were to adopt LLVM IR as a common IR, the only LLVM library
>> >> >> functionality it would need would be the IR manipulation and the
>> >> >> optimizations passes.
>> >> >>
>> >> >>> * LLVM is on a different release schedule (6 months vs. 3 months), has
>> >> >>> a different review process, etc., which means that to add support for
>> >> >>> new functionality that involves shaders, we now have to submit patches
>> >> >>> to two separate projects, and then 2 months later when we ship Mesa it
>> >> >>> turns out that nobody can actually use the new feature because it
>> >> >>> depends upon an unreleased version of LLVM that won't be released for
>> >> >>> another 3 months and then packaged by distros even later... we've
>> >> >>> already had problems where distros refused to ship newer Mesa releases
>> >> >>> because radeon depended on a version of LLVM newer than the one they
>> >> >>> were shipping, and if we started using LLVM in core Mesa it would get
>> >> >>> even worse. Proprietary drivers solve this problem by just forking
>> >> >>> LLVM, building it with the rest of their driver, and linking it in as
>> >> >>> a static library, but distro packagers would hate us if we did that.
>> >> >>>
>> >> >>
>> >> >> If Mesa were using LLVM IR as a common IR I'm not sure what features
>> >> >> in Mesa would be tied to new additions in LLVM.  As I said before,
>> >> >> all Mesa would be using would be the IR manipulations and the
>> >> >> optimization passes.  The IR manipulations only require new features
>> >> >> when something new is added to LLVM IR specification, which is rare.
>> >> >> It's possible there could be some lag in new features that go into
>> >> >> the optimization passes, but if there was some optimization that was
>> >> >> deemed really critical, it could be implemented in Mesa using the IR
>> >> >> manipulators.
>> >> >>
>> >> >> -Tom
>> >> >>
>> >> >>> I wouldn't completely rule out LLVM, and I do think they do a lot of
>> >> >>> things right, but for now it seems like it's not the path that the
>> >> >>> Intel team wants to take.
>> >> >>>
>> >> >>> Connor
>> >> >>> _______________________________________________
>> >> >>> mesa-dev mailing list
>> >> >>> mesa-dev at lists.freedesktop.org
>> >> >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> >> >> _______________________________________________
>> >> >> mesa-dev mailing list
>> >> >> mesa-dev at lists.freedesktop.org
>> >> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev