[Mesa-dev] What I'm working on

Fri Oct 15 17:13:48 PDT 2010

On Sat, Oct 16, 2010 at 8:09 AM, Eric Anholt <eric at anholt.net> wrote:
> On Wed, 13 Oct 2010 09:20:22 +1000, Dave Airlie <airlied at gmail.com> wrote:
>> On Wed, Oct 13, 2010 at 3:33 AM, Ian Romanick <idr at freedesktop.org> wrote:
>> > -----BEGIN PGP SIGNED MESSAGE-----
>> > Hash: SHA1
>> >
>> > Brian Paul wrote:
>> >> On 10/11/2010 03:49 PM, Ian Romanick wrote:
>> >>> It should be possible to move ir_to_mesa out of core Mesa and into a
>> >>> (lower) driver level.  As has been discussed numerous times, the
>> >>> assembly-like IRs in Mesa (classic Mesa IR and TGSI) are completely
>> >>> useless for generating code for GPUs.
>> >>
>> >> I'm sorry, but that's an exageration.  Mesa IR and TGSI are very similar
>> >> to the original ARB vertex/fragment program languages which were clearly
>> >> intended for GPU implementation.  They may not be ideal in some ways,
>> >> but certainly not completely useless.
>> >
>> > I suppose I should have said "useless for generating high quality code
>> > for GPUs."
>> >
>> > The original ARB vertex/fragment program languages were clearly intended
>> > for *direct* GPU translation.  That is, the intention was that there
>> > would be a nearly 1-to-1 translation from instructions in the source
>> > language to instructions on the hardware.  For that generation of
>> > hardware that was a good assumption, but this hasn't been the case for
>> > many years.  It certainly isn't true on i965 or r600.
>> >
>> > Look at what the r600 driver.  It translates Mesa IR (r600g presumably
>> > does the same with TGSI) back up to some higher-level IR before doing
>> > register allocation, code generation, and scheduling.  Every single
>> > credible DX driver also works this way, and driver writers *hate* it.
>> > The driver basically has to decompile one assembly back into a
>> > higher-level IR, attempting to infer the intention of the origin program
>> > along the way.
>> >
>> > When we have access to the original program already in a higher-level
>> > IR, this is just plain madness.  The driver has to do more work.  Since
>> > information is lost as the IR becomes lower and lower, the driver has
>> > less information to use to do that work.  It's a lose/lose situation.
>> >
>> > Outside of DX and Mesa, no multi-target compiler works like this.  LLVM
>> > doesn't[1].  GCC doesn't[2].  LCC doesn't[3].  Open64 (formerly MipsPro)
>> > doesn't[4].  Closed-source GLSL compilers don't.  There is a reason. :)
>>
>> The AMD IR is findable with google, I think their GLSL compiler spits
>> that out, it seems to my untrained eye to be more TGSI like than not,
>> but you guys know more about what you are talking about than I.
>>
>> http://developer.amd.com/gpu/ATIStreamSDK/assets/ATI_Intermediate_Language_(IL)_Specification_v2d.pdf
>
> This looks to me like something that would be spit out by the driver
> after optimization at some other IR level.  It has exactly the
> information needed by that class of chips for code generation.  I think
> all decent drivers are going to have chip-specific LIRs like this for
> doing their peepholing and register allocation.
>
> Some people have suggested that LLVM would save the poor driver writer
> From having to write that LIR stage.  I've heard of only one group, IMG,
> doing LLVM to GPU directly, and they have a custom backend not using the
> retargeting codegen stuff from LLVM from what I'm told.
>
> Back to the AMD IR, note how many tweaks they have to instruction
> operands that map 1:1 to AMD hardware features as far as I know (invert,
> bias, x2, etc).  GLSL IR doesn't have those, just like it doesn't have
> the constant immediate stuff that 915 and 965 do.  I've found supporting
> negate/abs to be quite easy out of GLSL IR thanks to the structured
> expression trees, and x2 would be as well, so I don't think we need to
> extend the GLSL IR for them.
>
> Also, that AMD IR has declarations for variables, which is the biggest
> failing of Mesa IR in my opinion, and the thing that made it worse than
> just an AST of the incoming code (Mesa IR register indexes are intended
> to mean actual hardware register indexes, so drivers have to jump
> through insane hoops to try to get back to things they can actually
> register allocate).
>
> The AMD IR doesn't have structures for control flow like we do in GLSL
> IR, probably because there's a near 1:1 mapping of those LIR
> instructions to the ISA, so they do basic block operations like I'm
> doing in the 965 FS backend (which has its own LIR too) on a flat
> instruction stream.
>
> So, yes, this AMD IR looks roughly like a Mesa IR with extra information
> for r600 chips, which is to say a low-level IR for a single target.  It
> doesn't look like a multi-target IR to me, though.

Yeah my point was against Ian saying nobody else does it this way, AMD
clearly do
as their OpenCL backend spits out that IR, and I would assume
internally their GLSL compiler.

I'm sure they have a more complex IR inside the compiler but the
output the driver sees is at this level,
I've no idea what the nvidia binary drivers do.

Dave.
>