[Mesa-dev] Mesa (shader-work): glsl: introduce ir_binop_all_equal and ir_binop_any_equal, allow vector cmps

Tue Sep 7 21:13:16 PDT 2010

> Too bad LLVM doesn't have a clue about hardware that requires structured
> branching.  Any decent optimizer for general purpose CPUs generates
> spaghetti code.
Yes, that's the biggest challenge, but I think it can be solved.

Also, modern hardware tends to have more flexible branching than GLSL,
even though it's obviously impossible to have fully general branching.

> It is, in the best case, really hard to convert
> spaghetti code back into structured code.
I wrote an LLVM pass to do exactly this, but haven't yet produced a
clean version of it or published it anywhere.

The good news is that it is always possible to convert a CFG into a
structured one by adding variables with no cloning and at least
usually without introducing divergence.

I can say though that writing GLSL IR passes to manipulate control
flow is way easier than manipulating control flow in LLVM programs.
Non-CFG optimization issues probably have the inverse property due to
LLVM's SSA form and sophisticated analysis passes.

> Even worse, once you do that
> you ruin a lot of the optimizations that you just worked so hard to get.

I think it's only really bad if you start from a pathological case.
If you start from structured code, I think you should usually avoid
pessimization.
I haven't significantly experimented with that, though.

> At least for fragment shaders, hardware is going to continue to look
> like this for the foreseeable future.  Vertex shaders, geometry shaders,
> and OpenCL kernels don't have the same issues.  Fragement shaders are
> pretty important, though. :)

As far as I know on current hardware fragment shaders and OpenCL
shaders are quite similar optimization-wise, in particular due to the
existence of EXT_shader_image_load_store, better known as DirectX 11
unordered access views.

Of course, applications might not actually use long fragment shaders
with lots of memory reads/writes/atomics in practice.

> One of our first projects after 7.9 is to add support for using LLVM to
> generate software vertex shaders.

This has already been done by Zack Rusin in the Gallium draw module.

It would be great if Intel switched to the i915g and i965g Gallium
drivers, since everyone else is concentrating their attention on
Gallium, since it's much easier and better to write drivers for it.

All the software rendering work is being done on Gallium llvmpipe, all
the nVidia drivers for cards with shaders are Gallium drivers, and the
Gallium Radeon drivers are the only ones actively developed by the
community, with AMD probably going to switch as soon as r600g is in a
decent state.

> GLSL requires a certain level of optimization just to perform semantic
> checking on a shader.  We really haven't done very much beyond that.
> That's in addition to my comments above about structured branching.  We
> tried to take the path with the fewest unknowns.  That's also why we're
> still generating the low-level Mesa IR.

Yes, indeed this is probably a sensible choice and the GLSL IR passes
should be at least good enough to allow a correct implementation, but
not sure how much it makes sense to proceed after that.