[Mesa-dev] [PATCH 00/19] i965/fs: load_payload on Gen7+.

Tue Jun 10 12:34:57 PDT 2014

On Wed, May 28, 2014 at 4:44 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> On Tue, May 27, 2014 at 9:47 PM, Matt Turner <mattst88 at gmail.com> wrote:
>> Here's a respin of my load_payload series from mid-April with some
>> feedback from Ken addressed and some bugs fixed.
>>
>> This series is available in my tree (with a few unrelated patches
>> before it)
>>
>>    git://people.freedesktop.org/~mattst88/mesa tex-sources
>>
>> This is a prep series for implementing SSA in the i965 fragment
>> shader backend.
>
> I wonder how SSA in the i965 backend will interact with my SSA in GLSL
> IR series?

I'm not really sure. Of course it'd be nice to not have a bunch of
similar code in src/glsl and the i965 driver.

I've got a bunch of patches on the cfg branch of my tree

     git://people.freedesktop.org/~mattst88/mesa cfg

that start implementing SSA in the fs backend. It's got code to create
the dominance tree and dominance frontier sets, to insert Phi nodes,
and to rename variables. I've also adapted my local value numbering
pass to operate globally, but of course it requires global code motion
to perform fix ups.

I haven't really started on translating out of SSA, because it's been
hard to get interested when performing register allocation from SSA
seems so appealing. That's obviously a sizeable amount of work.

Predication is another problem I need to handle. Gated SSA was
suggested to me as the best solution, with some other nice properties
that allow things like divergence analysis.

> I've recently re-started work on this after my school work
> started winding down, and as of now here are work items remaining I
> can think of:
>
> 1) Add unit tests for the conversion to SSA
> 2) Make the conversion out of SSA not suck as much, esp. with respect
> to writemasked operations (this is pretty difficult, as apparently
> it's still an unsolved problem that to my knowledge no one from
> academia has tried to tackle...)
> 3) Add some trivial SSA-based optimizations (dead code elimination,
> copy propagation)
> 4) Add more complicated optimizations like CSE, constant propagation,
> GVN-GCM... I believe some of these, especially value numbering, depend
> on a flat IR to work, so this might be a lot harder to do.
>
> I propose that I just do #1, post a new patch series with Paul's
> comments on the original one addressed (that would probably take less
> than a week), and then you can use the opt_to_ssa pass to avoid having
> to duplicate that logic that in the backend since it's pretty
> complicated. So it would look like:
>
> GLSL IR -> opt_to_ssa -> SSA-ified GLSL IR -> i965 fragment shader backend

That sounds good. I'd even settle for getting non-SSA GLSL IR in the
backend. I expect SSA GLSL IR will allow lots of our optimizations to
be more effective, and I've got a shader from a benchmark that I could
cut a pile of instructions out of if only it didn't unnecessarily
reuse a variable. SSA would totally fix that.

> We would have to modify the pass that scalarizes the GLSL IR to work
> with phi nodes and ir_quadop_vector expressions, but that wouldn't be
> too difficult. That way we would have an immediate user of the SSA
> stuff while being able to make it useful for everyone else in the long
> term.

Sure. Except I've written the to-SSA code already, and am lacking the
from-SSA equivalent. :)