[Mesa-dev] [RFC] nir: Divergence Analysis

Mon Oct 8 11:04:16 UTC 2018

This is an RFC for a Divergence Analysis for NIR.

The algorithm implements "The Simple Divergence Analysis" from
Diogo Sampaio, Rafael De Souza, Sylvain Collange, Fernando Magno Quintão Pereira.
Divergence Analysis.

The proposed pass computes for each ssa definition if it is uniform.
That is, the variable has the same value for all invocations of the group.
If the value might be different for some thread, we call it divergent.
The algorithm is a worklist algorithm and starts with the assumption that all values
are uniform / non-divergent and iterates until convergence.

Motivation:
Divergence Analysis can be used for various optimizations:
control flow optimizations such as branch distribution, branch fusion, branch splitting, 
loop collapsing, iteration delaying and thread reallocation, 
but also memory optimizations like memory coalescing, and work unification.
Not all optimizations are applicable for every backend, and some cannot be used at all (like thread reallocation).

Implementation State:
This implementation is incomplete, but the difficult part is done (i.e. the control flow handling).
There are some intrinsics and maybe some special cases missing.
Also the way the worklist is handled is still a bit inefficient.
Currently, the pass returns an array of bools where 'true' corresponds to 'divergent'.
We might want to add a flag directly to the ssa-defs.

Future Work:
The mentioned paper also contains a more complex divergence analysis, which
calculates dependencies to the thread index, where a divergent value can
be recomputed by using a uniform value and the thread index.
There are a few use cases for this analysis like simplified address calculation
and rematerialization, but the main benefit should be achievable with the simple version.

Final Note:
My hope is that this DA helps on some control flow optimizations and improves global code motion.
Some backends might also use it for work unification by using a scalar unit and/or save some register space.
I'd be glad about any comments.

Kind regards,
Daniel