[Mesa-dev] [PATCH 7/7] nir: add helper macros for running NIR passes

Wed Oct 28 14:24:05 PDT 2015

On Wed, Oct 28, 2015 at 1:37 PM, Rob Clark <robdclark at gmail.com> wrote:
> On Wed, Oct 28, 2015 at 12:09 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> On Sat, Oct 24, 2015 at 10:08 AM, Rob Clark <robdclark at gmail.com> wrote:
>>> From: Rob Clark <robclark at freedesktop.org>
>>>
>>> Convenient place to put in some extra sanity checking, without making
>>> things messy for the drivers running the passes.
>>
>> In the short-term this seems to work (at least for testing nir_clone).
>> In the long-term, I'm not sure that a macro is really what we want.
>> I've mentioned a time or two before that what I *think* I'd like to do
>> (don't know exactly how it will work out yet) is to have a little
>> datastructure
>>
>> typedef struct nir_pass {
>>    bool (*shader_pass_func)(nir_shader *shader, void *data);
>>    bool (*impl_pass_func)(nir_function_impl *impl, void *data);
>>    nir_metadata metadata_preserved;
>>    void *data;
>> } nir_pass;
>>
>> and have each of the passes expose one of these as a const global
>> variable instead of exposing the actual functions.  Then we would have
>> a runner function (or macro) that could run a pass.  The runner would
>> take care of validation, trashing metadata, and maybe even cloning.
>> If no shader_pass_func is provided but you call it on a shader, the
>> runner would iterate over all of the overloads for you and run the
>> impl_pass_func on each.  We could also have helpers that take an array
>> and run all of them or even take an array and run it in a loop until
>> no more progress is made.
>
> meh, once we collapse the run+validate into a single line macro call,
> having list of calls sounds like it doesn't really take up more lines
> of code compared to a table of nir passes.. plus old fashioned code
> has a lot more flexibility without having to reinvent loops and ifs
> and that sort of thing.  Keep in mind some passes are conditional on
> draw state (ie. what we are lowering) or shader stage, etc.
>
> BR,
> -R

FWIW, another reason that we might want to add something like this is
to optimize the ordering of passes so that they have to less work.
There are a lot of passes that act as "cleanups" for other passes; for
example, copy prop introduces a bunch of code that DCE has to clean
up. In addition, there are a lot of passes that are sort-of
"prerequisites" for another pass, doing some transform that lets
another pass do its work -- for example, lots of passes can't see
through copies and therefore require copy prop in order to do
anything, and deleting a trivial phi node may be necessary before we
can delete a loop. Right now, we try to add passes in more-or-less the
"right" order in the loop, but that's pretty icky and it's not obvious
to someone else using the infrastructure that a certain order might
not be optimal in terms of time required to get a fixed point.
Instead, I'd like for passes to be able to mark other passes as
prerequisites or cleanups, and have a scheduler/pass manager a la
LLVM's PassManager that tries to satisfy those dependencies (try and
run a cleanup pass if the previous pass reported progress, run passes
with unmet prerequisites last and passes with met prerequisites first,
etc.). Obviously, this is going to require some kind of pass struct
and some level of abstraction, although backends can still choose
which passes to add and they can still run passes themselves if they
so choose.

>
>
>> The thing I haven't quite settled on is how to pass extra parameters.
>> For some passes, we could just put the extra stuff in compiler_options
>> but we don't want to litter it too bad.  The other option is to do
>> what I did above and use the classic void pointer.  Then drivers would
>> have to just make a copy and set the data pointer to whatever they
>> want.
>>
>> Maybe I should just go implement this...
>>