[Mesa-dev] [RFC] nir: compiler options for addressing modes

Rob Clark robdclark at gmail.com
Tue Apr 14 16:32:35 PDT 2015


On Tue, Apr 14, 2015 at 7:08 PM, Rob Clark <robdclark at gmail.com> wrote:
> On Tue, Apr 14, 2015 at 6:24 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>> On Tue, Apr 14, 2015 at 5:16 PM, Rob Clark <robdclark at gmail.com> wrote:
>>> On Tue, Apr 14, 2015 at 4:59 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>>>>>>>>> +   /**
>>>>>>>>> +    * Addressing mode for corresponding _indirect intrinsics:
>>>>>>>>> +    */
>>>>>>>>> +   nir_addressing_mode var_addressing_mode;
>>>>>>>>> +   nir_addressing_mode input_addressing_mode;
>>>>>>>>> +   nir_addressing_mode output_addressing_mode;
>>>>>>>>> +   nir_addressing_mode uniform_addressing_mode;
>>>>>>>>> +   nir_addressing_mode ubo_addressing_mode;
>>>>
>>>> What is var_addressing_mode?  Sorry for not bringing that up before.
>>>
>>>
>>> well, originally in my thinking it was for load_var/store_var..  but
>>> perhaps that doesn't make sense (given lower_io).  Maybe it makes more
>>> sense to define is as applying to var_local/var_global (where the
>>> others apply to shader_in/shader_out/uniform and their equivalent
>>> intrinsic forms)?
>>>
>>> Maybe it's a bit weird since I don't lower vars to regs before feeding
>>> to my ir3 frontend, but the whole load_var/store_var for array access,
>>> and ssa for everything else form works kind of nicely for me.
>>>
>>> BR,
>>> -R
>>
>> I don't think we should be letting the driver define the stride of
>> variable array accesses. Variables are supposed to be structured,
>> backend-independent things that core NIR can manipulate and optimize
>> as it pleases; it shouldn't need to know anything about how the driver
>> will index the data. For doing the kinds of optimizations you're
>> talking about, you have registers that are backend-dependent, and core
>> NIR (other than the lower_locals_to_regs) doesn't need to know what
>> the indices mean. What you're doing right now is a hack, and if you
>> want to get the benefits of optimizing the index expression in core
>> NIR you should be using lower_locals_to_regs(). Having scalars be SSA
>> values and arrays be registers can't be that much more complicated
>> than having arrays be variables, and that's how it was set up to work
>> from the beginning.
>
> well, it is pretty convenient for me to have direct and indirect array
> access come via intrinsics, since that gives me a nice single point to
> do all the magic I need to do to set up instruction dependencies for
> scheduling and register assignment where the arrays get allocated in
> registers.  Possibly that means we need an option to lower array
> access to some new sort of intrinsic?  Not sure, I'll play with the
> lower_locals_to_regs without first coming out of SSA.. maybe if then
> the only things in regs are array accesses, I can achieve the same
> result.
>
> But more immediately, I hit a sort of snag:  I cannot seem to narrow
> from 32b to 16b at the same time I move to address register.  Which
> ends up meaning I need a mov from 32b to 16b followed by a 2nd mov to
> get it into address register...  which sort of defeats the purpose of
> this whole exercise.  I need to do some more r/e around this, but it
> may end up being better the way it was before.  And if we end up
> needing to do the shl in half-precision registers, then solving this
> in NIR would (I think) require NIR to be aware of half-precision.
> Which sounds useful, but -EBIGGER_FIRES
>
> The other problem is that currently ttn gives addr src in float, which
> is how things are in tgsi land.  I'm not sure if changing this will be
> a problem for Eric.

ahh, disregard that last para.. I had ARL vs UARL confusion.. but
still the narrowing to half precision is an issue for me.

BR,
-R

> An interesting alternative solution to consider, is to allow the
> backend to lower to driver specific specific alu opcodes.. and then
> somehow run those through the other generic NIR opt passes.  I'm not
> quite sure yet *how* that will work (esp. considering my need for
> doing some things in half precision), but if we come up with something
> it would help in other cases too (such as lowering integer multiply)
>
> For now, I think I'll go back to having ttn UBO support not depending
> on this patch, since in the short term I need to sort out if/else
> flattening and flow control so that we can drop the tgsi f/e.
>
> BR,
> -R
>
>> Connor


More information about the mesa-dev mailing list