[Mesa-dev] [PATCH] nir: add helper to get # of src/dest components
Connor Abbott
cwabbott0 at gmail.com
Thu Jun 18 08:01:09 PDT 2015
On Thu, Jun 18, 2015 at 4:48 AM, Rob Clark <robdclark at gmail.com> wrote:
> On Thu, Jun 18, 2015 at 1:23 AM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>> On Wed, Jun 17, 2015 at 12:02 PM, Rob Clark <robdclark at gmail.com> wrote:
>>> On Wed, Jun 17, 2015 at 2:27 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>>> So, as is, this patch isn't quite correct. When I originally wrote
>>>> NIR, the idea was that the size of each instruction would be explicit
>>>> -- that is, each instruction has it's own size, and the size of
>>>> registers/SSA values was merely a hint to say "by the way, you
>>>> actually need this many components based on all the things that use
>>>> this value/register." In other words, you could smash num_components
>>>> to 4 for everything, and it would still work. Then, we could just
>>>> shrink num_components on demand as we got rid of uses of things.
>>>> That's why we have functions like nir_tex_instr_src_size(),
>>>> nir_tex_instr_dest_size(), and nir_ssa_alu_instr_src_components() that
>>>> seem like they're doing the same thing that these functions -- in most
>>>> cases, you actually want those functions instead of these. Maybe we
>>>> were figuring out the register/value # of components a few times, and
>>>> perhaps some of times it was broken, but it seems like adding these
>>>> functions would only add to the confusion.
>>>>
>>>> Now, in hindsight, it seems like that might not be the best idea. In
>>>> other words, I can see how it would make sense to turn
>>>> nir_tex_instr_src_size() etc. into assertions in the validator that
>>>> check that nir_(src|dest)_num_components() equals what you would
>>>> expect, and it's probably a good idea. But I don't want people to use
>>>> these functions without knowing what they're doing until we do that
>>>> cleanup.
>>>
>>> hmm, fortunately I hadn't pushed my loops branch yet, since still need
>>> to work out how to make variables/arrays work w/ >1 block (since in
>>> nir-land variables are not ssa)..
>>>
>>> (really I want phi's for variables too.. the way I turn arrays into
>>> fanin/collect fanout/split works on the backend for dealing with
>>> arrays in ssa form (other than making instruction graph large) but the
>>> way I go from nir to ir3 currently assumes I get nir phi's everywhere
>>> I need an ir3 phi)
>>
>> Right... we explicitly decided not to support SSA form for arrays in
>> NIR, since it seemed like a pretty bad idea. SSA form assumes that
>> inserting copies for the things you're SSA-ifying is relatively
>> inexpensive, and both SSA-based register allocation and algorithms for
>> converting out of SSA make no guarantees about not inserting
>> potentially unnecessary copies. This is a good compromise for smaller
>> things, but not for your array of (say) 64 things where inserting an
>> extra copy is rather disastrous. You can do it if you want and shoot
>> yourself in the foot, but NIR isn't going to help you there -- we
>> won't write a to-SSA pass for something which doesn't make much sense
>> in the first place.
>>
>
> in ir3 my solution is to add sufficient dependencies between
> instructions so the array accesses don't get re-ordered and they all
> collapse down to a single name per array element/slot
It's not about getting reordered, it's about interference. The problem
is that as soon as you do basically any optimization at all (even copy
propagation), you can wind up with a situation where the sources and
destination of a phi node interfere with each other and you have to
insert extra mov's. And even if you keep everything exactly the same,
with an SSA-based register allocator, there's always the chance that
it'll screw up anyways and allocate something over your array and
force you to insert a mov. You could force the array to be allocated
to a single hardware register, but then it's not really an SSA value
anymore -- it's a hardware register, and you can't treat it like an
SSA value anymore in your allocator, and so adding phi nodes and
whatnot for it in your IR doesn't make much sense.
>
>>>
>>> Anyways, maybe I'll just move the helpers into ir3 for now until the
>>> is_packed stuff is purged..
>>
>> is_packed? what does that have to do with it? I suspect that there
>> might be some places where you're using this function instead of the
>> others I mentioned, and you actually want to use the latter, although
>> I haven't seen the code so I can't be sure. But of course, there's
>> always the option of actually going and fixing it :)
>
> because I was in a hurry and didn't actually read your original reply
> (and assumed it was continuation of discussion about nuking is_packed
> from irc)
>
> So the place where I was using it is actually for phi instructions
> (and actually just for an assert at the moment), so those fxns you
> mention don't do me much good.
Ok... for phi nodes, we already assert that the number of source
components equals the number of dest components, so you can just use
dest.ssa.num_components. You won't ever get a non-SSA thing in a phi
node anyways (that only happens internally in regs to SSA).
>
> Somehow, I think it is still a good idea to un-open-code the figuring
> out of # of src/dst components. Sure it makes sense to have validate
> code ensure that those are same as nir_tex_instr_src_size(),
> nir_tex_instr_dest_size(), and nir_ssa_alu_instr_src_components()
> give. But that doesn't preclude adding these two functions.
The thing is, most of the time those functions aren't what you want,
until you add the asserts to make sure that they return the correct
thing and fix any code that's doing it wrong.
>
> BR,
> -R
>
>>
>>>
>>> BR,
>>> -R
>>>
>>>
>>>>
>>>> On Mon, Jun 8, 2015 at 12:45 PM, Rob Clark <robdclark at gmail.com> wrote:
>>>>> From: Rob Clark <robclark at freedesktop.org>
>>>>>
>>>>> I need something like this in a couple places. And didn't see anything
>>>>> like it anywhere.
>>>>>
>>>>> Signed-off-by: Rob Clark <robclark at freedesktop.org>
>>>>> ---
>>>>> v2: Added similar helper for nir_src, and cleaned up a few places that
>>>>> open coded this. There are a couple left (such as validate_alu_src())
>>>>> but that handle is_packed differently so I thought it best to leave
>>>>> them as-is.
>>>>>
>>>>> src/glsl/nir/nir.h | 18 ++++++++++++++++++
>>>>> src/glsl/nir/nir_from_ssa.c | 10 ++--------
>>>>> src/glsl/nir/nir_validate.c | 4 +---
>>>>> 3 files changed, 21 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>>>>> index 697d37e..06bbb0c 100644
>>>>> --- a/src/glsl/nir/nir.h
>>>>> +++ b/src/glsl/nir/nir.h
>>>>> @@ -541,6 +541,24 @@ typedef struct {
>>>>> #define nir_foreach_def_safe(reg, dest) \
>>>>> list_for_each_entry_safe(nir_dest, dest, &(reg)->defs, reg.def_link)
>>>>>
>>>>> +static inline unsigned
>>>>> +nir_dest_num_components(nir_dest *dest)
>>>>> +{
>>>>> + if (dest->is_ssa)
>>>>> + return dest->ssa.num_components;
>>>>> + else
>>>>> + return dest->reg.reg->num_components;
>>>>> +}
>>>>> +
>>>>> +static inline unsigned
>>>>> +nir_src_num_components(nir_src *src)
>>>>> +{
>>>>> + if (src->is_ssa)
>>>>> + return src->ssa->num_components;
>>>>> + else
>>>>> + return src->reg.reg->num_components;
>>>>> +}
>>>>> +
>>>>> static inline nir_src
>>>>> nir_src_for_ssa(nir_ssa_def *def)
>>>>> {
>>>>> diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c
>>>>> index 67733e6..23c798d 100644
>>>>> --- a/src/glsl/nir/nir_from_ssa.c
>>>>> +++ b/src/glsl/nir/nir_from_ssa.c
>>>>> @@ -553,10 +553,7 @@ emit_copy(nir_parallel_copy_instr *pcopy, nir_src src, nir_src dest_src,
>>>>> dest_src.reg.indirect == NULL &&
>>>>> dest_src.reg.base_offset == 0);
>>>>>
>>>>> - if (src.is_ssa)
>>>>> - assert(src.ssa->num_components >= dest_src.reg.reg->num_components);
>>>>> - else
>>>>> - assert(src.reg.reg->num_components >= dest_src.reg.reg->num_components);
>>>>> + assert(nir_src_num_components(&src) == nir_src_num_components(&dest_src));
>>>>>
>>>>> nir_alu_instr *mov = nir_alu_instr_create(mem_ctx, nir_op_imov);
>>>>> nir_src_copy(&mov->src[0].src, &src, mem_ctx);
>>>>> @@ -712,10 +709,7 @@ resolve_parallel_copy(nir_parallel_copy_instr *pcopy,
>>>>> nir_register *reg = nir_local_reg_create(state->impl);
>>>>> reg->name = "copy_temp";
>>>>> reg->num_array_elems = 0;
>>>>> - if (values[b].is_ssa)
>>>>> - reg->num_components = values[b].ssa->num_components;
>>>>> - else
>>>>> - reg->num_components = values[b].reg.reg->num_components;
>>>>> + reg->num_components = nir_src_num_components(&values[b]);
>>>>> values[num_vals].is_ssa = false;
>>>>> values[num_vals].reg.reg = reg;
>>>>>
>>>>> diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c
>>>>> index da92ed9..c781912 100644
>>>>> --- a/src/glsl/nir/nir_validate.c
>>>>> +++ b/src/glsl/nir/nir_validate.c
>>>>> @@ -262,9 +262,7 @@ validate_dest(nir_dest *dest, validate_state *state)
>>>>> static void
>>>>> validate_alu_dest(nir_alu_dest *dest, validate_state *state)
>>>>> {
>>>>> - unsigned dest_size =
>>>>> - dest->dest.is_ssa ? dest->dest.ssa.num_components
>>>>> - : dest->dest.reg.reg->num_components;
>>>>> + unsigned dest_size = nir_dest_num_components(&dest->dest);
>>>>> bool is_packed = !dest->dest.is_ssa && dest->dest.reg.reg->is_packed;
>>>>> /*
>>>>> * validate that the instruction doesn't write to components not in the
>>>>> --
>>>>> 2.4.2
>>>>>
More information about the mesa-dev
mailing list