[Mesa-dev] [PATCH v3 057/104] nir,spirv: Rework function calls

Tue Apr 10 17:50:01 UTC 2018

On Tue, Apr 10, 2018 at 1:17 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> On Tue, Apr 10, 2018 at 9:59 AM, Rob Clark <robdclark at gmail.com> wrote:
>>
>> On Tue, Apr 10, 2018 at 11:55 AM, Jason Ekstrand <jason at jlekstrand.net>
>> wrote:
>> > On Tue, Apr 10, 2018 at 8:17 AM, Rob Clark <robdclark at gmail.com> wrote:
>> >>
>> >> On Tue, Apr 10, 2018 at 11:04 AM, Jason Ekstrand <jason at jlekstrand.net>
>> >> wrote:
>> >> > On Tue, Apr 10, 2018 at 6:20 AM, Rob Clark <robdclark at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Mon, Apr 9, 2018 at 10:52 PM, Jason Ekstrand
>> >> >> <jason at jlekstrand.net>
>> >> >> wrote:
>> >> >> > + A bunch of potentially interested parties.
>> >> >> >
>> >> >> > On Mon, Apr 9, 2018 at 4:25 PM, Caio Marcelo de Oliveira Filho
>> >> >> > <caio.oliveira at intel.com> wrote:
>> >> >> >>
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >> >  typedef struct {
>> >> >> >> > -   nir_parameter_type param_type;
>> >> >> >> > -   const struct glsl_type *type;
>> >> >> >> > +   uint8_t num_components;
>> >> >> >> > +   uint8_t bit_size;
>> >> >> >> >  } nir_parameter;
>> >> >> >>
>> >> >> >> (...)
>> >> >> >>
>> >> >> >> > @@ -683,18 +692,12 @@ validate_tex_instr(nir_tex_instr *instr,
>> >> >> >> > validate_state *state)
>> >> >> >> >  static void
>> >> >> >> >  validate_call_instr(nir_call_instr *instr, validate_state
>> >> >> >> > *state)
>> >> >> >> >  {
>> >> >> >> > -   if (instr->return_deref == NULL) {
>> >> >> >> > -      validate_assert(state,
>> >> >> >> > glsl_type_is_void(instr->callee->return_type));
>> >> >> >> > -   } else {
>> >> >> >> > -      validate_assert(state, instr->return_deref->deref.type
>> >> >> >> > ==
>> >> >> >> > instr->callee->return_type);
>> >> >> >> > -      validate_deref_var(instr, instr->return_deref, state);
>> >> >> >> > -   }
>> >> >> >> > -
>> >> >> >> >     validate_assert(state, instr->num_params ==
>> >> >> >> > instr->callee->num_params);
>> >> >> >> >
>> >> >> >> >     for (unsigned i = 0; i < instr->num_params; i++) {
>> >> >> >> > -      validate_assert(state, instr->callee->params[i].type ==
>> >> >> >> > instr->params[i]->deref.type);
>> >> >> >> > -      validate_deref_var(instr, instr->params[i], state);
>> >> >> >> > +      validate_src(&instr->params[i], state,
>> >> >> >> > +                   instr->callee->params[i].bit_size,
>> >> >> >> > +                   instr->callee->params[i].num_components);
>> >> >> >> >     }
>> >> >> >> >  }
>> >> >> >>
>> >> >> >> Question: I might be misreading, but it seems like we are losing
>> >> >> >> the
>> >> >> >> type information for functions. Isn't that something worth
>> >> >> >> keeping,
>> >> >> >> maybe in some other way, e.g. load_param specifying the expected
>> >> >> >> type?
>> >> >> >
>> >> >> >
>> >> >> > That's a very good question!  To be honest, I'm not sure what the
>> >> >> > answer
>> >> >> > is.
>> >> >> > At the moment, the type information is fairly useless for most of
>> >> >> > what
>> >> >> > we
>> >> >> > use functions for.  Really, all we need is something that NIR can
>> >> >> > inline.
>> >> >> > As it is, we're not really preserving the types from SPIR-V
>> >> >> > because
>> >> >> > of
>> >> >> > the
>> >> >> > gymnastics we're doing to handle pointers.
>> >> >> >
>> >> >> > If we did want to preserve types, we'd need to have more detailed
>> >> >> > type
>> >> >> > information.  In particular, we'd need to be able to provide
>> >> >> > pointer
>> >> >> > types
>> >> >> > and maybe combined image-sampler types.  And along with those
>> >> >> > pointer
>> >> >> > types,
>> >> >> > we'd need to somehow express those pointer's storage requirements.
>> >> >> >
>> >> >> > The philosophy behind this commit is that, if we don't have a good
>> >> >> > match
>> >> >> > to
>> >> >> > SPIR-V anyway, we might as well just chuck that information and do
>> >> >> > whatever
>> >> >> > makes our lives the easiest.  My philosophy here may be flawed and
>> >> >> > I'm
>> >> >> > happy
>> >> >> > to hear arguments in favor of keeping the information.  The best
>> >> >> > argument I
>> >> >> > can come up with for keeping the information is if we find
>> >> >> > ourselves
>> >> >> > wanting
>> >> >> > to do some sort of linking in the future where we have to match
>> >> >> > functions by
>> >> >> > both name and type.  If we want to do that, however, we'll need
>> >> >> > all
>> >> >> > the
>> >> >> > SPIR-V type information.
>> >> >> >
>> >> >>
>> >> >> We do end up wanting the type information for cl kernels.  This is
>> >> >> maybe a slightly different case from calls within shader code (ie.
>> >> >> when both caller and callee are in shader).
>> >> >
>> >> >
>> >> > Yes, I think it is.  Question: Is there a distinction in CL between
>> >> > functions which are entrypoints callable from the API and functions
>> >> > which
>> >> > are helpers?  i.e. Can you call an entrypoint as a helper?
>> >> >
>> >>
>> >> There is the __kernel annotation.  And you know the entry point name
>> >> when compiling.  However I'm not sure anything prevents one entry
>> >> point from calling another.
>> >
>> >
>> > That would be worth investigating.
>> >
>>
>> fwiw, at least the cl to spv compiler seems to allow it.  (Although in
>> my simple examples it also inlines the called function before things
>> end up in spv.)
>
>
> Interesting.
>
>>
>> >>
>> >> I'm not sure we want the calling convention to be the same internally
>> >> as for kernel entry points so in that case, if we aren't inlining
>> >> everything, we might end up generating two versions of a function (or
>> >> possibly a shim.. or possibly between the two based on size.. or??)
>> >
>> >
>> > Having a shim seems like a reasonable plan.
>> >
>> >>
>> >> >>
>> >> >> Although I'd kinda like
>> >> >> to think that we don't need to make vtn aware of this distinction.
>> >> >
>> >> >
>> >> > Someone has to be aware of it. :-)  There are lots of places in
>> >> > spirv_to_nir
>> >> > were we take the SPIR-V and do something slightly different with it
>> >> > than
>> >> > the
>> >> > obvious translation.  Also, using function parameters for this is a
>> >> > significant anachronism because no other shader I/O in NIR has ever
>> >> > worked
>> >> > that way.
>> >> >
>> >> >>
>> >> >> So just to throw out an idea.  What if vtn just used load_deref for
>> >> >> everything, and in the case of fxn params it just points to a local
>> >> >> var with type nir_var_param?  (Or something roughly like that.)
>> >> >> Then
>> >> >> lower_io lowers this to load_param.
>> >> >
>> >> >
>> >> > That's kind-of what the original thing did.  However, for SPIR-V
>> >> > helper
>> >> > functions we have to be able to pass through pointers, SSA values
>> >> > with
>> >> > arbitrary type, and image/sampler pointers.  SSA values can be
>> >> > handled
>> >> > by
>> >> > just making a variable and storing them to it.  Pointers are tricky
>> >> > because
>> >> > they're not really copy-in/out.  For images, samplers, and pointers,
>> >> > we
>> >> > have
>> >> > a pile of "try to patch up the deref chain" code in
>> >> > nir_inline_functions
>> >> > that's rather tricky.  The moral of the story is that "just use
>> >> > variables"
>> >> > is not nearly as obvious of a choice as it looks.
>> >>
>> >> So, I'm considering just adding nir_variable ptr back to
>> >> nir_parameter.  That would be rather easy, and gives something that
>> >> some lowering pass could fixup data.driver_loc with parameter
>> >> position.  That would be rather easy for driver backends:
>> >>
>> >>   param_loc = fxn->params[param_idx]->var->data.driver_loc
>> >
>> >
>> > Why can't you just add a mode nir_var_kernel_param and keep a running
>> > uint32_t offset in the builder.  When you process an OpParameter in an
>> > entrypoint, create a nir_var_kernel_param variable and set
>> > var->data.location = offset and add the size of the param to offset.
>> > Then
>>
>> So far, with the new deref chain, I've managed to keep knowledge of
>> physical layout of things in lower_io and out of vtn.. it would be
>> nice to keep it that way.  Otherwise vtn needs to know about
>> size/alignment rules.
>
>
> Another option (which maybe you've already brought up without me
> understanding) would be to make nir_var_kernel_param a local variable type
> and have the list of params stored in the function next to the locals.  Then
> you would know exactly what params are associated with a given function impl
> and can assign locations at-will.

That sounds fairly close to what I was suggesting with adding the var
to nir_parameter, just in nir_function_impl instead of nir_function

>
>>
>> > the OpParameter can return an access chain to the newly created
>> > interface
>> > variable.  I fail to see why load_param is needed at all here.
>>
>> I'd prefer not to have this end up like load_input (or load_uniform)
>> in the driver (if that is what you had in mind).. *maybe* we could end
>> up treating this like a UBO, although blob does try to use uniforms
>> when possible (like when it doesn't need a pointer or byte
>> addressing), since that is faster.  So I think we should have a
>> distinct intrinsic.. whether that be load_param or something
>> different.
>
>
> What is it about load_input and load_uniform that you don't like?  I'm
> really trying to understand what's going on here.  Clearly, there's
> something but I don't know what it is.

Well, a couple things..

1) for (perhaps hysterical) reasons, for gallium drivers
load_input/load_uniform have offset in units of 32b dwords.  Basically
this matched how TGSI works (and therefore, I guess, how tgsi_to_nir
worked, and the st_glsl_to_nir path after that..)

(This actually works fine for ir3, since hw sticks VS inputs into
registers for us, and non-ubo uniforms are basically just a different
bank of read-only registers.)

2) my current thinking to be able to use uniforms when possible (since
they are faster, can be directly used as src to other instructions in
many cases, etc), but still be able to do pointers and byte
addressing, I was thinking about using the same ubo as both a ubo and
uniform upload buffer.  So the benefit of a different intrinsic is
that I don't have to shoe-horn that special compute kernel logic into
load_input/load_uniform handling

>
>>
>> Also, fwiw, we need to be able to do pointer math on function parameters
>> ;-)
>>
>> __kernel void test_fn(__global int *res, __constant struct SomeStruct
>> b[5], int c)
>> {
>>     __constant struct SomeStruct *s = &b[c];
>>     if (c < 3)
>>         s++;
>>     *res = s->s.b;
>> }
>
>
> I figured you would.  That's part of my confusion about why you want to
> treat them the same as parameters you pass from one function to another.
> They don't work at all like local variables.

it does make it possibly attractive to fit this into the deref
scheme.. I think..

BR,
-R