[Mesa-dev] [PATCH v3 06/19] RFC: nir/vtn: "raw" pointer support

Jason Ekstrand jason at jlekstrand.net
Fri Mar 23 21:18:25 UTC 2018


On Fri, Mar 23, 2018 at 2:15 PM, Karol Herbst <kherbst at redhat.com> wrote:

> On Fri, Mar 23, 2018 at 10:07 PM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> > +list
> >
> > On Fri, Mar 23, 2018 at 1:45 PM, Karol Herbst <kherbst at redhat.com>
> wrote:
> >>
> >> On Fri, Mar 23, 2018 at 9:30 PM, Jason Ekstrand <jason at jlekstrand.net>
> >> wrote:
> >> > As I've been rewriting core NIR deref handling, I've been thinking
> about
> >> > this problem quite a bit.  One objective I have is to actually make
> UBO
> >> > and
> >> > SSBO access go through derefs instead of just being an offset and
> index
> >> > so
> >> > that the compiler can better reason about them.  In particular, I want
> >> > to be
> >> > able to start doing load/store elimination on SSBOs, SLM, and whatever
> >> > CL
> >> > has which would be great for everyone's compute performance (GL,
> Vulkan,
> >> > CL,
> >> > etc.).
> >> >
> >> > I would be lying if I said I had a full plan but I do have part of a
> >> > plan.
> >> > In my patch which adds the deref instructions, I add a new "cast"
> deref
> >> > type
> >> > which takes an arbitrary value as it's source and kicks out a deref
> with
> >> > a
> >> > type.  Whenever we discover that the source of the cast is actually
> >> > another
> >> > deref which is compatible (same type etc.), copy propagation gets rid
> of
> >> > the
> >> > cast for you.  The idea is that, instead of doing a load_raw(raw_ptr),
> >> > you
> >> > would do a load((type *)raw_ptr).
> >> >
> >> > Right now, most of the core NIR optimizations will throw a fit if they
> >> > ever
> >> > see a cast.  This is intentional because it requires us to manually go
> >> > through and handle casts.  This would mean that, at the moment, you
> >> > would
> >> > have to lower to load_raw intrinsics almost immediately after coming
> out
> >> > of
> >> > SPIR-V.
> >> >
> >>
> >> Well it gets more fun with OpenCL 2.0 where you can have generic
> >> pointer where you only know the type at creation type. You can also
> >> declare generic pointers as function inputs in a way, that you never
> >> actually know from where you have to load if you only have that one
> >> function. So the actual load operation depends on when you create the
> >> initial pointer variable (you can cast from X to generic, but not the
> >> other way around).
> >>
> >> Which in the end means you can end up with load(generic_ptr) and only
> >> following the chain up to it's creation (with function inlining in
> >> mind) you know the actual memory target.
> >
> >
> > Yup.  And there will always be crazy cases where you can't actually
> follow
> > it and you have to emit a pile of code to load different ways depending
> on
> > some bits somewhere that tell you how to load it.  I'm well aware of the
> > insanity. :-)  This is part of the reason why I'm glad I'm not trying to
> > write an OpenCL 2.0 driver.
> >
> > This insanity is exactly why I'm suggesting the pointer casting.  Sure,
> you
> > may not know the data type until the actual load.  In that case, you end
> up
> > with the cast being right before the load.  If you don't know the storage
> > class, maybe you have to switch and do multiple casts based on some bits.
> > Alternatively, if you don't know the storage class, we can just let the
> > deref mode be 0 for "I don't know". or maybe multiple bits for "these are
> > the things it might be".  In any case, I think we can handle it.
> >
>
> there shouldn't be a situation where we don't know, except when you
> don't inline all functions. I think Rob had the idea of fat pointers
> where a pointer is a vec2 and the 2nd component contains the actual
> pointer type and you end up with a switch over the type to get the
> correct storage class. And if the compiler inlines all functions, it
> should be able to optimize that switch away.
>

Right.  Today, we live in a world where all functions are inlined.  Sadly,
I fear that world may come to and end one of these days. :(


> > It's insane but we need some sort of structure to be able to reason about
> > the insanity.  Immediately lowering everything to load_raw is a good way
> to
> > get a driver off the ground.  What it's not so good for is making an
> > optimizing compiler that can reason about these crazy pointers and
> actually
> > optimize them.  Lest I sound too negative, I'm 100% fine with taking a
> short
> > path to getting something working now so long as it doesn't cloud up our
> > ability to do better in the future.
> >
> >>
> >> And I think the issue here is not that it is some kind of raw pointer
> >> in the patch, but more like an unbound/physical pointer, which doesn't
> >> relate to any variable. It is just a value like any other int/long as
> >> well.
> >>
> >> > On Fri, Mar 23, 2018 at 12:33 PM, Karol Herbst <kherbst at redhat.com>
> >> > wrote:
> >> >>
> >> >> From: Rob Clark <robdclark at gmail.com>
> >> >>
> >> >> An attempt to add physical pointer support to vtn.  I'm not totally
> >> >> happy about the handling of logical pointers vs physical pointers.
> >> >> So this is really more of an RFS (request for suggestions)
> >> >>
> >> >> v2: treat vec3 types as vec4 when dereferencing
> >> >>
> >> >> Signed-off-by: Karol Herbst <kherbst at redhat.com>
> >> >> ---
> >> >>  src/compiler/spirv/spirv_to_nir.c  |  87 ++++++++---
> >> >>  src/compiler/spirv/vtn_private.h   |  20 ++-
> >> >>  src/compiler/spirv/vtn_variables.c | 300
> >> >> ++++++++++++++++++++++++++++++++-----
> >> >>  3 files changed, 347 insertions(+), 60 deletions(-)
> >> >>
> >> >> diff --git a/src/compiler/spirv/spirv_to_nir.c
> >> >> b/src/compiler/spirv/spirv_to_nir.c
> >> >> index 334bcab9a82..d58a68f80ef 100644
> >> >> --- a/src/compiler/spirv/spirv_to_nir.c
> >> >> +++ b/src/compiler/spirv/spirv_to_nir.c
> >> >> @@ -572,6 +572,7 @@ vtn_types_compatible(struct vtn_builder *b,
> >> >>               vtn_types_compatible(b, t1->array_element,
> >> >> t2->array_element);
> >> >>
> >> >>     case vtn_base_type_pointer:
> >> >> +   case vtn_base_type_raw_pointer:
> >> >>        return vtn_types_compatible(b, t1->deref, t2->deref);
> >> >>
> >> >>     case vtn_base_type_struct:
> >> >> @@ -609,6 +610,7 @@ vtn_type_copy(struct vtn_builder *b, struct
> >> >> vtn_type
> >> >> *src)
> >> >>     case vtn_base_type_matrix:
> >> >>     case vtn_base_type_array:
> >> >>     case vtn_base_type_pointer:
> >> >> +   case vtn_base_type_raw_pointer:
> >> >>     case vtn_base_type_image:
> >> >>     case vtn_base_type_sampler:
> >> >>     case vtn_base_type_sampled_image:
> >> >> @@ -939,6 +941,14 @@ vtn_type_layout_std430(struct vtn_builder *b,
> >> >> struct
> >> >> vtn_type *type,
> >> >>        return type;
> >> >>     }
> >> >>
> >> >> +   case vtn_base_type_raw_pointer: {
> >> >> +      uint32_t comp_size = b->ptr_size / 8;
> >> >> +      vtn_assert(comp_size);
> >> >> +      *size_out = comp_size;
> >> >> +      *align_out = comp_size;
> >> >> +      return type;
> >> >> +   }
> >> >> +
> >> >>     case vtn_base_type_vector: {
> >> >>        uint32_t comp_size = glsl_get_bit_size(type->type) / 8;
> >> >>        assert(type->length > 0 && type->length <= 4);
> >> >> @@ -1003,6 +1013,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >>        val->type->base_type = vtn_base_type_scalar;
> >> >>        val->type->type = glsl_bool_type();
> >> >>        val->type->length = 1;
> >> >> +      val->type->stride = 4;
> >> >>        break;
> >> >>     case SpvOpTypeInt: {
> >> >>        int bit_size = w[2];
> >> >> @@ -1025,6 +1036,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >>           vtn_fail("Invalid int bit size");
> >> >>        }
> >> >>        val->type->length = 1;
> >> >> +      val->type->stride = bit_size / 8;
> >> >>        break;
> >> >>     }
> >> >>
> >> >> @@ -1045,6 +1057,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >>           vtn_fail("Invalid float bit size");
> >> >>        }
> >> >>        val->type->length = 1;
> >> >> +      val->type->stride = bit_size / 8;
> >> >>        break;
> >> >>     }
> >> >>
> >> >> @@ -1061,6 +1074,10 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >>        val->type->type =
> >> >> glsl_vector_type(glsl_get_base_type(base->type),
> >> >> elems);
> >> >>        val->type->length = elems;
> >> >>        val->type->stride = glsl_get_bit_size(base->type) / 8;
> >> >> +      /* special case: vec3 is aligned to vec4 */
> >> >> +      if (elems == 3)
> >> >> +         elems = 4;
> >> >> +      val->type->stride *= elems;
> >> >>        val->type->array_element = base;
> >> >>        break;
> >> >>     }
> >> >> @@ -1138,7 +1155,11 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >>
> >> >>        const char *name = val->name ? val->name : "struct";
> >> >>
> >> >> -      val->type->type = glsl_struct_type(fields, num_fields, name,
> >> >> false);
> >> >> +      val->type->type = glsl_struct_type(fields, num_fields, name,
> >> >> +                                         val->type->packed);
> >> >> +      // TODO stride for a struct only matters for kernel shaders,
> >> >> where
> >> >> +      // cl_size is the right thing.. but still a bit ugly to
> >> >> hard-code.
> >> >> +      val->type->stride = glsl_get_cl_size(val->type->type);
> >> >>        break;
> >> >>     }
> >> >>
> >> >> @@ -1167,25 +1188,47 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >>        val->type->storage_class = storage_class;
> >> >>        val->type->deref = deref_type;
> >> >>
> >> >> -      if (storage_class == SpvStorageClassUniform ||
> >> >> -          storage_class == SpvStorageClassStorageBuffer) {
> >> >> -         /* These can actually be stored to nir_variables and used
> as
> >> >> SSA
> >> >> -          * values so they need a real glsl_type.
> >> >> -          */
> >> >> -         val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
> >> >> -      }
> >> >> -
> >> >> -      if (storage_class == SpvStorageClassWorkgroup &&
> >> >> -          b->options->lower_workgroup_access_to_offsets) {
> >> >> +      // XXX handling the "fake" glsl pointers vs "raw" pointers in
> >> >> kernel
> >> >> +      // is a bit ugly..  need to understand how "pointers" are used
> >> >> in
> >> >> vk
> >> >> +      // and figure out something better
> >> >> +      if (storage_class == SpvStorageClassFunction ||
> >> >> +          storage_class == SpvStorageClassUniformConstant ||
> >> >> +          storage_class == SpvStorageClassWorkgroup ||
> >> >> +          !b->kernel_mode) {
> >> >> +         if (storage_class == SpvStorageClassUniform ||
> >> >> +             storage_class == SpvStorageClassStorageBuffer) {
> >> >> +            /* These can actually be stored to nir_variables and
> used
> >> >> as
> >> >> SSA
> >> >> +             * values so they need a real glsl_type.
> >> >> +             */
> >> >> +            val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
> >> >> +         } else if (storage_class == SpvStorageClassWorkgroup &&
> >> >> +                    b->options->lower_workgroup_access_to_offsets)
> {
> >> >> +            uint32_t size, align;
> >> >> +            val->type->deref = vtn_type_layout_std430(b,
> >> >> val->type->deref,
> >> >> +                                                      &size,
> &align);
> >> >> +            val->type->length = size;
> >> >> +            val->type->align = align;
> >> >> +            /* These can actually be stored to nir_variables and
> used
> >> >> as
> >> >> SSA
> >> >> +             * values so they need a real glsl_type.
> >> >> +             */
> >> >> +            val->type->type = glsl_uint_type();
> >> >> +         }
> >> >> +      } else {
> >> >> +         vtn_assert(storage_class == SpvStorageClassCrossWorkgroup
> ||
> >> >> +                    storage_class == SpvStorageClassInput);
> >> >>           uint32_t size, align;
> >> >> +         if (b->ptr_size == 64) {
> >> >> +            val->type->type = glsl_uint64_t_type();
> >> >> +         } else {
> >> >> +            val->type->type = glsl_uint_type();
> >> >> +         }
> >> >> +         val->type->base_type = vtn_base_type_raw_pointer;
> >> >> +         /* pointers can be accessed as array, so set the stride as
> >> >> size:
> >> >> */
> >> >>           val->type->deref = vtn_type_layout_std430(b,
> >> >> val->type->deref,
> >> >>                                                     &size, &align);
> >> >> -         val->type->length = size;
> >> >> +#define ALIGN(_v, _d) (((_v) + ((_d) - 1)) & ~((_d) - 1))
> >> >> +         val->type->stride = ALIGN(size, align);
> >> >>           val->type->align = align;
> >> >> -         /* These can actually be stored to nir_variables and used
> as
> >> >> SSA
> >> >> -          * values so they need a real glsl_type.
> >> >> -          */
> >> >> -         val->type->type = glsl_uint_type();
> >> >>        }
> >> >>        break;
> >> >>     }
> >> >> @@ -3391,9 +3434,16 @@ vtn_handle_preamble_instruction(struct
> >> >> vtn_builder
> >> >> *b, SpvOp opcode,
> >> >>        break;
> >> >>
> >> >>     case SpvOpMemoryModel:
> >> >> -      vtn_assert(w[1] == SpvAddressingModelLogical);
> >> >> +      vtn_assert(w[1] == SpvAddressingModelLogical ||
> >> >> +                 w[1] == SpvAddressingModelPhysical32 ||
> >> >> +                 w[1] == SpvAddressingModelPhysical64);
> >> >>        vtn_assert(w[2] == SpvMemoryModelSimple ||
> >> >> -                 w[2] == SpvMemoryModelGLSL450);
> >> >> +                 w[2] == SpvMemoryModelGLSL450 ||
> >> >> +                 w[2] == SpvMemoryModelOpenCL);
> >> >> +      if (w[1] == SpvAddressingModelPhysical32)
> >> >> +         b->ptr_size = 32;
> >> >> +      else if (w[1] == SpvAddressingModelPhysical64)
> >> >> +         b->ptr_size = 64;
> >> >>        break;
> >> >>
> >> >>     case SpvOpEntryPoint: {
> >> >> @@ -3780,6 +3830,7 @@ vtn_handle_body_instruction(struct vtn_builder
> >> >> *b,
> >> >> SpvOp opcode,
> >> >>           sel_type = glsl_vector_type(GLSL_TYPE_BOOL,
> >> >> res_val->type->length);
> >> >>           break;
> >> >>        case vtn_base_type_pointer:
> >> >> +      case vtn_base_type_raw_pointer:
> >> >>           /* We need to have actual storage for pointer types */
> >> >>           vtn_fail_if(res_val->type->type == NULL,
> >> >>                       "Invalid pointer result type for OpSelect");
> >> >> diff --git a/src/compiler/spirv/vtn_private.h
> >> >> b/src/compiler/spirv/vtn_private.h
> >> >> index bd273122703..dbfe9eab58a 100644
> >> >> --- a/src/compiler/spirv/vtn_private.h
> >> >> +++ b/src/compiler/spirv/vtn_private.h
> >> >> @@ -262,7 +262,8 @@ enum vtn_base_type {
> >> >>     vtn_base_type_matrix,
> >> >>     vtn_base_type_array,
> >> >>     vtn_base_type_struct,
> >> >> -   vtn_base_type_pointer,
> >> >> +   vtn_base_type_pointer,         /* a logical pointer */
> >> >> +   vtn_base_type_raw_pointer,     /* a physical pointer */
> >> >>     vtn_base_type_image,
> >> >>     vtn_base_type_sampler,
> >> >>     vtn_base_type_sampled_image,
> >> >> @@ -407,12 +408,14 @@ enum vtn_variable_mode {
> >> >>     vtn_variable_mode_local,
> >> >>     vtn_variable_mode_global,
> >> >>     vtn_variable_mode_param,
> >> >> +   vtn_variable_mode_const,
> >> >>     vtn_variable_mode_ubo,
> >> >>     vtn_variable_mode_ssbo,
> >> >>     vtn_variable_mode_push_constant,
> >> >>     vtn_variable_mode_image,
> >> >>     vtn_variable_mode_sampler,
> >> >>     vtn_variable_mode_workgroup,
> >> >> +   vtn_variable_mode_cross_workgroup,
> >> >>     vtn_variable_mode_input,
> >> >>     vtn_variable_mode_output,
> >> >>  };
> >> >> @@ -588,6 +591,13 @@ struct vtn_builder {
> >> >>
> >> >>     bool has_loop_continue;
> >> >>
> >> >> +   /* pointer size is:
> >> >> +    *   AddressingModelLogical:    0    (default)
> >> >> +    *   AddressingModelPhysical32: 32
> >> >> +    *   AddressingModelPhysical64: 64
> >> >> +    */
> >> >> +   unsigned ptr_size;
> >> >> +
> >> >>     bool kernel_mode;
> >> >>  };
> >> >>
> >> >> @@ -624,7 +634,8 @@ vtn_push_ssa(struct vtn_builder *b, uint32_t
> >> >> value_id,
> >> >>               struct vtn_type *type, struct vtn_ssa_value *ssa)
> >> >>  {
> >> >>     struct vtn_value *val;
> >> >> -   if (type->base_type == vtn_base_type_pointer) {
> >> >> +   if (type->base_type == vtn_base_type_pointer ||
> >> >> +       type->base_type == vtn_base_type_raw_pointer) {
> >> >>        val = vtn_push_value(b, value_id, vtn_value_type_pointer);
> >> >>        val->pointer = vtn_pointer_from_ssa(b, ssa->def, type);
> >> >>     } else {
> >> >> @@ -688,6 +699,11 @@ struct vtn_ssa_value *vtn_local_load(struct
> >> >> vtn_builder *b, nir_deref_var *src);
> >> >>  void vtn_local_store(struct vtn_builder *b, struct vtn_ssa_value
> *src,
> >> >>                       nir_deref_var *dest);
> >> >>
> >> >> +struct vtn_ssa_value *vtn_pointer_load(struct vtn_builder *b,
> >> >> +                                       struct vtn_pointer *ptr);
> >> >> +void vtn_pointer_store(struct vtn_builder *b, struct vtn_ssa_value
> >> >> *src,
> >> >> +                       struct vtn_pointer *ptr);
> >> >> +
> >> >>  struct vtn_ssa_value *
> >> >>  vtn_variable_load(struct vtn_builder *b, struct vtn_pointer *src);
> >> >>
> >> >> diff --git a/src/compiler/spirv/vtn_variables.c
> >> >> b/src/compiler/spirv/vtn_variables.c
> >> >> index b2897407fb1..af9222d6f4e 100644
> >> >> --- a/src/compiler/spirv/vtn_variables.c
> >> >> +++ b/src/compiler/spirv/vtn_variables.c
> >> >> @@ -51,6 +51,9 @@ vtn_access_chain_extend(struct vtn_builder *b,
> struct
> >> >> vtn_access_chain *old,
> >> >>     unsigned old_len = old ? old->length : 0;
> >> >>     chain = vtn_access_chain_create(b, old_len + new_ids);
> >> >>
> >> >> +   if (old)
> >> >> +      chain->ptr_as_array = old->ptr_as_array;
> >> >> +
> >> >>     for (unsigned i = 0; i < old_len; i++)
> >> >>        chain->link[i] = old->link[i];
> >> >>
> >> >> @@ -88,11 +91,12 @@ vtn_access_chain_pointer_dereference(struct
> >> >> vtn_builder *b,
> >> >>        vtn_access_chain_extend(b, base->chain, deref_chain->length);
> >> >>     struct vtn_type *type = base->type;
> >> >>
> >> >> -   /* OpPtrAccessChain is only allowed on things which support
> >> >> variable
> >> >> -    * pointers.  For everything else, the client is expected to just
> >> >> pass
> >> >> us
> >> >> -    * the right access chain.
> >> >> +   /* We can get an *PtrAccessChain with cl kernels, with first
> >> >> element
> >> >> +    * of deref chain being literal zero.  But we can't really cope
> w/
> >> >> +    * tacking on another level of ptr_as_array.
> >> >>      */
> >> >> -   vtn_assert(!deref_chain->ptr_as_array);
> >> >> +   vtn_assert(!(chain->ptr_as_array && deref_chain->ptr_as_array));
> >> >> +   chain->ptr_as_array = deref_chain->ptr_as_array;
> >> >>
> >> >>     unsigned start = base->chain ? base->chain->length : 0;
> >> >>     for (unsigned i = 0; i < deref_chain->length; i++) {
> >> >> @@ -135,6 +139,21 @@ vtn_access_link_as_ssa(struct vtn_builder *b,
> >> >> struct
> >> >> vtn_access_link link,
> >> >>     }
> >> >>  }
> >> >>
> >> >> +static struct vtn_access_link
> >> >> +vtn_to_access_link(struct vtn_builder *b, uint32_t link_id)
> >> >> +{
> >> >> +   struct vtn_value *link_val = vtn_untyped_value(b, link_id);
> >> >> +   static struct vtn_access_link link;
> >> >> +   if (link_val->value_type == vtn_value_type_constant) {
> >> >> +      link.mode = vtn_access_mode_literal;
> >> >> +      link.id = link_val->constant->values[0].u32[0];
> >> >> +   } else {
> >> >> +      link.mode = vtn_access_mode_id;
> >> >> +      link.id = link_id;
> >> >> +   }
> >> >> +   return link;
> >> >> +}
> >> >> +
> >> >>  static nir_ssa_def *
> >> >>  vtn_variable_resource_index(struct vtn_builder *b, struct
> vtn_variable
> >> >> *var,
> >> >>                              nir_ssa_def *desc_array_index)
> >> >> @@ -207,6 +226,12 @@ vtn_ssa_offset_pointer_dereference(struct
> >> >> vtn_builder
> >> >> *b,
> >> >>              /* You can't have a zero-length OpPtrAccessChain */
> >> >>              vtn_assert(deref_chain->length >= 1);
> >> >>              desc_arr_idx = vtn_access_link_as_ssa(b,
> >> >> deref_chain->link[0], 1);
> >> >> +         } else if (!glsl_type_is_struct(type->type)) {
> >> >> +            /* if we have something that is ptr to simple type, like
> >> >> an
> >> >> +             * int ptr, then just treat that like an array of length
> >> >> 1,
> >> >> +             * we just want to deref the zero'th element:
> >> >> +             */
> >> >> +            desc_arr_idx = nir_imm_int(&b->nb, 0);
> >> >>           } else {
> >> >>              /* We have a regular non-array SSBO. */
> >> >>              desc_arr_idx = NULL;
> >> >> @@ -304,7 +329,9 @@ vtn_ssa_offset_pointer_dereference(struct
> >> >> vtn_builder
> >> >> *b,
> >> >>        case GLSL_TYPE_FLOAT:
> >> >>        case GLSL_TYPE_FLOAT16:
> >> >>        case GLSL_TYPE_DOUBLE:
> >> >> -      case GLSL_TYPE_BOOL:
> >> >> +      case GLSL_TYPE_BOOL: {
> >> >> +         break;
> >> >> +      }
> >> >>        case GLSL_TYPE_ARRAY: {
> >> >>           nir_ssa_def *elem_offset =
> >> >>              vtn_access_link_as_ssa(b, deref_chain->link[idx],
> >> >> type->stride);
> >> >> @@ -336,6 +363,98 @@ vtn_ssa_offset_pointer_dereference(struct
> >> >> vtn_builder
> >> >> *b,
> >> >>     return ptr;
> >> >>  }
> >> >>
> >> >> +static struct vtn_pointer *
> >> >> +vtn_ssa_raw_pointer_dereference(struct vtn_builder *b,
> >> >> +                                struct vtn_pointer *base,
> >> >> +                                struct vtn_access_chain
> *deref_chain)
> >> >> +{
> >> >> +   nir_ssa_def *offset = nir_imm_int(&b->nb, 0);
> >> >> +   struct vtn_type *type = base->type;
> >> >> +
> >> >> +
> >> >> +   vtn_assert(type);
> >> >> +
> >> >> +   unsigned idx = 0;
> >> >> +
> >> >> +   /* For the *PtrAccessChain deref instructions, the first entry in
> >> >> +    * the deref chain is deref'ing the ptr as an array, ie. if you
> >> >> +    * have "struct foo *f" then it can be deref'd as either
> "f->blah",
> >> >> +    * which is equiv to "f[0].blah", or "f[n].blah".
> >> >> +    */
> >> >> +   if (deref_chain->ptr_as_array) {
> >> >> +      nir_ssa_def *elem_offset =
> >> >> +         vtn_access_link_as_ssa(b, deref_chain->link[idx],
> >> >> type->stride);
> >> >> +      offset = nir_iadd(&b->nb, offset, elem_offset);
> >> >> +      idx++;
> >> >> +   }
> >> >> +
> >> >> +   /* TODO this can probably be refactored out into helper?  Nearly
> >> >> +    * the same as in vtn_ssa_offset_pointer_dereference() and
> perhaps
> >> >> +    * elsewhere..
> >> >> +    *
> >> >> +    * Note that the type here in the switch is of what the pointer
> >> >> +    * points to, ie. if the base type is uint, we are dereferencing
> >> >> +    * (possibly as an array) a uint* pointer.
> >> >> +    */
> >> >> +   for (; idx < deref_chain->length; idx++) {
> >> >> +      switch (glsl_get_base_type(type->type)) {
> >> >> +      case GLSL_TYPE_UINT:
> >> >> +      case GLSL_TYPE_INT:
> >> >> +      case GLSL_TYPE_UINT16:
> >> >> +      case GLSL_TYPE_INT16:
> >> >> +      case GLSL_TYPE_UINT8:
> >> >> +      case GLSL_TYPE_INT8:
> >> >> +      case GLSL_TYPE_UINT64:
> >> >> +      case GLSL_TYPE_INT64:
> >> >> +      case GLSL_TYPE_FLOAT:
> >> >> +      case GLSL_TYPE_FLOAT16:
> >> >> +      case GLSL_TYPE_DOUBLE:
> >> >> +      case GLSL_TYPE_BOOL:
> >> >> +      case GLSL_TYPE_ARRAY: {
> >> >> +         nir_ssa_def *elem_offset =
> >> >> +            vtn_access_link_as_ssa(b, deref_chain->link[idx],
> >> >> type->stride);
> >> >> +         offset = nir_iadd(&b->nb, offset, elem_offset);
> >> >> +         if (type->array_element) {
> >> >> +            type = type->array_element;
> >> >> +         } else {
> >> >> +            vtn_assert(idx == (deref_chain->length - 1));
> >> >> +         }
> >> >> +         break;
> >> >> +      }
> >> >> +
> >> >> +      case GLSL_TYPE_STRUCT: {
> >> >> +         vtn_assert(deref_chain->link[idx].mode ==
> >> >> vtn_access_mode_literal);
> >> >> +         unsigned member = deref_chain->link[idx].id;
> >> >> +         nir_ssa_def *mem_offset = nir_imm_int(&b->nb,
> >> >> type->offsets[member]);
> >> >> +         offset = nir_iadd(&b->nb, offset, mem_offset);
> >> >> +         type = type->members[member];
> >> >> +         break;
> >> >> +      }
> >> >> +
> >> >> +      default:
> >> >> +         vtn_fail("Invalid type for deref");
> >> >> +      }
> >> >> +   }
> >> >> +
> >> >> +   /* at this point, offset is 32b, but pointer can be either 32b or
> >> >> 64b
> >> >> +    * depending on memory model:
> >> >> +    */
> >> >> +   if (b->ptr_size == 64) {
> >> >> +      offset = nir_u2u64(&b->nb, offset);
> >> >> +   }
> >> >> +
> >> >> +   /* add pointer address to calculated offset: */
> >> >> +   if (base->offset)
> >> >> +      offset = nir_iadd(&b->nb, offset, base->offset);
> >> >> +
> >> >> +   struct vtn_pointer *ptr = rzalloc(b, struct vtn_pointer);
> >> >> +   ptr->mode = base->mode;
> >> >> +   ptr->type = base->type;
> >> >> +   ptr->offset = offset;
> >> >> +
> >> >> +   return ptr;
> >> >> +}
> >> >> +
> >> >>  /* Dereference the given base pointer by the access chain */
> >> >>  static struct vtn_pointer *
> >> >>  vtn_pointer_dereference(struct vtn_builder *b,
> >> >> @@ -344,6 +463,8 @@ vtn_pointer_dereference(struct vtn_builder *b,
> >> >>  {
> >> >>     if (vtn_pointer_uses_ssa_offset(b, base)) {
> >> >>        return vtn_ssa_offset_pointer_dereference(b, base,
> deref_chain);
> >> >> +   } else if (base->ptr_type->base_type ==
> vtn_base_type_raw_pointer)
> >> >> {
> >> >> +      return vtn_ssa_raw_pointer_dereference(b, base, deref_chain);
> >> >>     } else {
> >> >>        return vtn_access_chain_pointer_dereference(b, base,
> >> >> deref_chain);
> >> >>     }
> >> >> @@ -373,7 +494,8 @@ vtn_pointer_for_variable(struct vtn_builder *b,
> >> >>
> >> >>     pointer->mode = var->mode;
> >> >>     pointer->type = var->type;
> >> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
> >> >>     vtn_assert(ptr_type->deref->type == var->type->type);
> >> >>     pointer->ptr_type = ptr_type;
> >> >>     pointer->var = var;
> >> >> @@ -384,6 +506,12 @@ vtn_pointer_for_variable(struct vtn_builder *b,
> >> >>  nir_deref_var *
> >> >>  vtn_pointer_to_deref(struct vtn_builder *b, struct vtn_pointer *ptr)
> >> >>  {
> >> >> +   /* once you've chased a pointer, you no longer really have a var,
> >> >> +    * so this case is handled differently
> >> >> +    */
> >> >> +   if (!ptr->var)
> >> >> +      return NULL;
> >> >> +
> >> >>     /* Do on-the-fly copy propagation for samplers. */
> >> >>     if (ptr->var->copy_prop_sampler)
> >> >>        return vtn_pointer_to_deref(b, ptr->var->copy_prop_sampler);
> >> >> @@ -408,7 +536,19 @@ vtn_pointer_to_deref(struct vtn_builder *b,
> struct
> >> >> vtn_pointer *ptr)
> >> >>     nir_deref *tail = &deref_var->deref;
> >> >>     nir_variable **members = ptr->var->members;
> >> >>
> >> >> -   for (unsigned i = 0; i < chain->length; i++) {
> >> >> +   unsigned i = 0;
> >> >> +   if (chain->ptr_as_array) {
> >> >> +      /* We can't currently handle this in a nir deref chain.
> Perhaps
> >> >> +       * a new type of deref node is needed?  But we can at least
> >> >> handle
> >> >> +       * the case where the first deref link is literal zero by
> >> >> skipping
> >> >> +       * it.
> >> >> +       */
> >> >> +      vtn_assert(chain->link[0].mode == vtn_access_mode_literal);
> >> >> +      vtn_assert(chain->link[0].id == 0);
> >> >> +      i++;
> >> >> +   }
> >> >> +
> >> >> +   for (; i < chain->length; i++) {
> >> >>        enum glsl_base_type base_type =
> >> >> glsl_get_base_type(deref_type->type);
> >> >>        switch (base_type) {
> >> >>        case GLSL_TYPE_UINT:
> >> >> @@ -598,6 +738,49 @@ vtn_local_store(struct vtn_builder *b, struct
> >> >> vtn_ssa_value *src,
> >> >>     }
> >> >>  }
> >> >>
> >> >> +struct vtn_ssa_value *
> >> >> +vtn_pointer_load(struct vtn_builder *b, struct vtn_pointer *ptr)
> >> >> +{
> >> >> +   const struct glsl_type *type = ptr->type->type;
> >> >> +   struct vtn_ssa_value *val = vtn_create_ssa_value(b, type);
> >> >> +   nir_intrinsic_op op = nir_intrinsic_load_global;
> >> >> +
> >> >> +   vtn_assert(ptr->offset);
> >> >> +
> >> >> +   nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->
> shader,
> >> >> op);
> >> >> +   intrin->num_components = glsl_get_vector_elements(type);
> >> >> +   intrin->src[0] = nir_src_for_ssa(ptr->offset);
> >> >> +
> >> >> +   nir_ssa_dest_init(&intrin->instr, &intrin->dest,
> >> >> +                     intrin->num_components,
> >> >> +                     glsl_get_bit_size(type),
> >> >> +                     NULL);
> >> >> +   val->def = &intrin->dest.ssa;
> >> >> +
> >> >> +   nir_builder_instr_insert(&b->nb, &intrin->instr);
> >> >> +
> >> >> +   return val;
> >> >> +}
> >> >> +
> >> >> +void
> >> >> +vtn_pointer_store(struct vtn_builder *b, struct vtn_ssa_value *src,
> >> >> +                  struct vtn_pointer *ptr)
> >> >> +{
> >> >> +   const struct glsl_type *type = ptr->type->type;
> >> >> +   nir_intrinsic_op op = nir_intrinsic_store_global;
> >> >> +
> >> >> +   vtn_assert(ptr->offset);
> >> >> +
> >> >> +   nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->
> shader,
> >> >> op);
> >> >> +   intrin->num_components = MAX2(1, glsl_get_vector_elements(type)
> );
> >> >> +   intrin->src[0] = nir_src_for_ssa(src->def);
> >> >> +   intrin->src[1] = nir_src_for_ssa(ptr->offset);
> >> >> +
> >> >> +   nir_intrinsic_set_write_mask(intrin, (1 <<
> intrin->num_components)
> >> >> -
> >> >> 1);
> >> >> +
> >> >> +   nir_builder_instr_insert(&b->nb, &intrin->instr);
> >> >> +}
> >> >> +
> >> >>  nir_ssa_def *
> >> >>  vtn_pointer_to_offset(struct vtn_builder *b, struct vtn_pointer
> *ptr,
> >> >>                        nir_ssa_def **index_out, unsigned
> *end_idx_out)
> >> >> @@ -1004,20 +1187,33 @@ _vtn_variable_load_store(struct vtn_builder
> *b,
> >> >> bool load,
> >> >>     case GLSL_TYPE_FLOAT:
> >> >>     case GLSL_TYPE_FLOAT16:
> >> >>     case GLSL_TYPE_BOOL:
> >> >> -   case GLSL_TYPE_DOUBLE:
> >> >> -      /* At this point, we have a scalar, vector, or matrix so we
> know
> >> >> that
> >> >> -       * there cannot be any structure splitting still in the way.
> By
> >> >> -       * stopping at the matrix level rather than the vector level,
> we
> >> >> -       * ensure that matrices get loaded in the optimal way even if
> >> >> they
> >> >> -       * are storred row-major in a UBO.
> >> >> -       */
> >> >> -      if (load) {
> >> >> -         *inout = vtn_local_load(b, vtn_pointer_to_deref(b, ptr));
> >> >> +   case GLSL_TYPE_DOUBLE: {
> >> >> +      nir_deref_var *deref_var = vtn_pointer_to_deref(b, ptr);
> >> >> +
> >> >> +      if (deref_var) {
> >> >> +         /* At this point, we have a scalar, vector, or matrix so we
> >> >> know
> >> >> that
> >> >> +          * there cannot be any structure splitting still in the
> way.
> >> >> By
> >> >> +          * stopping at the matrix level rather than the vector
> level,
> >> >> we
> >> >> +          * ensure that matrices get loaded in the optimal way even
> if
> >> >> they
> >> >> +          * are storred row-major in a UBO.
> >> >> +          */
> >> >> +         if (load) {
> >> >> +            *inout = vtn_local_load(b, deref_var);
> >> >> +         } else {
> >> >> +            vtn_local_store(b, *inout, deref_var);
> >> >> +         }
> >> >>        } else {
> >> >> -         vtn_local_store(b, *inout, vtn_pointer_to_deref(b, ptr));
> >> >> +         /* If deref'ing a raw pointer, we don't have a variable to
> go
> >> >> along
> >> >> +          * with it.  Just directly generate load/store_global
> >> >> intrinsics:
> >> >> +          */
> >> >> +         if (load) {
> >> >> +            *inout = vtn_pointer_load(b, ptr);
> >> >> +         } else {
> >> >> +            vtn_pointer_store(b, *inout, ptr);
> >> >> +         }
> >> >>        }
> >> >>        return;
> >> >> -
> >> >> +   }
> >> >>     case GLSL_TYPE_ARRAY:
> >> >>     case GLSL_TYPE_STRUCT: {
> >> >>        unsigned elems = glsl_get_length(ptr->type->type);
> >> >> @@ -1052,6 +1248,8 @@ vtn_variable_load(struct vtn_builder *b, struct
> >> >> vtn_pointer *src)
> >> >>  {
> >> >>     if (vtn_pointer_is_external_block(b, src)) {
> >> >>        return vtn_block_load(b, src);
> >> >> +   } else if (!src->var) {
> >> >> +      return vtn_pointer_load(b, src);
> >> >>     } else {
> >> >>        struct vtn_ssa_value *val = NULL;
> >> >>        _vtn_variable_load_store(b, true, src, &val);
> >> >> @@ -1065,8 +1263,11 @@ vtn_variable_store(struct vtn_builder *b,
> struct
> >> >> vtn_ssa_value *src,
> >> >>  {
> >> >>     if (vtn_pointer_is_external_block(b, dest)) {
> >> >>        vtn_assert(dest->mode == vtn_variable_mode_ssbo ||
> >> >> -                 dest->mode == vtn_variable_mode_workgroup);
> >> >> +                 dest->mode == vtn_variable_mode_workgroup ||
> >> >> +                 dest->mode == vtn_variable_mode_param);
> >> >>        vtn_block_store(b, src, dest);
> >> >> +   } else if (!dest->var) {
> >> >> +      vtn_pointer_store(b, src, dest);
> >> >>     } else {
> >> >>        _vtn_variable_load_store(b, false, dest, &src);
> >> >>     }
> >> >> @@ -1608,6 +1809,7 @@ static enum vtn_variable_mode
> >> >>  vtn_storage_class_to_mode(struct vtn_builder *b,
> >> >>                            SpvStorageClass class,
> >> >>                            struct vtn_type *interface_type,
> >> >> +                          bool has_initializer,
> >> >>                            nir_variable_mode *nir_mode_out)
> >> >>  {
> >> >>     enum vtn_variable_mode mode;
> >> >> @@ -1635,8 +1837,12 @@ vtn_storage_class_to_mode(struct vtn_builder
> *b,
> >> >>        } else if (glsl_type_is_sampler(interface_type->type)) {
> >> >>           mode = vtn_variable_mode_sampler;
> >> >>           nir_mode = nir_var_uniform;
> >> >> +      } else if (has_initializer) {
> >> >> +         mode = vtn_variable_mode_const;
> >> >> +         nir_mode = nir_var_global;
> >> >>        } else {
> >> >> -         vtn_fail("Invalid uniform constant variable type");
> >> >> +         mode = vtn_variable_mode_push_constant;
> >> >> +         nir_mode = nir_var_uniform;
> >> >>        }
> >> >>        break;
> >> >>     case SpvStorageClassPushConstant:
> >> >> @@ -1664,6 +1870,10 @@ vtn_storage_class_to_mode(struct vtn_builder
> *b,
> >> >>        nir_mode = nir_var_shared;
> >> >>        break;
> >> >>     case SpvStorageClassCrossWorkgroup:
> >> >> +      mode = vtn_variable_mode_cross_workgroup;
> >> >> +      // TODO do we want a new nir_mode for raw pointers?
> >> >> +      nir_mode = nir_var_global;
> >> >> +      break;
> >> >>     case SpvStorageClassGeneric:
> >> >>     case SpvStorageClassAtomicCounter:
> >> >>     default:
> >> >> @@ -1692,7 +1902,14 @@ vtn_pointer_to_ssa(struct vtn_builder *b,
> struct
> >> >> vtn_pointer *ptr)
> >> >>        struct vtn_access_chain chain = {
> >> >>           .length = 0,
> >> >>        };
> >> >> -      ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
> >> >> +
> >> >> +      if (vtn_pointer_uses_ssa_offset(b, ptr)) {
> >> >> +         ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
> >> >> +      } else if (ptr->ptr_type->base_type ==
> >> >> vtn_base_type_raw_pointer) {
> >> >> +         ptr = vtn_ssa_raw_pointer_dereference(b, ptr, &chain);
> >> >> +      } else {
> >> >> +         vtn_fail("unhandled");
> >> >> +      }
> >> >>     }
> >> >>
> >> >>     vtn_assert(ptr->offset);
> >> >> @@ -1710,27 +1927,34 @@ struct vtn_pointer *
> >> >>  vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa,
> >> >>                       struct vtn_type *ptr_type)
> >> >>  {
> >> >> -   vtn_assert(ssa->num_components <= 2 && ssa->bit_size == 32);
> >> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
> >> >>     vtn_assert(ptr_type->deref->base_type != vtn_base_type_pointer);
> >> >>     /* This pointer type needs to have actual storage */
> >> >>     vtn_assert(ptr_type->type);
> >> >>
> >> >>     struct vtn_pointer *ptr = rzalloc(b, struct vtn_pointer);
> >> >>     ptr->mode = vtn_storage_class_to_mode(b, ptr_type->storage_class,
> >> >> -                                         ptr_type, NULL);
> >> >> +                                         ptr_type, false, NULL);
> >> >> +
> >> >> +   if (ptr_type->base_type == vtn_base_type_raw_pointer) {
> >> >> +      vtn_assert(ssa->num_components == 1 && ssa->bit_size ==
> >> >> b->ptr_size);
> >> >> +   } else {
> >> >> +      vtn_assert(ssa->num_components <= 2 && ssa->bit_size == 32);
> >> >> +   }
> >> >> +
> >> >>     ptr->type = ptr_type->deref;
> >> >>     ptr->ptr_type = ptr_type;
> >> >>
> >> >>     if (ssa->num_components > 1) {
> >> >>        vtn_assert(ssa->num_components == 2);
> >> >>        vtn_assert(ptr->mode == vtn_variable_mode_ubo ||
> >> >> -                 ptr->mode == vtn_variable_mode_ssbo);
> >> >> +                 ptr->mode == vtn_variable_mode_ssbo ||
> >> >> +                 ptr->mode == vtn_variable_mode_global);
> >> >>        ptr->block_index = nir_channel(&b->nb, ssa, 0);
> >> >>        ptr->offset = nir_channel(&b->nb, ssa, 1);
> >> >>     } else {
> >> >>        vtn_assert(ssa->num_components == 1);
> >> >> -      vtn_assert(ptr->mode == vtn_variable_mode_workgroup);
> >> >>        ptr->block_index = NULL;
> >> >>        ptr->offset = ssa;
> >> >>     }
> >> >> @@ -1761,7 +1985,8 @@ vtn_create_variable(struct vtn_builder *b,
> struct
> >> >> vtn_value *val,
> >> >>                      struct vtn_type *ptr_type, SpvStorageClass
> >> >> storage_class,
> >> >>                      nir_constant *initializer)
> >> >>  {
> >> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
> >> >>     struct vtn_type *type = ptr_type->deref;
> >> >>
> >> >>     struct vtn_type *without_array = type;
> >> >> @@ -1770,7 +1995,8 @@ vtn_create_variable(struct vtn_builder *b,
> struct
> >> >> vtn_value *val,
> >> >>
> >> >>     enum vtn_variable_mode mode;
> >> >>     nir_variable_mode nir_mode;
> >> >> -   mode = vtn_storage_class_to_mode(b, storage_class, without_array,
> >> >> &nir_mode);
> >> >> +   mode = vtn_storage_class_to_mode(b, storage_class, without_array,
> >> >> +                                    !!initializer, &nir_mode);
> >> >>
> >> >>     switch (mode) {
> >> >>     case vtn_variable_mode_ubo:
> >> >> @@ -1933,6 +2159,7 @@ vtn_create_variable(struct vtn_builder *b,
> struct
> >> >> vtn_value *val,
> >> >>     case vtn_variable_mode_ubo:
> >> >>     case vtn_variable_mode_ssbo:
> >> >>     case vtn_variable_mode_push_constant:
> >> >> +   case vtn_variable_mode_cross_workgroup:
> >> >>        /* These don't need actual variables. */
> >> >>        break;
> >> >>     }
> >> >> @@ -2031,25 +2258,18 @@ vtn_handle_variables(struct vtn_builder *b,
> >> >> SpvOp
> >> >> opcode,
> >> >>     case SpvOpAccessChain:
> >> >>     case SpvOpPtrAccessChain:
> >> >>     case SpvOpInBoundsAccessChain: {
> >> >> +      struct vtn_type *ptr_type = vtn_value(b, w[1],
> >> >> vtn_value_type_type)->type;
> >> >> +      struct vtn_value *base_val = vtn_untyped_value(b, w[3]);
> >> >>        struct vtn_access_chain *chain = vtn_access_chain_create(b,
> >> >> count -
> >> >> 4);
> >> >> -      chain->ptr_as_array = (opcode == SpvOpPtrAccessChain);
> >> >> +      chain->ptr_as_array = (opcode == SpvOpPtrAccessChain) ||
> >> >> +                            (opcode == SpvOpInBoundsPtrAccessChain);
> >> >>
> >> >>        unsigned idx = 0;
> >> >>        for (int i = 4; i < count; i++) {
> >> >> -         struct vtn_value *link_val = vtn_untyped_value(b, w[i]);
> >> >> -         if (link_val->value_type == vtn_value_type_constant) {
> >> >> -            chain->link[idx].mode = vtn_access_mode_literal;
> >> >> -            chain->link[idx].id =
> >> >> link_val->constant->values[0].u32[0];
> >> >> -         } else {
> >> >> -            chain->link[idx].mode = vtn_access_mode_id;
> >> >> -            chain->link[idx].id = w[i];
> >> >> -
> >> >> -         }
> >> >> +         chain->link[idx] = vtn_to_access_link(b, w[i]);
> >> >>           idx++;
> >> >>        }
> >> >>
> >> >> -      struct vtn_type *ptr_type = vtn_value(b, w[1],
> >> >> vtn_value_type_type)->type;
> >> >> -      struct vtn_value *base_val = vtn_untyped_value(b, w[3]);
> >> >>        if (base_val->value_type == vtn_value_type_sampled_image) {
> >> >>           /* This is rather insane.  SPIR-V allows you to use
> >> >> OpSampledImage
> >> >>            * to combine an array of images with a single sampler to
> get
> >> >> an
> >> >> --
> >> >> 2.14.3
> >> >>
> >> >> _______________________________________________
> >> >> mesa-dev mailing list
> >> >> mesa-dev at lists.freedesktop.org
> >> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180323/80392b5d/attachment-0001.html>


More information about the mesa-dev mailing list