[Mesa-dev] [PATCH v3 06/19] RFC: nir/vtn: "raw" pointer support

Jason Ekstrand jason at jlekstrand.net
Fri Mar 23 21:07:45 UTC 2018


+list

On Fri, Mar 23, 2018 at 1:45 PM, Karol Herbst <kherbst at redhat.com> wrote:

> On Fri, Mar 23, 2018 at 9:30 PM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> > As I've been rewriting core NIR deref handling, I've been thinking about
> > this problem quite a bit.  One objective I have is to actually make UBO
> and
> > SSBO access go through derefs instead of just being an offset and index
> so
> > that the compiler can better reason about them.  In particular, I want
> to be
> > able to start doing load/store elimination on SSBOs, SLM, and whatever CL
> > has which would be great for everyone's compute performance (GL, Vulkan,
> CL,
> > etc.).
> >
> > I would be lying if I said I had a full plan but I do have part of a
> plan.
> > In my patch which adds the deref instructions, I add a new "cast" deref
> type
> > which takes an arbitrary value as it's source and kicks out a deref with
> a
> > type.  Whenever we discover that the source of the cast is actually
> another
> > deref which is compatible (same type etc.), copy propagation gets rid of
> the
> > cast for you.  The idea is that, instead of doing a load_raw(raw_ptr),
> you
> > would do a load((type *)raw_ptr).
> >
> > Right now, most of the core NIR optimizations will throw a fit if they
> ever
> > see a cast.  This is intentional because it requires us to manually go
> > through and handle casts.  This would mean that, at the moment, you would
> > have to lower to load_raw intrinsics almost immediately after coming out
> of
> > SPIR-V.
> >
>
> Well it gets more fun with OpenCL 2.0 where you can have generic
> pointer where you only know the type at creation type. You can also
> declare generic pointers as function inputs in a way, that you never
> actually know from where you have to load if you only have that one
> function. So the actual load operation depends on when you create the
> initial pointer variable (you can cast from X to generic, but not the
> other way around).
>
> Which in the end means you can end up with load(generic_ptr) and only
> following the chain up to it's creation (with function inlining in
> mind) you know the actual memory target.
>

Yup.  And there will always be crazy cases where you can't actually follow
it and you have to emit a pile of code to load different ways depending on
some bits somewhere that tell you how to load it.  I'm well aware of the
insanity. :-)  This is part of the reason why I'm glad I'm not trying to
write an OpenCL 2.0 driver.

This insanity is exactly why I'm suggesting the pointer casting.  Sure, you
may not know the data type until the actual load.  In that case, you end up
with the cast being right before the load.  If you don't know the storage
class, maybe you have to switch and do multiple casts based on some bits.
Alternatively, if you don't know the storage class, we can just let the
deref mode be 0 for "I don't know". or maybe multiple bits for "these are
the things it might be".  In any case, I think we can handle it.

It's insane but we need some sort of structure to be able to reason about
the insanity.  Immediately lowering everything to load_raw is a good way to
get a driver off the ground.  What it's not so good for is making an
optimizing compiler that can reason about these crazy pointers and actually
optimize them.  Lest I sound too negative, I'm 100% fine with taking a
short path to getting something working now so long as it doesn't cloud up
our ability to do better in the future.


> And I think the issue here is not that it is some kind of raw pointer
> in the patch, but more like an unbound/physical pointer, which doesn't
> relate to any variable. It is just a value like any other int/long as
> well.
>
> > On Fri, Mar 23, 2018 at 12:33 PM, Karol Herbst <kherbst at redhat.com>
> wrote:
> >>
> >> From: Rob Clark <robdclark at gmail.com>
> >>
> >> An attempt to add physical pointer support to vtn.  I'm not totally
> >> happy about the handling of logical pointers vs physical pointers.
> >> So this is really more of an RFS (request for suggestions)
> >>
> >> v2: treat vec3 types as vec4 when dereferencing
> >>
> >> Signed-off-by: Karol Herbst <kherbst at redhat.com>
> >> ---
> >>  src/compiler/spirv/spirv_to_nir.c  |  87 ++++++++---
> >>  src/compiler/spirv/vtn_private.h   |  20 ++-
> >>  src/compiler/spirv/vtn_variables.c | 300
> >> ++++++++++++++++++++++++++++++++-----
> >>  3 files changed, 347 insertions(+), 60 deletions(-)
> >>
> >> diff --git a/src/compiler/spirv/spirv_to_nir.c
> >> b/src/compiler/spirv/spirv_to_nir.c
> >> index 334bcab9a82..d58a68f80ef 100644
> >> --- a/src/compiler/spirv/spirv_to_nir.c
> >> +++ b/src/compiler/spirv/spirv_to_nir.c
> >> @@ -572,6 +572,7 @@ vtn_types_compatible(struct vtn_builder *b,
> >>               vtn_types_compatible(b, t1->array_element,
> >> t2->array_element);
> >>
> >>     case vtn_base_type_pointer:
> >> +   case vtn_base_type_raw_pointer:
> >>        return vtn_types_compatible(b, t1->deref, t2->deref);
> >>
> >>     case vtn_base_type_struct:
> >> @@ -609,6 +610,7 @@ vtn_type_copy(struct vtn_builder *b, struct vtn_type
> >> *src)
> >>     case vtn_base_type_matrix:
> >>     case vtn_base_type_array:
> >>     case vtn_base_type_pointer:
> >> +   case vtn_base_type_raw_pointer:
> >>     case vtn_base_type_image:
> >>     case vtn_base_type_sampler:
> >>     case vtn_base_type_sampled_image:
> >> @@ -939,6 +941,14 @@ vtn_type_layout_std430(struct vtn_builder *b,
> struct
> >> vtn_type *type,
> >>        return type;
> >>     }
> >>
> >> +   case vtn_base_type_raw_pointer: {
> >> +      uint32_t comp_size = b->ptr_size / 8;
> >> +      vtn_assert(comp_size);
> >> +      *size_out = comp_size;
> >> +      *align_out = comp_size;
> >> +      return type;
> >> +   }
> >> +
> >>     case vtn_base_type_vector: {
> >>        uint32_t comp_size = glsl_get_bit_size(type->type) / 8;
> >>        assert(type->length > 0 && type->length <= 4);
> >> @@ -1003,6 +1013,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> opcode,
> >>        val->type->base_type = vtn_base_type_scalar;
> >>        val->type->type = glsl_bool_type();
> >>        val->type->length = 1;
> >> +      val->type->stride = 4;
> >>        break;
> >>     case SpvOpTypeInt: {
> >>        int bit_size = w[2];
> >> @@ -1025,6 +1036,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> opcode,
> >>           vtn_fail("Invalid int bit size");
> >>        }
> >>        val->type->length = 1;
> >> +      val->type->stride = bit_size / 8;
> >>        break;
> >>     }
> >>
> >> @@ -1045,6 +1057,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> opcode,
> >>           vtn_fail("Invalid float bit size");
> >>        }
> >>        val->type->length = 1;
> >> +      val->type->stride = bit_size / 8;
> >>        break;
> >>     }
> >>
> >> @@ -1061,6 +1074,10 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> opcode,
> >>        val->type->type = glsl_vector_type(glsl_get_
> base_type(base->type),
> >> elems);
> >>        val->type->length = elems;
> >>        val->type->stride = glsl_get_bit_size(base->type) / 8;
> >> +      /* special case: vec3 is aligned to vec4 */
> >> +      if (elems == 3)
> >> +         elems = 4;
> >> +      val->type->stride *= elems;
> >>        val->type->array_element = base;
> >>        break;
> >>     }
> >> @@ -1138,7 +1155,11 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> opcode,
> >>
> >>        const char *name = val->name ? val->name : "struct";
> >>
> >> -      val->type->type = glsl_struct_type(fields, num_fields, name,
> >> false);
> >> +      val->type->type = glsl_struct_type(fields, num_fields, name,
> >> +                                         val->type->packed);
> >> +      // TODO stride for a struct only matters for kernel shaders,
> where
> >> +      // cl_size is the right thing.. but still a bit ugly to
> hard-code.
> >> +      val->type->stride = glsl_get_cl_size(val->type->type);
> >>        break;
> >>     }
> >>
> >> @@ -1167,25 +1188,47 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> opcode,
> >>        val->type->storage_class = storage_class;
> >>        val->type->deref = deref_type;
> >>
> >> -      if (storage_class == SpvStorageClassUniform ||
> >> -          storage_class == SpvStorageClassStorageBuffer) {
> >> -         /* These can actually be stored to nir_variables and used as
> SSA
> >> -          * values so they need a real glsl_type.
> >> -          */
> >> -         val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
> >> -      }
> >> -
> >> -      if (storage_class == SpvStorageClassWorkgroup &&
> >> -          b->options->lower_workgroup_access_to_offsets) {
> >> +      // XXX handling the "fake" glsl pointers vs "raw" pointers in
> >> kernel
> >> +      // is a bit ugly..  need to understand how "pointers" are used in
> >> vk
> >> +      // and figure out something better
> >> +      if (storage_class == SpvStorageClassFunction ||
> >> +          storage_class == SpvStorageClassUniformConstant ||
> >> +          storage_class == SpvStorageClassWorkgroup ||
> >> +          !b->kernel_mode) {
> >> +         if (storage_class == SpvStorageClassUniform ||
> >> +             storage_class == SpvStorageClassStorageBuffer) {
> >> +            /* These can actually be stored to nir_variables and used
> as
> >> SSA
> >> +             * values so they need a real glsl_type.
> >> +             */
> >> +            val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
> >> +         } else if (storage_class == SpvStorageClassWorkgroup &&
> >> +                    b->options->lower_workgroup_access_to_offsets) {
> >> +            uint32_t size, align;
> >> +            val->type->deref = vtn_type_layout_std430(b,
> >> val->type->deref,
> >> +                                                      &size, &align);
> >> +            val->type->length = size;
> >> +            val->type->align = align;
> >> +            /* These can actually be stored to nir_variables and used
> as
> >> SSA
> >> +             * values so they need a real glsl_type.
> >> +             */
> >> +            val->type->type = glsl_uint_type();
> >> +         }
> >> +      } else {
> >> +         vtn_assert(storage_class == SpvStorageClassCrossWorkgroup ||
> >> +                    storage_class == SpvStorageClassInput);
> >>           uint32_t size, align;
> >> +         if (b->ptr_size == 64) {
> >> +            val->type->type = glsl_uint64_t_type();
> >> +         } else {
> >> +            val->type->type = glsl_uint_type();
> >> +         }
> >> +         val->type->base_type = vtn_base_type_raw_pointer;
> >> +         /* pointers can be accessed as array, so set the stride as
> size:
> >> */
> >>           val->type->deref = vtn_type_layout_std430(b, val->type->deref,
> >>                                                     &size, &align);
> >> -         val->type->length = size;
> >> +#define ALIGN(_v, _d) (((_v) + ((_d) - 1)) & ~((_d) - 1))
> >> +         val->type->stride = ALIGN(size, align);
> >>           val->type->align = align;
> >> -         /* These can actually be stored to nir_variables and used as
> SSA
> >> -          * values so they need a real glsl_type.
> >> -          */
> >> -         val->type->type = glsl_uint_type();
> >>        }
> >>        break;
> >>     }
> >> @@ -3391,9 +3434,16 @@ vtn_handle_preamble_instruction(struct
> vtn_builder
> >> *b, SpvOp opcode,
> >>        break;
> >>
> >>     case SpvOpMemoryModel:
> >> -      vtn_assert(w[1] == SpvAddressingModelLogical);
> >> +      vtn_assert(w[1] == SpvAddressingModelLogical ||
> >> +                 w[1] == SpvAddressingModelPhysical32 ||
> >> +                 w[1] == SpvAddressingModelPhysical64);
> >>        vtn_assert(w[2] == SpvMemoryModelSimple ||
> >> -                 w[2] == SpvMemoryModelGLSL450);
> >> +                 w[2] == SpvMemoryModelGLSL450 ||
> >> +                 w[2] == SpvMemoryModelOpenCL);
> >> +      if (w[1] == SpvAddressingModelPhysical32)
> >> +         b->ptr_size = 32;
> >> +      else if (w[1] == SpvAddressingModelPhysical64)
> >> +         b->ptr_size = 64;
> >>        break;
> >>
> >>     case SpvOpEntryPoint: {
> >> @@ -3780,6 +3830,7 @@ vtn_handle_body_instruction(struct vtn_builder
> *b,
> >> SpvOp opcode,
> >>           sel_type = glsl_vector_type(GLSL_TYPE_BOOL,
> >> res_val->type->length);
> >>           break;
> >>        case vtn_base_type_pointer:
> >> +      case vtn_base_type_raw_pointer:
> >>           /* We need to have actual storage for pointer types */
> >>           vtn_fail_if(res_val->type->type == NULL,
> >>                       "Invalid pointer result type for OpSelect");
> >> diff --git a/src/compiler/spirv/vtn_private.h
> >> b/src/compiler/spirv/vtn_private.h
> >> index bd273122703..dbfe9eab58a 100644
> >> --- a/src/compiler/spirv/vtn_private.h
> >> +++ b/src/compiler/spirv/vtn_private.h
> >> @@ -262,7 +262,8 @@ enum vtn_base_type {
> >>     vtn_base_type_matrix,
> >>     vtn_base_type_array,
> >>     vtn_base_type_struct,
> >> -   vtn_base_type_pointer,
> >> +   vtn_base_type_pointer,         /* a logical pointer */
> >> +   vtn_base_type_raw_pointer,     /* a physical pointer */
> >>     vtn_base_type_image,
> >>     vtn_base_type_sampler,
> >>     vtn_base_type_sampled_image,
> >> @@ -407,12 +408,14 @@ enum vtn_variable_mode {
> >>     vtn_variable_mode_local,
> >>     vtn_variable_mode_global,
> >>     vtn_variable_mode_param,
> >> +   vtn_variable_mode_const,
> >>     vtn_variable_mode_ubo,
> >>     vtn_variable_mode_ssbo,
> >>     vtn_variable_mode_push_constant,
> >>     vtn_variable_mode_image,
> >>     vtn_variable_mode_sampler,
> >>     vtn_variable_mode_workgroup,
> >> +   vtn_variable_mode_cross_workgroup,
> >>     vtn_variable_mode_input,
> >>     vtn_variable_mode_output,
> >>  };
> >> @@ -588,6 +591,13 @@ struct vtn_builder {
> >>
> >>     bool has_loop_continue;
> >>
> >> +   /* pointer size is:
> >> +    *   AddressingModelLogical:    0    (default)
> >> +    *   AddressingModelPhysical32: 32
> >> +    *   AddressingModelPhysical64: 64
> >> +    */
> >> +   unsigned ptr_size;
> >> +
> >>     bool kernel_mode;
> >>  };
> >>
> >> @@ -624,7 +634,8 @@ vtn_push_ssa(struct vtn_builder *b, uint32_t
> value_id,
> >>               struct vtn_type *type, struct vtn_ssa_value *ssa)
> >>  {
> >>     struct vtn_value *val;
> >> -   if (type->base_type == vtn_base_type_pointer) {
> >> +   if (type->base_type == vtn_base_type_pointer ||
> >> +       type->base_type == vtn_base_type_raw_pointer) {
> >>        val = vtn_push_value(b, value_id, vtn_value_type_pointer);
> >>        val->pointer = vtn_pointer_from_ssa(b, ssa->def, type);
> >>     } else {
> >> @@ -688,6 +699,11 @@ struct vtn_ssa_value *vtn_local_load(struct
> >> vtn_builder *b, nir_deref_var *src);
> >>  void vtn_local_store(struct vtn_builder *b, struct vtn_ssa_value *src,
> >>                       nir_deref_var *dest);
> >>
> >> +struct vtn_ssa_value *vtn_pointer_load(struct vtn_builder *b,
> >> +                                       struct vtn_pointer *ptr);
> >> +void vtn_pointer_store(struct vtn_builder *b, struct vtn_ssa_value
> *src,
> >> +                       struct vtn_pointer *ptr);
> >> +
> >>  struct vtn_ssa_value *
> >>  vtn_variable_load(struct vtn_builder *b, struct vtn_pointer *src);
> >>
> >> diff --git a/src/compiler/spirv/vtn_variables.c
> >> b/src/compiler/spirv/vtn_variables.c
> >> index b2897407fb1..af9222d6f4e 100644
> >> --- a/src/compiler/spirv/vtn_variables.c
> >> +++ b/src/compiler/spirv/vtn_variables.c
> >> @@ -51,6 +51,9 @@ vtn_access_chain_extend(struct vtn_builder *b, struct
> >> vtn_access_chain *old,
> >>     unsigned old_len = old ? old->length : 0;
> >>     chain = vtn_access_chain_create(b, old_len + new_ids);
> >>
> >> +   if (old)
> >> +      chain->ptr_as_array = old->ptr_as_array;
> >> +
> >>     for (unsigned i = 0; i < old_len; i++)
> >>        chain->link[i] = old->link[i];
> >>
> >> @@ -88,11 +91,12 @@ vtn_access_chain_pointer_dereference(struct
> >> vtn_builder *b,
> >>        vtn_access_chain_extend(b, base->chain, deref_chain->length);
> >>     struct vtn_type *type = base->type;
> >>
> >> -   /* OpPtrAccessChain is only allowed on things which support variable
> >> -    * pointers.  For everything else, the client is expected to just
> pass
> >> us
> >> -    * the right access chain.
> >> +   /* We can get an *PtrAccessChain with cl kernels, with first element
> >> +    * of deref chain being literal zero.  But we can't really cope w/
> >> +    * tacking on another level of ptr_as_array.
> >>      */
> >> -   vtn_assert(!deref_chain->ptr_as_array);
> >> +   vtn_assert(!(chain->ptr_as_array && deref_chain->ptr_as_array));
> >> +   chain->ptr_as_array = deref_chain->ptr_as_array;
> >>
> >>     unsigned start = base->chain ? base->chain->length : 0;
> >>     for (unsigned i = 0; i < deref_chain->length; i++) {
> >> @@ -135,6 +139,21 @@ vtn_access_link_as_ssa(struct vtn_builder *b,
> struct
> >> vtn_access_link link,
> >>     }
> >>  }
> >>
> >> +static struct vtn_access_link
> >> +vtn_to_access_link(struct vtn_builder *b, uint32_t link_id)
> >> +{
> >> +   struct vtn_value *link_val = vtn_untyped_value(b, link_id);
> >> +   static struct vtn_access_link link;
> >> +   if (link_val->value_type == vtn_value_type_constant) {
> >> +      link.mode = vtn_access_mode_literal;
> >> +      link.id = link_val->constant->values[0].u32[0];
> >> +   } else {
> >> +      link.mode = vtn_access_mode_id;
> >> +      link.id = link_id;
> >> +   }
> >> +   return link;
> >> +}
> >> +
> >>  static nir_ssa_def *
> >>  vtn_variable_resource_index(struct vtn_builder *b, struct vtn_variable
> >> *var,
> >>                              nir_ssa_def *desc_array_index)
> >> @@ -207,6 +226,12 @@ vtn_ssa_offset_pointer_dereference(struct
> vtn_builder
> >> *b,
> >>              /* You can't have a zero-length OpPtrAccessChain */
> >>              vtn_assert(deref_chain->length >= 1);
> >>              desc_arr_idx = vtn_access_link_as_ssa(b,
> >> deref_chain->link[0], 1);
> >> +         } else if (!glsl_type_is_struct(type->type)) {
> >> +            /* if we have something that is ptr to simple type, like an
> >> +             * int ptr, then just treat that like an array of length 1,
> >> +             * we just want to deref the zero'th element:
> >> +             */
> >> +            desc_arr_idx = nir_imm_int(&b->nb, 0);
> >>           } else {
> >>              /* We have a regular non-array SSBO. */
> >>              desc_arr_idx = NULL;
> >> @@ -304,7 +329,9 @@ vtn_ssa_offset_pointer_dereference(struct
> vtn_builder
> >> *b,
> >>        case GLSL_TYPE_FLOAT:
> >>        case GLSL_TYPE_FLOAT16:
> >>        case GLSL_TYPE_DOUBLE:
> >> -      case GLSL_TYPE_BOOL:
> >> +      case GLSL_TYPE_BOOL: {
> >> +         break;
> >> +      }
> >>        case GLSL_TYPE_ARRAY: {
> >>           nir_ssa_def *elem_offset =
> >>              vtn_access_link_as_ssa(b, deref_chain->link[idx],
> >> type->stride);
> >> @@ -336,6 +363,98 @@ vtn_ssa_offset_pointer_dereference(struct
> vtn_builder
> >> *b,
> >>     return ptr;
> >>  }
> >>
> >> +static struct vtn_pointer *
> >> +vtn_ssa_raw_pointer_dereference(struct vtn_builder *b,
> >> +                                struct vtn_pointer *base,
> >> +                                struct vtn_access_chain *deref_chain)
> >> +{
> >> +   nir_ssa_def *offset = nir_imm_int(&b->nb, 0);
> >> +   struct vtn_type *type = base->type;
> >> +
> >> +
> >> +   vtn_assert(type);
> >> +
> >> +   unsigned idx = 0;
> >> +
> >> +   /* For the *PtrAccessChain deref instructions, the first entry in
> >> +    * the deref chain is deref'ing the ptr as an array, ie. if you
> >> +    * have "struct foo *f" then it can be deref'd as either "f->blah",
> >> +    * which is equiv to "f[0].blah", or "f[n].blah".
> >> +    */
> >> +   if (deref_chain->ptr_as_array) {
> >> +      nir_ssa_def *elem_offset =
> >> +         vtn_access_link_as_ssa(b, deref_chain->link[idx],
> type->stride);
> >> +      offset = nir_iadd(&b->nb, offset, elem_offset);
> >> +      idx++;
> >> +   }
> >> +
> >> +   /* TODO this can probably be refactored out into helper?  Nearly
> >> +    * the same as in vtn_ssa_offset_pointer_dereference() and perhaps
> >> +    * elsewhere..
> >> +    *
> >> +    * Note that the type here in the switch is of what the pointer
> >> +    * points to, ie. if the base type is uint, we are dereferencing
> >> +    * (possibly as an array) a uint* pointer.
> >> +    */
> >> +   for (; idx < deref_chain->length; idx++) {
> >> +      switch (glsl_get_base_type(type->type)) {
> >> +      case GLSL_TYPE_UINT:
> >> +      case GLSL_TYPE_INT:
> >> +      case GLSL_TYPE_UINT16:
> >> +      case GLSL_TYPE_INT16:
> >> +      case GLSL_TYPE_UINT8:
> >> +      case GLSL_TYPE_INT8:
> >> +      case GLSL_TYPE_UINT64:
> >> +      case GLSL_TYPE_INT64:
> >> +      case GLSL_TYPE_FLOAT:
> >> +      case GLSL_TYPE_FLOAT16:
> >> +      case GLSL_TYPE_DOUBLE:
> >> +      case GLSL_TYPE_BOOL:
> >> +      case GLSL_TYPE_ARRAY: {
> >> +         nir_ssa_def *elem_offset =
> >> +            vtn_access_link_as_ssa(b, deref_chain->link[idx],
> >> type->stride);
> >> +         offset = nir_iadd(&b->nb, offset, elem_offset);
> >> +         if (type->array_element) {
> >> +            type = type->array_element;
> >> +         } else {
> >> +            vtn_assert(idx == (deref_chain->length - 1));
> >> +         }
> >> +         break;
> >> +      }
> >> +
> >> +      case GLSL_TYPE_STRUCT: {
> >> +         vtn_assert(deref_chain->link[idx].mode ==
> >> vtn_access_mode_literal);
> >> +         unsigned member = deref_chain->link[idx].id;
> >> +         nir_ssa_def *mem_offset = nir_imm_int(&b->nb,
> >> type->offsets[member]);
> >> +         offset = nir_iadd(&b->nb, offset, mem_offset);
> >> +         type = type->members[member];
> >> +         break;
> >> +      }
> >> +
> >> +      default:
> >> +         vtn_fail("Invalid type for deref");
> >> +      }
> >> +   }
> >> +
> >> +   /* at this point, offset is 32b, but pointer can be either 32b or
> 64b
> >> +    * depending on memory model:
> >> +    */
> >> +   if (b->ptr_size == 64) {
> >> +      offset = nir_u2u64(&b->nb, offset);
> >> +   }
> >> +
> >> +   /* add pointer address to calculated offset: */
> >> +   if (base->offset)
> >> +      offset = nir_iadd(&b->nb, offset, base->offset);
> >> +
> >> +   struct vtn_pointer *ptr = rzalloc(b, struct vtn_pointer);
> >> +   ptr->mode = base->mode;
> >> +   ptr->type = base->type;
> >> +   ptr->offset = offset;
> >> +
> >> +   return ptr;
> >> +}
> >> +
> >>  /* Dereference the given base pointer by the access chain */
> >>  static struct vtn_pointer *
> >>  vtn_pointer_dereference(struct vtn_builder *b,
> >> @@ -344,6 +463,8 @@ vtn_pointer_dereference(struct vtn_builder *b,
> >>  {
> >>     if (vtn_pointer_uses_ssa_offset(b, base)) {
> >>        return vtn_ssa_offset_pointer_dereference(b, base, deref_chain);
> >> +   } else if (base->ptr_type->base_type == vtn_base_type_raw_pointer) {
> >> +      return vtn_ssa_raw_pointer_dereference(b, base, deref_chain);
> >>     } else {
> >>        return vtn_access_chain_pointer_dereference(b, base,
> deref_chain);
> >>     }
> >> @@ -373,7 +494,8 @@ vtn_pointer_for_variable(struct vtn_builder *b,
> >>
> >>     pointer->mode = var->mode;
> >>     pointer->type = var->type;
> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
> >>     vtn_assert(ptr_type->deref->type == var->type->type);
> >>     pointer->ptr_type = ptr_type;
> >>     pointer->var = var;
> >> @@ -384,6 +506,12 @@ vtn_pointer_for_variable(struct vtn_builder *b,
> >>  nir_deref_var *
> >>  vtn_pointer_to_deref(struct vtn_builder *b, struct vtn_pointer *ptr)
> >>  {
> >> +   /* once you've chased a pointer, you no longer really have a var,
> >> +    * so this case is handled differently
> >> +    */
> >> +   if (!ptr->var)
> >> +      return NULL;
> >> +
> >>     /* Do on-the-fly copy propagation for samplers. */
> >>     if (ptr->var->copy_prop_sampler)
> >>        return vtn_pointer_to_deref(b, ptr->var->copy_prop_sampler);
> >> @@ -408,7 +536,19 @@ vtn_pointer_to_deref(struct vtn_builder *b, struct
> >> vtn_pointer *ptr)
> >>     nir_deref *tail = &deref_var->deref;
> >>     nir_variable **members = ptr->var->members;
> >>
> >> -   for (unsigned i = 0; i < chain->length; i++) {
> >> +   unsigned i = 0;
> >> +   if (chain->ptr_as_array) {
> >> +      /* We can't currently handle this in a nir deref chain.  Perhaps
> >> +       * a new type of deref node is needed?  But we can at least
> handle
> >> +       * the case where the first deref link is literal zero by
> skipping
> >> +       * it.
> >> +       */
> >> +      vtn_assert(chain->link[0].mode == vtn_access_mode_literal);
> >> +      vtn_assert(chain->link[0].id == 0);
> >> +      i++;
> >> +   }
> >> +
> >> +   for (; i < chain->length; i++) {
> >>        enum glsl_base_type base_type =
> >> glsl_get_base_type(deref_type->type);
> >>        switch (base_type) {
> >>        case GLSL_TYPE_UINT:
> >> @@ -598,6 +738,49 @@ vtn_local_store(struct vtn_builder *b, struct
> >> vtn_ssa_value *src,
> >>     }
> >>  }
> >>
> >> +struct vtn_ssa_value *
> >> +vtn_pointer_load(struct vtn_builder *b, struct vtn_pointer *ptr)
> >> +{
> >> +   const struct glsl_type *type = ptr->type->type;
> >> +   struct vtn_ssa_value *val = vtn_create_ssa_value(b, type);
> >> +   nir_intrinsic_op op = nir_intrinsic_load_global;
> >> +
> >> +   vtn_assert(ptr->offset);
> >> +
> >> +   nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->shader,
> >> op);
> >> +   intrin->num_components = glsl_get_vector_elements(type);
> >> +   intrin->src[0] = nir_src_for_ssa(ptr->offset);
> >> +
> >> +   nir_ssa_dest_init(&intrin->instr, &intrin->dest,
> >> +                     intrin->num_components,
> >> +                     glsl_get_bit_size(type),
> >> +                     NULL);
> >> +   val->def = &intrin->dest.ssa;
> >> +
> >> +   nir_builder_instr_insert(&b->nb, &intrin->instr);
> >> +
> >> +   return val;
> >> +}
> >> +
> >> +void
> >> +vtn_pointer_store(struct vtn_builder *b, struct vtn_ssa_value *src,
> >> +                  struct vtn_pointer *ptr)
> >> +{
> >> +   const struct glsl_type *type = ptr->type->type;
> >> +   nir_intrinsic_op op = nir_intrinsic_store_global;
> >> +
> >> +   vtn_assert(ptr->offset);
> >> +
> >> +   nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->shader,
> >> op);
> >> +   intrin->num_components = MAX2(1, glsl_get_vector_elements(type));
> >> +   intrin->src[0] = nir_src_for_ssa(src->def);
> >> +   intrin->src[1] = nir_src_for_ssa(ptr->offset);
> >> +
> >> +   nir_intrinsic_set_write_mask(intrin, (1 << intrin->num_components)
> -
> >> 1);
> >> +
> >> +   nir_builder_instr_insert(&b->nb, &intrin->instr);
> >> +}
> >> +
> >>  nir_ssa_def *
> >>  vtn_pointer_to_offset(struct vtn_builder *b, struct vtn_pointer *ptr,
> >>                        nir_ssa_def **index_out, unsigned *end_idx_out)
> >> @@ -1004,20 +1187,33 @@ _vtn_variable_load_store(struct vtn_builder *b,
> >> bool load,
> >>     case GLSL_TYPE_FLOAT:
> >>     case GLSL_TYPE_FLOAT16:
> >>     case GLSL_TYPE_BOOL:
> >> -   case GLSL_TYPE_DOUBLE:
> >> -      /* At this point, we have a scalar, vector, or matrix so we know
> >> that
> >> -       * there cannot be any structure splitting still in the way.  By
> >> -       * stopping at the matrix level rather than the vector level, we
> >> -       * ensure that matrices get loaded in the optimal way even if
> they
> >> -       * are storred row-major in a UBO.
> >> -       */
> >> -      if (load) {
> >> -         *inout = vtn_local_load(b, vtn_pointer_to_deref(b, ptr));
> >> +   case GLSL_TYPE_DOUBLE: {
> >> +      nir_deref_var *deref_var = vtn_pointer_to_deref(b, ptr);
> >> +
> >> +      if (deref_var) {
> >> +         /* At this point, we have a scalar, vector, or matrix so we
> know
> >> that
> >> +          * there cannot be any structure splitting still in the way.
> By
> >> +          * stopping at the matrix level rather than the vector level,
> we
> >> +          * ensure that matrices get loaded in the optimal way even if
> >> they
> >> +          * are storred row-major in a UBO.
> >> +          */
> >> +         if (load) {
> >> +            *inout = vtn_local_load(b, deref_var);
> >> +         } else {
> >> +            vtn_local_store(b, *inout, deref_var);
> >> +         }
> >>        } else {
> >> -         vtn_local_store(b, *inout, vtn_pointer_to_deref(b, ptr));
> >> +         /* If deref'ing a raw pointer, we don't have a variable to go
> >> along
> >> +          * with it.  Just directly generate load/store_global
> >> intrinsics:
> >> +          */
> >> +         if (load) {
> >> +            *inout = vtn_pointer_load(b, ptr);
> >> +         } else {
> >> +            vtn_pointer_store(b, *inout, ptr);
> >> +         }
> >>        }
> >>        return;
> >> -
> >> +   }
> >>     case GLSL_TYPE_ARRAY:
> >>     case GLSL_TYPE_STRUCT: {
> >>        unsigned elems = glsl_get_length(ptr->type->type);
> >> @@ -1052,6 +1248,8 @@ vtn_variable_load(struct vtn_builder *b, struct
> >> vtn_pointer *src)
> >>  {
> >>     if (vtn_pointer_is_external_block(b, src)) {
> >>        return vtn_block_load(b, src);
> >> +   } else if (!src->var) {
> >> +      return vtn_pointer_load(b, src);
> >>     } else {
> >>        struct vtn_ssa_value *val = NULL;
> >>        _vtn_variable_load_store(b, true, src, &val);
> >> @@ -1065,8 +1263,11 @@ vtn_variable_store(struct vtn_builder *b, struct
> >> vtn_ssa_value *src,
> >>  {
> >>     if (vtn_pointer_is_external_block(b, dest)) {
> >>        vtn_assert(dest->mode == vtn_variable_mode_ssbo ||
> >> -                 dest->mode == vtn_variable_mode_workgroup);
> >> +                 dest->mode == vtn_variable_mode_workgroup ||
> >> +                 dest->mode == vtn_variable_mode_param);
> >>        vtn_block_store(b, src, dest);
> >> +   } else if (!dest->var) {
> >> +      vtn_pointer_store(b, src, dest);
> >>     } else {
> >>        _vtn_variable_load_store(b, false, dest, &src);
> >>     }
> >> @@ -1608,6 +1809,7 @@ static enum vtn_variable_mode
> >>  vtn_storage_class_to_mode(struct vtn_builder *b,
> >>                            SpvStorageClass class,
> >>                            struct vtn_type *interface_type,
> >> +                          bool has_initializer,
> >>                            nir_variable_mode *nir_mode_out)
> >>  {
> >>     enum vtn_variable_mode mode;
> >> @@ -1635,8 +1837,12 @@ vtn_storage_class_to_mode(struct vtn_builder *b,
> >>        } else if (glsl_type_is_sampler(interface_type->type)) {
> >>           mode = vtn_variable_mode_sampler;
> >>           nir_mode = nir_var_uniform;
> >> +      } else if (has_initializer) {
> >> +         mode = vtn_variable_mode_const;
> >> +         nir_mode = nir_var_global;
> >>        } else {
> >> -         vtn_fail("Invalid uniform constant variable type");
> >> +         mode = vtn_variable_mode_push_constant;
> >> +         nir_mode = nir_var_uniform;
> >>        }
> >>        break;
> >>     case SpvStorageClassPushConstant:
> >> @@ -1664,6 +1870,10 @@ vtn_storage_class_to_mode(struct vtn_builder *b,
> >>        nir_mode = nir_var_shared;
> >>        break;
> >>     case SpvStorageClassCrossWorkgroup:
> >> +      mode = vtn_variable_mode_cross_workgroup;
> >> +      // TODO do we want a new nir_mode for raw pointers?
> >> +      nir_mode = nir_var_global;
> >> +      break;
> >>     case SpvStorageClassGeneric:
> >>     case SpvStorageClassAtomicCounter:
> >>     default:
> >> @@ -1692,7 +1902,14 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct
> >> vtn_pointer *ptr)
> >>        struct vtn_access_chain chain = {
> >>           .length = 0,
> >>        };
> >> -      ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
> >> +
> >> +      if (vtn_pointer_uses_ssa_offset(b, ptr)) {
> >> +         ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
> >> +      } else if (ptr->ptr_type->base_type ==
> vtn_base_type_raw_pointer) {
> >> +         ptr = vtn_ssa_raw_pointer_dereference(b, ptr, &chain);
> >> +      } else {
> >> +         vtn_fail("unhandled");
> >> +      }
> >>     }
> >>
> >>     vtn_assert(ptr->offset);
> >> @@ -1710,27 +1927,34 @@ struct vtn_pointer *
> >>  vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa,
> >>                       struct vtn_type *ptr_type)
> >>  {
> >> -   vtn_assert(ssa->num_components <= 2 && ssa->bit_size == 32);
> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
> >>     vtn_assert(ptr_type->deref->base_type != vtn_base_type_pointer);
> >>     /* This pointer type needs to have actual storage */
> >>     vtn_assert(ptr_type->type);
> >>
> >>     struct vtn_pointer *ptr = rzalloc(b, struct vtn_pointer);
> >>     ptr->mode = vtn_storage_class_to_mode(b, ptr_type->storage_class,
> >> -                                         ptr_type, NULL);
> >> +                                         ptr_type, false, NULL);
> >> +
> >> +   if (ptr_type->base_type == vtn_base_type_raw_pointer) {
> >> +      vtn_assert(ssa->num_components == 1 && ssa->bit_size ==
> >> b->ptr_size);
> >> +   } else {
> >> +      vtn_assert(ssa->num_components <= 2 && ssa->bit_size == 32);
> >> +   }
> >> +
> >>     ptr->type = ptr_type->deref;
> >>     ptr->ptr_type = ptr_type;
> >>
> >>     if (ssa->num_components > 1) {
> >>        vtn_assert(ssa->num_components == 2);
> >>        vtn_assert(ptr->mode == vtn_variable_mode_ubo ||
> >> -                 ptr->mode == vtn_variable_mode_ssbo);
> >> +                 ptr->mode == vtn_variable_mode_ssbo ||
> >> +                 ptr->mode == vtn_variable_mode_global);
> >>        ptr->block_index = nir_channel(&b->nb, ssa, 0);
> >>        ptr->offset = nir_channel(&b->nb, ssa, 1);
> >>     } else {
> >>        vtn_assert(ssa->num_components == 1);
> >> -      vtn_assert(ptr->mode == vtn_variable_mode_workgroup);
> >>        ptr->block_index = NULL;
> >>        ptr->offset = ssa;
> >>     }
> >> @@ -1761,7 +1985,8 @@ vtn_create_variable(struct vtn_builder *b, struct
> >> vtn_value *val,
> >>                      struct vtn_type *ptr_type, SpvStorageClass
> >> storage_class,
> >>                      nir_constant *initializer)
> >>  {
> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
> >>     struct vtn_type *type = ptr_type->deref;
> >>
> >>     struct vtn_type *without_array = type;
> >> @@ -1770,7 +1995,8 @@ vtn_create_variable(struct vtn_builder *b, struct
> >> vtn_value *val,
> >>
> >>     enum vtn_variable_mode mode;
> >>     nir_variable_mode nir_mode;
> >> -   mode = vtn_storage_class_to_mode(b, storage_class, without_array,
> >> &nir_mode);
> >> +   mode = vtn_storage_class_to_mode(b, storage_class, without_array,
> >> +                                    !!initializer, &nir_mode);
> >>
> >>     switch (mode) {
> >>     case vtn_variable_mode_ubo:
> >> @@ -1933,6 +2159,7 @@ vtn_create_variable(struct vtn_builder *b, struct
> >> vtn_value *val,
> >>     case vtn_variable_mode_ubo:
> >>     case vtn_variable_mode_ssbo:
> >>     case vtn_variable_mode_push_constant:
> >> +   case vtn_variable_mode_cross_workgroup:
> >>        /* These don't need actual variables. */
> >>        break;
> >>     }
> >> @@ -2031,25 +2258,18 @@ vtn_handle_variables(struct vtn_builder *b,
> SpvOp
> >> opcode,
> >>     case SpvOpAccessChain:
> >>     case SpvOpPtrAccessChain:
> >>     case SpvOpInBoundsAccessChain: {
> >> +      struct vtn_type *ptr_type = vtn_value(b, w[1],
> >> vtn_value_type_type)->type;
> >> +      struct vtn_value *base_val = vtn_untyped_value(b, w[3]);
> >>        struct vtn_access_chain *chain = vtn_access_chain_create(b,
> count -
> >> 4);
> >> -      chain->ptr_as_array = (opcode == SpvOpPtrAccessChain);
> >> +      chain->ptr_as_array = (opcode == SpvOpPtrAccessChain) ||
> >> +                            (opcode == SpvOpInBoundsPtrAccessChain);
> >>
> >>        unsigned idx = 0;
> >>        for (int i = 4; i < count; i++) {
> >> -         struct vtn_value *link_val = vtn_untyped_value(b, w[i]);
> >> -         if (link_val->value_type == vtn_value_type_constant) {
> >> -            chain->link[idx].mode = vtn_access_mode_literal;
> >> -            chain->link[idx].id = link_val->constant->values[0].
> u32[0];
> >> -         } else {
> >> -            chain->link[idx].mode = vtn_access_mode_id;
> >> -            chain->link[idx].id = w[i];
> >> -
> >> -         }
> >> +         chain->link[idx] = vtn_to_access_link(b, w[i]);
> >>           idx++;
> >>        }
> >>
> >> -      struct vtn_type *ptr_type = vtn_value(b, w[1],
> >> vtn_value_type_type)->type;
> >> -      struct vtn_value *base_val = vtn_untyped_value(b, w[3]);
> >>        if (base_val->value_type == vtn_value_type_sampled_image) {
> >>           /* This is rather insane.  SPIR-V allows you to use
> >> OpSampledImage
> >>            * to combine an array of images with a single sampler to get
> an
> >> --
> >> 2.14.3
> >>
> >> _______________________________________________
> >> mesa-dev mailing list
> >> mesa-dev at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180323/e6e02982/attachment-0001.html>


More information about the mesa-dev mailing list