[Mesa-dev] [PATCH v3 06/19] RFC: nir/vtn: "raw" pointer support
Jason Ekstrand
jason at jlekstrand.net
Fri Mar 23 21:18:25 UTC 2018
On Fri, Mar 23, 2018 at 2:15 PM, Karol Herbst <kherbst at redhat.com> wrote:
> On Fri, Mar 23, 2018 at 10:07 PM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> > +list
> >
> > On Fri, Mar 23, 2018 at 1:45 PM, Karol Herbst <kherbst at redhat.com>
> wrote:
> >>
> >> On Fri, Mar 23, 2018 at 9:30 PM, Jason Ekstrand <jason at jlekstrand.net>
> >> wrote:
> >> > As I've been rewriting core NIR deref handling, I've been thinking
> about
> >> > this problem quite a bit. One objective I have is to actually make
> UBO
> >> > and
> >> > SSBO access go through derefs instead of just being an offset and
> index
> >> > so
> >> > that the compiler can better reason about them. In particular, I want
> >> > to be
> >> > able to start doing load/store elimination on SSBOs, SLM, and whatever
> >> > CL
> >> > has which would be great for everyone's compute performance (GL,
> Vulkan,
> >> > CL,
> >> > etc.).
> >> >
> >> > I would be lying if I said I had a full plan but I do have part of a
> >> > plan.
> >> > In my patch which adds the deref instructions, I add a new "cast"
> deref
> >> > type
> >> > which takes an arbitrary value as it's source and kicks out a deref
> with
> >> > a
> >> > type. Whenever we discover that the source of the cast is actually
> >> > another
> >> > deref which is compatible (same type etc.), copy propagation gets rid
> of
> >> > the
> >> > cast for you. The idea is that, instead of doing a load_raw(raw_ptr),
> >> > you
> >> > would do a load((type *)raw_ptr).
> >> >
> >> > Right now, most of the core NIR optimizations will throw a fit if they
> >> > ever
> >> > see a cast. This is intentional because it requires us to manually go
> >> > through and handle casts. This would mean that, at the moment, you
> >> > would
> >> > have to lower to load_raw intrinsics almost immediately after coming
> out
> >> > of
> >> > SPIR-V.
> >> >
> >>
> >> Well it gets more fun with OpenCL 2.0 where you can have generic
> >> pointer where you only know the type at creation type. You can also
> >> declare generic pointers as function inputs in a way, that you never
> >> actually know from where you have to load if you only have that one
> >> function. So the actual load operation depends on when you create the
> >> initial pointer variable (you can cast from X to generic, but not the
> >> other way around).
> >>
> >> Which in the end means you can end up with load(generic_ptr) and only
> >> following the chain up to it's creation (with function inlining in
> >> mind) you know the actual memory target.
> >
> >
> > Yup. And there will always be crazy cases where you can't actually
> follow
> > it and you have to emit a pile of code to load different ways depending
> on
> > some bits somewhere that tell you how to load it. I'm well aware of the
> > insanity. :-) This is part of the reason why I'm glad I'm not trying to
> > write an OpenCL 2.0 driver.
> >
> > This insanity is exactly why I'm suggesting the pointer casting. Sure,
> you
> > may not know the data type until the actual load. In that case, you end
> up
> > with the cast being right before the load. If you don't know the storage
> > class, maybe you have to switch and do multiple casts based on some bits.
> > Alternatively, if you don't know the storage class, we can just let the
> > deref mode be 0 for "I don't know". or maybe multiple bits for "these are
> > the things it might be". In any case, I think we can handle it.
> >
>
> there shouldn't be a situation where we don't know, except when you
> don't inline all functions. I think Rob had the idea of fat pointers
> where a pointer is a vec2 and the 2nd component contains the actual
> pointer type and you end up with a switch over the type to get the
> correct storage class. And if the compiler inlines all functions, it
> should be able to optimize that switch away.
>
Right. Today, we live in a world where all functions are inlined. Sadly,
I fear that world may come to and end one of these days. :(
> > It's insane but we need some sort of structure to be able to reason about
> > the insanity. Immediately lowering everything to load_raw is a good way
> to
> > get a driver off the ground. What it's not so good for is making an
> > optimizing compiler that can reason about these crazy pointers and
> actually
> > optimize them. Lest I sound too negative, I'm 100% fine with taking a
> short
> > path to getting something working now so long as it doesn't cloud up our
> > ability to do better in the future.
> >
> >>
> >> And I think the issue here is not that it is some kind of raw pointer
> >> in the patch, but more like an unbound/physical pointer, which doesn't
> >> relate to any variable. It is just a value like any other int/long as
> >> well.
> >>
> >> > On Fri, Mar 23, 2018 at 12:33 PM, Karol Herbst <kherbst at redhat.com>
> >> > wrote:
> >> >>
> >> >> From: Rob Clark <robdclark at gmail.com>
> >> >>
> >> >> An attempt to add physical pointer support to vtn. I'm not totally
> >> >> happy about the handling of logical pointers vs physical pointers.
> >> >> So this is really more of an RFS (request for suggestions)
> >> >>
> >> >> v2: treat vec3 types as vec4 when dereferencing
> >> >>
> >> >> Signed-off-by: Karol Herbst <kherbst at redhat.com>
> >> >> ---
> >> >> src/compiler/spirv/spirv_to_nir.c | 87 ++++++++---
> >> >> src/compiler/spirv/vtn_private.h | 20 ++-
> >> >> src/compiler/spirv/vtn_variables.c | 300
> >> >> ++++++++++++++++++++++++++++++++-----
> >> >> 3 files changed, 347 insertions(+), 60 deletions(-)
> >> >>
> >> >> diff --git a/src/compiler/spirv/spirv_to_nir.c
> >> >> b/src/compiler/spirv/spirv_to_nir.c
> >> >> index 334bcab9a82..d58a68f80ef 100644
> >> >> --- a/src/compiler/spirv/spirv_to_nir.c
> >> >> +++ b/src/compiler/spirv/spirv_to_nir.c
> >> >> @@ -572,6 +572,7 @@ vtn_types_compatible(struct vtn_builder *b,
> >> >> vtn_types_compatible(b, t1->array_element,
> >> >> t2->array_element);
> >> >>
> >> >> case vtn_base_type_pointer:
> >> >> + case vtn_base_type_raw_pointer:
> >> >> return vtn_types_compatible(b, t1->deref, t2->deref);
> >> >>
> >> >> case vtn_base_type_struct:
> >> >> @@ -609,6 +610,7 @@ vtn_type_copy(struct vtn_builder *b, struct
> >> >> vtn_type
> >> >> *src)
> >> >> case vtn_base_type_matrix:
> >> >> case vtn_base_type_array:
> >> >> case vtn_base_type_pointer:
> >> >> + case vtn_base_type_raw_pointer:
> >> >> case vtn_base_type_image:
> >> >> case vtn_base_type_sampler:
> >> >> case vtn_base_type_sampled_image:
> >> >> @@ -939,6 +941,14 @@ vtn_type_layout_std430(struct vtn_builder *b,
> >> >> struct
> >> >> vtn_type *type,
> >> >> return type;
> >> >> }
> >> >>
> >> >> + case vtn_base_type_raw_pointer: {
> >> >> + uint32_t comp_size = b->ptr_size / 8;
> >> >> + vtn_assert(comp_size);
> >> >> + *size_out = comp_size;
> >> >> + *align_out = comp_size;
> >> >> + return type;
> >> >> + }
> >> >> +
> >> >> case vtn_base_type_vector: {
> >> >> uint32_t comp_size = glsl_get_bit_size(type->type) / 8;
> >> >> assert(type->length > 0 && type->length <= 4);
> >> >> @@ -1003,6 +1013,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >> val->type->base_type = vtn_base_type_scalar;
> >> >> val->type->type = glsl_bool_type();
> >> >> val->type->length = 1;
> >> >> + val->type->stride = 4;
> >> >> break;
> >> >> case SpvOpTypeInt: {
> >> >> int bit_size = w[2];
> >> >> @@ -1025,6 +1036,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >> vtn_fail("Invalid int bit size");
> >> >> }
> >> >> val->type->length = 1;
> >> >> + val->type->stride = bit_size / 8;
> >> >> break;
> >> >> }
> >> >>
> >> >> @@ -1045,6 +1057,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >> vtn_fail("Invalid float bit size");
> >> >> }
> >> >> val->type->length = 1;
> >> >> + val->type->stride = bit_size / 8;
> >> >> break;
> >> >> }
> >> >>
> >> >> @@ -1061,6 +1074,10 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >> val->type->type =
> >> >> glsl_vector_type(glsl_get_base_type(base->type),
> >> >> elems);
> >> >> val->type->length = elems;
> >> >> val->type->stride = glsl_get_bit_size(base->type) / 8;
> >> >> + /* special case: vec3 is aligned to vec4 */
> >> >> + if (elems == 3)
> >> >> + elems = 4;
> >> >> + val->type->stride *= elems;
> >> >> val->type->array_element = base;
> >> >> break;
> >> >> }
> >> >> @@ -1138,7 +1155,11 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >>
> >> >> const char *name = val->name ? val->name : "struct";
> >> >>
> >> >> - val->type->type = glsl_struct_type(fields, num_fields, name,
> >> >> false);
> >> >> + val->type->type = glsl_struct_type(fields, num_fields, name,
> >> >> + val->type->packed);
> >> >> + // TODO stride for a struct only matters for kernel shaders,
> >> >> where
> >> >> + // cl_size is the right thing.. but still a bit ugly to
> >> >> hard-code.
> >> >> + val->type->stride = glsl_get_cl_size(val->type->type);
> >> >> break;
> >> >> }
> >> >>
> >> >> @@ -1167,25 +1188,47 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
> >> >> opcode,
> >> >> val->type->storage_class = storage_class;
> >> >> val->type->deref = deref_type;
> >> >>
> >> >> - if (storage_class == SpvStorageClassUniform ||
> >> >> - storage_class == SpvStorageClassStorageBuffer) {
> >> >> - /* These can actually be stored to nir_variables and used
> as
> >> >> SSA
> >> >> - * values so they need a real glsl_type.
> >> >> - */
> >> >> - val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
> >> >> - }
> >> >> -
> >> >> - if (storage_class == SpvStorageClassWorkgroup &&
> >> >> - b->options->lower_workgroup_access_to_offsets) {
> >> >> + // XXX handling the "fake" glsl pointers vs "raw" pointers in
> >> >> kernel
> >> >> + // is a bit ugly.. need to understand how "pointers" are used
> >> >> in
> >> >> vk
> >> >> + // and figure out something better
> >> >> + if (storage_class == SpvStorageClassFunction ||
> >> >> + storage_class == SpvStorageClassUniformConstant ||
> >> >> + storage_class == SpvStorageClassWorkgroup ||
> >> >> + !b->kernel_mode) {
> >> >> + if (storage_class == SpvStorageClassUniform ||
> >> >> + storage_class == SpvStorageClassStorageBuffer) {
> >> >> + /* These can actually be stored to nir_variables and
> used
> >> >> as
> >> >> SSA
> >> >> + * values so they need a real glsl_type.
> >> >> + */
> >> >> + val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
> >> >> + } else if (storage_class == SpvStorageClassWorkgroup &&
> >> >> + b->options->lower_workgroup_access_to_offsets)
> {
> >> >> + uint32_t size, align;
> >> >> + val->type->deref = vtn_type_layout_std430(b,
> >> >> val->type->deref,
> >> >> + &size,
> &align);
> >> >> + val->type->length = size;
> >> >> + val->type->align = align;
> >> >> + /* These can actually be stored to nir_variables and
> used
> >> >> as
> >> >> SSA
> >> >> + * values so they need a real glsl_type.
> >> >> + */
> >> >> + val->type->type = glsl_uint_type();
> >> >> + }
> >> >> + } else {
> >> >> + vtn_assert(storage_class == SpvStorageClassCrossWorkgroup
> ||
> >> >> + storage_class == SpvStorageClassInput);
> >> >> uint32_t size, align;
> >> >> + if (b->ptr_size == 64) {
> >> >> + val->type->type = glsl_uint64_t_type();
> >> >> + } else {
> >> >> + val->type->type = glsl_uint_type();
> >> >> + }
> >> >> + val->type->base_type = vtn_base_type_raw_pointer;
> >> >> + /* pointers can be accessed as array, so set the stride as
> >> >> size:
> >> >> */
> >> >> val->type->deref = vtn_type_layout_std430(b,
> >> >> val->type->deref,
> >> >> &size, &align);
> >> >> - val->type->length = size;
> >> >> +#define ALIGN(_v, _d) (((_v) + ((_d) - 1)) & ~((_d) - 1))
> >> >> + val->type->stride = ALIGN(size, align);
> >> >> val->type->align = align;
> >> >> - /* These can actually be stored to nir_variables and used
> as
> >> >> SSA
> >> >> - * values so they need a real glsl_type.
> >> >> - */
> >> >> - val->type->type = glsl_uint_type();
> >> >> }
> >> >> break;
> >> >> }
> >> >> @@ -3391,9 +3434,16 @@ vtn_handle_preamble_instruction(struct
> >> >> vtn_builder
> >> >> *b, SpvOp opcode,
> >> >> break;
> >> >>
> >> >> case SpvOpMemoryModel:
> >> >> - vtn_assert(w[1] == SpvAddressingModelLogical);
> >> >> + vtn_assert(w[1] == SpvAddressingModelLogical ||
> >> >> + w[1] == SpvAddressingModelPhysical32 ||
> >> >> + w[1] == SpvAddressingModelPhysical64);
> >> >> vtn_assert(w[2] == SpvMemoryModelSimple ||
> >> >> - w[2] == SpvMemoryModelGLSL450);
> >> >> + w[2] == SpvMemoryModelGLSL450 ||
> >> >> + w[2] == SpvMemoryModelOpenCL);
> >> >> + if (w[1] == SpvAddressingModelPhysical32)
> >> >> + b->ptr_size = 32;
> >> >> + else if (w[1] == SpvAddressingModelPhysical64)
> >> >> + b->ptr_size = 64;
> >> >> break;
> >> >>
> >> >> case SpvOpEntryPoint: {
> >> >> @@ -3780,6 +3830,7 @@ vtn_handle_body_instruction(struct vtn_builder
> >> >> *b,
> >> >> SpvOp opcode,
> >> >> sel_type = glsl_vector_type(GLSL_TYPE_BOOL,
> >> >> res_val->type->length);
> >> >> break;
> >> >> case vtn_base_type_pointer:
> >> >> + case vtn_base_type_raw_pointer:
> >> >> /* We need to have actual storage for pointer types */
> >> >> vtn_fail_if(res_val->type->type == NULL,
> >> >> "Invalid pointer result type for OpSelect");
> >> >> diff --git a/src/compiler/spirv/vtn_private.h
> >> >> b/src/compiler/spirv/vtn_private.h
> >> >> index bd273122703..dbfe9eab58a 100644
> >> >> --- a/src/compiler/spirv/vtn_private.h
> >> >> +++ b/src/compiler/spirv/vtn_private.h
> >> >> @@ -262,7 +262,8 @@ enum vtn_base_type {
> >> >> vtn_base_type_matrix,
> >> >> vtn_base_type_array,
> >> >> vtn_base_type_struct,
> >> >> - vtn_base_type_pointer,
> >> >> + vtn_base_type_pointer, /* a logical pointer */
> >> >> + vtn_base_type_raw_pointer, /* a physical pointer */
> >> >> vtn_base_type_image,
> >> >> vtn_base_type_sampler,
> >> >> vtn_base_type_sampled_image,
> >> >> @@ -407,12 +408,14 @@ enum vtn_variable_mode {
> >> >> vtn_variable_mode_local,
> >> >> vtn_variable_mode_global,
> >> >> vtn_variable_mode_param,
> >> >> + vtn_variable_mode_const,
> >> >> vtn_variable_mode_ubo,
> >> >> vtn_variable_mode_ssbo,
> >> >> vtn_variable_mode_push_constant,
> >> >> vtn_variable_mode_image,
> >> >> vtn_variable_mode_sampler,
> >> >> vtn_variable_mode_workgroup,
> >> >> + vtn_variable_mode_cross_workgroup,
> >> >> vtn_variable_mode_input,
> >> >> vtn_variable_mode_output,
> >> >> };
> >> >> @@ -588,6 +591,13 @@ struct vtn_builder {
> >> >>
> >> >> bool has_loop_continue;
> >> >>
> >> >> + /* pointer size is:
> >> >> + * AddressingModelLogical: 0 (default)
> >> >> + * AddressingModelPhysical32: 32
> >> >> + * AddressingModelPhysical64: 64
> >> >> + */
> >> >> + unsigned ptr_size;
> >> >> +
> >> >> bool kernel_mode;
> >> >> };
> >> >>
> >> >> @@ -624,7 +634,8 @@ vtn_push_ssa(struct vtn_builder *b, uint32_t
> >> >> value_id,
> >> >> struct vtn_type *type, struct vtn_ssa_value *ssa)
> >> >> {
> >> >> struct vtn_value *val;
> >> >> - if (type->base_type == vtn_base_type_pointer) {
> >> >> + if (type->base_type == vtn_base_type_pointer ||
> >> >> + type->base_type == vtn_base_type_raw_pointer) {
> >> >> val = vtn_push_value(b, value_id, vtn_value_type_pointer);
> >> >> val->pointer = vtn_pointer_from_ssa(b, ssa->def, type);
> >> >> } else {
> >> >> @@ -688,6 +699,11 @@ struct vtn_ssa_value *vtn_local_load(struct
> >> >> vtn_builder *b, nir_deref_var *src);
> >> >> void vtn_local_store(struct vtn_builder *b, struct vtn_ssa_value
> *src,
> >> >> nir_deref_var *dest);
> >> >>
> >> >> +struct vtn_ssa_value *vtn_pointer_load(struct vtn_builder *b,
> >> >> + struct vtn_pointer *ptr);
> >> >> +void vtn_pointer_store(struct vtn_builder *b, struct vtn_ssa_value
> >> >> *src,
> >> >> + struct vtn_pointer *ptr);
> >> >> +
> >> >> struct vtn_ssa_value *
> >> >> vtn_variable_load(struct vtn_builder *b, struct vtn_pointer *src);
> >> >>
> >> >> diff --git a/src/compiler/spirv/vtn_variables.c
> >> >> b/src/compiler/spirv/vtn_variables.c
> >> >> index b2897407fb1..af9222d6f4e 100644
> >> >> --- a/src/compiler/spirv/vtn_variables.c
> >> >> +++ b/src/compiler/spirv/vtn_variables.c
> >> >> @@ -51,6 +51,9 @@ vtn_access_chain_extend(struct vtn_builder *b,
> struct
> >> >> vtn_access_chain *old,
> >> >> unsigned old_len = old ? old->length : 0;
> >> >> chain = vtn_access_chain_create(b, old_len + new_ids);
> >> >>
> >> >> + if (old)
> >> >> + chain->ptr_as_array = old->ptr_as_array;
> >> >> +
> >> >> for (unsigned i = 0; i < old_len; i++)
> >> >> chain->link[i] = old->link[i];
> >> >>
> >> >> @@ -88,11 +91,12 @@ vtn_access_chain_pointer_dereference(struct
> >> >> vtn_builder *b,
> >> >> vtn_access_chain_extend(b, base->chain, deref_chain->length);
> >> >> struct vtn_type *type = base->type;
> >> >>
> >> >> - /* OpPtrAccessChain is only allowed on things which support
> >> >> variable
> >> >> - * pointers. For everything else, the client is expected to just
> >> >> pass
> >> >> us
> >> >> - * the right access chain.
> >> >> + /* We can get an *PtrAccessChain with cl kernels, with first
> >> >> element
> >> >> + * of deref chain being literal zero. But we can't really cope
> w/
> >> >> + * tacking on another level of ptr_as_array.
> >> >> */
> >> >> - vtn_assert(!deref_chain->ptr_as_array);
> >> >> + vtn_assert(!(chain->ptr_as_array && deref_chain->ptr_as_array));
> >> >> + chain->ptr_as_array = deref_chain->ptr_as_array;
> >> >>
> >> >> unsigned start = base->chain ? base->chain->length : 0;
> >> >> for (unsigned i = 0; i < deref_chain->length; i++) {
> >> >> @@ -135,6 +139,21 @@ vtn_access_link_as_ssa(struct vtn_builder *b,
> >> >> struct
> >> >> vtn_access_link link,
> >> >> }
> >> >> }
> >> >>
> >> >> +static struct vtn_access_link
> >> >> +vtn_to_access_link(struct vtn_builder *b, uint32_t link_id)
> >> >> +{
> >> >> + struct vtn_value *link_val = vtn_untyped_value(b, link_id);
> >> >> + static struct vtn_access_link link;
> >> >> + if (link_val->value_type == vtn_value_type_constant) {
> >> >> + link.mode = vtn_access_mode_literal;
> >> >> + link.id = link_val->constant->values[0].u32[0];
> >> >> + } else {
> >> >> + link.mode = vtn_access_mode_id;
> >> >> + link.id = link_id;
> >> >> + }
> >> >> + return link;
> >> >> +}
> >> >> +
> >> >> static nir_ssa_def *
> >> >> vtn_variable_resource_index(struct vtn_builder *b, struct
> vtn_variable
> >> >> *var,
> >> >> nir_ssa_def *desc_array_index)
> >> >> @@ -207,6 +226,12 @@ vtn_ssa_offset_pointer_dereference(struct
> >> >> vtn_builder
> >> >> *b,
> >> >> /* You can't have a zero-length OpPtrAccessChain */
> >> >> vtn_assert(deref_chain->length >= 1);
> >> >> desc_arr_idx = vtn_access_link_as_ssa(b,
> >> >> deref_chain->link[0], 1);
> >> >> + } else if (!glsl_type_is_struct(type->type)) {
> >> >> + /* if we have something that is ptr to simple type, like
> >> >> an
> >> >> + * int ptr, then just treat that like an array of length
> >> >> 1,
> >> >> + * we just want to deref the zero'th element:
> >> >> + */
> >> >> + desc_arr_idx = nir_imm_int(&b->nb, 0);
> >> >> } else {
> >> >> /* We have a regular non-array SSBO. */
> >> >> desc_arr_idx = NULL;
> >> >> @@ -304,7 +329,9 @@ vtn_ssa_offset_pointer_dereference(struct
> >> >> vtn_builder
> >> >> *b,
> >> >> case GLSL_TYPE_FLOAT:
> >> >> case GLSL_TYPE_FLOAT16:
> >> >> case GLSL_TYPE_DOUBLE:
> >> >> - case GLSL_TYPE_BOOL:
> >> >> + case GLSL_TYPE_BOOL: {
> >> >> + break;
> >> >> + }
> >> >> case GLSL_TYPE_ARRAY: {
> >> >> nir_ssa_def *elem_offset =
> >> >> vtn_access_link_as_ssa(b, deref_chain->link[idx],
> >> >> type->stride);
> >> >> @@ -336,6 +363,98 @@ vtn_ssa_offset_pointer_dereference(struct
> >> >> vtn_builder
> >> >> *b,
> >> >> return ptr;
> >> >> }
> >> >>
> >> >> +static struct vtn_pointer *
> >> >> +vtn_ssa_raw_pointer_dereference(struct vtn_builder *b,
> >> >> + struct vtn_pointer *base,
> >> >> + struct vtn_access_chain
> *deref_chain)
> >> >> +{
> >> >> + nir_ssa_def *offset = nir_imm_int(&b->nb, 0);
> >> >> + struct vtn_type *type = base->type;
> >> >> +
> >> >> +
> >> >> + vtn_assert(type);
> >> >> +
> >> >> + unsigned idx = 0;
> >> >> +
> >> >> + /* For the *PtrAccessChain deref instructions, the first entry in
> >> >> + * the deref chain is deref'ing the ptr as an array, ie. if you
> >> >> + * have "struct foo *f" then it can be deref'd as either
> "f->blah",
> >> >> + * which is equiv to "f[0].blah", or "f[n].blah".
> >> >> + */
> >> >> + if (deref_chain->ptr_as_array) {
> >> >> + nir_ssa_def *elem_offset =
> >> >> + vtn_access_link_as_ssa(b, deref_chain->link[idx],
> >> >> type->stride);
> >> >> + offset = nir_iadd(&b->nb, offset, elem_offset);
> >> >> + idx++;
> >> >> + }
> >> >> +
> >> >> + /* TODO this can probably be refactored out into helper? Nearly
> >> >> + * the same as in vtn_ssa_offset_pointer_dereference() and
> perhaps
> >> >> + * elsewhere..
> >> >> + *
> >> >> + * Note that the type here in the switch is of what the pointer
> >> >> + * points to, ie. if the base type is uint, we are dereferencing
> >> >> + * (possibly as an array) a uint* pointer.
> >> >> + */
> >> >> + for (; idx < deref_chain->length; idx++) {
> >> >> + switch (glsl_get_base_type(type->type)) {
> >> >> + case GLSL_TYPE_UINT:
> >> >> + case GLSL_TYPE_INT:
> >> >> + case GLSL_TYPE_UINT16:
> >> >> + case GLSL_TYPE_INT16:
> >> >> + case GLSL_TYPE_UINT8:
> >> >> + case GLSL_TYPE_INT8:
> >> >> + case GLSL_TYPE_UINT64:
> >> >> + case GLSL_TYPE_INT64:
> >> >> + case GLSL_TYPE_FLOAT:
> >> >> + case GLSL_TYPE_FLOAT16:
> >> >> + case GLSL_TYPE_DOUBLE:
> >> >> + case GLSL_TYPE_BOOL:
> >> >> + case GLSL_TYPE_ARRAY: {
> >> >> + nir_ssa_def *elem_offset =
> >> >> + vtn_access_link_as_ssa(b, deref_chain->link[idx],
> >> >> type->stride);
> >> >> + offset = nir_iadd(&b->nb, offset, elem_offset);
> >> >> + if (type->array_element) {
> >> >> + type = type->array_element;
> >> >> + } else {
> >> >> + vtn_assert(idx == (deref_chain->length - 1));
> >> >> + }
> >> >> + break;
> >> >> + }
> >> >> +
> >> >> + case GLSL_TYPE_STRUCT: {
> >> >> + vtn_assert(deref_chain->link[idx].mode ==
> >> >> vtn_access_mode_literal);
> >> >> + unsigned member = deref_chain->link[idx].id;
> >> >> + nir_ssa_def *mem_offset = nir_imm_int(&b->nb,
> >> >> type->offsets[member]);
> >> >> + offset = nir_iadd(&b->nb, offset, mem_offset);
> >> >> + type = type->members[member];
> >> >> + break;
> >> >> + }
> >> >> +
> >> >> + default:
> >> >> + vtn_fail("Invalid type for deref");
> >> >> + }
> >> >> + }
> >> >> +
> >> >> + /* at this point, offset is 32b, but pointer can be either 32b or
> >> >> 64b
> >> >> + * depending on memory model:
> >> >> + */
> >> >> + if (b->ptr_size == 64) {
> >> >> + offset = nir_u2u64(&b->nb, offset);
> >> >> + }
> >> >> +
> >> >> + /* add pointer address to calculated offset: */
> >> >> + if (base->offset)
> >> >> + offset = nir_iadd(&b->nb, offset, base->offset);
> >> >> +
> >> >> + struct vtn_pointer *ptr = rzalloc(b, struct vtn_pointer);
> >> >> + ptr->mode = base->mode;
> >> >> + ptr->type = base->type;
> >> >> + ptr->offset = offset;
> >> >> +
> >> >> + return ptr;
> >> >> +}
> >> >> +
> >> >> /* Dereference the given base pointer by the access chain */
> >> >> static struct vtn_pointer *
> >> >> vtn_pointer_dereference(struct vtn_builder *b,
> >> >> @@ -344,6 +463,8 @@ vtn_pointer_dereference(struct vtn_builder *b,
> >> >> {
> >> >> if (vtn_pointer_uses_ssa_offset(b, base)) {
> >> >> return vtn_ssa_offset_pointer_dereference(b, base,
> deref_chain);
> >> >> + } else if (base->ptr_type->base_type ==
> vtn_base_type_raw_pointer)
> >> >> {
> >> >> + return vtn_ssa_raw_pointer_dereference(b, base, deref_chain);
> >> >> } else {
> >> >> return vtn_access_chain_pointer_dereference(b, base,
> >> >> deref_chain);
> >> >> }
> >> >> @@ -373,7 +494,8 @@ vtn_pointer_for_variable(struct vtn_builder *b,
> >> >>
> >> >> pointer->mode = var->mode;
> >> >> pointer->type = var->type;
> >> >> - vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> >> + vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> >> + ptr_type->base_type == vtn_base_type_raw_pointer);
> >> >> vtn_assert(ptr_type->deref->type == var->type->type);
> >> >> pointer->ptr_type = ptr_type;
> >> >> pointer->var = var;
> >> >> @@ -384,6 +506,12 @@ vtn_pointer_for_variable(struct vtn_builder *b,
> >> >> nir_deref_var *
> >> >> vtn_pointer_to_deref(struct vtn_builder *b, struct vtn_pointer *ptr)
> >> >> {
> >> >> + /* once you've chased a pointer, you no longer really have a var,
> >> >> + * so this case is handled differently
> >> >> + */
> >> >> + if (!ptr->var)
> >> >> + return NULL;
> >> >> +
> >> >> /* Do on-the-fly copy propagation for samplers. */
> >> >> if (ptr->var->copy_prop_sampler)
> >> >> return vtn_pointer_to_deref(b, ptr->var->copy_prop_sampler);
> >> >> @@ -408,7 +536,19 @@ vtn_pointer_to_deref(struct vtn_builder *b,
> struct
> >> >> vtn_pointer *ptr)
> >> >> nir_deref *tail = &deref_var->deref;
> >> >> nir_variable **members = ptr->var->members;
> >> >>
> >> >> - for (unsigned i = 0; i < chain->length; i++) {
> >> >> + unsigned i = 0;
> >> >> + if (chain->ptr_as_array) {
> >> >> + /* We can't currently handle this in a nir deref chain.
> Perhaps
> >> >> + * a new type of deref node is needed? But we can at least
> >> >> handle
> >> >> + * the case where the first deref link is literal zero by
> >> >> skipping
> >> >> + * it.
> >> >> + */
> >> >> + vtn_assert(chain->link[0].mode == vtn_access_mode_literal);
> >> >> + vtn_assert(chain->link[0].id == 0);
> >> >> + i++;
> >> >> + }
> >> >> +
> >> >> + for (; i < chain->length; i++) {
> >> >> enum glsl_base_type base_type =
> >> >> glsl_get_base_type(deref_type->type);
> >> >> switch (base_type) {
> >> >> case GLSL_TYPE_UINT:
> >> >> @@ -598,6 +738,49 @@ vtn_local_store(struct vtn_builder *b, struct
> >> >> vtn_ssa_value *src,
> >> >> }
> >> >> }
> >> >>
> >> >> +struct vtn_ssa_value *
> >> >> +vtn_pointer_load(struct vtn_builder *b, struct vtn_pointer *ptr)
> >> >> +{
> >> >> + const struct glsl_type *type = ptr->type->type;
> >> >> + struct vtn_ssa_value *val = vtn_create_ssa_value(b, type);
> >> >> + nir_intrinsic_op op = nir_intrinsic_load_global;
> >> >> +
> >> >> + vtn_assert(ptr->offset);
> >> >> +
> >> >> + nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->
> shader,
> >> >> op);
> >> >> + intrin->num_components = glsl_get_vector_elements(type);
> >> >> + intrin->src[0] = nir_src_for_ssa(ptr->offset);
> >> >> +
> >> >> + nir_ssa_dest_init(&intrin->instr, &intrin->dest,
> >> >> + intrin->num_components,
> >> >> + glsl_get_bit_size(type),
> >> >> + NULL);
> >> >> + val->def = &intrin->dest.ssa;
> >> >> +
> >> >> + nir_builder_instr_insert(&b->nb, &intrin->instr);
> >> >> +
> >> >> + return val;
> >> >> +}
> >> >> +
> >> >> +void
> >> >> +vtn_pointer_store(struct vtn_builder *b, struct vtn_ssa_value *src,
> >> >> + struct vtn_pointer *ptr)
> >> >> +{
> >> >> + const struct glsl_type *type = ptr->type->type;
> >> >> + nir_intrinsic_op op = nir_intrinsic_store_global;
> >> >> +
> >> >> + vtn_assert(ptr->offset);
> >> >> +
> >> >> + nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->
> shader,
> >> >> op);
> >> >> + intrin->num_components = MAX2(1, glsl_get_vector_elements(type)
> );
> >> >> + intrin->src[0] = nir_src_for_ssa(src->def);
> >> >> + intrin->src[1] = nir_src_for_ssa(ptr->offset);
> >> >> +
> >> >> + nir_intrinsic_set_write_mask(intrin, (1 <<
> intrin->num_components)
> >> >> -
> >> >> 1);
> >> >> +
> >> >> + nir_builder_instr_insert(&b->nb, &intrin->instr);
> >> >> +}
> >> >> +
> >> >> nir_ssa_def *
> >> >> vtn_pointer_to_offset(struct vtn_builder *b, struct vtn_pointer
> *ptr,
> >> >> nir_ssa_def **index_out, unsigned
> *end_idx_out)
> >> >> @@ -1004,20 +1187,33 @@ _vtn_variable_load_store(struct vtn_builder
> *b,
> >> >> bool load,
> >> >> case GLSL_TYPE_FLOAT:
> >> >> case GLSL_TYPE_FLOAT16:
> >> >> case GLSL_TYPE_BOOL:
> >> >> - case GLSL_TYPE_DOUBLE:
> >> >> - /* At this point, we have a scalar, vector, or matrix so we
> know
> >> >> that
> >> >> - * there cannot be any structure splitting still in the way.
> By
> >> >> - * stopping at the matrix level rather than the vector level,
> we
> >> >> - * ensure that matrices get loaded in the optimal way even if
> >> >> they
> >> >> - * are storred row-major in a UBO.
> >> >> - */
> >> >> - if (load) {
> >> >> - *inout = vtn_local_load(b, vtn_pointer_to_deref(b, ptr));
> >> >> + case GLSL_TYPE_DOUBLE: {
> >> >> + nir_deref_var *deref_var = vtn_pointer_to_deref(b, ptr);
> >> >> +
> >> >> + if (deref_var) {
> >> >> + /* At this point, we have a scalar, vector, or matrix so we
> >> >> know
> >> >> that
> >> >> + * there cannot be any structure splitting still in the
> way.
> >> >> By
> >> >> + * stopping at the matrix level rather than the vector
> level,
> >> >> we
> >> >> + * ensure that matrices get loaded in the optimal way even
> if
> >> >> they
> >> >> + * are storred row-major in a UBO.
> >> >> + */
> >> >> + if (load) {
> >> >> + *inout = vtn_local_load(b, deref_var);
> >> >> + } else {
> >> >> + vtn_local_store(b, *inout, deref_var);
> >> >> + }
> >> >> } else {
> >> >> - vtn_local_store(b, *inout, vtn_pointer_to_deref(b, ptr));
> >> >> + /* If deref'ing a raw pointer, we don't have a variable to
> go
> >> >> along
> >> >> + * with it. Just directly generate load/store_global
> >> >> intrinsics:
> >> >> + */
> >> >> + if (load) {
> >> >> + *inout = vtn_pointer_load(b, ptr);
> >> >> + } else {
> >> >> + vtn_pointer_store(b, *inout, ptr);
> >> >> + }
> >> >> }
> >> >> return;
> >> >> -
> >> >> + }
> >> >> case GLSL_TYPE_ARRAY:
> >> >> case GLSL_TYPE_STRUCT: {
> >> >> unsigned elems = glsl_get_length(ptr->type->type);
> >> >> @@ -1052,6 +1248,8 @@ vtn_variable_load(struct vtn_builder *b, struct
> >> >> vtn_pointer *src)
> >> >> {
> >> >> if (vtn_pointer_is_external_block(b, src)) {
> >> >> return vtn_block_load(b, src);
> >> >> + } else if (!src->var) {
> >> >> + return vtn_pointer_load(b, src);
> >> >> } else {
> >> >> struct vtn_ssa_value *val = NULL;
> >> >> _vtn_variable_load_store(b, true, src, &val);
> >> >> @@ -1065,8 +1263,11 @@ vtn_variable_store(struct vtn_builder *b,
> struct
> >> >> vtn_ssa_value *src,
> >> >> {
> >> >> if (vtn_pointer_is_external_block(b, dest)) {
> >> >> vtn_assert(dest->mode == vtn_variable_mode_ssbo ||
> >> >> - dest->mode == vtn_variable_mode_workgroup);
> >> >> + dest->mode == vtn_variable_mode_workgroup ||
> >> >> + dest->mode == vtn_variable_mode_param);
> >> >> vtn_block_store(b, src, dest);
> >> >> + } else if (!dest->var) {
> >> >> + vtn_pointer_store(b, src, dest);
> >> >> } else {
> >> >> _vtn_variable_load_store(b, false, dest, &src);
> >> >> }
> >> >> @@ -1608,6 +1809,7 @@ static enum vtn_variable_mode
> >> >> vtn_storage_class_to_mode(struct vtn_builder *b,
> >> >> SpvStorageClass class,
> >> >> struct vtn_type *interface_type,
> >> >> + bool has_initializer,
> >> >> nir_variable_mode *nir_mode_out)
> >> >> {
> >> >> enum vtn_variable_mode mode;
> >> >> @@ -1635,8 +1837,12 @@ vtn_storage_class_to_mode(struct vtn_builder
> *b,
> >> >> } else if (glsl_type_is_sampler(interface_type->type)) {
> >> >> mode = vtn_variable_mode_sampler;
> >> >> nir_mode = nir_var_uniform;
> >> >> + } else if (has_initializer) {
> >> >> + mode = vtn_variable_mode_const;
> >> >> + nir_mode = nir_var_global;
> >> >> } else {
> >> >> - vtn_fail("Invalid uniform constant variable type");
> >> >> + mode = vtn_variable_mode_push_constant;
> >> >> + nir_mode = nir_var_uniform;
> >> >> }
> >> >> break;
> >> >> case SpvStorageClassPushConstant:
> >> >> @@ -1664,6 +1870,10 @@ vtn_storage_class_to_mode(struct vtn_builder
> *b,
> >> >> nir_mode = nir_var_shared;
> >> >> break;
> >> >> case SpvStorageClassCrossWorkgroup:
> >> >> + mode = vtn_variable_mode_cross_workgroup;
> >> >> + // TODO do we want a new nir_mode for raw pointers?
> >> >> + nir_mode = nir_var_global;
> >> >> + break;
> >> >> case SpvStorageClassGeneric:
> >> >> case SpvStorageClassAtomicCounter:
> >> >> default:
> >> >> @@ -1692,7 +1902,14 @@ vtn_pointer_to_ssa(struct vtn_builder *b,
> struct
> >> >> vtn_pointer *ptr)
> >> >> struct vtn_access_chain chain = {
> >> >> .length = 0,
> >> >> };
> >> >> - ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
> >> >> +
> >> >> + if (vtn_pointer_uses_ssa_offset(b, ptr)) {
> >> >> + ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
> >> >> + } else if (ptr->ptr_type->base_type ==
> >> >> vtn_base_type_raw_pointer) {
> >> >> + ptr = vtn_ssa_raw_pointer_dereference(b, ptr, &chain);
> >> >> + } else {
> >> >> + vtn_fail("unhandled");
> >> >> + }
> >> >> }
> >> >>
> >> >> vtn_assert(ptr->offset);
> >> >> @@ -1710,27 +1927,34 @@ struct vtn_pointer *
> >> >> vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa,
> >> >> struct vtn_type *ptr_type)
> >> >> {
> >> >> - vtn_assert(ssa->num_components <= 2 && ssa->bit_size == 32);
> >> >> - vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> >> + vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> >> + ptr_type->base_type == vtn_base_type_raw_pointer);
> >> >> vtn_assert(ptr_type->deref->base_type != vtn_base_type_pointer);
> >> >> /* This pointer type needs to have actual storage */
> >> >> vtn_assert(ptr_type->type);
> >> >>
> >> >> struct vtn_pointer *ptr = rzalloc(b, struct vtn_pointer);
> >> >> ptr->mode = vtn_storage_class_to_mode(b, ptr_type->storage_class,
> >> >> - ptr_type, NULL);
> >> >> + ptr_type, false, NULL);
> >> >> +
> >> >> + if (ptr_type->base_type == vtn_base_type_raw_pointer) {
> >> >> + vtn_assert(ssa->num_components == 1 && ssa->bit_size ==
> >> >> b->ptr_size);
> >> >> + } else {
> >> >> + vtn_assert(ssa->num_components <= 2 && ssa->bit_size == 32);
> >> >> + }
> >> >> +
> >> >> ptr->type = ptr_type->deref;
> >> >> ptr->ptr_type = ptr_type;
> >> >>
> >> >> if (ssa->num_components > 1) {
> >> >> vtn_assert(ssa->num_components == 2);
> >> >> vtn_assert(ptr->mode == vtn_variable_mode_ubo ||
> >> >> - ptr->mode == vtn_variable_mode_ssbo);
> >> >> + ptr->mode == vtn_variable_mode_ssbo ||
> >> >> + ptr->mode == vtn_variable_mode_global);
> >> >> ptr->block_index = nir_channel(&b->nb, ssa, 0);
> >> >> ptr->offset = nir_channel(&b->nb, ssa, 1);
> >> >> } else {
> >> >> vtn_assert(ssa->num_components == 1);
> >> >> - vtn_assert(ptr->mode == vtn_variable_mode_workgroup);
> >> >> ptr->block_index = NULL;
> >> >> ptr->offset = ssa;
> >> >> }
> >> >> @@ -1761,7 +1985,8 @@ vtn_create_variable(struct vtn_builder *b,
> struct
> >> >> vtn_value *val,
> >> >> struct vtn_type *ptr_type, SpvStorageClass
> >> >> storage_class,
> >> >> nir_constant *initializer)
> >> >> {
> >> >> - vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
> >> >> + vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
> >> >> + ptr_type->base_type == vtn_base_type_raw_pointer);
> >> >> struct vtn_type *type = ptr_type->deref;
> >> >>
> >> >> struct vtn_type *without_array = type;
> >> >> @@ -1770,7 +1995,8 @@ vtn_create_variable(struct vtn_builder *b,
> struct
> >> >> vtn_value *val,
> >> >>
> >> >> enum vtn_variable_mode mode;
> >> >> nir_variable_mode nir_mode;
> >> >> - mode = vtn_storage_class_to_mode(b, storage_class, without_array,
> >> >> &nir_mode);
> >> >> + mode = vtn_storage_class_to_mode(b, storage_class, without_array,
> >> >> + !!initializer, &nir_mode);
> >> >>
> >> >> switch (mode) {
> >> >> case vtn_variable_mode_ubo:
> >> >> @@ -1933,6 +2159,7 @@ vtn_create_variable(struct vtn_builder *b,
> struct
> >> >> vtn_value *val,
> >> >> case vtn_variable_mode_ubo:
> >> >> case vtn_variable_mode_ssbo:
> >> >> case vtn_variable_mode_push_constant:
> >> >> + case vtn_variable_mode_cross_workgroup:
> >> >> /* These don't need actual variables. */
> >> >> break;
> >> >> }
> >> >> @@ -2031,25 +2258,18 @@ vtn_handle_variables(struct vtn_builder *b,
> >> >> SpvOp
> >> >> opcode,
> >> >> case SpvOpAccessChain:
> >> >> case SpvOpPtrAccessChain:
> >> >> case SpvOpInBoundsAccessChain: {
> >> >> + struct vtn_type *ptr_type = vtn_value(b, w[1],
> >> >> vtn_value_type_type)->type;
> >> >> + struct vtn_value *base_val = vtn_untyped_value(b, w[3]);
> >> >> struct vtn_access_chain *chain = vtn_access_chain_create(b,
> >> >> count -
> >> >> 4);
> >> >> - chain->ptr_as_array = (opcode == SpvOpPtrAccessChain);
> >> >> + chain->ptr_as_array = (opcode == SpvOpPtrAccessChain) ||
> >> >> + (opcode == SpvOpInBoundsPtrAccessChain);
> >> >>
> >> >> unsigned idx = 0;
> >> >> for (int i = 4; i < count; i++) {
> >> >> - struct vtn_value *link_val = vtn_untyped_value(b, w[i]);
> >> >> - if (link_val->value_type == vtn_value_type_constant) {
> >> >> - chain->link[idx].mode = vtn_access_mode_literal;
> >> >> - chain->link[idx].id =
> >> >> link_val->constant->values[0].u32[0];
> >> >> - } else {
> >> >> - chain->link[idx].mode = vtn_access_mode_id;
> >> >> - chain->link[idx].id = w[i];
> >> >> -
> >> >> - }
> >> >> + chain->link[idx] = vtn_to_access_link(b, w[i]);
> >> >> idx++;
> >> >> }
> >> >>
> >> >> - struct vtn_type *ptr_type = vtn_value(b, w[1],
> >> >> vtn_value_type_type)->type;
> >> >> - struct vtn_value *base_val = vtn_untyped_value(b, w[3]);
> >> >> if (base_val->value_type == vtn_value_type_sampled_image) {
> >> >> /* This is rather insane. SPIR-V allows you to use
> >> >> OpSampledImage
> >> >> * to combine an array of images with a single sampler to
> get
> >> >> an
> >> >> --
> >> >> 2.14.3
> >> >>
> >> >> _______________________________________________
> >> >> mesa-dev mailing list
> >> >> mesa-dev at lists.freedesktop.org
> >> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180323/80392b5d/attachment-0001.html>
More information about the mesa-dev
mailing list