[Mesa-dev] Representing explicit memory layouts in NIR

Mon Dec 3 15:31:36 UTC 2018

On 1/12/18 14:34, Jason Ekstrand wrote:
> On Sat, Dec 1, 2018 at 4:06 AM apinheiro <apinheiro at igalia.com
> <mailto:apinheiro at igalia.com>> wrote:
>
>     On 30/11/18 23:52, Jason Ekstrand wrote:
>>     On Fri, Nov 30, 2018 at 4:39 PM Timothy Arceri
>>     <tarceri at itsqueeze.com <mailto:tarceri at itsqueeze.com>> wrote:
>>
>>         On 1/12/18 9:11 am, Jason Ekstrand wrote:
>>         > All,
>>         >
>>         > This week, I've been working on trying to move UBO and SSBO
>>         access in
>>         > NIR over to deref instructions.  I'm hoping that this will
>>         allow us to
>>         > start doing alias analysis and copy-propagation on it.  The
>>         passes we
>>         > have in NIR *should* be able to work with SSBOs as long as
>>         > nir_compare_derefs does the right thing.
>>         >
>>         > # A story about derefs
>>         >
>>         > In that effort, I've run into a bit of a snag with how to
>>         represent the
>>         > layout information.  What we get in from SPIR-V for Vulkan
>>         is a byte
>>         > offset for every struct member and a byte stride for every
>>         array (and
>>         > pointer in the OpPtrAccessChain case).  For matrices, there
>>         is an
>>         > additional RowMajor boolean we need to track somewhere. 
>>         With OpenCL
>>         > memory access, you don't get these decorations but it's
>>         trivial to
>>         > translate the OpenCL layout (It's the same as C) to
>>         offset/stride when
>>         > creating the type.  I've come up with three different ways
>>         to represent
>>         > the information and they all have their own downsides:
>>         >
>>         > ## 1. Put the information on the glsl_type similar to how
>>         it's done in
>>         > SPIR-V
>>         >
>>         > This has the advantage of being fairly non-invasive to
>>         glsl_type.  A lot
>>         > of the fields we need are already there and the only real
>>         change is to
>>         > allow array types to have strides.  The downside is that
>>         the information
>>         > is often not where you want.  Arrays and structs are ok
>>         but, for
>>         > matrices, you have to go fishing all the way back to the
>>         struct type to
>>         > get the RowMajor and MatrixStride decorations.  (Thanks,
>>         SPIR-V...) 
>>         > While this seems like a local annoyance, it actually
>>         destroys basically
>>         > all the advantages of having the information on the type
>>         and makes
>>         > lower_io a real pain.
>>         >
>>         > ## 2. Put the information on the type but do it properly
>>         >
>>         > In this version, we would put the matrix stride and
>>         RowMajor decoration
>>         > directly on the matrix type.  One obvious advantage here is
>>         that it
>>         > means no fishing for matrix type information.  Another is
>>         that, by
>>         > having the types specialized like this, the only way to
>>         change layouts
>>         > mid-deref-chain would be to have a cast.  Option 1 doesn't
>>         provide this
>>         > because matrix types are the same regardless of whether or
>>         not they're
>>         > declared RowMajor in the struct.  The downside to this
>>         option is that it
>>         > requires glsl_type surgery to make it work.  More on that
>>         in a bit.
>>
>
>     FWIW, while working on ARB_gl_spirv I also found annoying the need
>     to fish for the struct member information in order to get RowMajor
>     and MatrixStride information, as that lead to the need to track
>     the parent_type while you were processing the current type. As a
>     example, this is a extract of the code from _get_type_size (new
>     method from ARB_gl_spirv ubo/ssbo support, used to compute the
>     ubo/ssbo size):
>
>        /* Matrices must have a matrix stride and either RowMajor or
>     ColMajor */
>        if (glsl_type_is_matrix(type)) {
>           unsigned matrix_stride =
>              glsl_get_struct_field_explicit_matrix_stride(parent_type,
>     index_in_parent);
>
>           bool row_major =
>              glsl_get_struct_field_matrix_layout(parent_type,
>     index_in_parent) ==
>              GLSL_MATRIX_LAYOUT_ROW_MAJOR;
>
>           unsigned length = row_major ? glsl_get_vector_elements(type)
>              : glsl_get_length(type);
>
>           /* We don't really need to compute the type_size of the
>     matrix element
>            * type. That should be already included as part of
>     matrix_stride
>            */
>           return matrix_stride * length;
>        }
>
>
>     So for a given type, sometimes we are using the info from the
>     current type, and sometimes we need to go up to the parent (so in
>     addition to the parent_type we also need to track the index of the
>     current type on its parent).
>
>     At some point I was tempted to move matrix stride and row major
>     from the struct field to glsl type. But as Jason mentioned, that
>     would need a lot of changes, and I felt that doing that only for
>     ARB_gl_spirv convenience was an overkill. But if we have now more
>     reasons to do the move, I'm on that ship.
>
>
>>         >
>>         > ## 3. Put the information directly on the deref
>>         >
>>         > Instead of putting the stride/offset information on the
>>         type, we just
>>         > put it on the deref as we build the deref chain.  This is
>>         easy enough to
>>         > do in spirv_to_nir and someone could also do it easily
>>         enough as a
>>         > lowering pass based on a type_size function.  This has the
>>         advantage of
>>         > simplicity because you don't have to modify glsl_type at
>>         all and
>>         > lowering is stupid-easy because all the information you
>>         need is right
>>         > there on the deref.  The downside, however, is that you
>>         alias analysis
>>         > is potentially harder because you don't have the nice
>>         guarantee that you
>>         > don't see a layout change without a type cast.  The other
>>         downside is
>>         > that we can't ever use copy_deref with anything bigger than
>>         a vector
>>         > because you don't know the sizes of any types and, unless
>>         spirv_to_nir
>>         > puts the offset/stride information on the deref, there's
>>         now way to
>>         > reconstruct it.
>>         >
>>         > I've prototyped both 1 and 3 so far and I definitely like 3
>>         better than
>>         > 1 but it's not great.  I haven't prototyped 2 yet due to
>>         the issue
>>         > mentioned with glsl_type.
>>         >
>>         > Between 2 and 3, I really don't know how much we actually
>>         loose in terms
>>         > of our ability to do alias analysis.  I've written the
>>         alias analysis
>>         > for 3 and it isn't too bad.  I'm also not sure how much we
>>         would
>>         > actually loose from not being able to express whole-array or
>>         > whole-struct copies.  However, without a good reason
>>         otherwise, option 2
>>         > really seems like it's the best of all worlds....
>>         >
>>         > # glsl_type surgery
>>         >
>>         > You want a good reason, eh?  You should have known this was
>>         coming...
>>         >
>>         > The problem with option 2 above is that it requires
>>         significant
>>         > glsl_type surgery to do it.  Putting decorations on
>>         matrices violates
>>         > one of the core principals of glsl_type, namely that all
>>         fundamental
>>         > types: scalars, vectors, matrices, images, and samplers are
>>         singletons. 
>>         > Other types such as structs and arrays we build on-the-fly
>>         and cache
>>         > as-needed.  In order to do what we need for option 2 above,
>>         you have to
>>         > at least drop this for matrices and possibly vectors (the
>>         columns of a
>>         > row-major mat4 are vectors with a stride of 16).  Again, I
>>         see two options:
>>
>
> Something just occurred to me that I should note somewhere, so I'm
> putting it here.  Having static singletons for things like matrices is
> actually useful for compiler performance.  Because the cache is
> global, every time we look a type up in the cache, we have to lock
> around doing so.  However, with the static singletons like
> glsl_type::vec4_type, we can look them up without taking a lock. 
> Inside of NIR this probably isn't so bad because we don't tend to
> construct a lot of types out of thin air; we tend to just use what's
> there.  However, for glsl_to_nir or spirv_to_nir, the cost would be
> more noticeable.  If we reworked glsl_type without a plan to solve
> this problem, it would likely murder performance because that IR is
> constantly creating temporary variables and needs types for them.  I
> see some python codegen in my future....
>  
>
>>         > ## A. Major rework of the guts of glsl_type
>>         >
>>         > Basically, get rid of the static singletons and just use
>>         the build
>>         > on-the-fly and cache model for everything.  This would mean
>>         that mat4 ==
>>         > mat4 is no longer guaranteed unless you know a priori that
>>         none of your
>>         > types are decorated with layout information.  It would also
>>         be, not only
>>         > a pile of work, but a single mega-patch.  I don't know of
>>         any way to
>>         > make that change without just ripping it all up and putting
>>         it back
>>         > together.
>>
>>         Do we really need to throw away the singleton model? Could we
>>         not add
>>         another type on top of matrices to hold the layout
>>         information much like
>>         how we handle arrays and structs and just strip it off (like
>>         we often do
>>         with arrays) when needed for comparisons?
>>
>>         It's possible this could be messy, just trying to throw so
>>         ideas out there.
>>
>>
>>     As I've been thinking about that.  My thought for nir_type (if
>>     that's a thing) has been to have a "bare type" pointer that
>>     always points back to the undecorated version to use in comparisons.
>
>
>     Wouldn't that be a really limited form of comparison compared with
>     the current glsl_type comparison? As you mention, right now
>     several validations checks are based on directly compare the types
>     (type1 == type2), and right now that includes some decorations
>     (just skimming glsl_types::record_compare, we found
>     explicit_xfb_buffer, explicit_matrix_stride, etc). How would that
>     work with the nir bare type? It would be needed to do a two-step
>     check? Or adding a comparison method that compares recursively two
>     full types?
>
>
> Let me expand a bit on what I meant by that.  Each type would have a
> "bare_type" pointer.  For bare types, it would either be NULL or point
> to the type itself (I'm not sure which is more useful).  If you made
> an array of foo_type, for instance, foo_arr_type->bare_type would be
> an array of foo_type->bare_type and not just an undecorated array of
> foo_type.  In a sense, the recursion is already done for you.  This
> would be sufficient for answering the question of "are these, in
> general, the same type?"  I'm not sure what I would do with struct
> types.  I'm a bit inclined to just say "same members in the same
> order" and ignore things like field names but I'm not really sure.

Ok.

>
> If, on the other hand, you wanted to do something that looks more like
> GLSL interface matching, you're on your own.  You would have to write
> a (probably recursive) function that applies all the fiddly little
> rules around types and qualifiers.

Ok. Right now I guess that this would not be needed for ARB_gl_spirv.
Having said so, perhaps I'm wrong as now and then I found a new
old-OpenGL semantic that I need to bring back.

>
> Does that make more sense?

Yes, thanks for the explanation.

>
> --Jason
>  
>
>>
>>     --Jason
>>      
>>
>>         >
>>         > ## B. Make a new nir_type and make NIR use it
>>         >
>>         > This seems a bit crazy at this point.  src/compiler/nir
>>         itself has over
>>         > 200 references to glsl_type and that doesn't include
>>         back-ends.  It'd be
>>         > a major overhaul and it's not clear that it's worth it. 
>>         However, it
>>         > would mean that we'd have a chance to rewrite types and
>>         maybe do it
>>         > better.  Basing it on nir_alu_type instead of
>>         glsl_base_type would be
>>         > really nice because nir_alu_type already has an orthogonal
>>         split between
>>         > bit size and format (float, uint, etc.).  I would also
>>         likely structure
>>         > it like vtn_type which has a different base_type concept
>>         which I find
>>         > works better than glsl_base_type.
>>         >
>>         > Of course, A would be less invasive than B but B would give
>>         us the
>>         > chance to improve some things without rewriting quite as
>>         many levels of
>>         > the compiler.  There are a number of things I think we
>>         could do better
>>         > but changing those in the GLSL compiler would be a *lot* of
>>         work
>>         > especially since it doesn't use the C helpers that NIR
>>         does.  On the
>>         > other hand, the churn in NIR from introducing a new type
>>         data structure
>>         > would be pretty big.  I did a quick git grep and it looks
>>         like most of
>>         > the back-ends make pretty light use of glsl_type when it
>>         consuming NIR
>>         > so maybe it wouldn't be that bad?
>>         >
>>         > Thoughts?  Questions?  Objections?
>>         >
>>         > --Jason
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181203/e3c9be13/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pEpkey.asc
Type: application/pgp-keys
Size: 1546 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181203/e3c9be13/attachment-0001.key>