<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jun 13, 2018 at 5:07 PM, Chema Casanova <<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 13/06/18 22:46, Jason Ekstrand wrote: > On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo <div><div class="h5">> <<a href="mailto:jmcasanova@igalia.com">jmcasanova@igalia.com</a> <mailto:<a href="mailto:jmcasanova@igalia.com">jmcasanova@igalia.com</a>>> wrote: > > This new function takes care of shuffle/unshuffle components of a > particular bit-size in components with a different bit-size. > > If source type size is smaller than destination type size the operation > needed is a component shuffle. The opposite case would be an unshuffle. > > The operation allows to skip first_component number of components from > the source. > > Shuffle MOVs are retyped using integer types avoiding problems with > denorms > and float types. This allows to simplify uses of shuffle functions > that are > dealing with these retypes individually. > > Now there is a new restriction so source and destination can not overlap > anymore when calling this suffle function. Following patches that > migrate > to use this new function will take care individually of avoiding source > and destination overlaps. > --- > src/intel/compiler/brw_fs_nir.cpp | 92 +++++++++++++++++++++++++++++++ > 1 file changed, 92 insertions(+) > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 166da0aa6d7..1a9d3c41d1d 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -5362,6 +5362,98 @@ shuffle_16bit_data_for_32bit_write(const > fs_builder &bld, > } > } > > +/* > + * This helper takes a source register and un/shuffles it into the > destination > + * register. > + * > + * If source type size is smaller than destination type size the > operation > + * needed is a component shuffle. The opposite case would be an > unshuffle. If > + * source/destination type size is equal a shuffle is done that > would be > + * equivalent to a simple MOV. > > > There's a sticky bit here if we want this to work with 64-bit types on > gen7 and earlier because we only have DF there and not Q so the > brw_reg_type_from_bit_size below doesn't work. If we care about that > case (and I'm not convinced we do), it should be easy enough to add a > type_sz(src.type) == type_sz(dst.type) case which just does MOVs from > source to dest. </div></div>At this moment, current uses of this function are to read from 32-bits or to write to 32-bit. But I think that for completeness if would be nice to have all cases covered. The option of doing the MOVs in the case of equality (that would be quite normal) saves us to do the shuffle calculus for the simple case. So I'm going for it. <div><div class="h5"> > + * > + * For example, if source is a 16-bit type and destination is > 32-bit. A 3 > + * components .xyz 16-bit vector on SIMD8 would be. > + * > + * |x1|x2|x3|x4|x5|x6|x7|x8|y1|y2|y3|y4|y5|y6|y7|y8| > + * |z1|z2|z3|z4|z5|z6|z7|z8| | | | | | | | | > + * > + * This helper will return the following 2 32-bit components with > the 16-bit > + * values shuffled: > + * > + * |x1 y1|x2 y2|x3 y3|x4 y4|x5 y5|x6 y6|x7 y7|x8 y8| > + * |z1 |z2 |z3 |z4 |z5 |z6 |z7 |z8 | > + * > + * For unshuffle, the example would be the opposite, a 64-bit type > source > + * and a 32-bit destination. A 2 component .xy 64-bit vector on SIMD8 > + * would be: > + * > + * | x1l x1h | x2l x2h | x3l x3h | x4l x4h | > + * | x5l x5h | x6l x6h | x7l x7h | x8l x8h | > + * | y1l y1h | y2l y2h | y3l y3h | y4l y4h | > + * | y5l y5h | y6l y6h | y7l y7h | y8l y8h | > + * > + * The returned result would be the following 4 32-bit components > unshuffled: > + * > + * | x1l | x2l | x3l | x4l | x5l | x6l | x7l | x8l | > + * | x1h | x2h | x3h | x4h | x5h | x6h | x7h | x8h | > + * | y1l | y2l | y3l | y4l | y5l | y6l | y7l | y8l | > + * | y1h | y2h | y3h | y4h | y5h | y6h | y7h | y8h | > + * > + * - Source and destination register must not be overlapped. > + * - first_component parameter allows skipping source components. > + */ > +void > +shuffle_src_to_dst(const fs_builder &bld, > + const fs_reg &dst, > + const fs_reg &src, > + uint32_t first_component, > + uint32_t components) > +{ > + if (type_sz(src.type) <= type_sz(dst.type)) { > + /* Source is shuffled into destination */ > + unsigned size_ratio = type_sz(dst.type) / type_sz(src.type); > +#ifndef NDEBUG > + boolean src_dst_overlap = regions_overlap(dst, > + type_sz(dst.type) * bld.dispatch_width() * components, > + offset(src, bld, first_component * size_ratio), > > > Why do you need to multiply first_component by size_ratio? It's already > in units of source components. </div></div>Yes, that's wrong. I forgot to update this assertion with my lasts refactorings of this series, and as nowadays there are no overlaps I didn't realize of the error. Both calls to regions_overlap have all parameters wrong. I should also include in the comment than the "component" parameter is measured in terms of the smaller type between source and destination. As we are un/shuffling the smaller components from/into a bigger one. </blockquote><div> </div><div>Yes, please. Is that really what we want though? Do we have cases where we are going from a big type to a small type and want a small component offset? I suppose I could see it happening. It's ok if it is what we want; I just want to make sure. :-) </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> > + type_sz(src.type) * bld.dispatch_width() * components * > size_ratio); > +#endif > + assert(!src_dst_overlap); > > > If the only thing you're doing with src_dst_overlap is to assert on it, > you may as well put the regions_overlap call inside the assert and drop > the #ifndef. Ok, I just thought it was too long function. I'll change it. > + > + brw_reg_type shuffle_type = > + brw_reg_type_from_bit_size(8 * type_sz(src.type), > + BRW_REGISTER_TYPE_D); > + for (unsigned i = 0; i < components; i++) { > + fs_reg shuffle_component_i = > + subscript(offset(dst, bld, i / size_ratio), > + shuffle_type, i % size_ratio); > + bld.MOV(shuffle_component_i, > + retype(offset(src, bld, i + first_component), > shuffle_type)); > + } > + } else { > + /* Source is unshuffled into destination */ > + unsigned size_ratio = type_sz(src.type) / type_sz(dst.type); > +#ifndef NDEBUG > + boolean src_dst_overlap = regions_overlap(dst, > + type_sz(dst.type) * bld.dispatch_width() * components, > + offset(src, bld, first_component / size_ratio), > > > Again, I don't think we want to divide here. Sure. > > > + type_sz(src.type) * bld.dispatch_width() * > + DIV_ROUND_UP(components + first_component % size_ratio, > size_ratio)); > +#endif > + assert(!src_dst_overlap); > > > As I said above, everything can go in the assert. Ok. > Other than that, this looks great. Thanks for the review. I'm sending tomorrow a v2 with the fixes. > --Jason > > > > + brw_reg_type shuffle_type = > + brw_reg_type_from_bit_size(8 * type_sz(dst.type), > + BRW_REGISTER_TYPE_D); > + for (unsigned i = 0; i < components; i++) { > + fs_reg shuffle_component_i = > + subscript(offset(src, bld, (first_component + i) / > size_ratio), > + shuffle_type, (first_component + i) % > size_ratio); > + bld.MOV(retype(offset(dst, bld, i), shuffle_type), > + shuffle_component_i); > + } > + } > +} > + > fs_reg > setup_imm_df(const fs_builder &bld, double v) > { > -- > 2.17.1 > > _______________________________________________ > mesa-dev mailing list > <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a> <mailto:<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a>> > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a> > <<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a>> <div class="HOEnZb"><div class="h5">> > > > > _______________________________________________ > mesa-dev mailing list > <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a> > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a> > </div></div></blockquote></div> </div></div>