<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Nov 30, 2017 at 3:41 PM, Chema Casanova <<a href="mailto:jmcasanova@igalia.com" target="_blank">jmcasanova@igalia.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On 30/11/17 23:21, Jason Ekstrand wrote: > On Wed, Nov 29, 2017 at 6:50 PM, Jose Maria Casanova Crespo <div><div class="h5">> <<a href="mailto:jmcasanova@igalia.com">jmcasanova@igalia.com</a> <mailto:<a href="mailto:jmcasanova@igalia.com">jmcasanova@igalia.com</a>>> wrote: > > This helpers are used to load/store 16-bit types from/to 32-bit > components. > > The functions shuffle_32bit_load_result_to_16bit_data and > shuffle_16bit_data_for_32bit_write are implemented in a similar > way than the analogous functions for handling 64-bit types. > --- > src/intel/compiler/brw_fs.h | 11 +++++++++ > src/intel/compiler/brw_fs_nir.cpp | 51 > +++++++++++++++++++++++++++++++++++++++ > 2 files changed, 62 insertions(+) > > diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h > index 19b897e7a9..30557324d5 100644 > --- a/src/intel/compiler/brw_fs.h > +++ b/src/intel/compiler/brw_fs.h > @@ -497,6 +497,17 @@ void > shuffle_32bit_load_result_to_64bit_data(const brw::fs_builder &bld, > fs_reg shuffle_64bit_data_for_32bit_write(const brw::fs_builder &bld, > const fs_reg &src, > uint32_t components); > + > +void shuffle_32bit_load_result_to_16bit_data(const brw::fs_builder > &bld, > + const fs_reg &dst, > + const fs_reg &src, > + uint32_t components); > + > +void shuffle_16bit_data_for_32bit_write(const brw::fs_builder &bld, > + const fs_reg &dst, > + const fs_reg &src, > + uint32_t components); > + > fs_reg setup_imm_df(const brw::fs_builder &bld, > double v); > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 726b2fcee7..c091241132 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -4828,6 +4828,33 @@ shuffle_32bit_load_result_to_64bit_data(const > fs_builder &bld, > } > } > > +void > +shuffle_32bit_load_result_to_16bit_data(const fs_builder &bld, > + const fs_reg &dst, > + const fs_reg &src, > + uint32_t components) > +{ > + assert(type_sz(src.type) == 4); > + assert(type_sz(dst.type) == 2); > + > + fs_reg tmp = retype(bld.vgrf(src.type), dst.type); > + > + for (unsigned i = 0; i < components; i++) { > + const fs_reg component_i = subscript(offset(src, bld, i / 2), > dst.type, i % 2); > + > + bld.MOV(offset(tmp, bld, i % 2), component_i); > + > + if (i % 2) { > + bld.MOV(offset(dst, bld, i -1), offset(tmp, bld, 0)); > + bld.MOV(offset(dst, bld, i), offset(tmp, bld, 1)); > + } > > > I'm very confused by this extra moving. Why can't we just do > > bld.MOV(offset(dst, bld, i), component_i); > > above? Maybe I'm missing something but I don't see why the extra moves > are needed. </div></div>There is a comment in the previous function shuffle_32bit_load_result_to_64bit_data that explains the similar situation that still applies in shuffle_32bit_load_result_to_16bit_data, what about including the following comment. /* A temporary that we will use to shuffle the 16-bit data of each * component in the vector into valid 32-bit data. We can't write * directly to dst because dst can be the same as src and in that * case the first MOV in the loop below would overwrite the data * read in the second MOV. */ But in any case I've just checked, and at first sight at the 6 final uses of this function this situation never happens. <div><div class="h5"> > > > + } > + if (components % 2) { > + bld.MOV(offset(dst, bld, components - 1), tmp); > + } > +} > + > + > /** > * This helper does the inverse operation of > * SHUFFLE_32BIT_LOAD_RESULT_TO_64BIT_DATA. > @@ -4860,6 +4887,30 @@ shuffle_64bit_data_for_32bit_write(const > fs_builder &bld, > return dst; > } > > +void > +shuffle_16bit_data_for_32bit_write(const fs_builder &bld, > + const fs_reg &dst, > + const fs_reg &src, > + uint32_t components) > +{ > + assert(type_sz(src.type) == 2); > + assert(type_sz(dst.type) == 4); > + > + fs_reg tmp = bld.vgrf(dst.type); > + > + for (unsigned i = 0; i < components; i++) { > + const fs_reg component_i = offset(src, bld, i); > + bld.MOV(subscript(tmp, src.type, i % 2), component_i); > + if (i % 2) { > + bld.MOV(offset(dst, bld, i / 2), tmp); > + } > > > Again, why the extra MOVs? Why not > > bld.MOV(subscript(offset(tmp, bld, i / 2), src.type, i % 2), component_i); > > instead of the extra MOVs? </div></div>The same previous issue applies here, but I've seen that at commit 030d2b5016360caf44ebfa3f6951a6d676316a89 i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write you changed the original behavior of this function. "All callers of this function allocate a fs_reg expressly to pass into it. It's much easier if we just let the helper allocate the register. While we're here, we switch it to doing the MOVs with an integer type so that we don't accidentally canonicalize floats on half of a double." In this case we only have one call where src and dst are the same at nir_intrinsic_store_output at nir_emit_tcs_intrinsic. Maybe I can re-factor the code following your approach. </blockquote><div> </div><div>I think it's fine to just let them handle in-place shuffles. Please drop a comment to that effect on both of them so that it's more clear why the temporary is needed. With that,</div><div> </div><div>Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Chema > > --Jason > > > + } > + if (components % 2) { > + bld.MOV(offset(dst, bld, components / 2), tmp); > + } > +} > + > + > fs_reg > setup_imm_df(const fs_builder &bld, double v) > { > -- > 2.14.3 > > _______________________________________________ > mesa-dev mailing list > <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a> <mailto:<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a>> > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a> > <<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a>> > > </blockquote></div> </div></div>