[Mesa-dev] [PATCH v2 1/3] i965/fs: fix indirect load DF uniforms on BSW/BXT
Francisco Jerez
currojerez at riseup.net
Sun Feb 19 02:58:13 UTC 2017
Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> The lowered BSW/BXT indirect move instructions had incorrect
> source types, which luckily wasn't causing incorrect assembly to be
> generated due to the bug fixed in the next patch, but would have
> confused the remaining back-end IR infrastructure due to the mismatch
> between the IR source types and the emitted machine code.
>
> v2:
> - Improve commit log (Curro)
> - Fix read_size (Curro)
> - Fix DF uniform array detection in assign_constant_locations() when
> it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT.
>
> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> Cc: "17.0" <mesa-stable at lists.freedesktop.org>
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 11 ++++++++-
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 41 ++++++++++++++++----------------
> 2 files changed, 30 insertions(+), 22 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index c348bc7138d..93ab84b5845 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -1944,10 +1944,19 @@ fs_visitor::assign_constant_locations()
> unsigned last = constant_nr + (inst->src[2].ud / 4) - 1;
> assert(last < uniforms);
>
> + bool supports_64bit_indirects =
> + !devinfo->is_cherryview && !devinfo->is_broxton;
> + /* Detect if this is as a result 64-bit MOV INDIRECT. In case
> + * of BSW/BXT, we substitute each DF MOV INDIRECT by two 32-bit MOV
> + * INDIRECT.
> + */
> + bool mov_indirect_64bit = (type_sz(inst->src[i].type) == 8) ||
> + (!supports_64bit_indirects && inst->dst.type == BRW_REGISTER_TYPE_UD &&
> + inst->src[0].type == BRW_REGISTER_TYPE_UD && inst->dst.stride == 2);
This seems kind of fragile, I don't think the optimizer gives you any
guarantees that the stride of a lowered 64-bit indirect move will remain
equal to two, or that the destination stride of an actual 32-bit
indirect uniform load will never end up being two as well. That said,
because you access these with 32-bit indirect moves, I don't see why
you'd need to treat them as 64-bit uniforms here, the usual alignment
requirements for 64-bit uniforms no longer apply, so you can treat them
as regular 32-bit uniforms AFAICT. Why did you add this hunk?
Other than that patch looks good to me.
> for (unsigned j = constant_nr; j < last; j++) {
> is_live[j] = true;
> contiguous[j] = true;
> - if (type_sz(inst->src[i].type) == 8) {
> + if (mov_indirect_64bit) {
> is_live_64bit[j] = true;
> }
> }
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 991c20fad62..4aaf84964e9 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -3878,31 +3878,30 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
> unsigned read_size = instr->const_index[1] -
> (instr->num_components - 1) * type_sz(dest.type);
>
> - fs_reg indirect_chv_high_32bit;
> - bool is_chv_bxt_64bit =
> - (devinfo->is_cherryview || devinfo->is_broxton) &&
> - type_sz(dest.type) == 8;
> - if (is_chv_bxt_64bit) {
> - indirect_chv_high_32bit = vgrf(glsl_type::uint_type);
> - /* Calculate indirect address to read high 32 bits */
> - bld.ADD(indirect_chv_high_32bit, indirect, brw_imm_ud(4));
> - }
> + bool supports_64bit_indirects =
> + !devinfo->is_cherryview && !devinfo->is_broxton;
>
> - for (unsigned j = 0; j < instr->num_components; j++) {
> - if (!is_chv_bxt_64bit) {
> + if (type_sz(dest.type) != 8 || supports_64bit_indirects) {
> + for (unsigned j = 0; j < instr->num_components; j++) {
> bld.emit(SHADER_OPCODE_MOV_INDIRECT,
> offset(dest, bld, j), offset(src, bld, j),
> indirect, brw_imm_ud(read_size));
> - } else {
> - bld.emit(SHADER_OPCODE_MOV_INDIRECT,
> - subscript(offset(dest, bld, j), BRW_REGISTER_TYPE_UD, 0),
> - offset(src, bld, j),
> - indirect, brw_imm_ud(read_size));
> -
> - bld.emit(SHADER_OPCODE_MOV_INDIRECT,
> - subscript(offset(dest, bld, j), BRW_REGISTER_TYPE_UD, 1),
> - offset(src, bld, j),
> - indirect_chv_high_32bit, brw_imm_ud(read_size));
> + }
> + } else {
> + const unsigned num_mov_indirects =
> + type_sz(dest.type) / type_sz(BRW_REGISTER_TYPE_UD);
> + /* We read a little bit less per MOV INDIRECT, as they are now
> + * 32-bits ones instead of 64-bit. Fix read_size then.
> + */
> + const unsigned read_size_32bit = read_size -
> + (num_mov_indirects - 1) * type_sz(BRW_REGISTER_TYPE_UD);
> + for (unsigned j = 0; j < instr->num_components; j++) {
> + for (unsigned i = 0; i < num_mov_indirects; i++) {
> + bld.emit(SHADER_OPCODE_MOV_INDIRECT,
> + subscript(offset(dest, bld, j), BRW_REGISTER_TYPE_UD, i),
> + subscript(offset(src, bld, j), BRW_REGISTER_TYPE_UD, i),
> + indirect, brw_imm_ud(read_size_32bit));
> + }
> }
> }
> }
> --
> 2.11.0
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170218/cabf6387/attachment.sig>
More information about the mesa-dev
mailing list