[Mesa-dev] [Mesa-stable] [PATCH 1/2] i965/fs: fix indirect load DF uniforms on BSW/BXT

Samuel Iglesias Gonsálvez siglesias at igalia.com
Wed Feb 15 14:45:39 UTC 2017


On Tue, 2017-02-14 at 11:11 -0800, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> 
> > Previously we were emitting two MOV_INDIRECT instructions by
> > calculating
> > source's indirect offsets for each 32-bit half of a DF source.
> > However,
> > this is not needed as we can just emit two 32-bit MOV INDIRECT
> > without
> > doing that calculation.
> > 
> 
> Maybe mention here the main motivation for fixing this and nominating
> it
> for stable: The lowered BSW/BXT indirect move instructions had
> incorrect
> source types, which luckily wasn't causing incorrect assembly to be
> generated due to the bug fixed in the next patch, but would have
> confused the remaining back-end IR infrastructure due to the mismatch
> between the IR source types and the emitted machine code.
> 

OK, thanks.

> > Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> > Cc: "17.0" <mesa-stable at lists.freedesktop.org>
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 23 ++++++++++--------
> > -----
> >  1 file changed, 10 insertions(+), 13 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > index 991c20fad62..8975940e10b 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > @@ -3878,31 +3878,28 @@ fs_visitor::nir_emit_intrinsic(const
> > fs_builder &bld, nir_intrinsic_instr *instr
> >           unsigned read_size = instr->const_index[1] -
> >              (instr->num_components - 1) * type_sz(dest.type);
> >  
> > -         fs_reg indirect_chv_high_32bit;
> > -         bool is_chv_bxt_64bit =
> > -            (devinfo->is_cherryview || devinfo->is_broxton) &&
> > -            type_sz(dest.type) == 8;
> > -         if (is_chv_bxt_64bit) {
> > -            indirect_chv_high_32bit = vgrf(glsl_type::uint_type);
> > -            /* Calculate indirect address to read high 32 bits */
> > -            bld.ADD(indirect_chv_high_32bit, indirect,
> > brw_imm_ud(4));
> > -         }
> > +         bool supports_64bit_indirects =
> > +            !devinfo->is_cherryview && !devinfo->is_broxton;
> >  
> >           for (unsigned j = 0; j < instr->num_components; j++) {
> > -            if (!is_chv_bxt_64bit) {
> > +            if (type_sz(dest.type) != 8 ||
> > supports_64bit_indirects) {
> >                 bld.emit(SHADER_OPCODE_MOV_INDIRECT,
> >                          offset(dest, bld, j), offset(src, bld, j),
> >                          indirect, brw_imm_ud(read_size));
> >              } else {
> > +               /* We are going to read 64-bit data in two 32-bit
> > MOV INDIRECTS,
> > +                * each one reading half of the data.
> > +                */
> > +               read_size /= 2;
> 
> I don't think this is right, data for both halves is interleaved so
> the
> read size will be roughly the same as for the 64 bit indirect move.
> 

OK

> >                 bld.emit(SHADER_OPCODE_MOV_INDIRECT,
> >                          subscript(offset(dest, bld, j),
> > BRW_REGISTER_TYPE_UD, 0),
> > -                        offset(src, bld, j),
> > +                        subscript(offset(src, bld, j),
> > BRW_REGISTER_TYPE_UD, 0),
> >                          indirect, brw_imm_ud(read_size));
> >  
> >                 bld.emit(SHADER_OPCODE_MOV_INDIRECT,
> >                          subscript(offset(dest, bld, j),
> > BRW_REGISTER_TYPE_UD, 1),
> > -                        offset(src, bld, j),
> > -                        indirect_chv_high_32bit,
> > brw_imm_ud(read_size));
> > +                        subscript(offset(src, bld, j),
> > BRW_REGISTER_TYPE_UD, 1),
> > +                        indirect, brw_imm_ud(read_size));
> 
> Given that this makes both emit statements nearly identical except
> for
> the 0/1 indices passed to subscript(), wouldn't it make sense to
> remove
> one of them and wrap it in a loop?
> 

Right, I am going to refactor it.

Sam

> >              }
> >           }
> >        }
> > -- 
> > 2.11.0
> > 
> > _______________________________________________
> > mesa-stable mailing list
> > mesa-stable at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-stable
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170215/46245852/attachment-0001.sig>


More information about the mesa-dev mailing list