[Mesa-dev] [PATCH 25/95] i965/vec4: fix base offset for nir_registers with doubles

Thu Sep 29 10:27:14 UTC 2016

On Wed, 2016-08-17 at 14:16 -0700, Francisco Jerez wrote:
> Iago Toral <itoral at igalia.com> writes:
> 
> > 
> > On Tue, 2016-08-02 at 18:40 -0700, Francisco Jerez wrote:
> > > 
> > > Iago Toral Quiroga <itoral at igalia.com> writes:
> > > 
> > > > 
> > > > 
> > > > ---
> > > >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +++++---
> > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > index cf35f2e..fde7b60 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > @@ -280,7 +280,8 @@ vec4_visitor::get_nir_dest(const nir_dest
> > > > &dest)
> > > >        nir_ssa_values[dest.ssa.index] = dst;
> > > >        return dst;
> > > >     } else {
> > > > -      return dst_reg_for_nir_reg(this, dest.reg.reg,
> > > > dest.reg.base_offset,
> > > > +      unsigned base_offset = dest.reg.base_offset *
> > > > dest.reg.reg-
> > > > > 
> > > > > bit_size / 32;
> > > > +      return dst_reg_for_nir_reg(this, dest.reg.reg,
> > > > base_offset,
> > > >                                   dest.reg.indirect);
> > > >     }
> > > >  }
> > > > @@ -308,8 +309,9 @@ vec4_visitor::get_nir_src(const nir_src
> > > > &src,
> > > > enum brw_reg_type type,
> > > >        reg = nir_ssa_values[src.ssa->index];
> > > >     }
> > > >     else {
> > > > -     reg = dst_reg_for_nir_reg(this, src.reg.reg,
> > > > src.reg.base_offset,
> > > > -                               src.reg.indirect);
> > > > +      unsigned base_offset = src.reg.base_offset *
> > > > src.reg.reg-
> > > > > 
> > > > > bit_size / 32;
> > > > +      reg = dst_reg_for_nir_reg(this, src.reg.reg,
> > > > base_offset,
> > > > +                                src.reg.indirect);
> > > I think this wouldn't have been necessary if you had fixed the
> > > offset()
> > > helper to take into account the register type (as it does in the
> > > FS
> > > back-end)?
> > Yes, you are right.
> > 
> > I found it convenient that offset() didn't consider the type
> > because
> > that gave us a bit more flexibility to offset into the second half
> > of a
> > SIMD4x2 dvecN simply by doing offset(reg, 1). If offset takes the
> > type
> > into account we lose that ability because for a DF register the
> > minimum
> > we could offset with a delta=1 would be 2 SIMD8 registers (a full
> > SIMD4x2 dvec4). We can still avoid calling offset() in these cases
> > and
> > just increase reg_offset manually by 1 instead, I understand that
> > you
> > prefer this?
> > 
> Right, I think I'd rather have offset() mean the same thing on both
> the
> VEC4 and FS back-ends to avoid confusion (i.e. step by as many scalar
> components as the specified SIMD width).
> > 
> > Alternatively, we could also have typed and untyped versions of the
> > offset function so we can choose the one we need depending on the
> > case,
> > but maybe that would be more confusing?
> > 
> I don't think an untyped (i.e. based on an offset argument in byte or
> 32-byte units) variant of offset() would be a particularly compelling
> way to achieve the above either, in fact this seems to be the cause
> of
> the rather artificial limitation of the SIMD lowering pass that
> prevents
> it from handling cases where each half doesn't write or read exactly
> one
> register of the instruction.  If you had something like the FS
> backend's
> horiz_offset() you would just get the 4-wide vector of the register
> for
> the group index specified as argument, regardless of the type of the
> register.  The current representation of register offsets (as a 32B
> multiple) may still get in the way for non-64 bit register types, but
> that's not really the SIMD lowering pass' business and can probably
> be
> addressed separately.

I have been looking into this. It seems like a lot if use cases for the
offset() helper in the vec4 backend look like the following (this is
from the cse pass, but there are plenty more like this):

for (unsigned i = 0; i < regs_written(entry->generator); ++i) {
   vec4_instruction *copy = MOV(offset(entry->generator->dst, i),
                                offset(entry->tmp, i));
   ...
}

That is, code that is setup on a register by register basis rather than
a channel basis. For these cases I think the current offset() helper is
more useful. So how about we add a byte_offset() helper to the Vec4 IR
and we use that in cases like these and then we change the offset()
helper to operate with the channel semantics that we have in the FS IR
as you suggest and use that everywhere else?

> > 
> > Iago
> > 
> > > 
> > > > 
> > > > 
> > > >     }
> > > >  
> > > >     reg = retype(reg, type);