[Mesa-dev] [PATCH v3 19/48] i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src

Tue Oct 31 14:22:31 UTC 2017

On Tue, 2017-10-31 at 07:20 -0700, Jason Ekstrand wrote:
> On Tue, Oct 31, 2017 at 12:01 AM, Iago Toral <itoral at igalia.com>
> wrote:
> > On Mon, 2017-10-30 at 11:29 -0700, Jason Ekstrand wrote:
> > > On Mon, Oct 30, 2017 at 12:15 AM, Iago Toral <itoral at igalia.com>
> > > wrote:
> > > > On Fri, 2017-10-27 at 12:21 -0700, Jason Ekstrand wrote:
> > > > > On Thu, Oct 26, 2017 at 11:53 PM, Iago Toral <itoral at igalia.c
> > > > > om> wrote:
> > > > > > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > > > > > 
> > > > > > > ---
> > > > > > 
> > > > > > >  src/intel/compiler/brw_fs_nir.cpp | 33
> > > > > > +++++++++++++++++++++------
> > > > > > 
> > > > > > > ------
> > > > > > 
> > > > > > >  1 file changed, 21 insertions(+), 12 deletions(-)
> > > > > > 
> > > > > > >
> > > > > > 
> > > > > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > > > > 
> > > > > > > b/src/intel/compiler/brw_fs_nir.cpp
> > > > > > 
> > > > > > > index e008e2e..a441f57 100644
> > > > > > 
> > > > > > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > > > > 
> > > > > > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > > > > 
> > > > > > > @@ -1441,11 +1441,19 @@ fs_visitor::get_nir_src(const
> > > > > > nir_src &src)
> > > > > > 
> > > > > > >                     src.reg.base_offset * src.reg.reg-
> > > > > > 
> > > > > > > >num_components);
> > > > > > 
> > > > > > >     }
> > > > > > 
> > > > > > >  
> > > > > > 
> > > > > > > -   /* to avoid floating-point denorm flushing problems,
> > > > > > set the type
> > > > > > 
> > > > > > > by
> > > > > > 
> > > > > > > -    * default to D - instructions that need floating
> > > > > > point semantics
> > > > > > 
> > > > > > > will set
> > > > > > 
> > > > > > > -    * this to F if they need to
> > > > > > 
> > > > > > > -    */
> > > > > > 
> > > > > > > -   return retype(reg, BRW_REGISTER_TYPE_D);
> > > > > > 
> > > > > > > +   if (nir_src_bit_size(src) == 64 && devinfo->gen == 7)
> > > > > > {
> > > > > > 
> > > > > > > +      /* The only 64-bit type available on gen7 is DF,
> > > > > > so use that.
> > > > > > 
> > > > > > > */
> > > > > > 
> > > > > > > +      reg.type = BRW_REGISTER_TYPE_DF;
> > > > > > 
> > > > > > > +   } else {
> > > > > > 
> > > > > > > +      /* To avoid floating-point denorm flushing
> > > > > > problems, set the
> > > > > > 
> > > > > > > type by
> > > > > > 
> > > > > > > +       * default to an integer type - instructions that
> > > > > > need
> > > > > > 
> > > > > > > floating point
> > > > > > 
> > > > > > > +       * semantics will set this to F if they need to
> > > > > > 
> > > > > > > +       */
> > > > > > 
> > > > > > > +      reg.type =
> > > > > > brw_reg_type_from_bit_size(nir_src_bit_size(src),
> > > > > > 
> > > > > > >
> > > > > > +                                            BRW_REGISTER_T
> > > > > > YPE_D);
> > > > > > 
> > > > > > > +   }
> > > > > > 
> > > > > > > +
> > > > > > 
> > > > > > > +   return reg;
> > > > > > 
> > > > > > >  }
> > > > > > 
> > > > > > >  
> > > > > > 
> > > > > > >  /**
> > > > > > 
> > > > > > > @@ -1455,6 +1463,10 @@ fs_reg
> > > > > > 
> > > > > > >  fs_visitor::get_nir_src_imm(const nir_src &src)
> > > > > > 
> > > > > > >  {
> > > > > > 
> > > > > > >     nir_const_value *val = nir_src_as_const_value(src);
> > > > > > 
> > > > > > > +   /* This function shouldn't be called on anything
> > > > > > which can even
> > > > > > 
> > > > > > > +    * possibly be 64 bits as it can't do what it claims.
> > > > > > 
> > > > > > > +    */
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > What would be wrong with something like this?
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > if (nir_src_bit_size(src) == 32)
> > > > > > 
> > > > > >    return val ? fs_reg(brw_imm_d(val->i32[0])) :
> > > > > > get_nir_src(src);
> > > > > > 
> > > > > > else
> > > > > > 
> > > > > >    return val ? fs_reg(brw_imm_df(val->f64[0])) :
> > > > > > get_nir_src(src);
> > > > > > 
> > > > > 
> > > > > Because double immediates only really work on BDW+ and I
> > > > > didn't want someone to call this function and get tripped up
> > > > > by that.
> > > > 
> > > > Ok, fair enough. In  that case, maybe I'd suggest to clarify
> > > > this in the comment, since otherwise it is a bit confusing (or
> > > > maybe assert on the generation rather than the bitsize since
> > > > that would be more self-explanatory).
> > > 
> > > I'm not really clear on what you're asking for.  Do you want it
> > > to work for 64-bit immediates and just assert on gen7?  Or do you
> > > want the comment to be more clear?
> > 
> > I think either of them would be fine. My issue is with the comment
> > where it says that this cannot possibly work with 64-bit immediates
> > in general, because that is not true for gen8+.
> > I am fine with a change in the comment clarifying that such thing
> > would only work for gen8+ but we decide to only support 32-bit
> > paths. I'd also be fine if we decided to support gen8+ and just
> > assert for gen7 but I am not asking that you do that if you prefer
> > to keep things 32-bit only for this.
> 
> Alright, I'll update the comment.  How about:
> This function should not be called on any value which may be 64
> bits.  We could theoretically support 64-bit on gen8+ but we choose
> not to because it wouldn't work in general (no gen7 support) and
> there are enough restrictions in 64-bit immediates that you can't
> take the return value and treat it the same as the result of 

That looks good to me, thanks!
Iago
> > > > >  
> > > > > > > +   assert(nir_src_bit_size(src) == 32);
> > > > > > 
> > > > > > >     return val ? fs_reg(brw_imm_d(val->i32[0])) :
> > > > > > get_nir_src(src);
> > > > > > 
> > > > > > >  }
> > > > > > 
> > > > > > >  
> > > > > > 
> > > > > > > @@ -2648,8 +2660,7 @@
> > > > > > fs_visitor::nir_emit_tcs_intrinsic(const
> > > > > > 
> > > > > > > fs_builder &bld,
> > > > > > 
> > > > > > >                  */
> > > > > > 
> > > > > > >                 unsigned channel = iter * 2 + i;
> > > > > > 
> > > > > > >                 fs_reg dest =
> > > > > > shuffle_64bit_data_for_32bit_write(bld,
> > > > > > 
> > > > > > > -                  retype(offset(value, bld, 2 *
> > > > > > channel),
> > > > > > 
> > > > > > > BRW_REGISTER_TYPE_DF),
> > > > > > 
> > > > > > > -                  1);
> > > > > > 
> > > > > > > +                  offset(value, bld, channel), 1);
> > > > > > 
> > > > > > >  
> > > > > > 
> > > > > > >                 srcs[header_regs + (i + first_component)
> > > > > > * 2] = dest;
> > > > > > 
> > > > > > >                 srcs[header_regs + (i + first_component)
> > > > > > * 2 + 1] =
> > > > > > 
> > > > > > > @@ -3505,8 +3516,7 @@
> > > > > > fs_visitor::nir_emit_cs_intrinsic(const
> > > > > > 
> > > > > > > fs_builder &bld,
> > > > > > 
> > > > > > >        if (nir_src_bit_size(instr->src[0]) == 64) {
> > > > > > 
> > > > > > >           type_size = 8;
> > > > > > 
> > > > > > >           val_reg =
> > > > > > shuffle_64bit_data_for_32bit_write(bld,
> > > > > > 
> > > > > > > -            retype(val_reg, BRW_REGISTER_TYPE_DF),
> > > > > > 
> > > > > > > -            instr->num_components);
> > > > > > 
> > > > > > > +            val_reg, instr->num_components);
> > > > > > 
> > > > > > >        }
> > > > > > 
> > > > > > >  
> > > > > > 
> > > > > > >        unsigned type_slots = type_size / 4;
> > > > > > 
> > > > > > > @@ -4005,8 +4015,7 @@
> > > > > > fs_visitor::nir_emit_intrinsic(const fs_builder
> > > > > > 
> > > > > > > &bld, nir_intrinsic_instr *instr
> > > > > > 
> > > > > > >        if (nir_src_bit_size(instr->src[0]) == 64) {
> > > > > > 
> > > > > > >           type_size = 8;
> > > > > > 
> > > > > > >           val_reg =
> > > > > > shuffle_64bit_data_for_32bit_write(bld,
> > > > > > 
> > > > > > > -            retype(val_reg, BRW_REGISTER_TYPE_DF),
> > > > > > 
> > > > > > > -            instr->num_components);
> > > > > > 
> > > > > > > +            val_reg, instr->num_components);
> > > > > > 
> > > > > > >        }
> > > > > > 
> > > > > > >  
> > > > > > 
> > > > > > >        unsigned type_slots = type_size / 4;
> > > > > > 
> > > > > > > @@ -4063,7 +4072,7 @@
> > > > > > fs_visitor::nir_emit_intrinsic(const fs_builder
> > > > > > 
> > > > > > > &bld, nir_intrinsic_instr *instr
> > > > > > 
> > > > > > >        unsigned first_component =
> > > > > > nir_intrinsic_component(instr);
> > > > > > 
> > > > > > >        if (nir_src_bit_size(instr->src[0]) == 64) {
> > > > > > 
> > > > > > >           fs_reg tmp =
> > > > > > shuffle_64bit_data_for_32bit_write(bld,
> > > > > > 
> > > > > > > -            retype(src, BRW_REGISTER_TYPE_DF),
> > > > > > num_components);
> > > > > > 
> > > > > > > +            src, num_components);
> > > > > > 
> > > > > > >           src = tmp;
> > > > > > 
> > > > > > >           num_components *= 2;
> > > > > > 
> > > > > > >        }
> > > > > > 
> > > > > > 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20171031/de692ea4/attachment.html>