[Mesa-dev] [PATCH 34/59] intel/compiler: fix ddy for half-float in gen8

Iago Toral itoral at igalia.com
Fri Dec 7 14:03:41 UTC 2018


On Fri, 2018-12-07 at 15:06 +0200, Pohjolainen, Topi wrote:
> On Tue, Dec 04, 2018 at 08:16:58AM +0100, Iago Toral Quiroga wrote:
> > We use ALign16 mode for this, since it is more convenient, but the
> > PRM
> > for Broadwell states in Volume 3D Media GPGPU, Chapter 'Register
> > region
> > restrictions', Section '1. Special Restrictions':
> > 
> >    "In Align16 mode, the channel selects and channel enables apply
> > to a
> >     pair of half-floats, because these parameters are defined for
> > DWord
> >     elements ONLY. This is applicable when both source and
> > destination
> >     are half-floats."
> > 
> > This means that we cannot select individual HF elements using
> > swizzles
> > like we do with 32-bit floats so we can't implement the required
> > regioning for this.
> > 
> > Use the gen11 path for this instead, which uses Align1 mode.
> > 
> > The restriction is not present in gen9 of gen10, where the Align16
> 
>                                          or?

Right, the issue is exclusive to gen8.

Iago

> > implementation seems to work just fine.
> > ---
> >  src/intel/compiler/brw_fs_generator.cpp | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/intel/compiler/brw_fs_generator.cpp
> > b/src/intel/compiler/brw_fs_generator.cpp
> > index d8e4bae17e0..ba7ed07e692 100644
> > --- a/src/intel/compiler/brw_fs_generator.cpp
> > +++ b/src/intel/compiler/brw_fs_generator.cpp
> > @@ -1281,8 +1281,14 @@ fs_generator::generate_ddy(const fs_inst
> > *inst,
> >     const uint32_t type_size = type_sz(src.type);
> >  
> >     if (inst->opcode == FS_OPCODE_DDY_FINE) {
> > -      /* produce accurate derivatives */
> > -      if (devinfo->gen >= 11) {
> > +      /* produce accurate derivatives. We can do this easily in
> > Align16
> > +       * but this is not supported in gen11+ and gen8 Align16
> > swizzles
> > +       * for Half-Float operands work in units of 32-bit and
> > always
> > +       * select pairs of consecutive half-float elements, so we
> > can't use
> > +       * use it for this.
> > +       */
> > +      if (devinfo->gen >= 11 ||
> > +          (devinfo->gen == 8 && src.type == BRW_REGISTER_TYPE_HF))
> > {
> >           src = stride(src, 0, 2, 1);
> >           struct brw_reg src_0  = byte_offset(src,  0 * type_size);
> >           struct brw_reg src_2  = byte_offset(src,  2 * type_size);
> > -- 
> > 2.17.1
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 



More information about the mesa-dev mailing list