[Mesa-dev] [PATCH 34/59] intel/compiler: fix ddy for half-float in gen8
Iago Toral
itoral at igalia.com
Fri Dec 7 14:03:41 UTC 2018
On Fri, 2018-12-07 at 15:06 +0200, Pohjolainen, Topi wrote:
> On Tue, Dec 04, 2018 at 08:16:58AM +0100, Iago Toral Quiroga wrote:
> > We use ALign16 mode for this, since it is more convenient, but the
> > PRM
> > for Broadwell states in Volume 3D Media GPGPU, Chapter 'Register
> > region
> > restrictions', Section '1. Special Restrictions':
> >
> > "In Align16 mode, the channel selects and channel enables apply
> > to a
> > pair of half-floats, because these parameters are defined for
> > DWord
> > elements ONLY. This is applicable when both source and
> > destination
> > are half-floats."
> >
> > This means that we cannot select individual HF elements using
> > swizzles
> > like we do with 32-bit floats so we can't implement the required
> > regioning for this.
> >
> > Use the gen11 path for this instead, which uses Align1 mode.
> >
> > The restriction is not present in gen9 of gen10, where the Align16
>
> or?
Right, the issue is exclusive to gen8.
Iago
> > implementation seems to work just fine.
> > ---
> > src/intel/compiler/brw_fs_generator.cpp | 10 ++++++++--
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs_generator.cpp
> > b/src/intel/compiler/brw_fs_generator.cpp
> > index d8e4bae17e0..ba7ed07e692 100644
> > --- a/src/intel/compiler/brw_fs_generator.cpp
> > +++ b/src/intel/compiler/brw_fs_generator.cpp
> > @@ -1281,8 +1281,14 @@ fs_generator::generate_ddy(const fs_inst
> > *inst,
> > const uint32_t type_size = type_sz(src.type);
> >
> > if (inst->opcode == FS_OPCODE_DDY_FINE) {
> > - /* produce accurate derivatives */
> > - if (devinfo->gen >= 11) {
> > + /* produce accurate derivatives. We can do this easily in
> > Align16
> > + * but this is not supported in gen11+ and gen8 Align16
> > swizzles
> > + * for Half-Float operands work in units of 32-bit and
> > always
> > + * select pairs of consecutive half-float elements, so we
> > can't use
> > + * use it for this.
> > + */
> > + if (devinfo->gen >= 11 ||
> > + (devinfo->gen == 8 && src.type == BRW_REGISTER_TYPE_HF))
> > {
> > src = stride(src, 0, 2, 1);
> > struct brw_reg src_0 = byte_offset(src, 0 * type_size);
> > struct brw_reg src_2 = byte_offset(src, 2 * type_size);
> > --
> > 2.17.1
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
More information about the mesa-dev
mailing list