[Mesa-dev] [PATCH 34/59] intel/compiler: fix ddy for half-float in gen8

Pohjolainen, Topi topi.pohjolainen at gmail.com
Fri Dec 7 13:06:17 UTC 2018


On Tue, Dec 04, 2018 at 08:16:58AM +0100, Iago Toral Quiroga wrote:
> We use ALign16 mode for this, since it is more convenient, but the PRM
> for Broadwell states in Volume 3D Media GPGPU, Chapter 'Register region
> restrictions', Section '1. Special Restrictions':
> 
>    "In Align16 mode, the channel selects and channel enables apply to a
>     pair of half-floats, because these parameters are defined for DWord
>     elements ONLY. This is applicable when both source and destination
>     are half-floats."
> 
> This means that we cannot select individual HF elements using swizzles
> like we do with 32-bit floats so we can't implement the required
> regioning for this.
> 
> Use the gen11 path for this instead, which uses Align1 mode.
> 
> The restriction is not present in gen9 of gen10, where the Align16

                                         or?

> implementation seems to work just fine.
> ---
>  src/intel/compiler/brw_fs_generator.cpp | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp
> index d8e4bae17e0..ba7ed07e692 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -1281,8 +1281,14 @@ fs_generator::generate_ddy(const fs_inst *inst,
>     const uint32_t type_size = type_sz(src.type);
>  
>     if (inst->opcode == FS_OPCODE_DDY_FINE) {
> -      /* produce accurate derivatives */
> -      if (devinfo->gen >= 11) {
> +      /* produce accurate derivatives. We can do this easily in Align16
> +       * but this is not supported in gen11+ and gen8 Align16 swizzles
> +       * for Half-Float operands work in units of 32-bit and always
> +       * select pairs of consecutive half-float elements, so we can't use
> +       * use it for this.
> +       */
> +      if (devinfo->gen >= 11 ||
> +          (devinfo->gen == 8 && src.type == BRW_REGISTER_TYPE_HF)) {
>           src = stride(src, 0, 2, 1);
>           struct brw_reg src_0  = byte_offset(src,  0 * type_size);
>           struct brw_reg src_2  = byte_offset(src,  2 * type_size);
> -- 
> 2.17.1
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list