[Mesa-dev] [PATCH 34/59] intel/compiler: fix ddy for half-float in gen8

Iago Toral Quiroga itoral at igalia.com
Tue Dec 4 07:16:58 UTC 2018


We use ALign16 mode for this, since it is more convenient, but the PRM
for Broadwell states in Volume 3D Media GPGPU, Chapter 'Register region
restrictions', Section '1. Special Restrictions':

   "In Align16 mode, the channel selects and channel enables apply to a
    pair of half-floats, because these parameters are defined for DWord
    elements ONLY. This is applicable when both source and destination
    are half-floats."

This means that we cannot select individual HF elements using swizzles
like we do with 32-bit floats so we can't implement the required
regioning for this.

Use the gen11 path for this instead, which uses Align1 mode.

The restriction is not present in gen9 of gen10, where the Align16
implementation seems to work just fine.
---
 src/intel/compiler/brw_fs_generator.cpp | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp
index d8e4bae17e0..ba7ed07e692 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1281,8 +1281,14 @@ fs_generator::generate_ddy(const fs_inst *inst,
    const uint32_t type_size = type_sz(src.type);
 
    if (inst->opcode == FS_OPCODE_DDY_FINE) {
-      /* produce accurate derivatives */
-      if (devinfo->gen >= 11) {
+      /* produce accurate derivatives. We can do this easily in Align16
+       * but this is not supported in gen11+ and gen8 Align16 swizzles
+       * for Half-Float operands work in units of 32-bit and always
+       * select pairs of consecutive half-float elements, so we can't use
+       * use it for this.
+       */
+      if (devinfo->gen >= 11 ||
+          (devinfo->gen == 8 && src.type == BRW_REGISTER_TYPE_HF)) {
          src = stride(src, 0, 2, 1);
          struct brw_reg src_0  = byte_offset(src,  0 * type_size);
          struct brw_reg src_2  = byte_offset(src,  2 * type_size);
-- 
2.17.1



More information about the mesa-dev mailing list