[Mesa-dev] [PATCH v3 37/43] i965/fs: Enable 16-bit render target write on SKL and CHV

Jose Maria Casanova Crespo jmcasanova at igalia.com
Thu Oct 12 18:38:26 UTC 2017


Once the infrastruture to support Render Target Messages with 16-bit
payload is available, this patch enables it on SKL and CHV platforms.

Enabling it allows 16-bit payload that use half of the register on
SIMD16 and avoids the spurious conversion from 16-bit to 32-bit needed
on BDW, just to be converted again to 16-bit.

In the case of CHV there is no support for UINT so in this case the
half precision data format is not enabled and the fallback of the
32-bit payload is used.

>From PRM CHV, vol 07, section "Pixel Data Port" page 260:

"Half Precision Render Target Write messages do not support UNIT
formats." where UNIT is a typo for UINT.

v2: Removed use of stride = 2 on sources (Jason Ekstrand)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova at igalia.com>
Signed-off-by: Eduardo Lima <elima at igalia.com>
---
 src/intel/compiler/brw_fs_nir.cpp | 46 +++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 14 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp
index 3dbdcc0955..2d0b3e139e 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -55,19 +55,24 @@ fs_visitor::nir_setup_outputs()
       return;
 
    if (stage == MESA_SHADER_FRAGMENT) {
-      /*
+      /* On HW that doesn't support half-precision render-target-write
+       * messages (e.g, some gen8 HW like Broadwell), we need a workaround
+       * to support 16-bit outputs from pixel shaders.
+       *
        * The following code uses the outputs map to save the variable's
        * original output type, so later we can retrieve it and retype
        * the output accordingly while emitting the FS 16-bit outputs.
        */
-      nir_foreach_variable(var, &nir->outputs) {
-         const enum glsl_base_type base_type =
-            glsl_get_base_type(var->type->without_array());
-
-         if (glsl_base_type_is_16bit(base_type)) {
-            outputs[var->data.driver_location] =
-               retype(outputs[var->data.driver_location],
-                      brw_type_for_base_type(var->type));
+      if (devinfo->gen == 8) {
+         nir_foreach_variable(var, &nir->outputs) {
+            const enum glsl_base_type base_type =
+               glsl_get_base_type(var->type->without_array());
+
+            if (glsl_base_type_is_16bit(base_type)) {
+               outputs[var->data.driver_location] =
+                  retype(outputs[var->data.driver_location],
+                         brw_type_for_base_type(var->type));
+            }
          }
       }
       return;
@@ -3246,14 +3251,27 @@ fs_visitor::nir_emit_fs_intrinsic(const fs_builder &bld,
       const unsigned location = nir_intrinsic_base(instr) +
          SET_FIELD(const_offset->u32[0], BRW_NIR_FRAG_OUTPUT_LOCATION);
 
+      /* This flag discriminates HW where we have support for half-precision
+       * render target write messages (aka, the data-format bit), so 16-bit
+       * render target payloads can be used. It is available since skylake
+       * and cherryview. In the case of cherryview there is no support for
+       * UINT formats.
+       */
+      bool enable_hp_rtw = is_16bit &&
+         (devinfo->gen >= 9 || (devinfo->is_cherryview &&
+                                outputs[location].type != BRW_REGISTER_TYPE_UW));
+
       if (is_16bit) {
-         /* The outputs[location] should already have the original output type
-          * stored from nir_setup_outputs.
+         /* outputs[location] should already have the original output type
+          * stored from nir_setup_outputs, in case the HW doesn't support
+          * half-precision RTW messages.
+          * If HP RTW is enabled we just use HF to copy 16-bit values.
           */
-         src = retype(src, outputs[location].type);
+         src = retype(src, enable_hp_rtw ?
+                      BRW_REGISTER_TYPE_HF : outputs[location].type);
       }
 
-      fs_reg new_dest = retype(alloc_frag_output(this, location, false),
+      fs_reg new_dest = retype(alloc_frag_output(this, location, enable_hp_rtw),
                                src.type);
 
       /* This is a workaround to support 16-bits outputs on HW that doesn't
@@ -3263,7 +3281,7 @@ fs_visitor::nir_emit_fs_intrinsic(const fs_builder &bld,
        * render target with a 16-bit surface format will force the correct
        * conversion of the 32-bit output values to 16-bit.
        */
-      if (is_16bit) {
+      if (is_16bit && !enable_hp_rtw) {
          new_dest.type = brw_reg_type_from_bit_size(32, src.type);
       }
       for (unsigned j = 0; j < instr->num_components; j++)
-- 
2.13.6



More information about the mesa-dev mailing list