[Mesa-dev] [PATCH v3 37/43] i965/fs: Enable 16-bit render target write on SKL and CHV
Jose Maria Casanova Crespo
jmcasanova at igalia.com
Thu Oct 12 18:38:26 UTC 2017
Once the infrastruture to support Render Target Messages with 16-bit
payload is available, this patch enables it on SKL and CHV platforms.
Enabling it allows 16-bit payload that use half of the register on
SIMD16 and avoids the spurious conversion from 16-bit to 32-bit needed
on BDW, just to be converted again to 16-bit.
In the case of CHV there is no support for UINT so in this case the
half precision data format is not enabled and the fallback of the
32-bit payload is used.
>From PRM CHV, vol 07, section "Pixel Data Port" page 260:
"Half Precision Render Target Write messages do not support UNIT
formats." where UNIT is a typo for UINT.
v2: Removed use of stride = 2 on sources (Jason Ekstrand)
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova at igalia.com>
Signed-off-by: Eduardo Lima <elima at igalia.com>
---
src/intel/compiler/brw_fs_nir.cpp | 46 +++++++++++++++++++++++++++------------
1 file changed, 32 insertions(+), 14 deletions(-)
diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp
index 3dbdcc0955..2d0b3e139e 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -55,19 +55,24 @@ fs_visitor::nir_setup_outputs()
return;
if (stage == MESA_SHADER_FRAGMENT) {
- /*
+ /* On HW that doesn't support half-precision render-target-write
+ * messages (e.g, some gen8 HW like Broadwell), we need a workaround
+ * to support 16-bit outputs from pixel shaders.
+ *
* The following code uses the outputs map to save the variable's
* original output type, so later we can retrieve it and retype
* the output accordingly while emitting the FS 16-bit outputs.
*/
- nir_foreach_variable(var, &nir->outputs) {
- const enum glsl_base_type base_type =
- glsl_get_base_type(var->type->without_array());
-
- if (glsl_base_type_is_16bit(base_type)) {
- outputs[var->data.driver_location] =
- retype(outputs[var->data.driver_location],
- brw_type_for_base_type(var->type));
+ if (devinfo->gen == 8) {
+ nir_foreach_variable(var, &nir->outputs) {
+ const enum glsl_base_type base_type =
+ glsl_get_base_type(var->type->without_array());
+
+ if (glsl_base_type_is_16bit(base_type)) {
+ outputs[var->data.driver_location] =
+ retype(outputs[var->data.driver_location],
+ brw_type_for_base_type(var->type));
+ }
}
}
return;
@@ -3246,14 +3251,27 @@ fs_visitor::nir_emit_fs_intrinsic(const fs_builder &bld,
const unsigned location = nir_intrinsic_base(instr) +
SET_FIELD(const_offset->u32[0], BRW_NIR_FRAG_OUTPUT_LOCATION);
+ /* This flag discriminates HW where we have support for half-precision
+ * render target write messages (aka, the data-format bit), so 16-bit
+ * render target payloads can be used. It is available since skylake
+ * and cherryview. In the case of cherryview there is no support for
+ * UINT formats.
+ */
+ bool enable_hp_rtw = is_16bit &&
+ (devinfo->gen >= 9 || (devinfo->is_cherryview &&
+ outputs[location].type != BRW_REGISTER_TYPE_UW));
+
if (is_16bit) {
- /* The outputs[location] should already have the original output type
- * stored from nir_setup_outputs.
+ /* outputs[location] should already have the original output type
+ * stored from nir_setup_outputs, in case the HW doesn't support
+ * half-precision RTW messages.
+ * If HP RTW is enabled we just use HF to copy 16-bit values.
*/
- src = retype(src, outputs[location].type);
+ src = retype(src, enable_hp_rtw ?
+ BRW_REGISTER_TYPE_HF : outputs[location].type);
}
- fs_reg new_dest = retype(alloc_frag_output(this, location, false),
+ fs_reg new_dest = retype(alloc_frag_output(this, location, enable_hp_rtw),
src.type);
/* This is a workaround to support 16-bits outputs on HW that doesn't
@@ -3263,7 +3281,7 @@ fs_visitor::nir_emit_fs_intrinsic(const fs_builder &bld,
* render target with a 16-bit surface format will force the correct
* conversion of the 32-bit output values to 16-bit.
*/
- if (is_16bit) {
+ if (is_16bit && !enable_hp_rtw) {
new_dest.type = brw_reg_type_from_bit_size(32, src.type);
}
for (unsigned j = 0; j < instr->num_components; j++)
--
2.13.6
More information about the mesa-dev
mailing list