[Mesa-dev] [PATCH] i965/fs: Follow pow(16) instructions with a NOP.
Matt Turner
mattst88 at gmail.com
Tue May 3 06:32:13 UTC 2016
Beginning with commit 7b208a73, Unigine Valley began hanging the GPU on
Gen >= 8 platforms. This patch avoids the GPU hangs, but does not
implement a full work around for the restriction (dispatch_width == 16
is an imperfect proxy).
Evidently that commit allowed the scheduler to make different choices
that somehow finally ran afoul of a hardware bug in which POW and FDIV
instructions may not be followed by an instruction with two destination
registers (including compressed instructions). I presume the conditions
are more complex than that, but the internal hardware bug report (BDWGFX
bug_de 1696294) does not contain much more information.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94924
---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 8654ca4..2b3544b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -2086,6 +2086,19 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
assert(inst->conditional_mod == BRW_CONDITIONAL_NONE);
if (devinfo->gen >= 7 && inst->opcode == SHADER_OPCODE_POW) {
gen6_math(p, dst, brw_math_function(inst->opcode), src[0], src[1]);
+
+ /* From the Broadwell PRM, Volume 7, "3D-Media-GPGPU", in the
+ * "Register Region Restrictions" section: for BDW, SKL:
+ *
+ * "A POW/FDIV operation must not be followed by an instruction
+ * that requires two destination registers."
+ *
+ * The documentation is often lacking annotations for Atom parts,
+ * and empirically this affects CHV as well.
+ */
+ if (devinfo->gen >= 8 && dispatch_width == 16) {
+ brw_NOP(p);
+ }
} else if (devinfo->gen >= 6) {
generate_math_gen6(inst, dst, src[0], src[1]);
} else {
--
2.7.3
More information about the mesa-dev
mailing list