[Mesa-dev] [PATCH 1/3] i965/vec4: fix vertical stride to avoid breaking region parameter rule
Samuel Iglesias Gonsálvez
siglesias at igalia.com
Wed Apr 26 11:57:55 UTC 2017
>From IVB PRM, vol4, part3, "General Restrictions on Regioning
Parameters":
"If ExecSize = Width and HorzStride ≠ 0, VertStride must
be set to Width * HorzStride."
In next patch, we are going to modify the region parameter for
uniforms and vgrf. For uniforms that are the source of
DF align1 instructions, they will have <0, 4, 1> regioning and
the execsize for those instructions will be 4, so they will break
the regioning rule. This will be the same for VGRF sources where
we use the vstride == 0 exploit.
As we know we are not going to cross the GRF boundary with that
execsize and parameters (not even with the exploit), we just fix
the vstride here.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Cc: "17.1" <mesa-stable at lists.freedesktop.org>
---
src/intel/compiler/brw_reg.h | 15 +++++++++++++++
src/intel/compiler/brw_vec4.cpp | 19 +++++++++++++++++++
2 files changed, 34 insertions(+)
diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index 17a51fbd655..24e09a84fce 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -914,6 +914,21 @@ static inline unsigned cvt(unsigned val)
return 0;
}
+static inline unsigned inv_cvt(unsigned val)
+{
+ switch (val) {
+ case 0: return 0;
+ case 1: return 1;
+ case 2: return 2;
+ case 3: return 4;
+ case 4: return 8;
+ case 5: return 16;
+ case 6: return 32;
+ }
+ return 0;
+}
+
+
static inline struct brw_reg
stride(struct brw_reg reg, unsigned vstride, unsigned width, unsigned hstride)
{
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index f9b805ea5a9..95f96ea69c0 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -38,6 +38,8 @@ using namespace brw;
namespace brw {
+static bool is_align1_df(vec4_instruction *inst);
+
void
src_reg::init()
{
@@ -2049,6 +2051,23 @@ vec4_visitor::convert_to_hw_regs()
apply_logical_swizzle(®, inst, i);
src = reg;
+
+ /* From IVB PRM, vol4, part3, "General Restrictions on Regioning
+ * Parameters":
+ *
+ * "If ExecSize = Width and HorzStride ≠ 0, VertStride must be set
+ * to Width * HorzStride."
+ *
+ * We can break this rule with DF sources on DF align1
+ * instructions, because the exec_size would be 4 and width is 4.
+ * As we know we are not accessing to next GRF, it is safe to
+ * set vstride to the formula given by the rule itself.
+ */
+ if (is_align1_df(inst) && inst->exec_size == inv_cvt(src.width + 1)) {
+ const unsigned width = inv_cvt(src.width + 1);
+ const unsigned hstride = inv_cvt(src.hstride);
+ src.vstride = cvt(width * hstride);
+ }
}
if (inst->is_3src(devinfo)) {
--
2.11.0
More information about the mesa-dev
mailing list