[Mesa-dev] [PATCH 3/4] i965: Fix execution size of scalar TCS barrier setup code.
Kenneth Graunke
kenneth at whitecape.org
Wed Aug 17 14:15:06 UTC 2016
Previously, the scalar TCS backend was generating:
mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted };
and(8) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all 1Q };
shl(8) g17.2<1>UD g17.2<8,8,1>UD 0x0000000bUD { align1 WE_all 1Q };
or(8) g17.2<1>UD g17.2<8,8,1>UD 0x00008200UD { align1 WE_all 1Q };
send(8) null<1>UW g17<8,8,1>UD
gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q };
This is rubbish - g17.2<8,8,1>UD spans two registers, and is an illegal
region. Not to mention it clobbers 8 channels of data when we only
wanted to touch m0.2.
Instead, we want:
mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted };
and(1) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all };
shl(1) g17.2<1>UD g17.2<0,1,0>UD 0x0000000bUD { align1 WE_all };
or(1) g17.2<1>UD g17.2<0,1,0>UD 0x00008200UD { align1 WE_all };
send(8) null<1>UW g17<8,8,1>UD
gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q };
Using component() accomplishes this.
Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers.
barrier_guarded_read_write_calls on Skylake. Probably fixes other
barrier issues on Gen8+.
Cc: mesa-stable at lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index df033e1..bb254d6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2420,7 +2420,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld,
break;
fs_reg m0 = bld.vgrf(BRW_REGISTER_TYPE_UD, 1);
- fs_reg m0_2 = byte_offset(m0, 2 * sizeof(uint32_t));
+ fs_reg m0_2 = component(m0, 2);
const fs_builder fwa_bld = bld.exec_all();
--
2.9.3
More information about the mesa-dev
mailing list