[Mesa-dev] [PATCH 4/4] r600: partly fix sampleMaskIn value

sroland at vmware.com sroland at vmware.com
Sun Feb 4 23:01:06 UTC 2018


From: Roland Scheidegger <sroland at vmware.com>

The hw gives us coverage for pixel, not for individual fragment shader
invocations, in case execution isn't per pixel (note eg, unlike cm, actually
cannot do "real" minSampleShading, it's either per-pixel or per-fragment, but
it doesn't really make a difference here).
Also, with msaa disabled, the hw still gives us a mask corresponding to the
number of samples, where GL requires this to be 1.
Fix this up by masking the sampleMaskIn bits with the bit corresponding to
the sampleID, if we know this shader is always executed at per-sample
granularity. (In case of a per-sample frequency shader and msaa disabled, the
sampleID will always be 0, so this works just fine there.)
Fixing this for the minSampleShading case will require a shader key (radeonsi
uses the prolog part for this) (for eg, could get away with a single bit, cm
would need either more bits depending on sample/invocation ratio, or read the
bits from a uniform), unless we'd want to always use a sample mask uniform
(which is probably not a good idea, as it would make the ordinary common msaa
case slower for no good reason).
This fixes some parts of piglit arb_sample_shading-samplemask (needs fixed
test), in particular those which use a sampleID, while still failing others
as expected.
---
 src/gallium/drivers/r600/r600_shader.c | 54 ++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c
index 1009411c62..8779f166aa 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -1138,6 +1138,11 @@ static int allocate_system_value_inputs(struct r600_shader_ctx *ctx, int gpr_off
 
 	tgsi_parse_free(&parse);
 
+	if (ctx->info.reads_samplemask &&
+	    (ctx->info.uses_linear_sample || ctx->info.uses_linear_sample)) {
+		inputs[1].enabled = true;
+	}
+
 	if (ctx->bc->chip_class >= EVERGREEN) {
 		int num_baryc = 0;
 		/* assign gpr to each interpolator according to priority */
@@ -3503,8 +3508,57 @@ static int r600_shader_from_tgsi(struct r600_context *rctx,
 			r = eg_load_helper_invocation(&ctx);
 		if (r)
 			return r;
+	}
+
+	/*
+	 * XXX this relies on fixed_pt_position_gpr only being present when
+	 * this shader should be executed per sample. Should be the case for now...
+	 */
+	if (ctx.fixed_pt_position_gpr != -1 && ctx.info.reads_samplemask) {
+		/*
+		 * Fix up sample mask. The hw always gives us coverage mask for
+		 * the pixel. However, for per-sample shading, we need the
+		 * coverage for the shader invocation only.
+		 * Also, with disabled msaa, only the first bit should be set
+		 * (luckily the same fixup works for both problems).
+		 * For now, we can only do it if we know this shader is always
+		 * executed per sample (due to usage of bits in the shader
+		 * forcing per-sample execution).
+		 * If the fb is not multisampled, we'd do unnecessary work but
+		 * it should still be correct.
+		 * It will however do nothing for sample shading according
+		 * to MinSampleShading.
+		 */
+		struct r600_bytecode_alu alu;
+		int tmp = r600_get_temp(&ctx);
+		assert(ctx.face_gpr != -1);
+		memset(&alu, 0, sizeof(struct r600_bytecode_alu));
+
+		alu.op = ALU_OP2_LSHL_INT;
+		alu.src[0].sel = V_SQ_ALU_SRC_LITERAL;
+		alu.src[0].value = 0x1;
+		alu.src[1].sel = ctx.fixed_pt_position_gpr;
+		alu.src[1].chan = 3;
+		alu.dst.sel = tmp;
+		alu.dst.chan = 0;
+		alu.dst.write = 1;
+		alu.last = 1;
+		if ((r = r600_bytecode_add_alu(ctx.bc, &alu)))
+			return r;
 
+		memset(&alu, 0, sizeof(struct r600_bytecode_alu));
+		alu.op = ALU_OP2_AND_INT;
+		alu.src[0].sel = tmp;
+		alu.src[1].sel = ctx.face_gpr;
+		alu.src[1].chan = 2;
+		alu.dst.sel = ctx.face_gpr;
+		alu.dst.chan = 2;
+		alu.dst.write = 1;
+		alu.last = 1;
+		if ((r = r600_bytecode_add_alu(ctx.bc, &alu)))
+			return r;
 	}
+
 	if (ctx.fragcoord_input >= 0) {
 		if (ctx.bc->chip_class == CAYMAN) {
 			for (j = 0 ; j < 4; j++) {
-- 
2.12.3



More information about the mesa-dev mailing list