On 30 June 2012 11:50, Olivier Galibert <span dir="ltr"><<a href="mailto:galibert@pobox.com" target="_blank">galibert@pobox.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
There was... confusion about which register goes where. With that<br>
patch urb_setup is in line with the vue setup, even when these<br>
annoying backcolor slots are used. And in addition the stray mov into<br>
lalaland is avoided when only one of the front/back slots is used and<br>
the backface is looking at you. The code instead picks whatever slot<br>
was written to by the vertex shader. That makes most of the generated<br>
piglit tests useless to test the backface selection though.<br></blockquote><div><br>It looks like this patch does three separate things:<br>
<br>
1. modify brw_fs.cpp and brw_wm_pass2.c to properly account for the fact
that prior to Gen6, the SF stage of the pipeline doesn't remove
backfacing colors from the URB entry, so the fragment shader compiler
needs to skip over them when interpreting the contents of the thread
payload.<br>
<br>
2. rework the do_twoside_color() function so that it's easier to follow,
and doesn't emit unnecessary code when there's nothing to do.<br>
<br>
3. ensure that if the vertex program writes to gl_BackColor but not
gl_FrontColor, then the gl_BackColor value shows up in the fragment
shader, regardless of whether the polygon is front-facing or
back-facing.<br>
<br>
Can you split this into three separate patches? That will make it
easier to troubleshoot in case we find bugs with these patches in the
future.<br>
<br>
Also, I'm not convinced that #3 is necessary. Is there something in the
spec that dictates this behaviour? My reading of the spec is that if
the vertex shader writes to gl_BackColor but not glFrontColor, then the
contents of gl_Color in the fragment shader is undefined.<br>
<br>
If we *do* decide that #3 is necessary, then I think a better way to
accomplish it is to handle it in the GLSL vertex shader front-end, by
replacing gl_BackColor with gl_FrontColor in cases where gl_FrontColor
is not written to. That way our special case code to handle this
situation would be in just one place, rather than in three places (both
fragment shader back-ends, and the SF program). Also then the fix would
apply to all hardware, not just Intel Gen4-5.<br>
<br>Finally, I couldn't figure out what you meant by "the stray mov into lalaland". Can you elaborate on which piece of code used to generate that stray mov, and why it doesn't anymore? Thanks.<br><br>
I have a few additional comments below.<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Signed-off-by: Olivier Galibert <<a href="mailto:galibert@pobox.com">galibert@pobox.com</a>><br>
---<br>
src/mesa/drivers/dri/i965/brw_fs.cpp | 18 +++++-<br>
src/mesa/drivers/dri/i965/brw_sf.c | 3 +-<br>
src/mesa/drivers/dri/i965/brw_sf_emit.c | 93 +++++++++++++++++-------------<br>
src/mesa/drivers/dri/i965/brw_wm_pass2.c | 19 +++++-<br>
4 files changed, 89 insertions(+), 44 deletions(-)<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
index 6cef08a..710f2ff 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
@@ -721,8 +721,24 @@ fs_visitor::calculate_urb_setup()<br>
if (c->key.vp_outputs_written & BITFIELD64_BIT(i)) {<br>
int fp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i);<br>
<br>
+ /* Special case: two-sided vertex option, vertex program<br>
+ * only writes to the back color. Map it to the<br>
+ * associated front color location.<br>
+ */<br>
+ if (i >= VERT_RESULT_BFC0 && i <= VERT_RESULT_BFC1 &&<br>
+ ctx->VertexProgram._TwoSideEnabled &&<br>
+ urb_setup[i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0] == -1)<br>
+ fp_index = i - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;<br>
+<br>
+ /* The back color slot is skipped when the front color is<br>
+ * also written to. In addition, some slots can be<br>
+ * written in the vertex shader and not read in the<br>
+ * fragment shader. So the register number must always be<br>
+ * incremented, mapped or not.<br>
+ */<br>
if (fp_index >= 0)<br>
- urb_setup[fp_index] = urb_next++;<br>
+ urb_setup[fp_index] = urb_next;<br>
+ urb_next++;<br>
}<br>
}<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c<br>
index 23a874a..7867ab5 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_sf.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_sf.c<br>
@@ -192,7 +192,8 @@ brw_upload_sf_prog(struct brw_context *brw)<br>
<br>
/* _NEW_LIGHT */<br>
key.do_flat_shading = (ctx->Light.ShadeModel == GL_FLAT);<br>
- key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide);<br>
+ key.do_twoside_color = (ctx->Light.Enabled && ctx->Light.Model.TwoSide) ||<br>
+ ctx->VertexProgram._TwoSideEnabled;<br></blockquote><div><br>You're right that this needs fixing, but I think it should just be<br><br>key.do_twoside_color = ctx->VertexProgram._TwoSideEnabled;<br><br>My reasoning is: _TwoSideEnabled is derived state. It is set by src/mesa/main/state.c's update_twoside() function to either (a) ctx->VertexProgram.TwoSideEnabled, if there is a vertex program, or (b) ctx->Light.Enabled && ctx->Light.Model.TwoSide, if there isn't. Since it already accounts correctly for both ways the user can enable two-sided lighting, there's no reason for us to consult ctx->Light.Model.TwoSide in the driver.<br>
<br>Note: brw_clip.c has some references to Light.Model.TwoSide as well. I suspect they are also wrong, but I'm not exactly sure what the correct code is. I suppose we should address that in a later patch :)<br> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
/* _NEW_POLYGON */<br>
if (key.do_twoside_color) {<br>
diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c b/src/mesa/drivers/dri/i965/brw_sf_emit.c<br>
index ff6383b..9d8aa38 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_sf_emit.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c<br>
@@ -79,24 +79,9 @@ have_attr(struct brw_sf_compile *c, GLuint attr)<br>
/***********************************************************************<br>
* Twoside lighting<br>
*/<br>
-static void copy_bfc( struct brw_sf_compile *c,<br>
- struct brw_reg vert )<br>
-{<br>
- struct brw_compile *p = &c->func;<br>
- GLuint i;<br>
-<br>
- for (i = 0; i < 2; i++) {<br>
- if (have_attr(c, VERT_RESULT_COL0+i) &&<br>
- have_attr(c, VERT_RESULT_BFC0+i))<br>
- brw_MOV(p,<br>
- get_vert_result(c, vert, VERT_RESULT_COL0+i),<br>
- get_vert_result(c, vert, VERT_RESULT_BFC0+i));<br>
- }<br>
-}<br>
-<br>
-<br>
static void do_twoside_color( struct brw_sf_compile *c )<br>
{<br>
+ GLuint i, need_0, need_1;<br>
struct brw_compile *p = &c->func;<br>
GLuint backface_conditional = c->key.frontface_ccw ? BRW_CONDITIONAL_G : BRW_CONDITIONAL_L;<br>
<br>
@@ -105,12 +90,14 @@ static void do_twoside_color( struct brw_sf_compile *c )<br>
if (c->key.primitive == SF_UNFILLED_TRIS)<br>
return;<br>
<br>
- /* XXX: What happens if BFC isn't present? This could only happen<br>
- * for user-supplied vertex programs, as t_vp_build.c always does<br>
- * the right thing.<br>
+ /* If the vertex shader provides both front and backface color, do<br>
+ * the selection. Otherwise the generated code will pick up<br>
+ * whichever there is.<br>
*/<br>
- if (!(have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0)) &&<br>
- !(have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1)))<br>
+ need_0 = have_attr(c, VERT_RESULT_COL0) && have_attr(c, VERT_RESULT_BFC0);<br>
+ need_1 = have_attr(c, VERT_RESULT_COL1) && have_attr(c, VERT_RESULT_BFC1);<br>
+<br>
+ if (!need_0 && !need_1)<br>
return;<br>
<br>
/* Need to use BRW_EXECUTE_4 and also do an 4-wide compare in order<br>
@@ -121,12 +108,15 @@ static void do_twoside_color( struct brw_sf_compile *c )<br>
brw_push_insn_state(p);<br>
brw_CMP(p, vec4(brw_null_reg()), backface_conditional, c->det, brw_imm_f(0));<br>
brw_IF(p, BRW_EXECUTE_4);<br>
- {<br>
- switch (c->nr_verts) {<br>
- case 3: copy_bfc(c, c->vert[2]);<br>
- case 2: copy_bfc(c, c->vert[1]);<br>
- case 1: copy_bfc(c, c->vert[0]);<br>
- }<br>
+ for (i=0; i<c->nr_verts; i++) {<br>
+ if (need_0)<br>
+ brw_MOV(p,<br>
+ get_vert_result(c, c->vert[i], VERT_RESULT_COL0),<br>
+ get_vert_result(c, c->vert[i], VERT_RESULT_BFC0));<br>
+ if (need_1)<br>
+ brw_MOV(p,<br>
+ get_vert_result(c, c->vert[i], VERT_RESULT_COL1),<br>
+ get_vert_result(c, c->vert[i], VERT_RESULT_BFC1));<br></blockquote><div><br>This is much easier to follow as a loop. Thanks for cleaning this up.<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
}<br>
brw_ENDIF(p);<br>
brw_pop_insn_state(p);<br>
@@ -139,20 +129,27 @@ static void do_twoside_color( struct brw_sf_compile *c )<br>
*/<br>
<br>
#define VERT_RESULT_COLOR_BITS (BITFIELD64_BIT(VERT_RESULT_COL0) | \<br>
- BITFIELD64_BIT(VERT_RESULT_COL1))<br>
+ BITFIELD64_BIT(VERT_RESULT_COL1))<br>
<br>
static void copy_colors( struct brw_sf_compile *c,<br>
struct brw_reg dst,<br>
- struct brw_reg src)<br>
+ struct brw_reg src,<br>
+ int allow_twoside)<br>
{<br>
struct brw_compile *p = &c->func;<br>
GLuint i;<br>
<br>
for (i = VERT_RESULT_COL0; i <= VERT_RESULT_COL1; i++) {<br>
- if (have_attr(c,i))<br>
+ if (have_attr(c,i)) {<br>
brw_MOV(p,<br>
get_vert_result(c, dst, i),<br>
get_vert_result(c, src, i));<br>
+<br>
+ } else if(allow_twoside && have_attr(c, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0)) {<br>
+ brw_MOV(p,<br>
+ get_vert_result(c, dst, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0),<br>
+ get_vert_result(c, src, i - VERT_RESULT_COL0 + VERT_RESULT_BFC0));<br>
+ }<br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
}<br>
}<br>
<br>
@@ -167,9 +164,19 @@ static void do_flatshade_triangle( struct brw_sf_compile *c )<br>
struct brw_compile *p = &c->func;<br>
struct intel_context *intel = &p->brw->intel;<br>
struct brw_reg ip = brw_ip_reg();<br>
- GLuint nr = _mesa_bitcount_64(c->key.attrs & VERT_RESULT_COLOR_BITS);<br>
GLuint jmpi = 1;<br>
<br>
+ GLuint nr;<br>
+<br>
+ if (c->key.do_twoside_color) {<br>
+ nr = ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL0) | BITFIELD64_BIT(VERT_RESULT_BFC0))) != 0) +<br>
+ ((c->key.attrs & (BITFIELD64_BIT(VERT_RESULT_COL1) | BITFIELD64_BIT(VERT_RESULT_BFC1))) != 0);<br>
+<br>
+ } else {<br>
+ nr = ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL0)) != 0) +<br>
+ ((c->key.attrs & BITFIELD64_BIT(VERT_RESULT_COL1)) != 0);<br>
+ }<br>
+<br>
if (!nr)<br>
return;<br>
<br>
@@ -186,16 +193,16 @@ static void do_flatshade_triangle( struct brw_sf_compile *c )<br>
brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr*2+1)));<br>
brw_JMPI(p, ip, ip, c->pv);<br>
<br>
- copy_colors(c, c->vert[1], c->vert[0]);<br>
- copy_colors(c, c->vert[2], c->vert[0]);<br>
+ copy_colors(c, c->vert[1], c->vert[0], c->key.do_twoside_color);<br>
+ copy_colors(c, c->vert[2], c->vert[0], c->key.do_twoside_color);<br>
brw_JMPI(p, ip, ip, brw_imm_d(jmpi*(nr*4+1)));<br>
<br>
- copy_colors(c, c->vert[0], c->vert[1]);<br>
- copy_colors(c, c->vert[2], c->vert[1]);<br>
+ copy_colors(c, c->vert[0], c->vert[1], c->key.do_twoside_color);<br>
+ copy_colors(c, c->vert[2], c->vert[1], c->key.do_twoside_color);<br>
brw_JMPI(p, ip, ip, brw_imm_d(jmpi*nr*2));<br>
<br>
- copy_colors(c, c->vert[0], c->vert[2]);<br>
- copy_colors(c, c->vert[1], c->vert[2]);<br>
+ copy_colors(c, c->vert[0], c->vert[2], c->key.do_twoside_color);<br>
+ copy_colors(c, c->vert[1], c->vert[2], c->key.do_twoside_color);<br>
<br>
brw_pop_insn_state(p);<br>
}<br>
@@ -224,10 +231,10 @@ static void do_flatshade_line( struct brw_sf_compile *c )<br>
<br>
brw_MUL(p, c->pv, c->pv, brw_imm_d(jmpi*(nr+1)));<br>
brw_JMPI(p, ip, ip, c->pv);<br>
- copy_colors(c, c->vert[1], c->vert[0]);<br>
+ copy_colors(c, c->vert[1], c->vert[0], 0);<br>
<br>
brw_JMPI(p, ip, ip, brw_imm_ud(jmpi*nr));<br>
- copy_colors(c, c->vert[0], c->vert[1]);<br>
+ copy_colors(c, c->vert[0], c->vert[1], 0);<br>
<br>
brw_pop_insn_state(p);<br>
}<br>
@@ -337,13 +344,17 @@ calculate_masks(struct brw_sf_compile *c,<br>
if (c->key.do_flat_shading)<br>
persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS) |<br>
BITFIELD64_BIT(VERT_RESULT_COL0) |<br>
- BITFIELD64_BIT(VERT_RESULT_COL1));<br>
+ BITFIELD64_BIT(VERT_RESULT_COL1) |<br>
+ BITFIELD64_BIT(VERT_RESULT_BFC0) |<br>
+ BITFIELD64_BIT(VERT_RESULT_BFC1));<br>
else<br>
persp_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_HPOS));<br>
<br>
if (c->key.do_flat_shading)<br>
linear_mask = c->key.attrs & ~(BITFIELD64_BIT(VERT_RESULT_COL0) |<br>
- BITFIELD64_BIT(VERT_RESULT_COL1));<br>
+ BITFIELD64_BIT(VERT_RESULT_COL1) |<br>
+ BITFIELD64_BIT(VERT_RESULT_BFC0) |<br>
+ BITFIELD64_BIT(VERT_RESULT_BFC1));<br>
else<br>
linear_mask = c->key.attrs;<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/brw_wm_pass2.c b/src/mesa/drivers/dri/i965/brw_wm_pass2.c<br>
index 27c0a94..48143f3 100644<br>
--- a/src/mesa/drivers/dri/i965/brw_wm_pass2.c<br>
+++ b/src/mesa/drivers/dri/i965/brw_wm_pass2.c<br>
@@ -96,9 +96,26 @@ static void init_registers( struct brw_wm_compile *c )<br>
if (c->key.vp_outputs_written & BITFIELD64_BIT(j)) {<br>
int fp_index = _mesa_vert_result_to_frag_attrib(j);<br>
<br>
+ /* Special case: two-sided vertex option, vertex program<br>
+ * only writes to the back color. Map it to the<br>
+ * associated front color location.<br>
+ */<br>
+ if (j >= VERT_RESULT_BFC0 && j <= VERT_RESULT_BFC1 &&<br>
+ intel->ctx.VertexProgram._TwoSideEnabled &&<br>
+ !(c->key.vp_outputs_written & BITFIELD64_BIT(j - VERT_RESULT_BFC0 + VERT_RESULT_COL0)))<br>
+ fp_index = j - VERT_RESULT_BFC0 + FRAG_ATTRIB_COL0;<br>
+<br>
nr_interp_regs++;<br>
+<br>
+ /* The back color slot is skipped when the front color is<br>
+ * also written to. In addition, some slots can be<br>
+ * written in the vertex shader and not read in the<br>
+ * fragment shader. So the register number must always be<br>
+ * incremented, mapped or not.<br>
+ */<br>
if (fp_index >= 0)<br>
- prealloc_reg(c, &c->payload.input_interp[fp_index], i++);<br>
+ prealloc_reg(c, &c->payload.input_interp[fp_index], i);<br>
+ i++;<br>
}<br>
}<br>
assert(nr_interp_regs >= 1);<br>
<span class="HOEnZb"><font color="#888888">--<br>
1.7.10.rc3.1.gb306<br>
<br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev" target="_blank">http://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</font></span></blockquote></div><br>