On 15 December 2011 08:02, Eric Anholt <<a href="mailto:eric@anholt.net">eric@anholt.net</a>> wrote: <div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div class="im">On Thu, 15 Dec 2011 00:00:49 +0100, Marek Olšák <<a href="mailto:maraeo@gmail.com">maraeo@gmail.com</a>> wrote: > On Wed, Dec 14, 2011 at 11:25 PM, Paul Berry <<a href="mailto:stereotype441@gmail.com">stereotype441@gmail.com</a>> wrote: </div><div class="im">> > (c) Do nothing, and rely on programmers to remember that RasterDiscard is an > > exception to the usual correspondence between dirty bits and substructures > > of gl_context. > > > > I'm really not comfortable with (c) because of the risk of future bugs. I > > suppose I could be talked into (b) if there's popular support for it, but > > it's not my favourite, because as I said earlier, I think there are actually > > a lot of good reasons to think of rasterizer discard as related to transform > > feedback. My preference is to do (a). > > (d) Rework the _NEW_* flags such that they roughly match hardware > state groups, not OpenGL state groups. Direct3D 11 and Gallium are two > examples of how it could be done. </div>The problem is that everyone disagrees on what "hardware state group" a piece of state is in. On i965, rasterizer discard is really in the transform feedback state -- the SOL (transform feedback) unit on gen7, and the GS on gen6. </blockquote></div> I have been thinking about this more this morning, and I have an idea for how to accomplish (d) that I think would address this problem. It's not a trivial change, but it's something we could implement incrementally, so we apply it to rasterizer discard now, and over time extend it to cover other pieces of state. Here's the idea: The key problem is that there are so many distinct pieces of state that we could never possibly assign a separate bit to each one--we would run out of space in the bitfield. So instead of having core Mesa decide how they are grouped (and, inevitably, wind up grouping them in a way that doesn't work well for some drivers), let each driver decide how they are grouped. The drivers communicate this grouping to core Mesa by populating a new data structure (at initialization time) called ctx->StateFlags. ctx->StateFlags has an entry for each distinct piece of state, which tells which bits in ctx->NewState should be set when that state changes. So, for example, in BeginTransformFeedback() and EndTransformFeedback(), instead of doing this: FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK); We would do this: FLUSH_VERTICES(ctx, ctx->StateFlags->TransformFeedback_Active); In PauseTransformFeedback() and ResumeTransformFeedback() we would do: FLUSH_VERTICES(ctx, ctx->StateFlags->TransformFeedback_Paused); And in enable.c, when rasterizer discard is turned on or off, we would do: FLUSH_VERTICES(ctx, ctx->StateFlags->RasterizerDiscard); In the i965 driver, where all of these features map to the GS stage of the pipeline, we would initialize TransformFeedback_Active, TransformFeedback_Paused, and RasterizerDiscard all to the same value. In the r600 driver, where rasterizer discard is implemented using face culling, StateFlags->RasterizerDiscard would indicate a totally different bit than those used for transform feedback. In the short term, we could implement this technique just for rasterizer discard, to address the differences between r600 and i965 that we're discussing in this email thread. In the long term, our goal would be to replace all of the _NEW_* constants with a fine-grained set of values in StateFlags. Once we've done that, each driver can set up StateFlags in a way that precisely matches how state is grouped for that particular piece of hardware. What do y'all think? If there's support for this idea I'd be glad to make an RFC patch.