[Mesa-dev] [PATCH 21/22] i965/gen7: Generalize gen7_vs_state in preparation for GS.
Kenneth Graunke
kenneth at whitecape.org
Thu Sep 5 11:40:36 PDT 2013
On 09/01/2013 07:45 AM, Paul Berry wrote:
> On 29 August 2013 21:31, Kenneth Graunke wrote:
[snip]
> > I definitely don't want to share portions of 3DSTATE_VS.
>
> In an effort to help us make a more informed decision about this, I've
> investigated the following questions:
>
> 1. Are there any historical commits that would have been simpler if we
> had shared code between VS and FS in the way that we're currently
> discussing sharing code between VS and GS? Are there any that would
> have been more complex?
These are good questions to be asking - thanks for looking into this.
> Are there any historical commits where we made
> mistakes because the code wasn't shared (e.g. changed one function but
> forgot to change the other)? I looked just at gen7, and I looked at
> patches since the beginning of 2012. I considered both the code sharing
> that I've proposed in this patch as well as the counter-proposal to just
> share the code for constant emission.
>
> These two commits would have been helped by the code sharing I've
> proposed; they would also have been helped if we just shared code for
> constant emission:
>
> e6893b9 (Set MOCS L3 cacheability for IVB/BYT (v2))
> 2273b65 (Change L3 MOCS of 3DSTATE_CONSTANT_VS/PS)
>
> These two commits would have been helped by the code sharing I've
> proposed, but not by sharing code for constant emission.
>
> decc708 (Upload separate per-stage sampler state tables)
> f5a690c (i965: Split sampler count variable to be per-stage)
>
> In all of the above cases the savings would have been small--one diff
> hunk instead of two.
In the context of the whole MOCS series, the savings are fairly minimal.
We had to set MOCS fields in 10 different places:
brw_emit_vertices
gen6_blorp_emit_vertices
gen7_blorp_emit_surface_state
gen7_blorp_emit_constant_ps
gen7_blorp_emit_depth_stencil_config
gen7_emit_depth_stencil_hiz
gen7_vs_state (1 line)
gen7_wm_state
gen7_update_texture_surface
gen7_update_renderbuffer_surface
If geometry shaders had landed first, we would have had to add MOCS to
3DSTATE_GS as well. This would have meant adding 1 additional line to a
+32 -9 series.
If these are the only patches that really would have been affected in
the last year, and the savings are 1-2 lines of code, I think that shows
that the sharing is not significantly useful.
> I didn't find any commits that would have been more complex under either
> my or Eric's proposal. I didn't find any instances of past mistakes
> where we modified one function but not the other.
>
>
>
> 2. If we start sharing the code to populate the first 4 DWORDs of
> 3DSTATE_{VS,GS} now, is it likely that in the future we'll be able to
> extend the code sharing to 3DSTATE_{HS,DS,PS}? To answer this question,
> I looked at the documentation for all of these commands in Gen7 (IVB and
> HSW). I found these differences:
>
> - 3DSTATE_HS uses a completely different layout from the others; for
> example its kernel start pointer isn't until dw3, whereas the other
> stages use dw1.
>
> - dw2[26] is "denormal mode" for PS; it's MBZ for VS, DS, and GS.
>
> - dw2[17] is "thread priority" for HSW only on VS, DS, and PS; it's
> "thread priority" for both HSW and IVB on GS. I suspect this is a
> documentation bug, and it's really meant to be HSW only for all stages.
>
> - dw2[15:14] are "rounding mode" on PS only.
>
> - "Accesses UAV" (a HSW-only bit) is on dw2[12] for VS and GS, dw2[14]
> for DS, and dw4[5] for PS. I suspect this bit is related to
> ARB_shader_image_load_store functionality, but I'm not certain.
>
> - dw2[11] is "mask stack exception enable" on GS and PS; it's MBZ on VS
> and DS.
>
> At the moment we never set any of the bits that differ between VS, DS,
> GS, and PS, so we could share the code to populate dw0-3 between those 4
> stages without having to make stage-specific exceptions. It's possible
> that we might have to make stage-specific exceptions in the future if we
> ever decide to set some of those bits, but the only likely candidate
> seems to be "Accesses UAV". It's pretty clear that we couldn't share
> any code to populate dw0-3 of 3DSTATE_HS.
We will need to support UAVs eventually (they're an upcoming GL
feature), but I don't know when or on what platforms.
Since 3DSTATE_HS isn't shareable, this means that we need to add
3DSTATE_GS today, and 3DSTATE_DS in a little while.
I'm still not convinced that the added complexity is worth the benefit.
I would still like to see 3DSTATE_VS, 3DSTATE_GS emitted within a
single function, in one single BEGIN...OUT...ADVANCE block.
--Ken
More information about the mesa-dev
mailing list