[Mesa-dev] [PATCH 2/2] i965: Disable SOL buffers and decls when not doing transform feedback.
kenneth at whitecape.org
Thu Jul 3 13:52:06 PDT 2014
On Wednesday, July 02, 2014 10:09:49 AM Iago Toral wrote:
> On Wed, 2014-07-02 at 08:49 +0200, Iago Toral wrote:
> > Hello Kenneth, Steven:
> > On Tue, 2014-07-01 at 17:25 -0700, Kenneth Graunke wrote:
> > > Previously, we only emitted 3DSTATE_SO_BUFFER and 3DSTATE_SO_DECL_LIST
> > > when transform feedback was active. When it was inactive, we disabled
> > > the SOL stage in 3DSTATE_SOL so the other state would be ignored.
> > >
> > > In commit 3178d2474ae5bdd1102fb3d76a60d1d63c961ff5, Iago enabled the SOL
> > > stage universally, so we could implement the GL_PRIMITIVES_GENERATED
> > > statistics counter. This caused every Piglit test to trigger assertions
> > > in the simulator, and apparently caused GPU hangs for some users.
> > >
> > > Apparently, we're supposed to program 3DSTATE_SO_DECL_LIST to zero when
> > > output streams are inactive, but we hadn't been doing so. Now that SOL
> > > is on, we need to do that properly.
> > If that is the case then I guess the workaround I was trying to use for
> > that hardware issue I mentioned in another e-mail may not be any good,
> > since it required to configure a non-zero 3DSTATE_SO_DECL_LIST while
> > outputting to a disabled stream... :(
> My workaround patch works here in IvyBridge and keeps working with your
> patch since it only activates when TF is active, but since it relies on
> this stuff I think it would be best if someone tests it on other
> hardware when it gets reviewed.
> > I never get any GPU hangs though...
> > > Experimentally, it looks like we also need to program 3DSTATE_SO_BUFFER
> > > to zero as well, or else I get many GPU hangs on Haswell.
> > >
> > > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > > Cc: Iago Toral Quiroga <itoral at igalia.com>
> > > Cc: Chris Forbes <chrisf at ijw.co.nz>
> > > Cc: Steven Newbury <steve at snewbury.org.uk>
> > > ---
> > > src/mesa/drivers/dri/i965/brw_state.h | 3 ++-
> > > src/mesa/drivers/dri/i965/gen7_sol_state.c | 34
> > > src/mesa/drivers/dri/i965/gen8_sol_state.c | 2 +-
> > > 3 files changed, 28 insertions(+), 11 deletions(-)
> > >
> > > Iago,
> > >
> > > I noticed that pretty much every Piglit test broke in our simulation
> > > environment after your patch to turn on SOL by default. Something about
> > > SO_DECLs not being right. This patch seems to fix the issue. Does it
> > > look reasonable to you?
> > The patch is producing GPU hangs for me in the case where I have a
> > geometry shader and no TF, so it seems that it introduces other
> > problems. I'll see if I can find the reason for this.
> It looks like we have to send at least one decl per stream even if we
> are configuring the number of declarations per stream to 0, otherwise we
> get this GPU hang on IvyBridge at least.
Interesting. I'm not seeing any documentation indicating that we should have
to, but it seems like a fairly reasonable thing to do...
> This seems to contradict clearly the documentation for IvyBridge. From
> 8.5 3DSTATE_SO_DECL_LIST Command, dw2, bits 7:0, Num Entries:
> "It is legal to specify Num Entries = 0 for all four streams
> simultaneously. In this case there will be no SO_DECLs included in the
> command (only DW 0-2). Note that all Stream to Buffer Selects bits must
> be zero in this case (as no streams produce output).
Yeah - that's what I was looking at. It /ought/ to work (and seems to on
> Which is exactly what Kenneth's patch does, but that produces a GPU hang
> At the same time, there is this other statement at the beginning of that
> same chapter:
> "Errata: All 128 decls for all four streams must be included whenever
> this command is issued. The “Num Entries [n]” fields still contain the
> actual numbers of valid decls."
> Not sure if that sentence is the fix to a previous errata or the errata
> itself... but in any case, we are never sending all 128 decls but it
> looks like at least on IvyBridge we have to send one or we get a GPU
This is a documentation mistake. Looking at the latest docs, that errata
clearly only applies to early pre-production hardware. Unfortunately, when
they published the PRM, they stripped out the stepping information, and didn't
strip out the entire errata. :(
> Kenneth, following the above, this change to your patch fixes the GPU
> hang I was running into:
> diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c
> index 2013406..6d1b93c 100644
> --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
> @@ -108,8 +108,10 @@ gen7_upload_3dstate_so_decl_list(struct brw_context
> * all zero (no buffers written to) and the corresponding Num
> * field to zero (no valid SO_DECLs)."
> - BEGIN_BATCH(3);
> - OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (3 - 2));
> + BEGIN_BATCH(5);
> + OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (5 - 2));
> + OUT_BATCH(0);
> + OUT_BATCH(0);
> Steven, can you try Kenneth's patch with this change too?
Cool, I can squash that in. Thanks!
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 819 bytes
Desc: This is a digitally signed message part.
More information about the mesa-dev