[Mesa-dev] [PATCH 2/2] i965: Disable SOL buffers and decls when not doing transform feedback.

Kenneth Graunke kenneth at whitecape.org
Thu Jul 3 13:52:06 PDT 2014


On Wednesday, July 02, 2014 10:09:49 AM Iago Toral wrote:
> On Wed, 2014-07-02 at 08:49 +0200, Iago Toral wrote:
> > Hello Kenneth, Steven:
> > 
> > On Tue, 2014-07-01 at 17:25 -0700, Kenneth Graunke wrote:
> > > Previously, we only emitted 3DSTATE_SO_BUFFER and 3DSTATE_SO_DECL_LIST
> > > when transform feedback was active.  When it was inactive, we disabled
> > > the SOL stage in 3DSTATE_SOL so the other state would be ignored.
> > > 
> > > In commit 3178d2474ae5bdd1102fb3d76a60d1d63c961ff5, Iago enabled the SOL
> > > stage universally, so we could implement the GL_PRIMITIVES_GENERATED
> > > statistics counter.  This caused every Piglit test to trigger assertions
> > > in the simulator, and apparently caused GPU hangs for some users.
> > > 
> > > Apparently, we're supposed to program 3DSTATE_SO_DECL_LIST to zero when
> > > output streams are inactive, but we hadn't been doing so.  Now that SOL
> > > is on, we need to do that properly.
> > 
> > If that is the case then I guess the workaround I was trying to use for
> > that hardware issue I mentioned in another e-mail may not be any good,
> > since it required to configure a non-zero 3DSTATE_SO_DECL_LIST while
> > outputting to a disabled stream... :(
> 
> My workaround patch works here in IvyBridge and keeps working with your
> patch since it only activates when TF is active, but since it relies on
> this stuff I think it would be best if someone tests it on other
> hardware when it gets reviewed.
> 
> > I never get any GPU hangs though...
> > 
> > > Experimentally, it looks like we also need to program 3DSTATE_SO_BUFFER
> > > to zero as well, or else I get many GPU hangs on Haswell.
> > > 
> > > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > > Cc: Iago Toral Quiroga <itoral at igalia.com>
> > > Cc: Chris Forbes <chrisf at ijw.co.nz>
> > > Cc: Steven Newbury <steve at snewbury.org.uk>
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_state.h      |  3 ++-
> > >  src/mesa/drivers/dri/i965/gen7_sol_state.c | 34 
++++++++++++++++++++++--------
> > >  src/mesa/drivers/dri/i965/gen8_sol_state.c |  2 +-
> > >  3 files changed, 28 insertions(+), 11 deletions(-)
> > > 
> > > Iago,
> > > 
> > > I noticed that pretty much every Piglit test broke in our simulation
> > > environment after your patch to turn on SOL by default.  Something about
> > > SO_DECLs not being right.  This patch seems to fix the issue.  Does it
> > > look reasonable to you?
> > 
> > The patch is producing GPU hangs for me in the case where I have a
> > geometry shader and no TF, so it seems that it introduces other
> > problems. I'll see if I can find the reason for this.
> 
> It looks like we have to send at least one decl per stream even if we
> are configuring the number of declarations per stream to 0, otherwise we
> get this GPU hang on IvyBridge at least.

Interesting.  I'm not seeing any documentation indicating that we should have 
to, but it seems like a fairly reasonable thing to do...

> This seems to contradict clearly the documentation for IvyBridge. From
> 8.5 3DSTATE_SO_DECL_LIST Command, dw2, bits 7:0, Num Entries[0]:
> 
> "It is legal to specify Num Entries = 0 for all four streams
> simultaneously. In this case there will be no SO_DECLs included in the
> command (only DW 0-2). Note that all Stream to Buffer Selects bits must
> be zero in this case (as no streams produce output).

Yeah - that's what I was looking at.  It /ought/ to work (and seems to on 
Haswell).

> Which is exactly what Kenneth's patch does, but that produces a GPU hang
> here.
> 
> At the same time, there is this other statement at the beginning of that
> same chapter:
> 
> "Errata: All 128 decls for all four streams must be included whenever
> this command is issued. The “Num Entries [n]” fields still contain the
> actual numbers of valid decls."
> 
> Not sure if that sentence is the fix to a previous errata or the errata
> itself... but in any case, we are never sending all 128 decls but it
> looks like at least on IvyBridge we have to send one or we get a GPU
> hang.

This is a documentation mistake.  Looking at the latest docs, that errata 
clearly only applies to early pre-production hardware.  Unfortunately, when 
they published the PRM, they stripped out the stepping information, and didn't 
strip out the entire errata. :(

> Kenneth, following the above, this change to your patch fixes the GPU
> hang I was running into:
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
> index 2013406..6d1b93c 100644
> --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
> @@ -108,8 +108,10 @@ gen7_upload_3dstate_so_decl_list(struct brw_context 
*brw,
>         *  all zero (no buffers written to) and the corresponding Num 
Entries
>         *  field to zero (no valid SO_DECLs)."
>         */
> -      BEGIN_BATCH(3);
> -      OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (3 - 2));
> +      BEGIN_BATCH(5);
> +      OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (5 - 2));
> +      OUT_BATCH(0);
> +      OUT_BATCH(0);
>        OUT_BATCH(0);
>        OUT_BATCH(0);
>        ADVANCE_BATCH();
> 
> Steven, can you try Kenneth's patch with this change too?
> 
> Iago

Cool, I can squash that in.  Thanks!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140703/0e8db4e2/attachment.sig>


More information about the mesa-dev mailing list