[Intel-gfx] [PATCH] lib/rendercopy_gen9: Setup Push constant pointer before sending BTP commands

Daniel Vetter daniel at ffwll.ch
Fri Aug 14 01:58:27 PDT 2015


On Thu, Aug 13, 2015 at 03:49:35PM -0700, Ben Widawsky wrote:
> On Thu, Aug 13, 2015 at 10:33:00AM +0300, Joonas Lahtinen wrote:
> > Hi,
> > 
> > On ke, 2015-08-12 at 18:35 -0700, Ben Widawsky wrote:
> > > On Wed, Aug 12, 2015 at 03:10:18PM +0300, Joonas Lahtinen wrote:
> > > > On ke, 2015-08-12 at 12:26 +0100, Arun Siluvery wrote:
> > > > > From Gen9, by default push constant command is not committed to 
> > > > > the 
> > > > > shader unit
> > > > > untill the corresponding shader's BTP_* command is parsed. This 
> > > > > is 
> > > > > the
> > > > > behaviour when set shader is enabled. This patch updates the 
> > > > > batch to 
> > > > > follow
> > > > > this requirement otherwise it results in gpu hang.
> > > > > 
> > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89959
> > > > > 
> > > > > Set shader need to be disabled if legacy behaviour is required.
> > > > > 
> > > > > Cc: Ben Widawsky <benjamin.widawsky at intel.com>
> > > > > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > > > > Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > > > > Signed-off-by: Arun Siluvery <arun.siluvery at linux.intel.com>
> > > > 
> > > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > > > 
> > > 
> > > Repeating what I said on the mesa thread:
> > > Does anyone understand why this actually causes a hang on the IGT 
> > > test? I
> > > certainly don't. The docs are pretty clear that the constant command 
> > > is not
> > > committed until the BTP command, but I can't make any sense of how it 
> > > related to
> > > a GPU hang.
> > > 
> > 
> > Changing the order makes the hang go away and come back for sure, we've
> > all been experiencing that. System validation said it is a programming
> > restriction, so I'm not sure how relevant it is what goes wrong if it's
> > not followed. And the legacy mode bits were added to support the old
> > behavior of having the order like it has been previously, so I do not
> > see why question it without visibility to the actual RTL. And enabling
> > the legacy bits makes the hang go away, too.
> > 
> > If I had the RTL sources, then it would be more relevant to take
> > educated guesses as to why a set of hundreds of thousands of
> > transistors doesn't work as it should :) Without that, if it gets
> > stuck, it gets stuck.
> > 
> > Regards, Joonas
> > 
> 
> Let me start by saying I do believe that questioning this shouldn't prevent
> merging the patch.
> 
> <rant>
> I absolutely disagree with you and think we should question these kind of things
> and get out of the mindset of, "well, it fixes a hang, let's move on."
> Understanding these kind of things is critical to writing stable drivers.  If
> the programming guide/SV team said it can lead to a hang, that's one thing, but
> AFAICT, we do not understand why it is hanging nor does any of the documentation
> we do have suggest it should hang. Without clarification, next time we have a
> similar hang signature we're going to be right back here where we started.
> 
> It was one thing when there were a handful of us working on the stuff and we
> didn't have time to get to the bottom of bugs like this. I'm guilty of patches
> like this myself. I really do not see any excuse for this any more though.
> </rant>
> 
> Could you send me the reference for where SV said it was a "programming
> restriction"? To me it all sounds very much like an implementation detail, and
> I'd like to try to understand what I am missing.

Fully agree, we can't just ignore hangs but have to bottom out on them.
And in the past we've had piles of examples where we discovered something
and didn't chase it correctly, then a few months later massive panic in
intel. At least this must be reflected in bspec.

Joonas, can you please work together with Ben to chase this down? If you
don't get traction please escalate the shit out of this,  we really must
at least get docs to accurately reflect reality.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list