[Intel-gfx] [PATCH] lib/rendercopy_gen9: Setup Push constant pointer before sending BTP commands

Ben Widawsky benjamin.widawsky at intel.com
Thu Aug 13 15:49:35 PDT 2015


On Thu, Aug 13, 2015 at 10:33:00AM +0300, Joonas Lahtinen wrote:
> Hi,
> 
> On ke, 2015-08-12 at 18:35 -0700, Ben Widawsky wrote:
> > On Wed, Aug 12, 2015 at 03:10:18PM +0300, Joonas Lahtinen wrote:
> > > On ke, 2015-08-12 at 12:26 +0100, Arun Siluvery wrote:
> > > > From Gen9, by default push constant command is not committed to 
> > > > the 
> > > > shader unit
> > > > untill the corresponding shader's BTP_* command is parsed. This 
> > > > is 
> > > > the
> > > > behaviour when set shader is enabled. This patch updates the 
> > > > batch to 
> > > > follow
> > > > this requirement otherwise it results in gpu hang.
> > > > 
> > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89959
> > > > 
> > > > Set shader need to be disabled if legacy behaviour is required.
> > > > 
> > > > Cc: Ben Widawsky <benjamin.widawsky at intel.com>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > > > Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > > > Signed-off-by: Arun Siluvery <arun.siluvery at linux.intel.com>
> > > 
> > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > > 
> > 
> > Repeating what I said on the mesa thread:
> > Does anyone understand why this actually causes a hang on the IGT 
> > test? I
> > certainly don't. The docs are pretty clear that the constant command 
> > is not
> > committed until the BTP command, but I can't make any sense of how it 
> > related to
> > a GPU hang.
> > 
> 
> Changing the order makes the hang go away and come back for sure, we've
> all been experiencing that. System validation said it is a programming
> restriction, so I'm not sure how relevant it is what goes wrong if it's
> not followed. And the legacy mode bits were added to support the old
> behavior of having the order like it has been previously, so I do not
> see why question it without visibility to the actual RTL. And enabling
> the legacy bits makes the hang go away, too.
> 
> If I had the RTL sources, then it would be more relevant to take
> educated guesses as to why a set of hundreds of thousands of
> transistors doesn't work as it should :) Without that, if it gets
> stuck, it gets stuck.
> 
> Regards, Joonas
> 

Let me start by saying I do believe that questioning this shouldn't prevent
merging the patch.

<rant>
I absolutely disagree with you and think we should question these kind of things
and get out of the mindset of, "well, it fixes a hang, let's move on."
Understanding these kind of things is critical to writing stable drivers.  If
the programming guide/SV team said it can lead to a hang, that's one thing, but
AFAICT, we do not understand why it is hanging nor does any of the documentation
we do have suggest it should hang. Without clarification, next time we have a
similar hang signature we're going to be right back here where we started.

It was one thing when there were a handful of us working on the stuff and we
didn't have time to get to the bottom of bugs like this. I'm guilty of patches
like this myself. I really do not see any excuse for this any more though.
</rant>

Could you send me the reference for where SV said it was a "programming
restriction"? To me it all sounds very much like an implementation detail, and
I'd like to try to understand what I am missing.

> > [snip]
> > 
> > ---
> > Ben Widawsky, Intel Open Source Technology Center

-- 
Ben Widawsky, Intel Open Source Technology Center


More information about the Intel-gfx mailing list