[Intel-gfx] [PATCH] prime_self_import: Assure no pending requests before object counting

Daniel Vetter daniel at ffwll.ch
Fri Nov 1 20:07:52 CET 2013


On Fri, Nov 01, 2013 at 11:52:48AM -0700, Ben Widawsky wrote:
> On Fri, Nov 01, 2013 at 07:47:37PM +0100, Daniel Vetter wrote:
> > On Fri, Nov 01, 2013 at 11:44:40AM -0700, Ben Widawsky wrote:
> > > On Fri, Nov 01, 2013 at 07:42:59PM +0100, Daniel Vetter wrote:
> > > > On Fri, Nov 01, 2013 at 09:18:51AM -0700, Ben Widawsky wrote:
> > > > > On Fri, Nov 01, 2013 at 05:08:17PM +0100, Daniel Vetter wrote:
> > > > > > On Fri, Nov 01, 2013 at 12:53:42PM +0000, oscar.mateo at intel.com wrote:
> > > > > > > From: Oscar Mateo <oscar.mateo at intel.com>
> > > > > > > 
> > > > > > > We don't want a previously used object to be freed in the middle of a
> > > > > > > before/after object counting operation (or we would get a "-1 objects
> > > > > > > leaked" message). We have seen this happening, e.g., when a context
> > > > > > > from a previous run dies, but its backing object is alive waiting for
> > > > > > > a retire_work to kick in.
> > > > > > > 
> > > > > > > Signed-off-by: Oscar Mateo <oscar.mateo at intel.com>
> > > > > > > Cc: Ben Widawsky <ben at bwidawsk.net>
> > > > > > 
> > > > > > Nice catch. Should we do this in general as part of our gem_quiescent_gpu
> > > > > > helper? All i-g-t testcase are written under the assumption that they
> > > > > > completel own the gpu and that the gtt is completely empty besides the few
> > > > > > driver-allocated and pinned objects. So trying really hard to get rid of
> > > > > > any residual stuff sounds like a good idea.
> > > > > 
> > > > > I was going to address this in the other mail thread.... in any case, I
> > > > > think not. I believe a separate helper is the way to go, and we should
> > > > > only call it when we absolutely want to.
> > > > > 
> > > > > Though it's not the intention, I've seen many tests fail because of
> > > > > previous state, and I don't want to miss out on those in the future. It
> > > > > would also slow down the run unnecessarily further.
> > > > 
> > > > We already do rather eregious stuff in quiescent. So I want hard numbers
> > > > on your claim that it slows down stuff further - there really shouldn't be
> > > > much at all to retire/evict.
> > > > -Daniel
> > > 
> > > I don't like any of those arbitrary calls to quiescent either fwiw.
> > > 
> > > Can't I make the same demand for data BEFORE we merge the patch that it
> > > doesn't slow anything down?
> > 
> > All those "arbitrary calls to quiescent" actually fixed spurious igt
> > failures. igts are written under the assumption that _nothing_ else is
> > going on in gpu-land, since otherwise it's just impossible to hit some
> > races. So this is matter of correctness first and speed second.
> > -Daniel
> 
> They should be called where there are "spurious" errors and have an
> understanding why it's required to do. Sprinkling synchronizing code all
> over the place and calling it a fix is false. It's a "workaround" at
> best, but more likely dearth of time to do it properly. I can live with
> either honestly. I can't live with the statement that it's the proper
> thing to do.
> 
> Very few tests we have will actually care that _nothing_ else is
> running, and if they do, annotations in code via quiescent calls is a
> nice way to document it.

Atm a call to quiescent_gpu on an idle machine takes roughly 25us (in a
loop of 100k, snb laptop). You're optimizing the wrong thing.

Also, as long as everyone bitches and moans about igt tests being unstable
I'm leaning _massively_ towards stable tests results. And I've really seen
too many igt tests fail spuriously so that I've decided to go back to
an unconditional to quiescent_gpu (it wasn't like that originally).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list