[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

Ben Skeggs skeggsb at gmail.com
Tue Apr 24 17:32:07 PDT 2012


On Tue, 2012-04-24 at 21:31 +0200, Marcin Slusarz wrote:
> On Mon, Apr 23, 2012 at 06:56:44PM +0200, Martin Peres wrote:
> > Le 23/04/2012 18:32, Marcin Slusarz a écrit :
> > >
> > > Just run piglit. Even "quick" tests can cause ~5 lockups (it eventually messes
> > > up DDX channel, but this patchset can't fix this case).
> > > You can run fs-discard-exit-2 test first - for me it causes instant GPU lockup.
> > >
> > > Marcin
> > Great, Thanks.
> > 
> > Did you have a look at 
> > https://bugs.freedesktop.org/show_bug.cgi?id=40886 and 
> > http://xorg.freedesktop.org/wiki/SummerOfCodeIdeas ?
> 
> Yeah, I've seen them some time ago.
> 
> > The Ubuntu xorg devs were looking for something like this, but they also 
> > wanted a lockup report. Are you also interested on working on it ?
As I argued at XDC last year, I really question the usefulness of
something like this.  We have stupidly HUGE amounts of state that could
be relevant, and the situations where we even need something like this
are RARE.

I don't want this useless crap in our kernel module just because some
random distro thinks it's so useful, when it's not.  On the very very
rare (I can think of one situation where we've wanted these register
dumps, and they weren't useful even then) occasions we need this info,
we can ask people to install envytools and grab it..

We have a GPU with *very* good error reporting, and we log this to
dmesg.  This is good enough.  Any random errorless lockups are much
harder, and unless you dump *all* the card state right from the memory
controllers, to the clocks, to PFIFO to the particular engine that's
involved.. It's going to be useless.  The problem could be anything.

> 
> Yes, when this patchset will be applied, I'm going to work on improving
> error reporting.
Assuming you're not talking about a register-dump style lockup report
like above, this could be good.  Particularly, fleshing out and
improving/completing each engine's IRQ handlers (which will probably
have the nice side-effect of surviving a few more errors without locking
up) :)

Cheers,
Ben.

> 
> Marcin
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel




More information about the dri-devel mailing list