XOrg freeze that affects a lot of people

Charles Goodwin charlie at vexi.org
Fri Mar 18 07:36:30 PST 2005


There is an XOrg freezing problem that is afflicting a _lot_ of people.

The typical symptom is that when using Firefox the screen will freeze
although the mouse cursor will still move.  However, it's not limited to
Firefox, having experienced it with gvim myself and seen multiple (but
less numerous) reports of the issue when using Opera and Konqueror.  (A
little strange that it's mainly browser usage that induces the freeze.)

You can still ssh in and kill X to recover the computer sans rebooting.

It has been reported with XFree86 too.

The problem is in fact driver related.  It afflicted me when I upgraded
to the most recent (7167) proprietary nvidia drivers, and downgrading
fixed things for my desktop.  Many of the sufferers use Ati cards and
there appears to be no standard version of the Ati drivers that works,
much more problematic than the nvidia drivers where version prior to
7167 seem to have generally been stable for the majority of users.

It seems to be limited (from what I've seen) to the 2.6 Linux kernel.

Why is this really relevant to XOrg?

Well, I described the high-level symptom.  The low level symptom is that
a message not dissimlar to this occurs in /var/log/everything/current:

Mar 18 12:50:48 [kernel] NVRM: Xid: 13, 0000 02005600 00000056 00000c28
01be0078 00000080

And instantly XOrg freezes and consumes 99% of the CPU.

I'm just curious as to whether it might be possible for XOrg to catch
whatever-the-problem-with-the-driver-is and gracefully handle it.  If
this were possible, even if X were to exit and give a nice error which
could be used to debug the situation (by reporting the error to the
driver developers), it would save a lot of pain for a lot of people.
Most users are left having to do a hard reboot (since a lot of new Linux
users are tech savvy enough to use ssh) and many get disillusioned and
end up going back to whatever-OS they used before.

Sources:

* Official nvidia forum thread on the issue with the latest drivers:
    http://www.nvnews.net/vbulletin/showthread.php?t=47502

* Gentoo forum threads (to illustrate the wide-reaching base of people
  affected by this problem:
    http://forums.gentoo.org/viewtopic-t-198023.html
    http://forums.gentoo.org/viewtopic-t-231134.html
    http://forums.gentoo.org/viewtopic-t-309020.html

I could dig up endlessly more links on complaints about this.  The only
solutions I've ever seen are to upgrade or downgrade to driver x.y.z or
to disable RenderAccel in xorg.conf.  I also believe this problem comes
in more guises, having seen more varied issues reported as being fixed
by these solutions.  Also, this is a problem that has been ongoing for
(at least) up to a year, judging by the reports I've come across.

I know this is all a bit vague, but it's difficult to give any kind of
info other than the experience / solution(s) unless it's actually
happening to a developer.

- Charlie

Charles Goodwin <charlie at vexi.org>




More information about the xorg mailing list