[Bug 90378] GPU lockups in Left 4 Dead 2

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu May 21 08:09:04 PDT 2015


https://bugs.freedesktop.org/show_bug.cgi?id=90378

--- Comment #2 from Daniel Scharrer <daniel at constexpr.org> ---
Created attachment 115951
  --> https://bugs.freedesktop.org/attachment.cgi?id=115951&action=edit
patch to revert LLVM r233366 (fixes lockups)

This seems to be a regression in llvm:
Mesa git + LLVM svn is bad
Mesa 10.5.5 + LLVM svn is bad
Mesa git + LLVM 3.6.0 is good (no lockups, no glitches)

With Mesa git, the lockups in the L4D2 apitrace linked above bisect to LLVM
r233366:

commit 9217916725713c00f17cb64123e8dffdae843eb7
Author: Andrew Trick <atrick at apple.com>
Date:   Fri Mar 27 06:10:13 2015 +0000

    Complete the MachineScheduler fix made way back in r210390.

    "Fix the MachineScheduler's logic for updating ready times for in-order.
     Now the scheduler updates a node's ready time as soon as it is
     scheduled, before releasing dependent nodes."

    This fix was only made in one variant of the ScheduleDAGMI driver.
    Francois de Ferriere reported the issue in the other bit of code where
    it was also needed.
    I never got around to coming up with a test case, but it's an
    obvious fix that shouldn't be delayed any longer.
    I'll try to refactor this code a little better.

    I did verify performance on a wide variety of targets and saw no
    negative impact with this fix.

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233366
91177308-0d34-0410-b5e6-96231b3b80d8


I had to revert b8797a7 and a99a16a in Mesa for it to build against that LLVM
revision.

Besides the arch-specific test files, r233366 only moves one line of code
around. Reverting that on current LLVM (see attached patch) also fixes the
lockups.

As with bug #90510, R600_DEBUG=switch_on_eop gets rid of the glitches, and also
prevents the crashes as well. Not sure if that means it could be a bug in Mesa
or if that just hides the LLVM bug.

While bisecting for the lockup, I noticed the glitches were also introduced in
LLVM after 3.6.0, but not by the same revision - f74b5c6 (r231401) has no
lockups but does have glitches. I'll bisect that for bug #88561 as the glitches
in the latest Talos apitrace there also seem to come from that commit range.
(The GPU faults - bug #87278 - seem to have yet another cause, being present
even with LLVM 3.6.0.)

NB: I also noticed that with compositing enabled in KWin, the system is not
able to recover from the GPU lockups (and eventually does not even respond to
SSH or SysRq). With compositing disabled the GPU is almost always reset
successfully and the game / glretrace can continue as if nothing happened.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20150521/55d34ebc/attachment.html>


More information about the dri-devel mailing list