[Mesa-dev] Sandy Bridge heisenbug

Chris Wilson chris at chris-wilson.co.uk
Thu Mar 31 14:05:19 PDT 2011


On Thu, 31 Mar 2011 15:01:53 -0500, Ian Pilcher <arequipeno at gmail.com> wrote:
> On 03/31/2011 02:53 PM, Chris Wilson wrote:
> > https://bugs.freedesktop.org/show_bug.cgi?id=35820 perhaps?
> 
> I don't think so.  My system doesn't hang, and I don't get any X crash,
> just messages in the syslog (see below).
> 
> I'm not expecting a quick fix.  Just hoping for some ideas on gathering
> the necessary information to debug the problem.
> 
> Thanks!
> 
> Mar 30 20:47:16 ian kernel: [  233.620717] [drm:i915_hangcheck_elapsed]
> *ERROR*
> Hangcheck timer elapsed... GPU hung

Ah. You will want

kernel commit 91355834646328e7edc6bd25176ae44bcd7386c7
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Fri Mar 4 19:22:40 2011 +0000

    drm/i915: Do not overflow the MMADDR write FIFO
    
    Whilst the GT is powered down (rc6), writes to MMADDR are placed in a
    FIFO by the System Agent. This is a limited resource, only 64 entries, of
    which 20 are reserved for Display and PCH writes, and so we must take
    care not to queue up too many writes. To avoid this, there is counter
    which we can poll to ensure there are sufficient free entries in the
    fifo.
    
    "Issuing a write to a full FIFO is not supported; at worst it could
    result in corruption or a system hang."
    
    Reported-and-Tested-by: Matt Turner <mattst88 at gmail.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34056
    Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>

The trick in this case was to manually mark the GPU as wedged and so
trigger the i915_error_state capture.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the mesa-dev mailing list