[Bug 101403] New: gpu couldn't recovery after running drv_hangman in dom)

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Jun 13 06:10:26 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=101403

            Bug ID: 101403
           Summary: gpu couldn't recovery after running drv_hangman in
                    dom)
           Product: DRI
           Version: DRI git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: xiong.y.zhang at intel.com
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Environment:
Xen: upstream 4.9
kernel: drm-intel-nightly
4f89fbd drm-tip: 2017y-06m-12d-19h-18m-52s UTC integration manifest
intel-gpu-tools: master branch
d1ea0c0 gem_wsim: More interesting workloads

Issues:
After running igt at drv_hangman, gpu couldn't recovery on dom0, then send reboot
command through SSH, dom0 couldn't reboot as Xorg couldn't be terminated and
blocked there,  I have to press power button to shutdown machine.

How to reproduce:
1) compile upstream xen 4.9, compile drm-intel-nightly kernel
2) drm-intel-nightly as dom0's kernel, use xen 4.9 to boot it
3) After dom0 boot up, run igt at tests@drv_hangman through ssh
4) After running drv_hangman, gpu couldn't recovery and dom0's desktop is
frozen
5) send reboot command to dom0 through ssh, but dom0 couldn't reboot as xorg
blocked there. And I have to press power button to shutdown dom0.

Experiments:
1) this only happens on xen environment, native environment doesn't have such
issue.
2) this is an kernel regression. 4.10 kernel doesn't have such issue, 4.11
kernel has this issue and drm-intel-night has this too.
git bisect tell me the first bad commit is: 
commit 4c9655436522eaf4ba35572851150ccb71f3866e
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Tue Jan 17 17:59:01 2017 +0200

    drm/i915: Move engine reset preparation to i915_gem_reset_prepare()

    Now that we have prepare/finish routines for the GEM reset, move the
    disabling of the engine->irq_tasklet into them to reduce repetition. The
    device irq enable/disable is split out to ensure it is run first and
    last always (even if the GPU reset fails).

    Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala at intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
    Link:
http://patchwork.freedesktop.org/patch/msgid/1484668747-9120-1-git-send-email-mika.kuoppala@intel.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20170613/0c44e673/attachment.html>


More information about the intel-gfx-bugs mailing list