[Bug 99025] [KVM][GVT-d] [BDW & SKL ]Ubuntu 16.04 guest boot up with kernel panic with the newest 4.9.0+ drm-intel kernel

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Feb 14 05:11:07 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=99025

--- Comment #16 from Terrence Xu <terrence.xu at intel.com> ---
The first bad commit as below:

commit 821ed7df6e2a1dbae243caebcfe21a0a4329fca0
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Fri Sep 9 14:11:53 2016 +0100

    drm/i915: Update reset path to fix incomplete requests

    Update reset path in preparation for engine reset which requires
    identification of incomplete requests and associated context and fixing
    their state so that engine can resume correctly after reset.

    The request that caused the hang will be skipped and head is reset to the
    start of breadcrumb. This allows us to resume from where we left-off.
    Since this request didn't complete normally we also need to cleanup elsp
    queue manually. This is vital if we employ nonblocking request
    submission where we may have a web of dependencies upon the hung request
    and so advancing the seqno manually is no longer trivial.

    ABI: gem_reset_stats / DRM_IOCTL_I915_GET_RESET_STATS

    We change the way we count pending batches. Only the active context
    involved in the reset is marked as either innocent or guilty, and not
    mark the entire world as pending. By inspection this only affects
    igt/gem_reset_stats (which assumes implementation details) and not
    piglit.

    ARB_robustness gives this guide on how we expect the user of this
    interface to behave:

     * Provide a mechanism for an OpenGL application to learn about
       graphics resets that affect the context.  When a graphics reset
       occurs, the OpenGL context becomes unusable and the application
       must create a new context to continue operation. Detecting a
       graphics reset happens through an inexpensive query.

    And with regards to the actual meaning of the reset values:

       Certain events can result in a reset of the GL context. Such a reset
       causes all context state to be lost. Recovery from such events
       requires recreation of all objects in the affected context. The
       current status of the graphics reset state is returned by

        enum GetGraphicsResetStatusARB();

       The symbolic constant returned indicates if the GL context has been
       in a reset state at any point since the last call to
       GetGraphicsResetStatusARB. NO_ERROR indicates that the GL context
       has not been in a reset state since the last call.
       GUILTY_CONTEXT_RESET_ARB indicates that a reset has been detected
       that is attributable to the current GL context.
       INNOCENT_CONTEXT_RESET_ARB indicates a reset has been detected that
       is not attributable to the current GL context.
       UNKNOWN_CONTEXT_RESET_ARB indicates a detected graphics reset whose
       cause is unknown.

    The language here is explicit in that we must mark up the guilty batch,
    but is loose enough for us to relax the innocent (i.e. pending)
    accounting as only the active batches are involved with the reset.

    In the future, we are looking towards single engine resetting (with
    minimal locking), where it seems inappropriate to mark the entire world
    as innocent since the reset occurred on a different engine. Reducing the
    information available means we only have to encounter the pain once, and
    also reduces the information leaking from one context to another.

    v2: Legacy ringbuffer submission required a reset following hibernation,
    or else we restore stale values to the RING_HEAD and walked over
    stolen garbage.

    v3: GuC requires replaying the requests after a reset.

    v4: Restore engine IRQ after reset (so waiters will be woken!)
        Rearm hangcheck if resetting with a waiter.

    Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
    Cc: Mika Kuoppala <mika.kuoppala at intel.com>
    Cc: Arun Siluvery <arun.siluvery at linux.intel.com>
    Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
    Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
    Link:
http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-13-chris@chris-wilson.co.uk

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20170214/fd806579/attachment.html>


More information about the intel-gfx-bugs mailing list