[Bug 109469] [CI][SHARDS] igt at gem_mmap_gtt@hang - fail - Failed assertion: !control->error

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Jan 28 09:36:26 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=109469

--- Comment #2 from Chris Wilson <chris at chris-wilson.co.uk> ---
This was a deliberate regression in our GPU reset handling -- accepting that we
cannot serialise user memory access without a magic recursive mutex.

commit eb8d0f5af4ec2d172baf8b4b9a2199cd916b4e54
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Fri Jan 25 13:22:28 2019 +0000

    drm/i915: Remove GPU reset dependence on struct_mutex

    Now that the submission backends are controlled via their own spinlocks,
    with a wave of a magic wand we can lift the struct_mutex requirement
    around GPU reset. That is we allow the submission frontend (userspace)
    to keep on submitting while we process the GPU reset as we can suspend
    the backend independently.

    The major change is around the backoff/handoff strategy for performing
    the reset. With no mutex deadlock, we no longer have to coordinate with
    any waiter, and just perform the reset immediately.

    Testcase: igt/gem_mmap_gtt/hang # regresses
    Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
    Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
    Link:
https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk


We think we might be able to do a "light stop_machine()" to suspend affected
userspace (along the lines of SIGSTOP) while we do the reset -- which should
avoid the recursion from inside the pagefault handlers. Actual user impact
should be low, temporary visual glitch (userspace image shown with incorrect
tiling) on older machines should the GPU hang (so after already being
unresponsive for 10s), machine stability unaffected.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190128/cb64cb23/attachment.html>


More information about the intel-gfx-bugs mailing list