[Bug 93467] [bsw] execlists causes machine lockups

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Mar 21 17:35:04 UTC 2016


https://bugs.freedesktop.org/show_bug.cgi?id=93467

--- Comment #13 from Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com> ---
On my BDW I need a kernel with no debugging whatsoever to trigger this
reliably. No tracing, no lockdep, even basic spinlock debugging needs to be
turned off.

In that setup gem_exec_nop/basic generates ~340k interrupts per second and
creates numerous 10-20 second system-wide stalls. 

I've tried to measure the durations of various sections of the code in
intel_lrc.c, and although the averages and maximums are bad (if my numbers are
correct we can spend 10-25% of elapsed test time with interrupts off), I can't
find anything which would block for 10-20 seconds in one go.

And by inspecting the code I also can't figure out how it would happen.

Also, these lockups in general seem to come and go in batches. Sometimes the
system is happily chugging along with 340k irq/s, and sometimes it is stalling
all the time. What makes it latch to one of these modes I have no idea.

At one point I thought I see a correlation with the retire worker, but then it
went away. It still feels upgrading the trylock there with a real lock improves
things (more short lockups vs long ones), but I can't figure out why that would
make sense.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20160321/b58b4de3/attachment-0001.html>


More information about the intel-gfx-bugs mailing list