<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Priority</th>
<td>high
</td>
</tr>
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [SNB Bisected]igt/gem_hangcheck_forcewake cause [drm:i915_hangcheck_elapsed] *ERROR* bsd ring: stuck on addr 0x3b4"
href="https://bugs.freedesktop.org/show_bug.cgi?id=65394">65394</a>
</td>
</tr>
<tr>
<th>CC</th>
<td>yangweix.shui@intel.com
</td>
</tr>
<tr>
<th>Assignee</th>
<td>intel-gfx-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Summary</th>
<td>[SNB Bisected]igt/gem_hangcheck_forcewake cause [drm:i915_hangcheck_elapsed] *ERROR* bsd ring: stuck on addr 0x3b4
</td>
</tr>
<tr>
<th>QA Contact</th>
<td>intel-gfx-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Severity</th>
<td>major
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux (All)
</td>
</tr>
<tr>
<th>Reporter</th>
<td>huax.lu@intel.com
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Component</th>
<td>DRM/Intel
</td>
</tr>
<tr>
<th>Product</th>
<td>DRI
</td>
</tr></table>
<p>
<div>
<pre>System Environment:
--------------------------
Arch: i386
Platform: Sandybridge
Kernel: (drm-intel-next-queued)d7697eea3eec74c561d12887d892c53ac4380c00
Bug detailed description:
-------------------------
It happens on sandybridge with drm-intel-next-queued kernel.It works well on
drm-intel-fixes kernel.
Bisect shows:05407ff889ceebe383aa5907219f86582ef96b72 is the first bad commit.
commit 05407ff889ceebe383aa5907219f86582ef96b72
Author: Mika Kuoppala <<a href="mailto:mika.kuoppala@linux.intel.com">mika.kuoppala@linux.intel.com</a>>
AuthorDate: Thu May 30 09:04:29 2013 +0300
Commit: Daniel Vetter <<a href="mailto:daniel.vetter@ffwll.ch">daniel.vetter@ffwll.ch</a>>
CommitDate: Mon Jun 3 10:58:21 2013 +0200
drm/i915: detect hang using per ring hangcheck_score
Keep track of ring seqno progress and if there are no
progress detected, declare hang. Use actual head (acthd)
to distinguish between ring stuck and batchbuffer looping
situation. Stuck ring will be kicked to trigger progress.
This commit adds a hard limit for batchbuffer completion time.
If batchbuffer completion time is more than 4.5 seconds,
the gpu will be declared hung.
Review comment from Ben which nicely clarifies the semantic change:
"Maybe I'm just stating the functional changes of the patch, but in case
they were unintended here is what I see as potential issues:
1. "If ring B is waiting on ring A via semaphore, and ring A is making
progress, albeit slowly - the hangcheck will fire. The check will
determine that A is moving, however ring B will appear hung because
the ACTHD doesn't move. I honestly can't say if that's actually a
realistic problem to hit it probably implies the timeout value is too
low.
2. "There's also another corner case on the kick. If the seqno = 2
(though not stuck), and on the 3rd hangcheck, the ring is stuck, and
we try to kick it... we don't actually try to find out if the kick
helped"
v2: use atchd to detect stuck ring from loop (Ben Widawsky)
v3: Use acthd to check when ring needs kicking.
Declare hang on third time in order to give time for
kick_ring to take effect.
v4: Update commit msg
output:
filling ring
waiting
done waiting, check dmesg
dmesg:
[60161.225096] [drm:i915_driver_open],
[60161.225116] [drm:intel_crtc_set_config], [CRTC:3] [FB:27] #connectors=1 (x
y) (0 0)
[60161.225120] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to
[CRTC:3]
[60161.225122] [drm:intel_crtc_set_config], [CRTC:5] [NOFB]
[60161.225124] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to
[CRTC:3]
[60161.225129] [drm:i915_driver_open],
[60171.239225] [drm:intel_crtc_set_config], [CRTC:3] [FB:27] #connectors=1 (x
y) (0 0)
[60171.239231] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to
[CRTC:3]
[60171.239233] [drm:intel_crtc_set_config], [CRTC:5] [NOFB]
[60171.239235] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to
[CRTC:3]
[60171.239254] [drm:i915_driver_open],
[60171.239259] [drm:intel_crtc_set_config], [CRTC:3] [FB:27] #connectors=1 (x
y) (0 0)
[60171.239261] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to
[CRTC:3]
[60171.239262] [drm:intel_crtc_set_config], [CRTC:5] [NOFB]
[60171.239263] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to
[CRTC:3]
[60171.239267] [drm:i915_driver_open],
[60176.787935] [drm:i915_hangcheck_elapsed] *ERROR* bsd ring: stuck on addr
0x3b4
[60176.788236] [drm:i915_error_work_func], resetting chip
[60176.790751] [drm:gm45_get_vblank_counter], trying to get vblank count for
disabled pipe B
[60176.790883] [drm:ironlake_update_plane], Writing base 00072000 00000000 0 0
5120
[60176.790940] [drm:intel_crtc_set_config], [CRTC:3] [FB:27] #connectors=1 (x
y) (0 0)
[60176.790944] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to
[CRTC:3]
[60176.790946] [drm:intel_crtc_set_config], [CRTC:5] [NOFB]
[60176.790947] [drm:intel_modeset_stage_output_state], [CONNECTOR:7:VGA-1] to
[CRTC:3]
[60176.801235] [drm:gmbus_xfer], GMBUS [i915 gmbus dpc] NAK for addr: 0050 r(1)
[60176.801239] [drm:drm_do_probe_ddc_edid], drm: skipping non-existent adapter
i915 gmbus dpc
[60176.801259] [drm:intel_ironlake_crt_detect_hotplug], ironlake hotplug
adpa=0x83f40018, result 1
[60176.801261] [drm:intel_crt_detect], CRT detected via hotplug
[60176.801909] [drm:gmbus_xfer], GMBUS [i915 gmbus dpc] NAK for addr: 0050 r(1)
[60176.801912] [drm:drm_do_probe_ddc_edid], drm: skipping non-existent adapter
i915 gmbus dpc
[60176.801923] [drm:intel_ironlake_crt_detect_hotplug], ironlake hotplug
adpa=0x83f40018, result 1
[60176.801924] [drm:intel_crt_detect], CRT detected via hotplug
[60176.802678] [drm:gmbus_xfer], GMBUS [i915 gmbus dpc] NAK for addr: 0050 r(1)
[60176.802680] [drm:drm_do_probe_ddc_edid], drm: skipping non-existent adapter
i915 gmbus dpc
[60176.802691] [drm:intel_ironlake_crt_detect_hotplug], ironlake hotplug
adpa=0x83f40018, result 1
[60176.802692] [drm:intel_crt_detect], CRT detected via hotplug
Reproduce steps:
----------------
1. ./gem_hangcheck_forcewake
2. dmesg -r | egrep "<[1-6]>" |grep drm</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>