more intel drm issues (was Re: [git pull] drm intel only fixes)

Chris Wilson chris at
Thu Jan 20 02:25:40 PST 2011

On Wed, 19 Jan 2011 22:22:48 -0800, Linus Torvalds <torvalds at> wrote:
> On Wed, Jan 19, 2011 at 8:55 PM, Jeff Chua <jeff.chua.linux at> wrote:
> >
> > Rafael send out two patches earlier. Could be related. I was facing
> > issue during resume.
> No, I'm aware of the rcu-synchronize thing, this isn't it. This is
> really at the suspend stage, and I had bisected it down to the drm
> changes.
> In fact, by now I have bisected it down to a single commit. It's
> another merge commit, which makes me a bit nervous (I bisected another
> issue today, and it turned out to simply not be repeatable).
> But this time the merge commit actually has a real conflict that got
> fixed up in the merge, and the code around the conflict waits for
> three seconds, and three seconds is also exactly how long the delay at
> suspend time is. So I get the feeling that this time it's a real
> issue, and what happened was that the merge may have been a mismerge.
> Chris: as of commit 8d5203ca6253 ("Merge branch 'drm-intel-fixes' into
> drm-intel-next") I'm getting that 3-second delay at suspend time. And
> the merge diff looks like this:
>  +	struct drm_device *dev = ring->dev;
>  +	struct drm_i915_private *dev_priv = dev->dev_private;
>   	unsigned long end;
>  -	drm_i915_private_t *dev_priv = dev->dev_private;
>   	u32 head;
> - 	head = intel_read_status_page(ring, 4);
> - 	if (head) {
> - 		ring->head = head & HEAD_ADDR;
> - 		ring->space = ring->head - (ring->tail + 8);
> - 		if (ring->space < 0)
> - 			ring->space += ring->size;
> - 		if (ring->space >= n)
> - 			return 0;
> - 	}
> -
>   	trace_i915_ring_wait_begin (dev);
>   	end = jiffies + 3 * HZ;
>   	do {
> and that whole do-loop with a 3-second timeout makes me *very*
> suspicious. It used to have (in _one_ of the parent branches) that
> code before it to return early if there was space in the ring, now it
> doesn't any more - and that merge co-incides with my suspend suddenly
> taking 3 seconds.
> The same check that is deleted does exist inside the loop too, but
> there it has some extra code it in (compare to "actual_head" and so
> on), so I wonder if the fast-case was somehow hiding this issue.

Right, the autoreported HEAD may have been already reset to 0 and so hit
the wraparound bug which caused it to exit early without actually
quiescing the ringbuffer.

Another possibility is that I added a 3s timeout waiting for a request if
IRQs were suspended:

commit b5ba177d8d71f011c23b1cabec99fdaddae65c4d
Author: Chris Wilson <chris at>
Date:   Tue Dec 14 12:17:15 2010 +0000

    drm/i915: Poll for seqno completion if IRQ is disabled

Both of those I think are symptoms of another problem, that perhaps during
suspend we are shutting down parts of the chip before idling?

Chris Wilson, Intel Open Source Technology Centre

More information about the dri-devel mailing list