[PATCH 2/2] drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc.

Mario Kleiner mario.kleiner.de at gmail.com
Fri Feb 19 01:06:39 UTC 2016


This fixes a regression introduced in Linux 4.4.

This is a port of the same fix for radeon-kms in the
patch "drm/radeon: Don't hang in radeon_flip_work_func
on disabled crtc. (v2)"

Limit the amount of time amdgpu_flip_work_func can
delay programming a page flip, by both limiting the
maximum amount of time per wait cycle and the maximum
number of wait cycles. Continue the flip if the limit
is exceeded, even if that may result in a visual or
timing glitch.

This is to prevent a hang of page flips, as reported
in fdo bug #93746: Disconnecting a DisplayPort display
in parallel to a kms pageflip getting queued can cause
the following hang of page flips and thereby an unusable
desktop:

1. kms pageflip ioctl() queues pageflip -> queues execution
   of amdgpu_flip_work_func.

2. Hotunplug of display causes the driver to DPMS OFF
   the unplugged display. Display engine shuts down,
   scanout no longer moves, but stays at its resting
   position at start line of vblank.

3. amdgpu_flip_work_func executes while crtc is off, and
   due to the non-moving scanout position, the new flip
   delay code introduced into Linux 4.4 by
   commit 8e36f9d33c13 ("drm/amdgpu: Fixup hw vblank counter/ts..")
   enters an infinite wait loop.

4. After reconnecting the display, the pageflip continues
   to hang in 3. and the display doesn't update its view
   of the desktop.

This patch fixes the Linux 4.4 regression from fdo bug #93746

<https://bugs.freedesktop.org/show_bug.cgi?id=93746>

Reported-by: Bernd Steinhauser <linux at bernd-steinhauser.de>
Signed-off-by: Mario Kleiner <mario.kleiner.de at gmail.com>

Cc: <stable at vger.kernel.org> # 4.4+
Cc: Michel Dänzer <michel.daenzer at amd.com>
Cc: Alex Deucher <alexander.deucher at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index acd066d0..8297bc3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -72,8 +72,8 @@ static void amdgpu_flip_work_func(struct work_struct *__work)
 
 	struct drm_crtc *crtc = &amdgpuCrtc->base;
 	unsigned long flags;
-	unsigned i;
-	int vpos, hpos, stat, min_udelay;
+	unsigned i, repcnt = 4;
+	int vpos, hpos, stat, min_udelay = 0;
 	struct drm_vblank_crtc *vblank = &crtc->dev->vblank[work->crtc_id];
 
 	amdgpu_flip_wait_fence(adev, &work->excl);
@@ -96,7 +96,7 @@ static void amdgpu_flip_work_func(struct work_struct *__work)
 	 * In practice this won't execute very often unless on very fast
 	 * machines because the time window for this to happen is very small.
 	 */
-	for (;;) {
+	while (amdgpuCrtc->enabled && repcnt--) {
 		/* GET_DISTANCE_TO_VBLANKSTART returns distance to real vblank
 		 * start in hpos, and to the "fudged earlier" vblank start in
 		 * vpos.
@@ -114,10 +114,22 @@ static void amdgpu_flip_work_func(struct work_struct *__work)
 		/* Sleep at least until estimated real start of hw vblank */
 		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
 		min_udelay = (-hpos + 1) * max(vblank->linedur_ns / 1000, 5);
+		if (min_udelay > vblank->framedur_ns / 2000) {
+			/* Don't wait ridiculously long - something is wrong */
+			repcnt = 0;
+			break;
+		}
 		usleep_range(min_udelay, 2 * min_udelay);
 		spin_lock_irqsave(&crtc->dev->event_lock, flags);
 	};
 
+	if (!repcnt)
+		DRM_DEBUG_DRIVER("Delay problem on crtc %d: min_udelay %d, "
+				 "framedur %d, linedur %d, stat %d, vpos %d, "
+				 "hpos %d\n", work->crtc_id, min_udelay,
+				 vblank->framedur_ns / 1000,
+				 vblank->linedur_ns / 1000, stat, vpos, hpos);
+
 	/* do the flip (mmio) */
 	adev->mode_info.funcs->page_flip(adev, work->crtc_id, work->base);
 	/* set the flip status */
-- 
1.9.1



More information about the dri-devel mailing list