[Intel-gfx] Panic after S3 resume and modeset with MST
Takashi Iwai
tiwai at suse.de
Wed Mar 29 13:54:43 UTC 2017
On Wed, 29 Mar 2017 15:34:15 +0200,
Ville Syrjälä wrote:
>
> On Wed, Mar 29, 2017 at 03:10:09PM +0200, Takashi Iwai wrote:
> > On Mon, 27 Mar 2017 18:02:13 +0200,
> > Takashi Iwai wrote:
> > >
> > > Hi,
> > >
> > > the upstream fix a16b7658f4e0d4aec9bc3e75a5f0cc3f7a3a0422
> > > drm/i915: Call intel_dp_mst_resume() before resuming displays
> > >
> > > seems to trigger a kernel panic when some modeset change happens after
> > > S3 resume. The details are found in openSUSE bugzilla,
> > > https://bugzilla.suse.com/show_bug.cgi?id=1029634
> > >
> > > In short, the following procedure causes a kernel panic (supposedly)
> > > almost 100% on Dell Latitude with Skylake with MST DP on dock:
> > >
> > > - Boot with a docking station, DP-1 connected.
> > > - Login on X
> > > - xrandr --output eDP-1 --primary --auto --output DP-1-1 --auto --left-of eDP-1
> > > ==> This changes the mode.
> > > - Suspend ("systemctl suspend" in my case), and close the lid.
> > > - Remove from the dock (keep the lid closed).
> > > - Open the lid, which resumes automatically. It works.
> > > - Suspend again.
> > > - Connect to the dock again (keep the lid closed).
> > > - Open the lid, which resumes automatically. It's still OK.
> > > - xrandr --output eDP-1 --primary --auto --output DP-1-1 --auto --left-of eDP-1
> > > ==> Now the kernel feezes.
> > >
> > > Reverting the commit mentioned above fixes the problem.
> > >
> > > The problem is present in all versions I tested. The reported kernel
> > > in the Bugzilla is 4.4.x-based one, but the issue is seen in 4.11-rc3,
> > > too. Note that the S3 resume itself works in 4.11-rc3; the kernel
> > > panic happens when invoking xrandr manually after that.
> > >
> > > Unfortunately, I couldn't get a kernel panic message, so far. kdump
> > > didn't work well in this case by some reason. There are some
> > > screenshots taken by the original reporter (could switch VT
> > > beforehand), but I don't know whether it helps.
> > >
> > > If you have any hints for further debugging, it'd be highly
> > > appreciated.
> >
> > It seems that the patch below works around the problem.
> > Can anyone enlighten what's going on there?
> >
> >
> > thanks,
> >
> > Takashi
> >
> > -- 8< --
> > From: Takashi Iwai <tiwai at suse.de>
> > Subject: [PATCH] drm/i915: Fix crash after S3 resume with DP MST mode change
> >
> > We've got a bug report showing that Skylake Dell machines with a
> > docking station causes a kernel panic after S3 resume and modeset.
> > The details are found in the openSUSE bugzilla entry below. The
> > typical test procedure is:
> >
> > - Laptop is Dell Latitude with eDP (1366x768)
> > - Boot with docking station connected to a DP (1920x1080)
> > - Login, change the mode via
> > xrandr --output eDP-1 --auto --output DP-1-1 --auto --left-of eDP-1
> > - Suspend, and close the lid after the suspend
> > (or close the lid to trigger the suspend)
> > - Undock while keeping the lid closed.
> > - Open the lid, which triggers the resume;
> > the machine wakes up well, and X shows up. No problem, so far.
> > - Suspend again, close the lid.
> > - Dock again while keeping the lid closed.
> > - Open the lid, triggering the resume; this wakes up still fine.
> > - At this moment, run xrandr again to re-setup DP-1
> > xrandr --output eDP-1 --auto --output DP-1-1 --auto --left-of eDP-1
> > ==> This triggers a hard crash.
> >
> > I could bisect it, and this leaded to the commit a16b7658f4e0
> > ("drm/i915: Call intel_dp_mst_resume() before resuming displays").
> >
> > Basically the commit just shuffles the calls of intel_display_resume()
> > and intel_dp_mst_resume(). So as a workaround, I tried to split
> > intel_dp_mst_resume() call to postpone the suspected code (the
> > invocation of intel_dp_check_mst_status()), then bingo, this cured the
> > problem.
> >
> > But don't ask me *why* this fixes. It's still in a cargo-cult state.
> >
> > Fixes: a16b7658f4e0 ("drm/i915: Call intel_dp_mst_resume() before resuming displays")
> > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1029634
> > Signed-off-by: Takashi Iwai <tiwai at suse.de>
> > ---
> > drivers/gpu/drm/i915/i915_drv.c | 5 ++++-
> > drivers/gpu/drm/i915/intel_dp.c | 20 +++++++++++++++++++-
> > drivers/gpu/drm/i915/intel_drv.h | 3 ++-
> > 3 files changed, 25 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 1c75402a59c1..62c40090ceed 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1559,6 +1559,7 @@ static int i915_suspend_switcheroo(struct drm_device *dev, pm_message_t state)
> > static int i915_drm_resume(struct drm_device *dev)
> > {
> > struct drm_i915_private *dev_priv = to_i915(dev);
> > + int mst_pending;
> > int ret;
> >
> > disable_rpm_wakeref_asserts(dev_priv);
> > @@ -1608,10 +1609,12 @@ static int i915_drm_resume(struct drm_device *dev)
> > dev_priv->display.hpd_irq_setup(dev_priv);
> > spin_unlock_irq(&dev_priv->irq_lock);
> >
> > - intel_dp_mst_resume(dev);
> > + mst_pending = intel_dp_mst_resume(dev);
> >
> > intel_display_resume(dev);
> >
> > + intel_dp_mst_resume_post(dev, mst_pending);
> > +
> > drm_kms_helper_poll_enable(dev);
> >
> > /*
> > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> > index d1670b8afbf5..fc5ea900e6f3 100644
> > --- a/drivers/gpu/drm/i915/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/intel_dp.c
> > @@ -6027,9 +6027,10 @@ void intel_dp_mst_suspend(struct drm_device *dev)
> > }
> > }
> >
> > -void intel_dp_mst_resume(struct drm_device *dev)
> > +int intel_dp_mst_resume(struct drm_device *dev)
> > {
> > struct drm_i915_private *dev_priv = to_i915(dev);
> > + int pending = 0;
> > int i;
> >
> > for (i = 0; i < I915_MAX_PORTS; i++) {
> > @@ -6041,6 +6042,23 @@ void intel_dp_mst_resume(struct drm_device *dev)
> >
> > ret = drm_dp_mst_topology_mgr_resume(&intel_dig_port->dp.mst_mgr);
> > if (ret)
> > + pending |= 1 << i;
> > + }
> > +
> > + return pending;
> > +}
> > +
> > +void intel_dp_mst_resume_post(struct drm_device *dev, int pending)
> > +{
> > + struct drm_i915_private *dev_priv = to_i915(dev);
> > + int i;
> > +
> > + for (i = 0; i < I915_MAX_PORTS; i++) {
> > + struct intel_digital_port *intel_dig_port =
> > + dev_priv->hotplug.irq_port[i];
> > + if (!intel_dig_port || !intel_dig_port->dp.can_mst)
> > + continue;
> > + if (pending & (1 << i))
> > intel_dp_check_mst_status(&intel_dig_port->dp);
> > }
> > }
>
> The whole MST resume is a bit of chicken and egg type of situation. We
> need the HPD interrupts to resume the previous state, but we don't want
> to actually process real hotplugs until we've done the resume. The
> current code is definitely broken IMO.
>
> But I'm not really sure why this patch fixes things because the HPD
> processing that will occur when we talk to the sink during the display
> resume should also call intel_dp_check_mst_status().
Actually, just dropping intel_dp_check_mst_status() calls in
intel_dp_mst_resume() seems enough to fix the problem.
Below is the v2 patch doing that. Does it make more sense?
thanks,
Takashi
-- 8< --
From: Takashi Iwai <tiwai at suse.de>
Subject: [PATCH] drm/i915: Fix crash after S3 resume with DP MST mode change
(v2)
We've got a bug report showing that Skylake Dell machines with a
docking station causes a kernel panic after S3 resume and modeset.
The details are found in the openSUSE bugzilla entry below. The
typical test procedure is:
- Laptop is Dell Latitude with eDP (1366x768)
- Boot with docking station connected to a DP (1920x1080)
- Login, change the mode via
xrandr --output eDP-1 --auto --output DP-1-1 --auto --left-of eDP-1
- Suspend, and close the lid after the suspend
(or close the lid to trigger the suspend)
- Undock while keeping the lid closed.
- Open the lid, which triggers the resume;
the machine wakes up well, and X shows up. No problem, so far.
- Suspend again, close the lid.
- Dock again while keeping the lid closed.
- Open the lid, triggering the resume; this wakes up still fine.
- At this moment, run xrandr again to re-setup DP-1
xrandr --output eDP-1 --auto --output DP-1-1 --auto --left-of eDP-1
==> This triggers a hard crash.
I could bisect it, and this leaded to the commit a16b7658f4e0
("drm/i915: Call intel_dp_mst_resume() before resuming displays").
This patch tries to work around the crash by ignoring the failed port
from drm_dp_mst_topology_mgr_resume(). They should be handled in hpd
later in anyway.
v1->v2: just ignore the drm_dp_mst_topology_mgr_resume() error codes
instead of postponing.
Fixes: a16b7658f4e0 ("drm/i915: Call intel_dp_mst_resume() before resuming displays")
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1029634
Signed-off-by: Takashi Iwai <tiwai at suse.de>
---
drivers/gpu/drm/i915/intel_dp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index d1670b8afbf5..a6c0f0ac16eb 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -6041,6 +6041,7 @@ void intel_dp_mst_resume(struct drm_device *dev)
ret = drm_dp_mst_topology_mgr_resume(&intel_dig_port->dp.mst_mgr);
if (ret)
- intel_dp_check_mst_status(&intel_dig_port->dp);
+ DRM_DEBUG_KMS("DP MST resume failed for port-%c\n",
+ port_name(intel_dig_port->port));
}
}
--
2.11.1
More information about the Intel-gfx
mailing list