[Freedreno] [PATCH] drm/msm/dpu: ensure device suspend happens during PM sleep

Doug Anderson dianders at chromium.org
Wed May 27 22:11:00 UTC 2020


Hi,

On Fri, May 15, 2020 at 9:37 AM Doug Anderson <dianders at chromium.org> wrote:
>
> Hi,
>
> On Fri, May 15, 2020 at 5:06 AM <kalyan_t at codeaurora.org> wrote:
> >
> > On 2020-05-14 21:47, Doug Anderson wrote:
> > > Hi,
> > >
> > > On Fri, May 1, 2020 at 6:31 AM Kalyan Thota <kalyan_t at codeaurora.org>
> > > wrote:
> > >>
> > >> "The PM core always increments the runtime usage counter
> > >> before calling the ->suspend() callback and decrements it
> > >> after calling the ->resume() callback"
> > >>
> > >> DPU and DSI are managed as runtime devices. When
> > >> suspend is triggered, PM core adds a refcount on all the
> > >> devices and calls device suspend, since usage count is
> > >> already incremented, runtime suspend was not getting called
> > >> and it kept the clocks on which resulted in target not
> > >> entering into XO shutdown.
> > >>
> > >> Add changes to force suspend on runtime devices during pm sleep.
> > >>
> > >> Changes in v1:
> > >>  - Remove unnecessary checks in the function
> > >>     _dpu_kms_disable_dpu (Rob Clark).
> > >>
> > >> Changes in v2:
> > >>  - Avoid using suspend_late to reset the usagecount
> > >>    as suspend_late might not be called during suspend
> > >>    call failures (Doug).
> > >>
> > >> Changes in v3:
> > >>  - Use force suspend instead of managing device usage_count
> > >>    via runtime put and get API's to trigger callbacks (Doug).
> > >>
> > >> Changes in v4:
> > >>  - Check the return values of pm_runtime_force_suspend and
> > >>    pm_runtime_force_resume API's and pass appropriately (Doug).
> > >>
> > >> Changes in v5:
> > >
> > > Can you please put the version number properly in your subject?  It's
> > > really hard to tell one version of your patch from another.
> > >
> > >
> > >>  - With v4 patch, test cycle has uncovered issues in device resume.
> > >>
> > >>    On bubs: cmd tx failures were seen as SW is sending panel off
> > >>    commands when the dsi resources are turned off.
> > >>
> > >>    Upon suspend, DRM driver will issue a NULL composition to the
> > >>    dpu, followed by turning off all the HW blocks.
> > >>
> > >>    v5 changes will serialize the NULL commit and resource unwinding
> > >>    by handling them under PM prepare and PM complete phases there by
> > >>    ensuring that clks are on when panel off commands are being
> > >>    processed.
> > >
> > > I'm still most definitely not an expert in how all the DRM pieces all
> > > hook up together, but the solution you have in this patch seems wrong
> > > to me.  As far as I can tell the "prepare" state isn't supposed to be
> > > actually doing the suspend work and here that's exactly what you're
> > > doing.  I think you should find a different solution to ensure
> > > ordering is correct.
> > >
> > > -Doug
> > >
> >
> > Hi,
>
> Quite honestly I'm probably not the right person to be reviewing this
> code.  I mostly just noticed one of your early patches and it looked
> strange to me.  Hopefully someone with actual experience in how all
> the DRM components work together can actually review and see if this
> makes sense.  Maybe Sean would know better?
>
> That being said, let me at least look at what you're saying...
>
>
> > Prepare and Complete are callbacks defined as part of Sleep and Resume
> > sequence
> >
> > Entering PM SUSPEND the phases are : prepare --> suspend -->
> > suspend_late --> suspend_noirq.
> > While leaving PM SUSPEND the phases are: resume_noirq --> resume_early
> > --> resume --> complete.
>
> Sure, it's part of the sequence.  It's also documented in pm.h as:
>
>  * The principal role of this callback is to prevent new children of
>  * the device from being registered after it has returned (the driver's
>  * subsystem and generally the rest of the kernel is supposed to prevent
>  * new calls to the probe method from being made too once @prepare() has
>  * succeeded).
>
> It does not feel like that matches your usage of this call.
>
>
> > The reason to push drm suspend handling to PM prepare phase is that
> > parent here will trigger a modeset to turn off the timing and
> > subsequently the panel.
> > the child devices should not turn of their clocks before parent unwinds
> > the composition. Hence they are serialized as per the sequence mentioned
> > above.
>
> So the general model in Linux is that children suspend before their
> parents, right?  So you're saying that, in this case, the parent needs
> to act on the child before the child suspends.  Is that correct?
>
> Rather than hijacking the prepare/complete, I'd be at least slightly
> inclined to move the other driver to turn off its clocks in
> suspend_late and to turn them back on in resume_early?  That seems to
> be what was done in "analogix_dp-rockchip.c" to solve a similar
> problem.
>
>
> > A similar approach is taken by other driver that use drm framework. In
> > this driver, the device registers for prepare and complete callbacks to
> > handle drm_suspend and drm_resume.
> > https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/exynos/exynos_drm_drv.c#L163
>
> OK, if there is another driver in DRM then I guess I won't object too
> strongly.  Note that when searching for other drivers I noticed this
> bit in todo.rst:
>
> * Most drivers (except i915 and nouveau) that use
> * drm_atomic_helper_suspend/resume() can probably be converted to use
> * drm_mode_config_helper_suspend/resume(). Also there's still open-coded version
> * of the atomic suspend/resume code in older atomic modeset drivers.
>
> Does anything get fixed if you do that?  It seems like it'd cleanup
> your code a bit so maybe worth doing anyway...
>
> ---
>
> I guess the last question I'd want resolved is why you have this asymmetry:
>
> SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, msm_pm_resume)
>
> Why couldn't you use pm_runtime_force_resume()?

I'm curious if you had answers to any of the questions I posed in my review.

-Doug


More information about the Freedreno mailing list