[PATCH v2 8/8] drm/amdgpu: Call drm_atomic_helper_shutdown() at shutdown time
Alex Deucher
alexdeucher at gmail.com
Thu Jun 20 13:00:23 UTC 2024
On Thu, Jun 20, 2024 at 3:10 AM Maxime Ripard <mripard at kernel.org> wrote:
>
> Hi,
>
> On Wed, Jun 19, 2024 at 09:53:12AM GMT, Alex Deucher wrote:
> > On Wed, Jun 19, 2024 at 9:50 AM Alex Deucher <alexdeucher at gmail.com> wrote:
> > >
> > > On Tue, Jun 18, 2024 at 7:53 PM Doug Anderson <dianders at chromium.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Tue, Jun 18, 2024 at 3:00 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > > >
> > > > > On Tue, Jun 18, 2024 at 5:40 PM Doug Anderson <dianders at chromium.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > >
> > > > > > On Mon, Jun 17, 2024 at 8:01 AM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jun 12, 2024 at 6:37 PM Douglas Anderson <dianders at chromium.org> wrote:
> > > > > > > >
> > > > > > > > Based on grepping through the source code this driver appears to be
> > > > > > > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > > > > > > time. Among other things, this means that if a panel is in use that it
> > > > > > > > won't be cleanly powered off at system shutdown time.
> > > > > > > >
> > > > > > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > > > > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > > > > > > instance overview" in drm_drv.c.
> > > > > > > >
> > > > > > > > Suggested-by: Maxime Ripard <mripard at kernel.org>
> > > > > > > > Cc: Alex Deucher <alexander.deucher at amd.com>
> > > > > > > > Cc: Christian König <christian.koenig at amd.com>
> > > > > > > > Cc: Xinhui Pan <Xinhui.Pan at amd.com>
> > > > > > > > Signed-off-by: Douglas Anderson <dianders at chromium.org>
> > > > > > > > ---
> > > > > > > > This commit is only compile-time tested.
> > > > > > > >
> > > > > > > > ...and further, I'd say that this patch is more of a plea for help
> > > > > > > > than a patch I think is actually right. I'm _fairly_ certain that
> > > > > > > > drm/amdgpu needs this call at shutdown time but the logic is a bit
> > > > > > > > hard for me to follow. I'd appreciate if anyone who actually knows
> > > > > > > > what this should look like could illuminate me, or perhaps even just
> > > > > > > > post a patch themselves!
> > > > > > >
> > > > > > > I'm not sure this patch makes sense or not. The driver doesn't really
> > > > > > > do a formal tear down in its shutdown routine, it just quiesces the
> > > > > > > hardware. What are the actual requirements of the shutdown function?
> > > > > > > In the past when we did a full driver tear down in shutdown, it
> > > > > > > delayed the shutdown sequence and users complained.
> > > > > >
> > > > > > The "inspiration" for this patch is to handle panels properly.
> > > > > > Specifically, panels often have several power/enable signals going to
> > > > > > them and often have requirements that these signals are powered off in
> > > > > > the proper order with the proper delays between them. While we can't
> > > > > > always do so when the system crashes / reboots in an uncontrolled way,
> > > > > > panel manufacturers / HW Engineers get upset if we don't power things
> > > > > > off properly during an orderly shutdown/reboot. When panels are
> > > > > > powered off badly it can cause garbage on the screen and, so I've been
> > > > > > told, can even cause long term damage to the panels over time.
> > > > > >
> > > > > > In Linux, some panel drivers have tried to ensure a proper poweroff of
> > > > > > the panel by handling the shutdown() call themselves. However, this is
> > > > > > ugly and panel maintainers want panel drivers to stop doing it. We
> > > > > > have removed the code doing this from most panels now [1]. Instead the
> > > > > > assumption is that the DRM modeset drivers should be calling
> > > > > > drm_atomic_helper_shutdown() which will make sure panels get an
> > > > > > orderly shutdown.
> > > > > >
> > > > > > For a lot more details, see the cover letter [2] which then contains
> > > > > > links to even more discussions about the topic.
> > > > > >
> > > > > > [1] https://lore.kernel.org/r/20240605002401.2848541-1-dianders@chromium.org
> > > > > > [2] https://lore.kernel.org/r/20240612222435.3188234-1-dianders@chromium.org
> > > > >
> > > > > I don't think it's an issue. We quiesce the hardware as if we were
> > > > > about to suspend the system (e.g., S3). For the display hardware we
> > > > > call drm_atomic_helper_suspend() as part of that sequence.
> > > >
> > > > OK. It's no skin off my teeth and we can drop this patch if you're
> > > > convinced it's not needed. From the point of view of someone who has
> > > > no experience with this driver it seems weird to me that it would use
> > > > drm_atomic_helper_suspend() at shutdown time instead of the documented
> > > > drm_atomic_helper_shutdown(), but if it works for everyone then I'm
> > > > not gonna complain.
> > >
> > > I think the problem is that it is not clear exactly what the
> > > expectations are around the PCI shutdown callback. The documentation
> > > says:
> > >
> > > "Hook into reboot_notifier_list (kernel/sys.c). Intended to stop any
> > > idling DMA operations. Useful for enabling wake-on-lan (NIC) or
> > > changing the power state of a device before reboot. e.g.
> > > drivers/net/e100.c."
> >
> > Arguably, there is no requirement to even touch the display hardware
> > at all. In theory you could just leave the display hardware as is in
> > the current state. The system will either be rebooting or powering
> > down anyway.
>
> I think it mostly boils down to a cultural mismatch :)
>
> Doug works on panel for ARM systems, where devices need (and need to
> handle) much more resources than what's typical on a system with an AMD
> GPU.
>
> So, for the kind of hardware Doug usually deals with, we definitely need
> the shutdown hook to make sure the regulators, GPIOs, etc. supplying the
> panel are properly shutdown.
>
> And panels usually tied to AMD GPUs probably don't need any of that.
Makes sense. I think drm_atomic_helper_suspend() is a viable
alternative if drivers want to leverage their existing suspend code.
I could write up a doc patch unless there is reason to prefer the
shutdown variant.
Alex
More information about the amd-gfx
mailing list