Expecting to revert commit 55285e21f045 "fbdev/efifb: Release PCI device ..."
Deucher, Alexander
Alexander.Deucher at amd.com
Tue Dec 21 17:01:10 UTC 2021
[Public]
> -----Original Message-----
> From: Linus Torvalds <torvalds at linux-foundation.org>
> Sent: Monday, December 20, 2021 5:05 PM
> To: Imre Deak <imre.deak at intel.com>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>; Deucher, Alexander
> <Alexander.Deucher at amd.com>; Kai-Heng Feng
> <kai.heng.feng at canonical.com>
> Subject: Re: Expecting to revert commit 55285e21f045 "fbdev/efifb: Release
> PCI device ..."
>
> On Mon, Dec 20, 2021 at 1:33 PM Imre Deak <imre.deak at intel.com> wrote:
> >
> > amdgpu.runpm=0
>
> Hmmm.
>
> This does seem to "work", but not very well.
>
> With this, what seems to happen is odd: I lock the screen, wait, it goes "No
> signal, shutting down", but then doesn't actually shut down but stays black
> (with the backlight on). After _another_ five seconds or so, the monitor goes
> "No signal, shutting down" _again_, and at that point it actually does it.
>
> So it solves my immediate problem - in that yes, the backlight finally does
> turn off in the end - but it does seem to be still broken.
>
> I'm very surprised if no AMD drm developers can see this exact same thing.
> This is a very simple setup. The only possibly slightly less common thing is that
> I have two monitors, but while that is not necessarily the _most_ common
> setup in an absolute sense, I'd expect it to be very common among DRM
> developers..
>
> I guess I can just change the revert to just a
>
> -int amdgpu_runtime_pm = -1;
> +int amdgpu_runtime_pm = 0;
>
> instead. The auto-detect is apparently broken. Maybe it should only kick in
> for LVDS screens on actual laptops?
>
> Note: on my machine, I get that
>
> amdgpu 0000:49:00.0: amdgpu: Using BACO for runtime pm
>
> so maybe the other possible runtime pm models (ARPX and BOCO) are ok,
> and it's only that BACO case that is broken.
>
> I have no idea what any of those three things are - I'm just looking at the
> uses of that amdgpu_runtime_pm variable.
>
> amdgpu people: if you don't want that amdgpu_runtime_pm turned off by
> default, tell me something else to try.
For a little background, runtime PM support was added about 10 year ago originally to support laptops with multiple GPUs (integrated and discrete). It's not specific to the display hardware. When the GPU is idle, it can be powered down completely. In the case of these laptops, it's D3 cold (managed by ACPI, we call this BOCO in AMD parlance - Bus Off, Chip Off) which powers off the dGPU completely (i.e., it disappears from the bus). A few years ago we extended this to support desktop dGPUs as well which support their own version of runtime D3 (called BACO in AMD parlance - Bus Active, Chip Off). The driver can put the chip into a low power state where everything except the bus interface is powered down (to avoid the device disappearing from the bus). So this has worked for almost 2 years now on BACO capable parts and for a decade or more on BOCO systems. Unfortunately, changing the default runpm parameter setting would cause a flood of bug reports about runtime power management breaking and suddenly systems are using more power.
Imre's commit (55285e21f045) fixes another commit (a6c0fd3d5a8b). Runtime pm was working on amdgpu prior to that commit. Is it possible there is still some race between when amdgpu takes over from efifb? Does it work properly when all pm_runtime calls in efifb are removed or if efifb is not enabled? Runtime pm for Polaris boards has been enabled by default since 4fdda2e66de0b which predates both of those patches.
Alex
More information about the amd-gfx
mailing list