[Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate

Imre Deak imre.deak at intel.com
Wed Feb 25 10:33:34 PST 2015


On ti, 2015-02-24 at 20:00 +0100, Bjørn Mork wrote:
> Imre Deak <imre.deak at intel.com> writes:
> 
> > The poweroff handlers undo the actions of the thaw handlers. As the
> > original commit stated saving the registers is not needed there, but
> > it's also not a big overhead and there should be no problem doing it. We
> > are planning to optimize the hibernation sequence by for example not
> > shutting down the display between freeze and thaw, and also getting rid
> > of unnecessary steps at the power off phase. But before that we want to
> > actually unify things rather than having special cases, as maintaining
> > the special code paths caused already quite a lot of problems for us so
> > far.
> 
> That sounds like a worthy goal.  I don't understand what you hope to
> achieve by having a poweroff_late hook, since there aren't really
> anything useful left to do at the point it is called, but if you want a
> dummy callback there for code structure reasons then fine.

There are the following reasons for shutting down the device properly
during hibernation:
- ACPI mandates that the OSPM (the kernel in our case) puts all devices
into D3 that are not wake-up sources (i915 is not) (Kudos to Ville for
pointing this out)
- Embedded panels have a well defined shutdown sequence. We don't have
any good reason to not follow this, in fact for some panels the
subsequent reinitialization could be problematic in case of a hard
power-off. (Thanks to Jani for this info)

In fact the only reason why we didn't put the device into PCI D3 before
the patch you bisected, is kind of arbitrary. The PCI core puts every
device into D3 unless its driver saves the device's PCI state on its
own. Since before that patch we did save the state, but haven't put the
device into D3 it stayed in D0. That was definitely not intentional and
as such we depended on the BIOS to correct this for us afterwards.

> But you cannot just run around breaking stuff while slowly moving
> towards this goal over multiple releases... v3.19 is currently broken
> and that seems totally unnecessary.
> 
> In any case: You should have noticed this problem while testing your
> patches.  The breakage is 100% reproducible. Unfortunately I had to do a
> bisect to realize what you had done to the i915 driver, something I
> unfortunately didn't find time to do before v3.19 was released.  But I
> do find it unnecessary to release with such bugs.  Any attempt to
> exercise the code path you modified would have revealed the bug.

We tested these patches on numerous platforms and haven't seen the issue
you reported. Based on your feedback the current assumption is that this
is a bug in your BIOS, which tries to access the device despite it being
powered down. We're trying our best to test each change we commit on a
big set of platforms, but - especially on older hardware with random
BIOS versions/configurations - full coverage is not possible. So we
depend on reports like yours a lot to fill in the gaps.

> > Reverting the commit may hide some other issue, so before doing that
> > could you try the following patch:
> > http://lists.freedesktop.org/archives/intel-gfx/2015-February/060529.html
> 
> Makes no difference.  I assume that patch fixes an unrelated bug? The
> age and reported symptoms indicates so.  Note that I am reporting a
> regression introduced after v3.18, while that seems to fix a bug
> introduced in v3.17. Both v3.17 and v3.18 (including v3.18.6), as
> well as earlier releases, work fine for me.

Ok, thanks for trying.

> > If with that you still get the hang could you try on top of that the
> > patch below, first having only pci_set_power_state uncommented, then
> > both pci_set_power_state and pci_disable_device uncommented?
> 
> That patch fixes the problem, with only pci_set_power_state commented
> out.  Do you still want me to try with pci_disable_device() commented
> out as well?

No, but it would help if you could still try the two attached patch
separately, without any of the previous workarounds. Based on the
result, we'll follow up with a fix that adds for your specific platform
either of the attached workarounds or simply avoids putting the device
into D3 (corresponding to the patch you already tried).

Thanks,
Imre
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-i915-zero-PCI_COMMAND-at-the-end-of-hibernation.patch
Type: text/x-patch
Size: 783 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20150225/59ff0ced/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-drm-i915-use-D3cold-instead-of-D3hot-during-suspend-.patch
Type: text/x-patch
Size: 804 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20150225/59ff0ced/attachment-0001.bin>


More information about the Intel-gfx mailing list