Dealing with suspend/resume failures

Holger Macht hmacht at suse.de
Sat Nov 4 08:38:56 PST 2006


On Fri 03. Nov - 11:56:20, David Zeuthen wrote:
> 
> Hi,
> 
> For your consideration here's a proposal to make the desktop bits better
> deal with Suspend/Resume failures. It's the output of some discussions
> with mclasen, rstrode, jrb and myself over the last few days.
> 
> Traditionally what happens today is that invoking Suspend() on the
> o.fd.Hal.Device.SystemPowerManagement interface either throws the
> exception 
> 
>  o.fd.Hal.Device.SystemPowerManagement.Unsupported
> 
> if it's not supported and probably something bad like
> 
>  o.fd.Hal.Device.Error
> 
> if the return code of the tool called (e.g. pm-suspend) is not zero.
> 
> In particular, we don't do anything intelligent to report back if
> anything goes wrong during the attempt to suspend; what's worse (as most
> bugs is with resume) we have no way of reporting errors back when the
> user have rebooted because resume doesn't work. This is pretty hard, I
> mean, there is no way to figure out if e.g. video came back.
> 
> This is a proposal to rectify that. 
> 
> I propose to 
> 
>  - For Suspend() let HAL capture the output of the tool being invoked
>    to a file /var/lib/hal/suspend-output
> 
>  - If the tool failes with exit code != 0, make Suspend() throw the
>    exception o.fd.Hal.Device.SystemPowerManagement.SuspendFailed.
>    Desktop policy managers can use the new SuspendGetLastError()
>    method to get details, see below.
> 
>  - Provide a new method "void SuspendClearLastError()". It will
>    delete the file /var/lib/hal/suspend-output
> 
>    This is to be called by the desktop policy manager (e.g. g-p-m) once
>    we know that the system is back in a workable state. How does this
>    work? g-p-m should call this when the session is unlocked since
>    this is evidence that the user can use his system. In the event
>    where the session is not locked... I don't know.. perhaps after
>    5 minutes or when g-p-m terminates?
> 
>  - Provide a new method "string SuspendGetLastError()" on the
>    o.fd.Hal.Device.SystemPowerManagement interface. If there is
>    no file /var/lib/hal/suspend-output then this method throws
>    an exception. Otherwise the output of /var/lib/hal/suspend-output
>    is returned.

I am currently thinking about the same issues because we need a
replacement for this due to the switch from powersaved to pm-utils. What I
don't like too much here is that it depends on HAL and adds new methods to
the SystemPowerManagement interface. I thought it would be enough to do
something like this:

When pm-utils starts up, it cleares the suspend-output file. On suspend,
pm-utils writes a log to suspend-output and on error it just sends a
signal over the system bus, maybe on the interface org.freedesktop.PMUtils
with content SuspendError (better names appreciated) and is just passing
the error message written to the suspend-output file as the content of the
signal. This way, desktop applications can just catch the signal, display
the error message, create a bugreport, etc.. When the light doesn't go on
anymore, you still have the suspend-output laying around people can attach
to their bugreports.

I'm not sure if it makes lot of sence to generate automated
bugreports. There are just too many scenarious where suspend fails in
pm-utils due to misconfiguration of the system.

What do you think about this signal approach? For me, it seems more clean
because you don't have to add any new methods to HAL. And IMO it's even
simpler for desktop applications to catch the error log.

Regards,
	Holger


More information about the hal mailing list