Dealing with suspend/resume failures

Richard Hughes hughsient at gmail.com
Fri Nov 3 09:21:47 PST 2006


On Fri, 2006-11-03 at 11:56 -0500, David Zeuthen wrote:
> Hi,
> 
> For your consideration here's a proposal to make the desktop bits better
> deal with Suspend/Resume failures. It's the output of some discussions
> with mclasen, rstrode, jrb and myself over the last few days.

Sweet, thanks.

> Traditionally what happens today is that invoking Suspend() on the
> o.fd.Hal.Device.SystemPowerManagement interface either throws the
> exception 
> 
>  o.fd.Hal.Device.SystemPowerManagement.Unsupported
> 
> if it's not supported and probably something bad like
> 
>  o.fd.Hal.Device.Error
> 
> if the return code of the tool called (e.g. pm-suspend) is not zero.
> 
> In particular, we don't do anything intelligent to report back if
> anything goes wrong during the attempt to suspend; what's worse (as most
> bugs is with resume) we have no way of reporting errors back when the
> user have rebooted because resume doesn't work. This is pretty hard, I
> mean, there is no way to figure out if e.g. video came back.
> 
> This is a proposal to rectify that. 
> 
> I propose to 
> 
>  - For Suspend() let HAL capture the output of the tool being invoked
>    to a file /var/lib/hal/suspend-output

you mean have pm-suspend quit with exit code 3 and report "Device failed
to sync" as an example?

>  - If the tool failes with exit code != 0, make Suspend() throw the
>    exception o.fd.Hal.Device.SystemPowerManagement.SuspendFailed.
>    Desktop policy managers can use the new SuspendGetLastError()
>    method to get details, see below.

Nice.

>  - Provide a new method "void SuspendClearLastError()". It will
>    delete the file /var/lib/hal/suspend-output

Ick. What's wrong with just deleting this file the next time we do a
system action like shutting down, suspending, hibernating etc? If we
didn't fail, then the file won't exist.

Programs like g-p-m could just call "GetResumeStatus()" and if the file
exists, then this is reported to the desktop.

>    This is to be called by the desktop policy manager (e.g. g-p-m) once
>    we know that the system is back in a workable state. How does this
>    work? g-p-m should call this when the session is unlocked since
>    this is evidence that the user can use his system. In the event
>    where the session is not locked... I don't know.. perhaps after
>    5 minutes or when g-p-m terminates?

Seems a bit messy to me.

>  - Provide a new method "string SuspendGetLastError()" on the
>    o.fd.Hal.Device.SystemPowerManagement interface. If there is
>    no file /var/lib/hal/suspend-output then this method throws
>    an exception. Otherwise the output of /var/lib/hal/suspend-output
>    is returned.
> 
> So this is to be used this way by desktop policy managers such as g-p-m
> 
>  1. On startup, call SuspendGetLastError(). If something is returned
>     do what you need to do, e.g. show a dialog saying something along
>     the lines of
> 
>        Your system didn't come up after suspending it. This
>        might be a hardware or software problem.          
>                                        [Close]  [File bug]

Sure, this lets us have a common text:

Your system did not resume correctly.
This might be a hardware or software problem.

>     Notably the desktop policy manager might want to include key
>     information such as the smbios.* properties or whatever. When
>     the user closes the dialog the policy manager calls the method
>     SuspendClearLastError()
> 
>     (OK, so only the first user to login after suspend gets to see the
>      error. I think that's OK.)

Not if you autoclear that on shutdown, next suspend or hibernate.

Otherwise, this is great.

Richard.




More information about the hal mailing list