Dealing with suspend/resume failures

David Zeuthen david at fubar.dk
Fri Nov 3 08:56:20 PST 2006


Hi,

For your consideration here's a proposal to make the desktop bits better
deal with Suspend/Resume failures. It's the output of some discussions
with mclasen, rstrode, jrb and myself over the last few days.

Traditionally what happens today is that invoking Suspend() on the
o.fd.Hal.Device.SystemPowerManagement interface either throws the
exception 

 o.fd.Hal.Device.SystemPowerManagement.Unsupported

if it's not supported and probably something bad like

 o.fd.Hal.Device.Error

if the return code of the tool called (e.g. pm-suspend) is not zero.

In particular, we don't do anything intelligent to report back if
anything goes wrong during the attempt to suspend; what's worse (as most
bugs is with resume) we have no way of reporting errors back when the
user have rebooted because resume doesn't work. This is pretty hard, I
mean, there is no way to figure out if e.g. video came back.

This is a proposal to rectify that. 

I propose to 

 - For Suspend() let HAL capture the output of the tool being invoked
   to a file /var/lib/hal/suspend-output

 - If the tool failes with exit code != 0, make Suspend() throw the
   exception o.fd.Hal.Device.SystemPowerManagement.SuspendFailed.
   Desktop policy managers can use the new SuspendGetLastError()
   method to get details, see below.

 - Provide a new method "void SuspendClearLastError()". It will
   delete the file /var/lib/hal/suspend-output

   This is to be called by the desktop policy manager (e.g. g-p-m) once
   we know that the system is back in a workable state. How does this
   work? g-p-m should call this when the session is unlocked since
   this is evidence that the user can use his system. In the event
   where the session is not locked... I don't know.. perhaps after
   5 minutes or when g-p-m terminates?

 - Provide a new method "string SuspendGetLastError()" on the
   o.fd.Hal.Device.SystemPowerManagement interface. If there is
   no file /var/lib/hal/suspend-output then this method throws
   an exception. Otherwise the output of /var/lib/hal/suspend-output
   is returned.

So this is to be used this way by desktop policy managers such as g-p-m

 1. On startup, call SuspendGetLastError(). If something is returned
    do what you need to do, e.g. show a dialog saying something along
    the lines of

       Your system didn't come up after suspending it. This
       might be a hardware or software problem.          
                                       [Close]  [File bug]

    Notably the desktop policy manager might want to include key
    information such as the smbios.* properties or whatever. When
    the user closes the dialog the policy manager calls the method
    SuspendClearLastError()

    (OK, so only the first user to login after suspend gets to see the
     error. I think that's OK.)

 2. On resume (e.g. when Suspend() from HAL returns), the desktop policy
    manager calls SuspendClearLastError() when he sees intelligent input
    from the user such as the user being able to unlock his session or
    something else.

 3. When the desktop policy manager shuts down he also calls the method
    SuspendClearLastError() just for good measure.

All this would apply to Hibernate also.

I'm adding the pm-utils list as Cc as I want to make sure we can get
useful logging output. I think HAL would just use an option
--log-verbose-output-to-stderr when invoking pm-suspend and
pm-hibernate.

Thoughts? Flames? Praises? If it looks good I'm going to add this
feature soon since it's needed for our next release. Thanks.

     David




More information about the hal mailing list