[Pm-utils] Re: Dealing with suspend/resume failures

Holger Macht hmacht at suse.de
Sat Nov 11 14:32:51 PST 2006


On Thu 09. Nov - 14:48:29, David Zeuthen wrote:
> 
> Hi,
> 
> Sorry for being slow in responding. 
> 
> Attached are the patches I put in the RHEL beta, been tested somewhat
> thoroughly and appears to work fine insorfar that the user gets a dialog
> if suspend fails.
> 
>  diffstat hal-0.5.8.1-powermgmt-output-4.patch 
>  fdi/policy/10osvendor/10-power-mgmt-policy.fdi |   10 ++++++++++
>  tools/Makefile.am                              |    6 ++++++
>  tools/hal-system-power-hibernate-clear-error   |    3 +++
>  tools/hal-system-power-suspend-clear-error     |    3 +++
>  tools/linux/hal-system-power-hibernate-linux   |   13 +++++++++++--
>  tools/linux/hal-system-power-suspend-linux     |   13 +++++++++++--
>  6 files changed, 44 insertions(+), 4 deletions(-)
> 
>  diffstat gnome-power-manager-2.16.0-resume-failed-ui-3.patch 
>  gpm-hal.c         |   45 +++++++++++++++++++++++++++++++++++++++++
>  gpm-hal.h         |    5 ++++
>  gpm-main.c        |   16 ++++++++++++--
>  gpm-manager.c     |   59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  gpm-manager.h     |    2 +
>  gpm-screensaver.c |   12 ++++++++++
>  6 files changed, 137 insertions(+), 2 deletions(-)
> 
> Richard: sorry, the patch is a bit rough but does the idea look fine to you?
> 
> > 1. Desktop calls PM-utils via HAL to suspend. The suspend preparation/
> >    resume cleanup hooks fail due to whatever reason. Two possibilities:
> > 
> >    (1) PM-utils returns with error. Desktop calls SuspendGetLastError() on
> >        HAL to get the failure reason.
> > 
> >    (2) PM-utils sends out a D-Bus signal with the reason the destop can
> >        catch
> 
> We already do one but right; we throw the exception
> 
>  org.freedesktop.Hal.Device.Error
> 
> with a useless message we may want to clean up. We now (with the patch)
> _always_ create the file /var/lib/hal/system-power-suspend-output in
> addition that can be inspected when the caller of Suspend() is told the
> method returns.

Why can't we use /var/log/pm-{suspend,hibernate}.log created by pm-utils
instead? We are currently working on pm-utils being more verbose.

Additionally, can we let pm-utils return one descriptive message about the
reason for the failure, if one is present? Nothing very verbose, just
something like 'Module xz failed to unload' (just an example, of
course). This would be something that can be shown to expirenced users.

> > 2. Desktop calls PM-utils via HAL to suspend. The suspend preparation/
> >    resume cleanup hooks succeed. However, the backlight doesn't come back
> >    although the system is fully operational. Usually the user just
> >    reboots/powers the system off. The desktop has to figure out that the
> >    session didn't got unlocked properly, the user wasn't active anymore or
> >    whatever. But that's absolutely in the scope of the desktop to check
> >    this, because PM-utils is clueless and SuspendGetLastError() wouldn't
> >    contain an error.
> 
> I wasn't clear on this, but I think what you're missing is that
> Suspend() _always_ create the file
> 
>  /var/lib/hal/system-power-suspend-output
> 
> and Hibernate() always create the file
> 
>  /var/lib/hal/system-power-hibernate-output
> 
> these files contains important information about the state of the system
> (such as what kernel modules are loaded etc.) so it's easier to figure
> out what the problem it. 
> 
> I've attached an example output file (when pm-utils gains real logging
> we can replace the ugly 'bash -x /usr/sbin/pm-suspend' hack but right
> now it does the trick.). We might want to include more information,
> change the formatting, whatever, keep in mind this information is
> collected on _every_ suspend or hibernate request will generate this so
> we want the process to be very quick.

Ok, see above for using the pm-utils logging files.

> There are two new methods on HAL on the SystemPowerManagement interface
> 
>  SuspendClearError()
>  HibernateClearError()
> 
> that respectively just delete the files
> 
>  /var/lib/hal/system-power-suspend-output
>  /var/lib/hal/system-power-hibernate-output
> 
> (we should probably change the semantics to instead of deleting them
> they are moved away in e.g. system-power-suspend-output.1 etc.).

Hm, maybe we should just copy the pm-utils logfile to
/var/lib/hal/system-power-suspend-output? This way it would be ok to
delete it because it would also be something like a state file.

> So how does this work? It's simple
> 
>  1. on startup the desktop policy manager sees if any of
> 
>      /var/lib/hal/system-power-suspend-output
>      /var/lib/hal/system-power-hibernate-output
> 
>     exist and if they do they show a dialog, read the file and include
>     it in the dialog or whatever (my g-p-m patch currently only just
>     displays a dialog but that is subject to improvement). When the
>     desktop policy manager have notified the user it calls *ClearError()
>     to move the files out of the way 

Ok.

>  2. When seeing intelligent activity from the user the desktop policy
>     manager also calls *ClearError(). That's the right thing to do
>     because if we're just resumed the files will get clear; if we aren't
>     there is nothing to clear and it's all good.
> 
>     Right now my g-p-m does this in two ways
> 
>     - when the user unlocks his session (he needs to provide credentials
>       into the gnome-screensaver dialog so this is valid)
> 
>     - when g-p-m terminates (ok, so if the system is configure to
>       automatically shut down without asking the user on power button
>       presses we errornously clear the file but such a weird
>       configuration is rare and not default. The right thing is to
>       to make g-p-m only clear the files when we know that the session
>       terminates as a result of the user ending the session except
>       when he presses the power button and we shut down without
>       user intervention. I'll leave that for hughsie to implement :-)
> 
> 
> > 3. Very rare case: The suspend preparation/ resume cleanup hooks fail
> >    _and_ the backlight doesn't come back anymore. Two possibilities:
> > 
> >    (1) PM-utils returns with error. Desktop calls SuspendGetLastError() on
> >        HAL to get the failure reason. But it's not able to show that
> >        something went wrong and to inform the user. So it doesn't clear
> >        the error and shows it the next time the system is up properly.
> 
> Wrong, because we always leave the /var/lib/hal error files there.
> 
> > If the job ob SuspendGetLastError() is to return the suspend log to the
> > desktop, then it's ok, but we wouldn't need it for the information if
> > suspend failed last time. Either pm-utils fails, sends a signal and
> > desktops catches it. Or, backlight doesn't come back, then pm-utils is
> > clueless anyway and desktop has to deal with it on its own.
> > 
> > And in fact, I think the goal is simplicity here, doing a simple dbus-send
> > in PM-utils would be much more simpler then adding multiple new methods to
> > HAL.
> > 
> > So as a conclusion, if we agree that we need a method in HAL to get the
> > suspend log, not only on error (which is similiar to the backlight case),
> > then ok, let's do it, but I doubt that we need it. Sending out a signal
> > would still be possible. SuspendGetLastError() would only serve one
> > purpose, getting the suspend log. Sending the signal would serve the
> > purpose to inform the desktop that suspension failed. In case PM-Utils
> > doesn't know of a failure and doesn't send the signal, desktop is on its
> > own anyway.
> > 
> > I'm not against going the HAL-methods-way, but I think the signal-way is
> > much simpler. Both will do their job.
> 
> I think I've previously failed at explaining exactly what the intention
> was but I hope this mail + patches shows much more clearly what we're
> trying to achieve and how simple it is. Comments?

Yes, I always understood what you're trying to achieve. And in general I'm
fine with this proposal.

[...]

> +gboolean
> +gpm_hal_has_suspend_error (GpmHal *hal)
> +{
> +	/* TODO: should call a method on HAL and also return the ouput of the file */
> +	return g_file_test ("/var/lib/hal/system-power-suspend-output", G_FILE_TEST_EXISTS);
> +}
> +
> +gboolean
> +gpm_hal_has_hibernate_error (GpmHal *hal)
> +{
> +	/* TODO: should call a method on HAL and also return the ouput of the file */
> +	return g_file_test ("/var/lib/hal/system-power-hibernate-output", G_FILE_TEST_EXISTS);
> +}

Yes, we should definitely add a HAL method to get the output.

[...]

> diff --git a/tools/linux/hal-system-power-suspend-linux b/tools/linux/hal-system-power-suspend-linux
> index 59cde73..8be9755 100755
> --- a/tools/linux/hal-system-power-suspend-linux
> +++ b/tools/linux/hal-system-power-suspend-linux
> @@ -62,8 +62,17 @@ elif [ -f "/etc/redhat-release" ] || [ -
>  	# TODO: fixup pm-suspend to define erroc code (see alarm above) and throw
>  	#	   the appropriate exception
>  	if [ -x "/usr/sbin/pm-suspend" ] ; then
> -		/usr/sbin/pm-suspend
> +		rm -f /var/lib/hal/system-power-suspend-output
> +		touch /var/lib/hal/system-power-suspend-output
> +		chmod 644 /var/lib/hal/system-power-suspend-output
> +		echo "==== Kernel version: ====" >> /var/lib/hal/system-power-suspend-output
> +		/bin/uname -a >> /var/lib/hal/system-power-suspend-output
> +		echo "==== Modules loaded before suspending: ====" >> /var/lib/hal/system-power-suspend-output
> +		/sbin/lsmod >> /var/lib/hal/system-power-suspend-output
> +		echo "==== Output of /usr/sbin/pm-suspend ====" >> /var/lib/hal/system-power-suspend-output
> +		/bin/bash -x /usr/sbin/pm-suspend >> /var/lib/hal/system-power-suspend-output 2>&1
>  		RET=$?
> +		echo "==== DONE ====" >> /var/lib/hal/system-power-suspend-output
>  	else
>  		# TODO: add support
>  		unsupported

We should definitely use the pm-utils logfiles for this. I will send a
logging hook to the pm-utils list in a few minutes.

> @@ -103,7 +112,7 @@ do
>  	for device in $devices
>  	do
>  		dbus-send --system --print-reply --dest=org.freedesktop.Hal \
> -			  $device org.freedesktop.Hal.Device.Rescan
> +			  $device org.freedesktop.Hal.Device.Rescan > /dev/null 2>&1
>  	done
>  done
>  
> --- /dev/null	2006-11-08 11:09:00.112654773 -0500
> +++ b/tools/hal-system-power-suspend-clear-error	2006-11-06 16:44:56.000000000 -0500
> @@ -0,0 +1,3 @@
> +#!/bin/bash
> +
> +rm -f /var/lib/hal/system-power-suspend-output
> --- /dev/null	2006-11-08 11:09:00.112654773 -0500
> +++ b/tools/hal-system-power-hibernate-clear-error	2006-11-06 16:45:09.000000000 -0500
> @@ -0,0 +1,3 @@
> +#!/bin/bash
> +
> +rm -f /var/lib/hal/system-power-hibernate-output

> ==== Kernel version: ====
> Linux daxter.boston.redhat.com 2.6.18-1.2798.fc6 #1 SMP Mon Oct 16 14:37:32 EDT 2006 i686 i686 i386 GNU/Linux
> ==== Modules loaded before suspending: ====
> Module                  Size  Used by
> rfcomm                 46041  0 
> hidp                   24129  2 
> l2cap                  31681  10 rfcomm,hidp
> bluetooth              58917  5 rfcomm,hidp,l2cap
> button                 10961  0 
> tun                    15425  1 
> radeon                109665  2 
> drm                    72789  3 radeon
> ipv6                  267489  13 
> autofs4                25413  2 
> cpufreq_ondemand       11085  0 
> dm_mirror              33041  0 
> dm_mod                 61529  1 dm_mirror
> video                  21061  0 
> sbs                    20225  0 
> ibm_acpi               31809  0 
> i2c_ec                  9281  1 sbs
> dock                   12377  0 
> battery                14405  0 
> asus_acpi              20697  0 
> ac                      9541  0 
> parport_pc             31205  1 
> lp                     17033  0 
> parport                40841  2 parport_pc,lp
> intel_rng               7105  0 
> snd_intel8x0m          21325  0 
> snd_intel8x0           36445  1 
> snd_ac97_codec         95073  2 snd_intel8x0m,snd_intel8x0
> snd_ac97_bus            6593  1 snd_ac97_codec
> snd_seq_dummy           8133  0 
> snd_seq_oss            37185  0 
> snd_seq_midi_event     11841  1 snd_seq_oss
> snd_seq                57137  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
> aes                    31873  1 
> snd_seq_device         12621  3 snd_seq_dummy,snd_seq_oss,snd_seq
> snd_pcm_oss            46561  0 
> joydev                 13697  0 
> snd_mixer_oss          20545  1 snd_pcm_oss
> snd_pcm                80325  4 snd_intel8x0m,snd_intel8x0,snd_ac97_codec,snd_pcm_oss
> snd_timer              27077  2 snd_seq,snd_pcm
> ide_cd                 42337  2 
> snd                    57029  12 snd_intel8x0m,snd_intel8x0,snd_ac97_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
> soundcore              14113  1 snd
> i2c_i801               11853  0 
> serio_raw              11205  0 
> airo                   77281  0 
> snd_page_alloc         14281  3 snd_intel8x0m,snd_intel8x0,snd_pcm
> i2c_core               25537  2 i2c_ec,i2c_i801
> cdrom                  38625  1 ide_cd
> floppy                 61284  1 
> pcspkr                  7361  0 
> e1000                 123601  0 
> ext3                  135369  1 
> jbd                    63209  1 ext3
> ehci_hcd               35533  0 
> ohci_hcd               25181  0 
> uhci_hcd               27725  0 
> ==== Output of /usr/sbin/pm-suspend ====
> + '[' -n 0 -a 0 '!=' 0 ']'
> + export LC_COLLATE=C
> + LC_COLLATE=C
> + . /etc/pm/functions
> ++ . /etc/rc.d/init.d/functions
> +++ TEXTDOMAIN=initscripts
> +++ umask 022
> +++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
> +++ export PATH
> +++ '[' -z '' ']'
> +++ COLUMNS=80
> +++ '[' -z '' ']'
> ++++ /sbin/consoletype
> +++ CONSOLETYPE=serial
> +++ '[' -f /etc/sysconfig/i18n -a -z '' ']'
> +++ . /etc/profile.d/lang.sh
> ++++ sourced=0
> ++++ for langfile in /etc/sysconfig/i18n '$HOME/.i18n'
> ++++ '[' -f /etc/sysconfig/i18n ']'
> ++++ . /etc/sysconfig/i18n
> +++++ LANG=en_US.UTF-8
> +++++ SYSFONT=latarcyrheb-sun16
> ++++ sourced=1
> ++++ for langfile in /etc/sysconfig/i18n '$HOME/.i18n'
> ++++ '[' -f /.i18n ']'
> ++++ '[' -n '' ']'
> ++++ '[' 1 = 1 ']'
> ++++ '[' -n en_US.UTF-8 ']'
> ++++ export LANG
> ++++ '[' -n '' ']'
> ++++ unset LC_ADDRESS
> ++++ '[' -n '' ']'
> ++++ unset LC_CTYPE
> ++++ '[' -n C ']'
> ++++ export LC_COLLATE
> ++++ '[' -n '' ']'
> ++++ unset LC_IDENTIFICATION
> ++++ '[' -n '' ']'
> ++++ unset LC_MEASUREMENT
> ++++ '[' -n '' ']'
> ++++ unset LC_MESSAGES
> ++++ '[' -n '' ']'
> ++++ unset LC_MONETARY
> ++++ '[' -n '' ']'
> ++++ unset LC_NAME
> ++++ '[' -n '' ']'
> ++++ unset LC_NUMERIC
> ++++ '[' -n '' ']'
> ++++ unset LC_PAPER
> ++++ '[' -n '' ']'
> ++++ unset LC_TELEPHONE
> ++++ '[' -n '' ']'
> ++++ unset LC_TIME
> ++++ '[' -n '' ']'
> ++++ unset LC_ALL
> ++++ '[' -n '' ']'
> ++++ unset LANGUAGE
> ++++ '[' -n '' ']'
> ++++ unset LINGUAS
> ++++ '[' -n '' ']'
> ++++ unset _XKB_CHARSET
> +++++ /sbin/consoletype
> ++++ consoletype=serial
> ++++ '[' -n '' ']'
> ++++ '[' -n '' ']'
> ++++ '[' -n en_US.UTF-8 ']'
> ++++ case $LANG in
> ++++ '[' dumb = linux ']'
> ++++ unset SYSFONTACM SYSFONT
> ++++ unset sourced
> ++++ unset langfile
> +++ '[' -z '' ']'
> +++ '[' -f /etc/sysconfig/init ']'
> +++ . /etc/sysconfig/init
> ++++ BOOTUP=color
> ++++ GRAPHICAL=yes
> ++++ RES_COL=60
> ++++ MOVE_TO_COL='echo -en \033[60G'
> ++++ SETCOLOR_SUCCESS='echo -en \033[0;32m'
> ++++ SETCOLOR_FAILURE='echo -en \033[0;31m'
> ++++ SETCOLOR_WARNING='echo -en \033[0;33m'
> ++++ SETCOLOR_NORMAL='echo -en \033[0;39m'
> ++++ LOGLEVEL=3
> ++++ PROMPT=yes
> ++++ AUTOSWAP=no
> +++ '[' serial = serial ']'
> +++ BOOTUP=serial
> +++ MOVE_TO_COL=
> +++ SETCOLOR_SUCCESS=
> +++ SETCOLOR_FAILURE=
> +++ SETCOLOR_WARNING=
> +++ SETCOLOR_NORMAL=
> +++ '[' serial '!=' verbose ']'
> +++ INITLOG_ARGS=-q
> +++ __sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'
> ++ HIBERNATE_RESUME_POST_VIDEO=no
> ++ SUSPEND_MODULES=
> ++ '[' -f /etc/pm/config ']'
> ++ . /etc/pm/config
> +++ SUSPEND_MODULES=button
> +++ HIBERNATE_RESUME_POST_VIDEO=no
> ++ export HIBERNATE_RESUME_POST_VIDEO
> ++ export SUSPEND_MODULES
> ++ GLOBAL_CONFIG_VARIABLES='HIBERNATE_RESUME_POST_VIDEO SUSPEND_MODULES'
> ++ source_configs
> +++ ls -1 '/etc/pm/config.d/*'
> + '[' -f /sys/power/disk ']'
> + '[' -f /sys/power/state ']'
> ++ basename /usr/sbin/pm-suspend
> + ACTION=pm-suspend
> + ACTION=suspend
> + pm_main suspend
> + take_suspend_lock
> ++ /usr/bin/fgconsole
> + VT=7
> + chvt 63
> + '[' -f /.suspended ']'
> + echo 22034
> + rm -f /var/run/pm-suspend
> + touch /var/run/pm-suspend
> + return 0
> + run_hooks suspend
> + '[' -z suspend ']'
> + '[' -f /var/run/pm-suspend ']'
> + . /var/run/pm-suspend
> + rm -f /var/run/pm-suspend
> + files='/etc/pm/hooks/*'
> + '[' '' = reverse ']'
> + for file in '$files'
> + '[' -x /etc/pm/hooks/00clear ']'
> + /etc/pm/hooks/00clear suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/01grub ']'
> + /etc/pm/hooks/01grub suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/05led ']'
> + /etc/pm/hooks/05led suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/10NetworkManager ']'
> + /etc/pm/hooks/10NetworkManager suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/20video ']'
> + /etc/pm/hooks/20video suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/49bluetooth ']'
> + /etc/pm/hooks/49bluetooth suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/50modules ']'
> + /etc/pm/hooks/50modules suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/90clock ']'
> + /etc/pm/hooks/90clock suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/94cpufreq ']'
> + /etc/pm/hooks/94cpufreq suspend
> + for file in '$files'
> + '[' -x /etc/pm/hooks/95led ']'
> + /etc/pm/hooks/95led suspend
> + sync
> + sync
> + sync
> + case "$1" in
> + pm-pmu --suspend
> open: No such file or directory
> + echo -n mem
> + run_hooks resume reverse
> + '[' -z resume ']'
> + '[' -f /var/run/pm-suspend ']'
> + . /var/run/pm-suspend
> ++ export hidd_SERVICE_ACTIVATE=yes
> ++ hidd_SERVICE_ACTIVATE=yes
> ++ export bluetooth_SERVICE_ACTIVATE=yes
> ++ bluetooth_SERVICE_ACTIVATE=yes
> ++ export ibm_bluetooth_STATE=not
> ++ ibm_bluetooth_STATE=not
> ++ export rfcomm_MODULE_LOAD=yes
> ++ rfcomm_MODULE_LOAD=yes
> ++ export hidp_MODULE_LOAD=yes
> ++ hidp_MODULE_LOAD=yes
> ++ export button_MODULE_LOAD=yes
> ++ button_MODULE_LOAD=yes
> + rm -f /var/run/pm-suspend
> + files='/etc/pm/hooks/*'
> + '[' reverse = reverse ']'
> ++ echo /etc/pm/hooks/00clear /etc/pm/hooks/01grub /etc/pm/hooks/05led /etc/pm/hooks/10NetworkManager /etc/pm/hooks/20video /etc/pm/hooks/49bluetooth /etc/pm/hooks/50modules /etc/pm/hooks/90clock /etc/pm/hooks/94cpufreq /etc/pm/hooks/95led
> ++ awk '{ for (i=NF; i>=1; i--) print $i }'
> + files='/etc/pm/hooks/95led
> /etc/pm/hooks/94cpufreq
> /etc/pm/hooks/90clock
> /etc/pm/hooks/50modules
> /etc/pm/hooks/49bluetooth
> /etc/pm/hooks/20video
> /etc/pm/hooks/10NetworkManager
> /etc/pm/hooks/05led
> /etc/pm/hooks/01grub
> /etc/pm/hooks/00clear'
> + for file in '$files'
> + '[' -x /etc/pm/hooks/95led ']'
> + /etc/pm/hooks/95led resume
> + for file in '$files'
> + '[' -x /etc/pm/hooks/94cpufreq ']'
> + /etc/pm/hooks/94cpufreq resume
> + for file in '$files'
> + '[' -x /etc/pm/hooks/90clock ']'
> + /etc/pm/hooks/90clock resume
> + for file in '$files'
> + '[' -x /etc/pm/hooks/50modules ']'
> + /etc/pm/hooks/50modules resume
> + for file in '$files'
> + '[' -x /etc/pm/hooks/49bluetooth ']'
> + /etc/pm/hooks/49bluetooth resume
> + for file in '$files'
> + '[' -x /etc/pm/hooks/20video ']'
> + /etc/pm/hooks/20video resume
> + for file in '$files'

Regards,
	Holger



More information about the Pm-utils mailing list