questions about hal callout timeout

Thu Apr 9 09:29:25 PDT 2009

Hello,

I have been working with a project that has been developing an appliance
solution that relies quite a bit on hal. They use callout scripts to
assist in the configuration of devices with this appliance.

Recently, they encountered one of their callout scripts receiving a
sigterm after about 10 seconds of execution. After review of the hal
code, this was related to the HAL_HELPER_TIMEOUT value defined in
hald/hald_runner.h. It appeared that in the run_request_run()routine
from hald-runner/runner.c we find that it creates a glib timeout that
invokes a routine called run_timedout() that send a sigterm to the
process if it is still running when the timeout pops after
HAL_HELPER_TIMEOUT seconds.

It looks as if the original timeout value was increased to 20 seconds
not long ago.

We have a few questions about the implementation.

First, are there plans to change the implementation to either provide a
configurable timeout value or allow for the elimination of it? 

Second, is there documentation available to other callout script authors
to advise them of this runtime period limitation?

Third, assuming no change to the current implementation, what is the
recommended programming practice for a hal callout that will inevitably
take longer than the arbitrary HAL_HELPER_TIMEOUT period?  For example,
when a script must install additional software for the newly detected
device and configure it which could be a lengthy process. Currently, one
workaround is to create a signal handler for the SIGTERM and ignore it
but this does not seem like proper programming practice. Does the hal
community then recommend forking a separate process instead? Can we ask
that whatever recommendation the hal community settles on that it also
be documented for other hal callout authors?

Lastly, I suspect the origin of this timeout was to terminate child
processes that likely never completed though if the reason they didn't
is because they are blocked on some interruptible I/O then it is likely
not going to cause them to exit right away. From glancing at the source,
the callout processes are asynchronous so while one runs it doesn't
necessarily block the execution of the next one, right? 

I hope someone in the hal community can explain the original design and
whether currently one still needs the current termination after
HAL_HELPER_TIMEOUT period implementation as opposed to independent
processes not subject to a timeout.

Thank you for your time if you read all the way through.

regards,
-- 
Luciano Chavez <lnx1138 at linux.vnet.ibm.com>
IBM Linux Technology Center