Tracking whether a connection is alive

Mon Jul 12 02:24:08 PDT 2004

W liście z pon, 12-07-2004, godz. 10:44, David Zeuthen pisze: 
> On Mon, Jul 12, 2004 at 08:32:00AM +0200, Owen Fraser-Green wrote:
> > Hi,
> > 
> > On Mon, 2004-07-12 at 00:54, David Zeuthen wrote:
> > > On Sun, 2004-07-11 at 23:46 +0100, Mike Hearn wrote:
> > > > What I'd suggest is that while a peer has a device locked HAL sends Ping
> > > > messages to that service every few seconds until it's unlocked. One
> > > > downside is that by requiring this I guess you make it harder to write
> > > > clients because if the task you're doing isn't easily interruptable (is CD
> > > > writing? I guess it must be ....) you need threading and such to respond
> > > > to pings.
> > > > 
> > > 
> > > OK, any reasonable CD writer application probably has a thread to update
> > > the UI anyway, so this seems like an acceptable thing to do.
> > 
> > But this would only help to prove that the thread handling UI is alive
> > and well but nothing much about the thread doing the actual job which
> > owns the lock. What if that thread gets into an infinite loop or the
> > library it calls locks up? If you then decide to make monitoring thread
> > intelligent enough to detect these situations then it might as well have
> > have just told HAL when the thread went AWOL instead of HAL polling it
> > all the time.
> > 
> > I don't think anything can really detect locking up better than the
> > user...

There is one problem with user detecting a lock. As can be seen today,
many apps (especially heavily bonoboized ones, like Galeon) often tend
to not die completely when user chooses to terminate after WM prompt.
That is, GUI part is killed, but in reality parts of process remains and
prevents starting new instances, or locks up resources. I know what to
do with that, but for Joe $Random_User, computer just breaks and cannot
be fixed other than by rebooting. 
Similarly with apps blocking on opening busy /dev/dsp, they "hang".
Tracking down who holds /dev/dsp open is big, big PITA, and this is only
when you know exactly what happened. Recreating that in 21st century HAL
is going to be, ugh, even more PITA.
We really don't want user to face situation when he killed hung
application (WM asked him to do so), but is nevertheless unable to
access his CD-R drive and has not a slightest clue what's happening and
is unable to do _anything_ (rebooting isn't really "something"). Plus,
some of HAL apps don't have to have a GUI, so there's no WM to assist
user, or even give her a clue something is running.

> Mmm, okay. So your point is that detecting and killing buggy
> applications should better be left to e.g. the WM. Makes sense to
> me. And then I don't have to ping, which makes my life a bit easier.

One solution slightly more elegant than pinging periodically is to ping
only on attempt to lock acquisition, this however introduces up to
$TIMEOUT latency without removing fundamental issues, so maybe it isn't
all that great optimization.

Cheers,
Maciej

-- 
"Tautologizm to coś tautologicznego"
   Maciej Katafiasz <mnews2 at wp.pl>
       http://mathrick.blog.pl