More information about hung Jenkins builds
Stephan Bergmann
sbergman at redhat.com
Thu Sep 3 07:49:04 UTC 2020
On 30/06/2020 11:38, Stephan Bergmann wrote:
> On 19/06/2020 14:51, Stephan Bergmann wrote:
>> On 28/05/2020 22:19, Stephan Bergmann wrote:
>>> For now, I have updated
>>> <https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil/> to use
>>> the new kill-wrapper timeout feature instead of Jenkins' "Abort the
>>> build if it's stuck" option. (And am planning to roll it out to
>>> other Linux Jenkins jobs that could benefit from it, once it has
>>> proven sufficiently stable.)
>>
>> I have rolled out the kill-wrapper and its timeout feature now also
>> for
>> <https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil_branch/>,
>> <https://ci.libreoffice.org/job/gerrit_linux_gcc_release/>, and
>> <https://ci.libreoffice.org/job/lo_ubsan/>.
>
> Just to note down the semi-obvious somewhere: One scenario that
> kill-wrapper apparently doesn't prevent is leftover processes after
> Jenkins "has lost the connection" (for whatever reason, maybe a bug in
> Jenkins itself?).
>
> <https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil/62736/> had
> gone down with
>
> [...]
>> [build JUT] linguistic_unoapi
>> FATAL: command execution failed
>> java.io.EOFException
>> at
>> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2738)
[...]
That issue now hit again on tb79, where
<https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil/67758/> "lost
the connection" and left behind zombies that then broke later builds.
(And which I manually killed now.)
I don't know how such lost connection issues get fixed, do they
magically self-heal within the Jenkins framework, or does it involve
manual intervention? If the latter, would it be possible to include a
step that removes such leftover zombie processes?
More information about the LibreOffice
mailing list