linux dbgutil tinderbox stuck -> backtrace

Norbert Thiebaud nthiebaud at gmail.com
Fri Apr 1 11:22:18 UTC 2016


On Fri, Apr 1, 2016 at 2:59 AM, Stephan Bergmann <sbergman at redhat.com> wrote:
> On 03/31/2016 03:17 PM, Norbert Thiebaud wrote:
>>
>> On Thu, Mar 31, 2016 at 7:59 AM, Michael Stahl <mstahl at redhat.com> wrote:
>>>
>>> it's a pretty rare deadlock, i've hit it once and sberg too once AFAIK.
>
>
> Ah, <https://bugs.documentfoundation.org/show_bug.cgi?id=96387> "deadlock in
> HSQLDB" predates
> <https://cgit.freedesktop.org/libreoffice/core/commit/?id=03a271901c39d60e4519e67e258d565ad5e1e085>
> "Guard against globally shared UNO ref accessed from wrong UNO env", which
> was the only change that came to mind when I saw Norbert's original post.
>
> I've started to run into this a couple of times now, too.  But at least the
> one time I was alert enough to run jstack on the deadlocked process, all it
> gave me was an internal failure in jstack.
>
>> What I really wish for is a reliable hard timeout on all these tests.
>
>
> I think a better approach would be to let the bots do containerized builds
> that get automatically killed

2 problems:
1/ we _do_ have such global level deadlock.. but jenkins being java...
these are unreliable (jenkins plugin that provide that feature even
explain in details that it is unreliable)...
2/ the linux debug build, once a week also rebuild the doc.. which
takes a long time.. so I had to bump that global level deadlock to the
max time a full build + build the doc and upload it can take... which
make the deadlock kick in the 5-6 hours range... not great.

> [...] or passed on to someone who can debug the problem, or...

There is a flow of task to do.. I cannot block a slave for an
undetermined amount of time waiting for someone to take a look.... or
the build queue piles up...
what is needed is hard timeout.. and preferably with automatic
generation of the useful and relevant diag info. the later make
per-test timeout more useful since then the watchdog can try to first
gather diag of the running test... which will depend on the nature of
the test...

Norbert


Norbert


More information about the LibreOffice mailing list