multi-threading task under SolarMutex -> deadlock

Thu May 19 14:19:33 UTC 2016

Hi Norbert,

thanks a lot for the extensive answer. Comments inline...

Am 18.05.2016 um 20:22 schrieb Norbert Thiebaud:
> On Wed, May 18, 2016 at 3:43 AM, Armin Le Grand <armin_le_grand at me.com> wrote:
>> Hi Norbert,
>>
>> thanks for also having an eye on this - I am looking for the failure reports
>> on ci.libreoffice.org currently, too.
>> Last is from http://ci.libreoffice.org/job/lo_tb_master_linux_dbg/7195/, so
>> last is from Friday, 13th (uhhh...)
>>
>> Have you seen such or similar stacks anywhere else? In the meantime I tried
>> ChartTest massively locally on Linux and Win, but could never locally
>> reproduce.
> I noticed, and I still had to cancel a hung job few minutes ago, that
> things started to hang regularely.
> I only look at that particular case I sent the backtrace about.

Yes, that is not acceptable for long time, I agree.

>
>> I knew that and made sure that the multithreaded 3DRenderer WorkerThreads do
>> not need the SolarMutex for their work. I did not know yet that the memory
>> fail handler tries to get the SolarMutex, too, but is logic when it wants to
>> bring up a dialog in some form.
> yeah but that _is_ a major flaw. code under a signal handler are only
> allow to call async-safe-signal functions.
> Ignoring that _will_ cause trouble. it is not your doing. it just
> happen that your multithreading case seems to cause memory starvation
> which in turn trigger a signal that exhibit clearly why all the thing
> we try to do in a signal handler are not a good idea.
> Not to mention that, even if that was allowed, trying to bring about a
> dialog when we have already ran out of memory is never going to end
> well anyway.
>
> for reference:
>
>    Async-signal-safe functions
>         A signal handling routine established by sigaction(2) or
> signal(2) must be very careful, since processing elsewhere may be
> interrupted at some arbitrary point in the execution of the program.
> POSIX has the concept of "safe function".  If a signal interrupts the
> execution of an unsafe function, and handler calls an  unsafe
> function,  then
>         the behavior of the program is undefined.
>
>         POSIX.1-2004 (also known as POSIX.1-2001 Technical Corrigendum
> 2) requires an implementation to guarantee that the following
> functions can be safely called inside a signal handler:
>
>             _Exit()
>             _exit()
>             abort()
>             accept()
>             access()
>             aio_error()
>             aio_return()
>             aio_suspend()
>             alarm()
>             bind()
>             cfgetispeed()
>             cfgetospeed()
>             cfsetispeed()
>             cfsetospeed()
>             chdir()
>             chmod()
>             chown()
>             clock_gettime()
>             close()
>             connect()
>             creat()
>             dup()
>             dup2()
>             execle()
>             execve()
>             fchmod()
>             fchown()
>             fcntl()
>             fdatasync()
>             fork()
>             fpathconf()
>             fstat()
>             fsync()
>             ftruncate()
>             getegid()
>             geteuid()
>             getgid()
>             getgroups()
>             getpeername()
>             getpgrp()
>             getpid()
>             getppid()
>             getsockname()
>             getsockopt()
>             getuid()
>             kill()
>             link()
>             listen()
>             lseek()
>             lstat()
>             mkdir()
>             mkfifo()
>             open()
>             pathconf()
>             pause()
>             pipe()
>             poll()
>             posix_trace_event()
>             pselect()
>             raise()
>             read()
>             readlink()
>             recv()
>             recvfrom()
>             recvmsg()
>             rename()
>             rmdir()
>             select()
>             sem_post()
>             send()
>             sendmsg()
>             sendto()
>             setgid()
>             setpgid()
>             setsid()
>             setsockopt()
>             setuid()
>             shutdown()
>             sigaction()
>             sigaddset()
>             sigdelset()
>             sigemptyset()
>             sigfillset()
>             sigismember()
>             signal()
>             sigpause()
>             sigpending()
>             sigprocmask()
>             sigqueue()
>             sigset()
>             sigsuspend()
>             sleep()
>             sockatmark()
>             socket()
>             socketpair()
>             stat()
>             symlink()
>             sysconf()
>             tcdrain()
>             tcflow()
>             tcflush()
>             tcgetattr()
>             tcgetpgrp()
>             tcsendbreak()
>             tcsetattr()
>             tcsetpgrp()
>             time()
>             timer_getoverrun()
>             timer_gettime()
>             timer_settime()
>             times()
>             umask()
>             uname()
>             unlink()
>             utime()
>             wait()
>             waitpid()
>             write()
>
>         POSIX.1-2008 removes fpathconf(), pathconf(), and sysconf()
> from the above list, and adds the following functions:
>
>             execl()
>             execv()
>             faccessat()
>             fchmodat()
>             fchownat()
>             fexecve()
>             fstatat()
>             futimens()
>             linkat()
>             mkdirat()
>             mkfifoat()
>             mknod()
>             mknodat()
>             openat()
>             readlinkat()
>             renameat()
>             symlinkat()
>             unlinkat()
>             utimensat()
>             utimes()
>
>
>> But the deeper problem is that allocation - here extending a vector of
>> pointers to a helper class from 1 to 2 entries - fails.
> the straw that broke the camel's back....
> or as the french put it: the drop that made the water overflow...
> bear in mind, iirc the default allocator strategy for a vector is to
> 'double' on out-of-space.
> iow if you had a vector of 128 capacity.. when you try to push a 129th
> elements you end up with a 256-sized vector

True - I know. In the case of the stacktrace it is a vector of pointers, 
so not much mem involved. All objects used in 3D are ref-counted or 
cow-wrapped. Even the target mem for pixels is shared - thus I doubt 
that rendering in paralell uses much more than rendering in one thread. 
I would state that it will not need double the mem as it needs in a 
single thread. Still, of course, a single byte makes the water overflow 
at the end.

>
> also, tests are run in parallel.. so up to 32 test maybe running in // already.
> if each one want to run 32 threads.. that will seriouly over-commit
> the machine.. with 32^32 = 1024 threads fighting for cpu.

That might be more a problem. Do the UnitTests run in one mem space? I 
doubt so. The overhead for a single WorkerTask should not be too big in 
itself - RegisterSet, Stack, ..?
BTW, that global TreadPool and the WorkerThreads used get created when 
the office starts, and other (few for now) usages use it already, so 
that problem should have shown earier...?

>
>
>> Sometimes. And that
>> only on many cores on that machine (up to now).
> I do not mind the out-of-memory thing... unless it is a bug (run-away
> allocation or something)
> out of memory are bound to happen... what I mind is the consequences..
> as bad as a crash is... a hung process is worse
>
> fyi. tb75 has 2x8 cores, 32 threads and 64GB or RAM.

It probably indeed is a run out-of-mem in extreme situations, even when 
I can hardly imagine that. The 3D rasterconversion is hard limited to 
upper bounds in pixels, as said, it uses referenced objects, not really 
local memory objects of remarkable size - I have no idea yet.
Interestingly, in stack 7273, the crash is probably related, but happens 
in freeing one object. Does this not speak against the out-of-memory 
theory completely?

>
>
>> I checked all involved classes, their refcounting and that the used
>> o3tl::cow_wrapper uses the ThreadSafeRefCountingPolicy, looks good so far.
>> It is also not the case that the WorkerThreads need massive amounts of own
>> memory, so I doublt that limiting to e.g. 8 thredads would change this,
>> except maybe making it less probable to happen. I looked at
>> o3tl::cow_wrapper itself, and the basic B2D/B3DPrimitive implementations
>> which internally use a comphelper::OBaseMutex e.g. for creating buffered
>> decompositions.
>>
>> I found no concrete reason until now, any tipps/help much appreciated.
> I honestly did not look at all _why_ it signaled. I only investigated
> why that resulted in a deadlock.

If I understand you correct, your main concern is to have a clean crash 
instead of the office hanging? That would definitely be better. But how 
to do that? I see in the stack 7195:

#9  0x00002aea35f04670 in <signal handler called> () at /lib64/libc.so.6
#10 0x00002aea35f045f7 in raise () at /lib64/libc.so.6
#11 0x00002aea35f05ce8 in abort () at /lib64/libc.so.6
#12 0x00002aea35f4a1b8 in  () at /lib64/libc.so.6
#13 0x00002aea35f4d877 in _int_malloc () at /lib64/libc.so.6
#14 0x00002aea35f4dae6 in malloc_check () at /lib64/libc.so.6
#15 0x00002aea3570e0cd in operator new(unsigned long) () at 
/lib64/libstdc++.so.6

So how to come in-between and not have operator new when it fails to 
call signal handler, but e.g. exit()..?
I see now way to do that. In consequence that would mean that operator 
new is not alowed in WorkerThreads at all - that would be a too big 
limitation I guess...?

>
>> I keep watching this - at least it did not happen in all the builds since
>> 13th and on no other machine, so the thread now is to somehow nail it to get
>> it reproducable.
> It very likely happened again on
> http://ci.libreoffice.org/job/lo_tb_master_linux_dbg/7260/
> which I cancelled after more than 2 hours or runtime

Checked 7260, no trace to 'B3D' contained, this one has no 3D 
multihreaded render included. Must be someting else, of course maybe 
related to the general signal handler in WorkerThread problem. Or not - 
'worker' is also not included.

>
>> If someone has other traces, please send them! I would hate
>> to take this back, esp. because we will need multithreading more and more
>> since Moore's law is tilting.
> If we want to use multi-threading. we will have, at least, to solve
> the signal handler situation, since as it stand it makes the 'promise
> that this worker thread won't try to hold the solarmutex' impossible
> to uphold.

True, agreed. How?

I already tried to use SolarMutexReleaser as inverse to SolarMutexGuard 
around waitUntilEmpty(), but that makes the next message to be 
processed, not good.

The only thing I could think of is to change the implementation of 
waitUntilEmpty() to not just wait for asll ThreadTasks to finish, but to 
do this in a timer loop. This is polling with timer, but would in that 
way - AFAIK - release the SolarMutex from time to time and thus allow a 
deadlock to be resolved and let the signal handler do it's thing -> and 
allow the crash. Not nice, but pragmatic.

> Maybe a way around that may be to make sure that all worker thread use
> a pthread_sigmask to essentially prevent any of them to be interrupted
> for signal handling, by blocking all signal in these thread.
> in the same vein we could have a dedicated signal handler thread
> looping on a sigwait... then we can do the crazy thing we do in a
> handler without as many limitation

Interesting, but too complicated as long as we have the SolarMutex 
mayhem...?

>
>
> The whole SolarMutex craziness is yet another story altogether....

Yes, what a big can of worms, sigh...

-- 
--
ALG (PGP Key: EE1C 4B3F E751 D8BC C485 DEC1 3C59 F953 D81C F4A2)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20160519/54584700/attachment.html>