Cause of Cygwin terminal gets stuck during the build

Hossein Nourikhah hossein at libreoffice.org
Mon Oct 21 14:18:35 UTC 2024


Hello,

If you try the latest version of Cygwin (3.5.x) or Git for Windows 
(2.47) to build LibreOffice, you may have faced the problem that 
terminal sometimes gets stuck during the build.

This is the result of my effort to understand the problem:

The issue first appeared in Cygwin shell 3.3 a while ago, and it became 
worse in Cygwin 3.5, but with the recent update of "Git for Windows", it 
is also visible on "Git bash".

Running "make -d", one can see a lot of debugging information. The 
process may hang in many places, but this is one of the hangs, which 
reaches mktemp utility:

$ make -d
...
[build DEP] LNK:Library/unobootstrapprotector.dll.d
CreateProcess(C:\cygwin64\bin\mktemp.exe,mktemp --tmpdir=C:/cygwin64/tmp 
gbuild.XXXXXX,...)

[ Terminal hangs here ]

I could find the PID of hanging mktemp with:

$ ps ax | grep mktemp
     26144       1   23307      10708  cons1     197609 23:38:01 
/usr/bin/mktemp


After several tries, I could get some meaningful backtrace from hanging 
mktemp by attaching gdb to that. I have installed make-debuginfo and 
coreutils-debuginfo alongside gdb-14 to be able to get the backtrace:

$ gdb -p 10708
GNU gdb (GDB) (Cygwin 14.2-1) 14.2
...
Attaching to process 10708
[New Thread 10708.0x80b0]
[New Thread 10708.0x81e8]
Reading symbols from /usr/bin/mktemp.exe...
Reading symbols from /usr/lib/debug//usr/bin/mktemp.exe.dbg...
(gdb) interrupt
(gdb) bt
#0  0x00007ff8519e7a36 in fhandler_console::set_input_mode 
(m=m at entry=tty::cygwin,
     t=0x1a0030028, p=p at entry=0x800008da8)
     at 
/usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/fhandler/console.cc:817
#1  0x00007ff8519f175c in fhandler_console::post_open_setup 
(this=0x800008ba8,
     fd=<optimized out>) at 
/usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/fhandler/console.cc:1910
#2  0x00007ff851948796 in dtable::init_std_file_from_handle 
(this=this at entry=0x800004870,
     fd=fd at entry=0, handle=0xffffffffffffffff, handle at entry=0x424)
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/dtable.cc:425
#3  0x00007ff851948a61 in dtable::stdio_init (this=0x800004870)
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/dtable.cc:162
#4  0x00007ff8519370e7 in dll_crt0_1 ()
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/dcrt0.cc:929
#5  0x00007ff851935d51 in _cygtls::call2 (this=0x7ffffce00,
     func=0x7ff851936f10 <dll_crt0_1(void*)>, arg=0x0, 
buf=buf at entry=0x7ffffcdf0)
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/cygtls.cc:41
#6  0x00007ff851935dca in _cygtls::call (func=<optimized out>, 
arg=<optimized out>)
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/cygtls.cc:28
#7  0x0000000000000000 in ?? ()

As visible in the backtrace, the problem is in 
fhandler_console::set_input_mode() in console.cc:817. Console hangs 
after attach_console() is invoked. There were known issues around 
multiple processes trying to access console at the same time, and this 
issue seems to be because of the exact same problem. Multiple processes 
want to write on the console at the same time, and then this concurrency 
problem happens, maybe a deadlock.

This is one of the patches that was supposed to fix the problem:

[PATCH] Cygwin: console: Fix race issue on allocating console 
simultaneously.
https://cygwin.com/pipermail/cygwin-patches/2024q3/012722.html

More "race" issues can be seen by searching "race" in newlib-cygwin:

https://cygwin.com/cgit/newlib-cygwin/log/?qt=grep&q=race

Looking into the sources of Cygwin 3.5.4-1 locally, one may see that 
fixes b160b690b6ace93ee4225f14a9287549e37f4a71 and 
10477d95ec401213d5bded5ae3600ab0d2d5ed94 are already applied, but the 
problem still persists.

Also, the issue is not limited to Cygwin, and also happens in the recent 
version of "Git for Windows" shell. To describe the same issue on git 
bash, you can try 'uname -a' on git bash, which shares some sources with 
Cygwin.

On git bash version 2.46, you'll get 3.4.10-2e2ef940.x86_64, but with 
the latest, 2.47, you'll get 3.5.4-1e8cf1a5.x86_64. On git bash 2.46, 
you may not face the problem, but on git bash 2.47, you may face it 
immediately after invoking "make" for LibreOffice core source code.

This is from Git for Windows v2.47.0 release notes:
"Comes with the MSYS2 runtime (Git for Windows flavor) based on Cygwin 
v3.5.4, which drops Windows 7 and Windows 8 support."
https://github.com/git-for-windows/build-extra/blob/main/ReleaseNotes.md

One other observation from a LibreOffice developer, Michael W, is the 
character by character interleaving of the output on the terminal from 
different processes, which should not happen in a buffered STDOUT 
(standard output).

One last note is that the more parallelism you use, the more probable is 
that you see the build gets stuck. I use 20 parallel processes, but you 
may use --with-parallelism=1 to avoid issues, or you can set a larger 
value to reproduce the problem.

This is a more detailed backtrace:

(gdb) backtrace full
#0  0x00007ff8519e7a36 in fhandler_console::set_input_mode 
(m=m at entry=tty::cygwin,
     t=0x1a0030028, p=p at entry=0x800008da8)
     at 
/usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/fhandler/console.cc:817
         unit = 0
         oflags = 4294967295
         resume_pid = <optimized out>
         flags = <optimized out>
#1  0x00007ff8519f175c in fhandler_console::post_open_setup 
(this=0x800008ba8,
     fd=<optimized out>) at 
/usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/fhandler/console.cc:1910
No locals.
#2  0x00007ff851948796 in dtable::init_std_file_from_handle 
(this=this at entry=0x800004870,
     fd=fd at entry=0, handle=0xffffffffffffffff, handle at entry=0x424)
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/dtable.cc:425
         fh = 0x800008ba8
         io = {{Status = 6, Pointer = 0x6}, Information = 
140704499238208}
         fai = {AccessFlags = 8}
         openflags = 65538
         tp = {c_buf_old = 0, w_buf_old = 0}
         buf = {dwSize = {X = 69, Y = 0}, dwCursorPosition = {X = 7, Y = 
0},
           wAttributes = 16576, srWindow = {Left = 20929, Top = 32760, 
Right = 0,
             Bottom = 1024}, dwMaximumWindowSize = {X = 20914, Y = 
32760}}
         dcb = {DCBlength = 4294953784, BaudRate = 7, fBinary = 1, 
fParity = 0,
           fOutxCtsFlow = 0, fOutxDsrFlow = 0, fDtrControl = 0, 
fDsrSensitivity = 0,
           fTXContinueOnXoff = 0, fOutX = 1, fInX = 0, fErrorChar = 1, 
fNull = 0,
           fRtsControl = 0, fAbortOnError = 0, fDummy2 = 0, wReserved = 
0, XonLim = 1280,
           XoffLim = 21, ByteSize = 0 '\000', Parity = 0 '\000', StopBits 
= 45 '-',
           XonChar = -58 '\306', XoffChar = -22 '\352', ErrorChar = 54 
'6',
           EofChar = -89 '\247', EvtChar = -53 '\313', wReserved1 = 
24591}
         bin = 65536
         dev = {<_device> = {_name = 0x7ff851b5c2f5 <msg1+11749> 
"/dev/console", d = {
               devn = 327681, devn_fh_devices = FH_CONSOLE, {minor = 1, 
major = 5}},
             _native = 0x7ff851b5c2f5 <msg1+11749> "/dev/console",
             exists_func = 0x7ff8519412bc <exists_console(device 
const&)>, _mode = 8192,
             lives_in_dev = true, dev_on_fs = false, name_allocated = 
false,
             native_allocated = false}, <No data fields>}
         access = <optimized out>
         ft = <optimized out>
         name = <optimized out>
         __PRETTY_FUNCTION__ = "void 
dtable::init_std_file_from_handle(int, HANDLE)"
#3  0x00007ff851948a61 in dtable::stdio_init (this=0x800004870)
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/dtable.cc:162
         in = 0x424
         out = 0x18c
         err = 0x230
         __PRETTY_FUNCTION__ = "void dtable::stdio_init()"
#4  0x00007ff8519370e7 in dll_crt0_1 ()
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/dcrt0.cc:929
         __PRETTY_FUNCTION__ = "void dll_crt0_1(void*)"
#5  0x00007ff851935d51 in _cygtls::call2 (this=0x7ffffce00,
     func=0x7ff851936f10 <dll_crt0_1(void*)>, arg=0x0, 
buf=buf at entry=0x7ffffcdf0)
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/cygtls.cc:41
         res = <optimized out>
#6  0x00007ff851935dca in _cygtls::call (func=<optimized out>, 
arg=<optimized out>)
     at /usr/src/debug/cygwin-3.5.4-1/winsup/cygwin/cygtls.cc:28
         buf = '\000' <repeats 16 times>, 
"\r\000\000\000\000\000\000\000`=\301Q\370\177\000\000\030>\301Q\370\177\000\000\320>\301Q\370\177", 
'\000' <repeats 58 times>, "pX\223Q\370\177", '\000' <repeats 138 
times>...
         protect = <optimized out>
#7  0x0000000000000000 in ?? ()
No symbol table info available.


Regards,
Hossein

-- 
Hossein Nourikhah, Ph.D., Developer Community Architect
Tel: +49 30 5557992-65 | Email: hossein at libreoffice.org
The Document Foundation, Winterfeldtstraße 52, 10781 Berlin, DE
Gemeinnützige rechtsfähige Stiftung des bürgerlichen Rechts
Legal details: https://www.documentfoundation.org/imprint


More information about the LibreOffice mailing list