<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - pulseaudio --start triggers undefined behaviour in pthread_join()"
href="https://bugs.freedesktop.org/show_bug.cgi?id=71738">71738</a>
</td>
</tr>
<tr>
<th>CC</th>
<td>lennart@poettering.net
</td>
</tr>
<tr>
<th>Assignee</th>
<td>pulseaudio-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Summary</th>
<td>pulseaudio --start triggers undefined behaviour in pthread_join()
</td>
</tr>
<tr>
<th>QA Contact</th>
<td>pulseaudio-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr>
<tr>
<th>OS</th>
<td>OpenBSD
</td>
</tr>
<tr>
<th>Reporter</th>
<td>freedesktop-bugs@stsp.name
</td>
</tr>
<tr>
<th>Hardware</th>
<td>Other
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Component</th>
<td>daemon
</td>
</tr>
<tr>
<th>Product</th>
<td>PulseAudio
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=89406" name="attach_89406" title="Proposed patch to fix the problem. Don't call pthread_join() after fork().">attachment 89406</a> <a href="attachment.cgi?id=89406&action=edit" title="Proposed patch to fix the problem. Don't call pthread_join() after fork().">[details]</a></span> <a href='page.cgi?id=splinter.html&bug=71738&attachment=89406'>[review]</a>
Proposed patch to fix the problem. Don't call pthread_join() after fork().
When pulseaudio --start runs, it uses a lockfile to ensure that two
concurrently spawned 'pulseaudio --start' processes only spawn one daemon.
This procedure involves two processes, one of which also spawns threads.
With OpenBSD's pthread implementation 'pulseaudio --start' hangs hard after
obtaining the lock file. The stuck process is waiting for other threads in
pthread_join(), in a process which contains only a single thread!
The problem is fixed by calling pthread_join() only in the process that spawns
threads.
Below is a quote from Philip Guenther, who helped me analyze the problem, with
more details:
[[[
The output of
kdump -Hrf pulseaudio-kdump-4-hanging-process.out
confirms that it's bad code in pulseaudio. From the end of that:
8149/1008666 pulseaudio.orig GIO fd 4 wrote 1 bytes
"x"
8149/1008666 pulseaudio.orig RET write 1
8149/1008149 pulseaudio.orig RET poll 1
8149/1008149 pulseaudio.orig CALL read(0x3,0x7f7fffff669f,0x1)
8149/1008149 pulseaudio.orig GIO fd 3 read 1 bytes
"x"
8149/1008149 pulseaudio.orig RET read 1
So, process 8149 contains at least two threads, with tids 1008666 and
1008149. The gdb session from your previous email shows that the former
thread, 1008666, is the one that's the target of the hung pthread_join().
8149/1008149 pulseaudio.orig CALL read(0x3,0x7f7fffff6680,0x10)
8149/1008149 pulseaudio.orig RET read -1 errno 35 Resource temporarily
unava
ilable
8149/1008149 pulseaudio.orig CALL pipe(0x7f7fffff67b0)
8149/1008149 pulseaudio.orig RET pipe 0
8149/1008149 pulseaudio.orig CALL sigprocmask(SIG_BLOCK,~0<>)
8149/1008149 pulseaudio.orig RET sigprocmask 0<>
8149/1008149 pulseaudio.orig CALL fork()
8149/1008149 pulseaudio.orig RET fork 15791/0x3daf
One of the threads forks, creating process 15791...
8149/1008149 pulseaudio.orig CALL sigprocmask(SIG_SETMASK,0<>)
8149/1008149 pulseaudio.orig RET sigprocmask ~0x10100<SIGKILL|SIGSTOP>
8149/1008149 pulseaudio.orig CALL close(0x7)
8149/1008149 pulseaudio.orig RET close 0
8149/1008149 pulseaudio.orig CALL read(0x6,0x7f7fffff6810,0x4)
8149/1008666 pulseaudio.orig CALL __threxit(0x3d59798e810)
...while the other thread in the original process, the target of the
problem pthread_join(), exits.
15791/1015791 pulseaudio.orig RET fork 0
15791/1015791 pulseaudio.orig CALL sigprocmask(SIG_SETMASK,0<>)
15791/1015791 pulseaudio.orig RET sigprocmask ~0x10100<SIGKILL|SIGSTOP>
15791/1015791 pulseaudio.orig CALL getthrid()
15791/1015791 pulseaudio.orig RET getthrid 1015791/0xf7fef
15791/1015791 pulseaudio.orig CALL
sendto(0x4,0x7f7fffff66af,0x1,0x400<MSG_NOS
IGNAL>,0,0)
15791/1015791 pulseaudio.orig RET sendto -1 errno 38 Socket operation on
non-
socket
15791/1015791 pulseaudio.orig CALL write(0x4,0x7f7fffff66af,0x1)
15791/1015791 pulseaudio.orig GIO fd 4 wrote 1 bytes
"x"
15791/1015791 pulseaudio.orig RET write 1
15791/1015791 pulseaudio.orig CALL
__thrsleep(0x3d59798e804,CLOCK_REALTIME,0,0
x3d59798e800,0x3d5999f88e0)
The new process, which only has one thread, then tries to pthread_join()
that thread in the original process. That's Wrong. To quote XSH 2.9.2:
2.9.2 Thread IDs
Although implementations may have thread IDs that are unique in a
system, applications should only assume that thread IDs are usable
and unique within a single process. The effect of calling any of the
functions defined in this volume of POSIX.1-2008 and passing as an
argument the thread ID of a thread from another process is
unspecified.
]]]</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>