[pulseaudio-tickets] [Bug 71738] New: pulseaudio --start triggers undefined behaviour in pthread_join()

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Nov 18 05:17:06 PST 2013


https://bugs.freedesktop.org/show_bug.cgi?id=71738

          Priority: medium
            Bug ID: 71738
                CC: lennart at poettering.net
          Assignee: pulseaudio-bugs at lists.freedesktop.org
           Summary: pulseaudio --start triggers undefined behaviour in
                    pthread_join()
        QA Contact: pulseaudio-bugs at lists.freedesktop.org
          Severity: normal
    Classification: Unclassified
                OS: OpenBSD
          Reporter: freedesktop-bugs at stsp.name
          Hardware: Other
            Status: NEW
           Version: unspecified
         Component: daemon
           Product: PulseAudio

Created attachment 89406
  --> https://bugs.freedesktop.org/attachment.cgi?id=89406&action=edit
Proposed patch to fix the problem. Don't call pthread_join() after fork().

When pulseaudio --start runs, it uses a lockfile to ensure that two
concurrently spawned 'pulseaudio --start' processes only spawn one daemon.

This procedure involves two processes, one of which also spawns threads.

With OpenBSD's pthread implementation 'pulseaudio --start' hangs hard after
obtaining the lock file. The stuck process is waiting for other threads in
pthread_join(), in a process which contains only a single thread!
The problem is fixed by calling pthread_join() only in the process that spawns
threads.

Below is a quote from Philip Guenther, who helped me analyze the problem, with
more details:

[[[
The output of
        kdump -Hrf pulseaudio-kdump-4-hanging-process.out

confirms that it's bad code in pulseaudio.  From the end of that:

  8149/1008666 pulseaudio.orig GIO   fd 4 wrote 1 bytes
       "x"
  8149/1008666 pulseaudio.orig RET   write 1
  8149/1008149 pulseaudio.orig RET   poll 1
  8149/1008149 pulseaudio.orig CALL  read(0x3,0x7f7fffff669f,0x1)
  8149/1008149 pulseaudio.orig GIO   fd 3 read 1 bytes
       "x"
  8149/1008149 pulseaudio.orig RET   read 1


So, process 8149 contains at least two threads, with tids 1008666 and   
1008149.  The gdb session from your previous email shows that the former
thread, 1008666, is the one that's the target of the hung pthread_join().
  8149/1008149 pulseaudio.orig CALL  read(0x3,0x7f7fffff6680,0x10)
  8149/1008149 pulseaudio.orig RET   read -1 errno 35 Resource temporarily
unava
ilable
  8149/1008149 pulseaudio.orig CALL  pipe(0x7f7fffff67b0)
  8149/1008149 pulseaudio.orig RET   pipe 0
  8149/1008149 pulseaudio.orig CALL  sigprocmask(SIG_BLOCK,~0<>)
  8149/1008149 pulseaudio.orig RET   sigprocmask 0<>
  8149/1008149 pulseaudio.orig CALL  fork()
  8149/1008149 pulseaudio.orig RET   fork 15791/0x3daf

One of the threads forks, creating process 15791...

  8149/1008149 pulseaudio.orig CALL  sigprocmask(SIG_SETMASK,0<>)
  8149/1008149 pulseaudio.orig RET   sigprocmask ~0x10100<SIGKILL|SIGSTOP>
  8149/1008149 pulseaudio.orig CALL  close(0x7)
  8149/1008149 pulseaudio.orig RET   close 0
  8149/1008149 pulseaudio.orig CALL  read(0x6,0x7f7fffff6810,0x4)
  8149/1008666 pulseaudio.orig CALL  __threxit(0x3d59798e810)

...while the other thread in the original process, the target of the
problem pthread_join(), exits.

 15791/1015791 pulseaudio.orig RET   fork 0
 15791/1015791 pulseaudio.orig CALL  sigprocmask(SIG_SETMASK,0<>)
 15791/1015791 pulseaudio.orig RET   sigprocmask ~0x10100<SIGKILL|SIGSTOP>
 15791/1015791 pulseaudio.orig CALL  getthrid()
 15791/1015791 pulseaudio.orig RET   getthrid 1015791/0xf7fef
 15791/1015791 pulseaudio.orig CALL 
sendto(0x4,0x7f7fffff66af,0x1,0x400<MSG_NOS
IGNAL>,0,0)
 15791/1015791 pulseaudio.orig RET   sendto -1 errno 38 Socket operation on
non-
socket
 15791/1015791 pulseaudio.orig CALL  write(0x4,0x7f7fffff66af,0x1)
 15791/1015791 pulseaudio.orig GIO   fd 4 wrote 1 bytes
       "x"
 15791/1015791 pulseaudio.orig RET   write 1
 15791/1015791 pulseaudio.orig CALL 
__thrsleep(0x3d59798e804,CLOCK_REALTIME,0,0
x3d59798e800,0x3d5999f88e0)

The new process, which only has one thread, then tries to pthread_join()
that thread in the original process.  That's Wrong.  To quote XSH 2.9.2:

2.9.2 Thread IDs
      Although implementations may have thread IDs that are unique in a
      system, applications should only assume that thread IDs are usable
      and unique within a single process. The effect of calling any of the
      functions defined in this volume of POSIX.1-2008 and passing as an
      argument the thread ID of a thread from another process is
      unspecified.
]]]

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/pulseaudio-bugs/attachments/20131118/c1d10da9/attachment.html>


More information about the pulseaudio-bugs mailing list