[PATCH] client: Add lock-fd API to avoid requiring a polling main thread

Kristian Høgsberg hoegsberg at gmail.com
Mon Mar 25 10:33:58 PDT 2013


On Sun, Mar 24, 2013 at 01:49:05PM +0100, Uli Schlachter wrote:
> Hi again,
> 
> On 22.03.2013 02:29, Kristian Høgsberg wrote:
> > On Thu, Mar 21, 2013 at 05:13:47PM +0100, Uli Schlachter wrote:
> >> On 21.03.2013 15:20, Kristian Høgsberg wrote:
> [...]
> >>> + * Calling wl_display_lock_fd() ensures that no other thread will read
> >>> + * the display fd.  This means that no events will be queued between
> >>> + * dispatching pending events and going to sleep in the event loop and
> >>> + * that no other thread will read from the fd when data becomes
> >>> + * available.  Calling wl_display_dispatch() will read events, unlock
> >>> + * the fd and then dispatch new main queue events, if any.
> >>
> >> I wonder if it really is a good idea to have wl_display_dispatch()
> >> automatically unlock the fd. From the implementation, it looks like
> >> it doesn't make things more complicated if users always have to
> >> match wl_display_lock_fd() with a suitable call to
> >> wl_display_release_fd()?
> >>
> >> This would mean that just a single implementation of
> >> wl_display_release_fd() is needed and the duplicate in read_events()
> >> below can be removed.
> > 
> > We could share the code internally, that's not a big deal.  The reason
> > for unlocking in dispatch the fd is that we only want to read the
> > events with the fd lock, not dispatch them.  Reading the events and
> > queuing them should be instant, but the callbacks we do when
> > dispatching events could take a long time.  Other threads should be
> > allowed to read events while we dispatch.
> 
> Ah, I see. I guess I missed that there is still other things happening after the
> fd got unlocked. Never mind then.
> 
> [...]
> >>> @@ -847,45 +973,107 @@ dispatch_event(struct wl_display *display, struct wl_event_queue *queue)
> >>>  }
> >>>  
> >>>  static int
> >>> -dispatch_queue(struct wl_display *display,
> >>> -	       struct wl_event_queue *queue, int block)
> >>> +read_events(struct wl_display *display, struct wl_event_queue *queue)
> >>>  {
> >>> -	int len, size, count, ret;
> >>> -
> >>> -	pthread_mutex_lock(&display->mutex);
> >>> -
> >>> -	if (display->last_error)
> >>> -		goto err_unlock;
> >>> -
> >>> -	ret = wl_connection_flush(display->connection);
> >>> -	if (ret < 0 && errno != EAGAIN) {
> >>> -		display_fatal_error(display, errno);
> >>> -		goto err_unlock;
> >>> +	struct pollfd pfd[2];
> >>> +	int len, size, ret;
> >>> +	struct waiter waiter, *w;
> >>> +	char c;
> >>> +
> >>> +	if (display->reader_state == NO_READER) {
> >>> +		/* A thread gets here when it's the first reader to
> >>> +		 * try to read and there's LOCKED_READER.  We set
> >>> +		 * VOLUNTEER_READER and poll on the fd and the
> >>> +		 * reader_pipe, so that a locked reader can preempt if
> >>> +		 * it needs to. */
> >>> +		display->reader = pthread_self();
> >>> +		display->reader_state = VOLUNTEER_READER;
> >>> +
> >>> +		pthread_mutex_unlock(&display->mutex);
> >>> +
> >>> +		pfd[0].fd = display->fd;
> >>> +		pfd[0].events = POLLIN;
> >>> +		pfd[1].fd = display->reader_pipe[0];
> >>> +		pfd[1].events = POLLIN;
> >>> +		ret = poll(pfd, 2, -1);
> >>> +		if (pfd[1].revents & POLLIN)
> >>> +			read(display->reader_pipe[0], &c, 1);
> >>> +
> >>> +		pthread_mutex_lock(&display->mutex);
> >>> +
> >>> +		if (ret == -1)
> >>> +			return -1;
> >>> +
> >>> +		/* If we come back from poll in NO_READER state,
> >>> +		 * another thread locked and then released the fd.
> >>> +		 * Nobody is going to read anything in this state
> >>> +		 * so return instead of blocking without a reader. */
> >>> +		if (display->reader_state == NO_READER)
> >>> +			return 0;
> >>
> >> Let's get creative:
> >>
> >> - Thread 1 becomes a VOLUNTEER_READER and sits in poll().
> >> [reader = 1, reader_state = VOLUNTEER]
> >> - Thread 2 locks the fd and writes to the pipe.
> >> [reader = 2, reader_state = LOCKED]
> >> - Thread 2 does its thing and unlocks the fd again.
> >> [reader = 2, reader_state = NO_READER]
> >> - Thread 3 becomes a VOLUNTEER_READER, calls poll(), sees that the reader_pipe
> >> is readable, reads from the pipe, proceeds to read from the FD
> >> [reader = 3, reader_state = NO_READER]
> >> - Finally, Thread 1 gets scheduled again and is blocked in read()ing from the pipe.
> >>
> >> The above had a thread other than the VOLUNTEER_READER read from the
> >> pipe and thus a "wakeup byte" got lost.
> > 
> > Yes, I see... nice.
> > 
> >> After thinking about this for 0.5 seconds, I don't have any good
> >> suggestions on how to handle this. And I don't feel like thinking
> >> right now... :-)
> > 
> > We could make a pipe per volunteer reader, I guess, but that seems a
> > little wasteful.  Another idea would be to not let anybody else become
> > volunteer reader until the last volunteer has left the
> > building... something with a old-volunteer-still-around flag and
> > another pthread cond...
> 
> I think that wl_display_lock_fd() should synchronize with the volunteer reader
> and only return once the reader is out of the way. This shouldn't cause much
> latency, especially since the participating thread would likely be blocked in
> poll() anyway.
> 
> For a more concrete proposal of what wl_display_lock_fd() could do:
> 
>   while (display->reader_state == VOLUNTEER_READER) {
>     display->reader_state = VOLUNTEER_READER_EXITING;
>     /* Does this need any magic to make sure the above write is seen by all
>        threads before the following write()? */
>     write(display->reader_pipe[1], &c, 1);
>     while (dislay->reader_state == VOLUNTEER_READER_EXITING)
>       pthread_cond_wait(<some cond, possibly the one you introduce for the
>                         new pthread_cond_broadcast()?>)
>     /* Read our byte back, doing this here solves some races */
>     read(display->reader_pipe[0]);
>   }
>   display->reader_state = LOCKED_READER;
> 
> (The above is a "while" now instead of an "if" now, because new readers could
> show up while the last one is exiting, since the reader resets to NO_READER.
> Hm... :-/ )
> 
> And in read_events() something like this could work:
> 
>   ret = poll(pfd, 2, -1);
> 
>   pthread_mutex_lock(&display->mutex)
> 
>   if (display->reader_state == VOLUNTEER_READER_EXITING) {
>     display->reader_state = NO_READER;
>     pthread_cond_signal(<the condition that lock_fd() is waiting on>)
>     return 0;
>   }
>   if (ret == -1)
>     return -1;
> 
> The above should solve any possible race between poll() returning and
> wl_display_lock_fd() making it return. I hope.
> 
> Could this work? Did I miss anything?

I think you got the core idea right: wl_display_lock_fd() needs to
synchronize with the volunteer reader exiting, but I think it can be a
bit simpler.  wl_display_lock_fd() cat just set LOCKED_READER before
writing the pipe, and then wait on a cond (pipe_cond) for the
volunteer reader to exit the poll and read the token on the pipe.
Before we wait on pipe_cond and release the mutex, we've already set
LOCKED_READER and no other thread is going to try to become reader.
Everybody else, including the current volunteer reader is just going
to go straight to the pthread_wait_cond on the reader_cond.

I'll send out the updated patch now.

Kristian

> [...]
> >>> +		wl_list_init(&display->read_waiter_list);
> >>> +	} else {
> >>> +		/* Some other thread signed up to read from the fd, so
> >>> +		 * just add ourselves to the waiter list and wait for
> >>> +		 * the reader to do the work. */
> >>> +		pthread_cond_init(&waiter.cond, NULL);
> >>> +		wl_list_insert(display->read_waiter_list.prev, &waiter.link);
> >>> +		pthread_cond_wait(&waiter.cond, &display->mutex);
> >>
> >> I think this needs to remove the new entry from the list
> >> again. Otherwise the list contains pointers to random addresses on
> >> the stack and Bad Things(tm) will happen.
> > 
> > This should be OK.  When we come out of the wait, we never access the
> > waiter again and it goes out of scope when we return from the
> > read_events().  The reader will call wl_list_init() after signalling
> > all waiters on the list.  This is done under the mutex so all the
> > waiters on the list will be valid until we signal the condition and
> > release the mutex.
> [...]
> 
> Ah, that does make sense, but it is enough magic to warrant a comment IMHO.
> However, replacing this with pthread_cond_broadcast() makes all of that magic go
> away anyway.
> 
> Cheers,
> Uli
> -- 
> "Do you know that books smell like nutmeg or some spice from a foreign land?"
>                                                   -- Faber in Fahrenheit 451


More information about the wayland-devel mailing list