[PATCH 3/4] tests: add display test that highlights another bug in read_events
Marek Chalupa
mchqwerty at gmail.com
Tue Sep 9 06:39:35 PDT 2014
On 9 September 2014 13:10, Marek Chalupa <mchqwerty at gmail.com> wrote:
>
>
> On 4 September 2014 15:17, Pekka Paalanen <ppaalanen at gmail.com> wrote:
>
>> On Fri, 29 Aug 2014 11:21:30 +0200
>> Marek Chalupa <mchqwerty at gmail.com> wrote:
>>
>> > When wl_display_read_events() returns with errno == EAGAIN, we
>> > naturally try to call it again. But this next call results in deadlock.
>> >
>> > Signed-off-by: Marek Chalupa <mchqwerty at gmail.com>
>> > ---
>> > tests/display-test.c | 88
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > 1 file changed, 88 insertions(+)
>> >
>> > diff --git a/tests/display-test.c b/tests/display-test.c
>> > index 1289866..e1cf830 100644
>> > --- a/tests/display-test.c
>> > +++ b/tests/display-test.c
>> > @@ -32,6 +32,7 @@
>> > #include <sys/types.h>
>> > #include <sys/stat.h>
>> > #include <pthread.h>
>> > +#include <poll.h>
>> >
>> > #include "wayland-private.h"
>> > #include "wayland-server.h"
>> > @@ -480,6 +481,93 @@ TEST(threading_cancel_read_tst)
>> > display_destroy(d);
>> > }
>> >
>> > +/* set this to 1 when it is right that
>> > + * the thread is woken up */
>> > +static int threads_can_be_woken_up = 0;
>> > +
>> > +static void *
>> > +thread_eagain_func(void *data) {
>> > + struct client *c = data;
>> > +
>> > + register_reading(c->wl_display);
>> > +
>> > + c->display_stopped = 1;
>> > + assert(wl_display_read_events(c->wl_display) == 0
>> > + && errno != EAGAIN);
>> > + /* if thread was sleeping in read_events, this assert
>> > + * must pass now */
>> > + assert(threads_can_be_woken_up && "Thread woke up too early");
>> > +
>> > + pthread_exit(NULL);
>> > +}
>> > +
>> > +static void
>> > +threading_read_eagain(void)
>> > +{
>> > + struct client *c = client_connect();
>> > + pthread_t th1, th2, th3;
>> > + struct pollfd fds;
>> > + int i;
>> > +
>> > + register_reading(c->wl_display);
>> > +
>> > + th1 = create_thread(c, thread_eagain_func);
>> > + th2 = create_thread(c, thread_eagain_func);
>> > + th3 = create_thread(c, thread_eagain_func);
>> > +
>> > + /* All the threads are sleeping, waiting until read or cancel
>> > + * is called. Since we have no data on socket waiting,
>> > + * the wl_connection_read should end up with error and set errno
>> > + * to EAGAIN => wl_display_read_events() will return 0 and set
>> errno
>> > + * to EAGAIN. It should not wake up the sleeping threads, so that
>> > + * this thread can try to read again */
>> > + alarm(3);
>> > + assert(wl_display_read_events(c->wl_display) == 0);
>> > + assert(errno == EAGAIN);
>>
>> I think there is a problem here. wl_display_read_events() returning 0
>> means success, and in that case errno may not be set if it was a real
>> success. There is no EAGAIN to be handled, it's already done inside
>> read_events().
>>
>> In this test it won't be a real success, but you cannot code clients to
>> check for errno after getting 0 from wl_display_read_events().
>>
>
>> > +
>> > + /* the same should happen next times - but there was a bug when
>> > + * the main thread got blocked! */
>> > + for (i = 0; i < 5; ++i) {
>> > + assert(wl_display_read_events(c->wl_display) == 0);
>>
>> Therefore you are here calling wl_display_read_events() again after it
>> succeeded! And that IMO is a developer error.
>>
>
> yes
>
>
>> Besides, I cannot see anywhere where it says that it is ok to call
>> wl_display_read_events() again even if it failed, without calling
>> wl_display_prepare_read() first.
>>
>>
> Since original piece of code returned 0 on EAGAIN, then I suppose the
> author
> considered this as a success, so yes, wl_display_read_events() is not
> probably meant
> to be called again. But consider this:
> if the thread that got EAGAIN (the only thread that is not sleeping) calls
> wl_display_prepare_read()
> + wl_display_read_events() again, it works. Problem is that it does not
> have to call it
> right after the failed wl_display_read_events and the other threads are
> still sleeping.
> Moreover, if there are not events on the socket for some time, this one
> thread is looping in cycle
> while the other threads are sleeping.
>
> What I meant to do was to give the programmer the choice whether to loop
> or let
> all threads try it again (cancel read). But as you said before, 0 is wrong
> return value for that.
> See end of the e-mail for conclusions.
>
>
>> But I see something else: wl_display_read_events() does not decrement
>> reader_count if it returns due to last_error. I think that needs a fix
>> to guarantee the reader_count is decremented, so that the side-effects
>> are consistent.
>>
>
> Yes, we should fix this. There are cases that can cause a problem.
>
Actually, I can not make up any use-case where it is problem... do you have
any?
> > + assert(errno == EAGAIN);
>> > + }
>> > +
>> > + /* generate something to read and wait until the event comes */
>> > + wl_proxy_marshal((struct wl_proxy *) c->tc,
>> > + 0 /* STOP DISPLAY */, 1 /* number */);
>> > + assert(wl_display_flush(c->wl_display) > 0);
>> > +
>> > + fds.fd = wl_display_get_fd(c->wl_display);
>> > + fds.events = POLLIN;
>> > + assert(poll(&fds, 1, 3) == 1);
>>
>> Is timeout of 3 milliseconds intended?
>>
>
> Nope, my bad.
>
>
>>
>> > +
>> > + /* threads can be woken up now */
>> > + threads_can_be_woken_up = 1;
>> > +
>> > + errno = 0;
>> > + assert(wl_display_read_events(c->wl_display) == 0);
>> > + assert(errno != EAGAIN);
>> > + assert(wl_display_dispatch_pending(c->wl_display) == 1);
>> > +
>> > + pthread_join(th1, NULL);
>> > + pthread_join(th2, NULL);
>> > + pthread_join(th3, NULL);
>> > +
>> > + client_disconnect(c);
>> > +}
>> > +
>> > +TEST(threading_read_eagain_tst)
>> > +{
>> > + struct display *d = display_create();
>> > +
>> > + client_create(d, threading_read_eagain);
>> > +
>> > + display_run(d);
>> > + display_resume(d);
>> > +
>> > + display_destroy(d);
>> > +}
>> > +
>> > static void *
>> > thread_prepare_and_read2(void *data)
>> > {
>>
>>
>> Thanks,
>> pq
>>
>
> First, thanks for review.
>
> Summing it up, in my point of view we have these options:
>
> * Let it as it is:
> this can lead to starving of other threads for some time. Moreover,
> if the thread that
> got EAGAIN decides to exit (or simply not to read again) instead of
> another trial to read, the other threads will block.
> * return -1 and set EAGAIN
> programmer can either try to read again, or cancel read and let all
> threads do another cycle. Good thing is
> that programmer knows that there was an error and can cancel read in
> the case that the thread is exiting
> instead of another cycle.
> * wake up threads on EAGAIN
> this solution will force all threads to do another cycle (or whatever
> they want) and should not have any side effects. It can lead to
> big overhead if there are no events on the socket for some time
> (waking up threads again and again in cycle)
>
> + fix the bug with last_error, of course
>
> Is there something that I'm missing?
>
> Thanks,
> Marek
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/wayland-devel/attachments/20140909/187551ce/attachment.html>
More information about the wayland-devel
mailing list