[PATCH] client: Introduce functions to allocate and marshal proxies atomically
Kristian Høgsberg
hoegsberg at gmail.com
Tue Nov 19 11:46:56 PST 2013
On Tue, Nov 19, 2013 at 07:38:52AM +0200, Kalle Vahlman wrote:
> 2013/11/19 Bryce W. Harrington <b.harrington at samsung.com>:
> > On Fri, Nov 15, 2013 at 08:51:50PM -0800, Kristian Høgsberg wrote:
> >> The server requires clients to only allocate one ID ahead of the previously
> >> highest ID in order to keep the ID range tight. Failure to do so will
> >> make the server close the client connection. However, the way we allocate
> >> new IDs is racy. The generated code looks like:
> >>
> >> new_proxy = wl_proxy_create(...);
> >> wl_proxy_marshal(proxy, ... new_proxy, ...);
> >>
> >> If two threads do this at the same time, there's a chance that thread A
> >> will allocate a proxy, then get pre-empted by thread B which then allocates
> >> a proxy and then passes it to wl_proxy_marshal(). The ID for thread As
> >> proxy will be one higher that the currently highest ID, but the ID for
> >> thread Bs proxy will be two higher. But since thread B prempted thread A
> >> before it could send its new ID, B will send its new ID first, the server
> >> will see the ID from thread Bs proxy first, and will reject it.
> >>
> >> We fix this by introducing wl_proxy_marshal_constructor(). This
> >> function is identical to wl_proxy_marshal(), except that it will
> >> allocate a wl_proxy for NEW_ID arguments and send it, all under the
> >> display mutex. By introducing a new function, we maintain backwards
> >> compatibility with older code from the generator, and make sure that
> >> the new generated code has an explicit dependency on a new enough
> >> libwayland-client.so.
> >>
> >> A virtual Wayland merit badge goes to Kalle Vahlman, who tracked this
> >> down and analyzed the issue.
> >>
> >> Reported-by: Kalle Vahlman <kalle.vahlman at movial.com>
> >
> > Is there a test case available for checking this problem?
> >
> > Also, is it a theoretical issue or does it crop up in actual use? If
> > the latter, then might be worth including a test case in the test
> > suite.
>
> This has been spotted in a real system with threaded rendering.
> Reproducing the issue there was a bit of a challenge, as anything from
> the simplest printf to a full tracing would increase the amount of
> iterations required for this to trigger. Each iteration took 10-15s
> and the iteration count with debug prints soared to hundreds or
> thousands... You get the idea.
>
> However, reproducing the issue with an artificial example is very easy:
>
> a = wl_proxy_create(...);
> b = wl_proxy_create(...);
> wl_proxy_marshal(..., b);
> wl_proxy_marshal(..., a);
> wl_display_flush(...);
>
> I'm not sure if creating a test case for the artificial case is worth
> it though, this patch fixes things so that we can not end up with that
> sequence.
Were you able to verify that the patch fixes the problem for you?
Kristian
More information about the wayland-devel
mailing list