[PATCH wayland 8/9] client: Replace the singleton zombie with bespoke zombies

Wed Apr 12 14:41:43 UTC 2017

On 12/04/17 09:07 AM, Pekka Paalanen wrote:
> On Wed, 12 Apr 2017 11:53:02 +0800
> Jonas Ådahl <jadahl at gmail.com> wrote:
>
>> On Tue, Apr 11, 2017 at 10:37:28AM -0500, Derek Foreman wrote:
>>> On 07/04/17 03:27 PM, Derek Foreman wrote:
>>>> Using the singleton zombie object doesn't allow us to posthumously retain
>>>> object interface information, which makes it difficult to properly inter
>>>> future events destined for the recently deceased proxy.
>>>>
>>>> Notably, this makes it impossible for zombie proxy destined file
>>>> descriptors to be properly consumed.
>>>>
>>>> Instead of having a singleton zombie object, we add a new wl_map flag
>>>> (valid only on the client side where zombies roam - the existing
>>>> "legacy" flag was only ever used on the server side) to indicate that a
>>>> map entry is now a zombie.
>>>>
>>>> We still have to refcount and potentially free the proxy immediately or
>>>> we're left with situations where it can be leaked forever.  If we clean
>>>> up on delete_id, for example, a forced disconnect will result in a leaked
>>>> proxy (breaking much of the test suite).
>>>>
>>>> So, since we must free the zombied proxy immediately, we store just the
>>>> interface for it in its previous map location, so signature information
>>>> can be retained for zombie-destined events.
>>>
>>> So even with this in place it's still possible for a malicious application
>>> to consume 1000fds (the number of fds that fit in fds_in) and go to sleep
>>> and hold them - on client disconnect they should be freed.  I don't really
>>> see a way to prevent that sort of crap anyway - a client could use sendmsg
>>> directly, send a single byte of regular data (ie: not enough to trigger
>>> demarshalling, the server will assume there's more to come), and a buffer's
>>> worth of file descriptors, then never speak again.
>>>
>>> This appears indistinguishable from reasonable behaviour?
>
> Possibly, unless we can prove that we always send complete messages so
> that we could require receiving complete messages (and AF_UNIX
> guarantees that) in a single recvmsg() call.

As far as I can tell there's no promise of that - it looks like a 
SOCK_STREAM is a byte stream with no atomicity/boundary guarantees...

If someone out there has deeper knowledge and can quote spec that gets 
us out of this, please speak up. :)

If this is really a concern we could also just kill any client with > 
50(arbitrary number) fds lingering in its fds_in, because that's just 
anti-social behaviour, right?

I think a dmabuf using client could force the compositor to hold open a 
large amount of fds as well, without being bounded by the 1024 fd 
capacity of the fds_in buffer anyway - so I'm really not sure how much 
we have to care about this...

>>> There's also an interesting side effect to this patch that I noticed
>>> yesterday - it places a requirement on the client to keep the wl_interfaces
>>> around in memory long after it destroys the proxy - up until the delete_id
>>> occurs.  We have no way to hook delete_id in the client.  This turned into a
>>> crash in EFL clients using the text input protocol, as the input method code
>>> in EFL is a module that's unloaded on shutdown.  It was easily fixed in EFL,
>>> but I'd like some input - is a client allowed to unmap the memory that a
>>> wl_interface is stored in at the moment it deletes the relevant proxy?
>>>
>>> This isn't terribly difficult to fix, but I won't bother if the behaviour is
>>> universally concluded to be a client bug. :)
>
> Even though it's EFL, I think it's actually something people might be
> using. Unloading plugins, or libEGL.so or whatever, while keeping the
> Wayland connection alive.
>
> Hence I would be hesitant to add the requirement that wl_interface data
> needs to be kept around until the connection is closed.

Ok, we're up to two votes that this is an abi break, countered by my own 
apathy and a powerful shrug from an anonymous third party.  I'm going to 
treat this as a real bug I've introduced that needs to be properly 
addressed.

>> We could instead introduce a struct wl_zombie_object that constains the
>> data needed for doing a proper cleanup. This data could be for example
>> be number of fds per event opcode, or strdup:ed event signatures or
>> something. All of this would be known at zombification, and we'd avoid
>> any ABI change as this patch, as you say, seems to introduce.

This seems reasonable...  With this and a map walk at client disconnect 
to reap any zombies that hadn't received delete_id yet, I think we're good.

>> If we'd care about minimizing allocations, we could also insert the old
>> WL_ZOMBIE_OBJECT when there are no fds to clean up, and handle that as a
>> another special case on clean up.

Are we that concerned about allocations/frees?  It seems a little 
confusing to do this.

> If we need to allocate stuff to ensure clean-up, that allocation should
> be done at the time the wl_proxy is created. Allocation may
> theoretically fail, so destruction paths shouldn't allocate.
> Essentially that means we should copy what we need at the time of
> setting the interface data.

I hadn't actually considered that, I'll keep it in mind for the next 
revision.

> Since we can receive multiple messages in a single recvmsg() call, we
> need to parse them all before we can figure out which fds belong to
> which message. I wish there was a way around that, but it looks like we
> need the message fd counts in any case. Maybe copying just the fd
> counts only for opcodes that have any is the best we can do. Too bad
> our wire format does not include the number of fds along with the
> message length.

Yeah, I don't think there's any flexibility in the wire format at all, 
anything we might try to do will break something.

I'm going to go with fd counts for opcodes for v2, and see what that 
looks like...

>
> Thanks,
> pq
>