[RFC wayland] wayland-server: assert instead of posting events with wrong client objects
Derek Foreman
derekf at osg.samsung.com
Thu Dec 8 22:01:55 UTC 2016
On 08/12/16 02:54 PM, Pekka Paalanen wrote:
> On Thu, 8 Dec 2016 13:28:41 -0600
> Derek Foreman <derekf at osg.samsung.com> wrote:
>
>> On 02/12/16 02:07 AM, Pekka Paalanen wrote:
>>> On Thu, 1 Dec 2016 17:19:41 -0600
>>> Derek Foreman <derekf at osg.samsung.com> wrote:
>>>
>>>> Check that all the objects in an event belong to the same client as
>>>> the resource posting it. This prevents a compositor from accidentally
>>>> mixing client objects and posting an event that causes a client to
>>>> kill itself.
>>>>
>>>> It's intended that the compositor killing itself be easier to debug
>>>> than the client killing itself for something that it's completely
>>>> innocent of.
>>>>
>>>> Signed-off-by: Derek Foreman <derekf at osg.samsung.com>
>>>> ---
>>>>
>>>> Recently caught a bug in Enlightenment where we were posting keyboard enter
>>>> events with a surface from another client. In most cases this results in
>>>> the client disconnecting itself due to the target object being the wrong type.
>>>> It might be more hilarious if the types just magically happened to match.
>>>>
>>>> Are we interested in this level of footgun protection?
>>>>
>>>> It's also been suggested that this might be better if conditional on
>>>> WAYLAND_DEBUG.
>>>
>>> Hi Derek,
>>>
>>> this is something I have been wanting for a long time. So far I only
>>> wrote down what we should have with libwayland-client:
>>> https://phabricator.freedesktop.org/T50
>>
>> That is a nice list. :)
>>
>> Glad I'm not the only one that wants this.
>
> Hi,
>
> a quick reply in the middle of the night...
>
>>> IMO, your check should not be an assert, it should be an unconditional
>>> abort() with no way to disable it and a clear log message. Like a
>>> segfault is, only better.
>>>
>>> A compositor that handles SEGVs and ABRTs will then also provide a
>>> backtrace that should be very helpful in squashing the bug.
>>>
>>> However, it seems our existing convention is to just kill the client on
>>> compositor programming errors (e.g. wl_closure_marshal() failing on
>>> nullable violation). I don't really like that, but it's a precedent and
>>> the very least we can detect the problem and yell rather than have
>>> clients mysteriously fail.
>>
>> I think it would be trivial to catch nullable violation from the
>> compositor in the same place where I'm testing for accidental mixing of
>> client objects.
>>
>> I don't think we should remove any client side sanity checks, but adding
>> that check on the compositor side seems like a win for debugging to me.
>> Should I do this too?
>
> Sure. The nullable violation is already checked, it might only miss
> a good log message. Server makes a mistake, the client is
> disconnected when the error is detected...
Ah, ok, that's tested in wl_closure_marshal - so it tests for nullable
violation in both client and server in the same place.
The log message is already concise, IMHO.
I am left wondering if I should have a client object mix-up react in the
same way - log something and disconnect the client.
I'm surprised that dup failure and unhandled arg type abort, but
nullable violation, too many args, and malloc failure keep marching on.
Were these prioritized this way intentionally?
>>> libwayland-client does call wl_abort() on such failures.
>>>
>>>> src/wayland-server.c | 23 +++++++++++++++++++++++
>>>> 1 file changed, 23 insertions(+)
>>>>
>>>> diff --git a/src/wayland-server.c b/src/wayland-server.c
>>>> index 9d7d9c1..bf1719a 100644
>>>> --- a/src/wayland-server.c
>>>> +++ b/src/wayland-server.c
>>>> @@ -160,6 +160,27 @@ log_closure(struct wl_resource *resource,
>>>> }
>>>> }
>>>>
>>>> +static bool
>>>> +clients_match(struct wl_resource *resource, uint32_t opcode,
>>>> + union wl_argument *args)
>>>> +{
>>>> + struct wl_object *object = &resource->object;
>>>> + const char *sig = object->interface->events[opcode].signature;
>>>> +
>>>> + for (;*sig;sig++) {
>>>> + if (*sig == '?')
>>>> + continue;
>>>
>>> How about skipping version numbers as well?
>>
>> Are there actually version numbers that have a place in the signature
>> but no associated argument? My read of get_next_argument left me
>> thinking that only ? was present without an argument.
>>
>> This is really an academic question because...
>>
>>> I believe there should be similar parsing functions elsewhere to either
>>> follow or even reuse/refactor.
>>
>> Yes, there's get_next_argument, which I've re-written my patch to use.
>> So I think that should handle any appropriate skipping that I may have
>> otherwise missed?
>
> I'd hope so.
>
>>>> +
>>>> + if (*sig == 'o') {
>>>
>>> I think you would also want to handle new_id args. Events with new_id
>>> are rare, but they do exist (wl_data_device.data_offer).
>>
>> I'm not sure how to do this. new_id will just give me a number, and I
>> think it would be complicated to ensure that the id is for the correct
>> client?
>
> I thought that when sending events with new_id, the compositor
> passes in wl_resource, not an uint32_t. It needs to create the
> wl_resource right before sending the event, IIRC.
Oh, that should be trivial then, I just need to treat 'n' exactly the
same as 'o', they'll both set args[i].o in wl_argument_from_va_list.
So my next revision will add 'n'
and will either use wl_abort() or wl_log() and set
resource->client->error = 1 instead of sending a closure.
I'll wait a bit to see if anyone else has opinions on server side self
destruct before posting. :)
Thanks,
Derek
>
> Thanks,
> pq
>
>> I wouldn't mind putting this down for "further work at a later date". :)
>>
>> I'll follow up shortly with a new patch.
>>
>> Thanks,
>> Derek
>>
>>>> + struct wl_resource *res = (struct wl_resource *) (args->o);
>>>> + if (res && res->client != resource->client)
>>>> + return false;
>>>> + }
>>>> + args++;
>>>> + }
>>>> + return true;
>>>> +}
>>>> +
>>>> WL_EXPORT void
>>>> wl_resource_post_event_array(struct wl_resource *resource, uint32_t opcode,
>>>> union wl_argument *args)
>>>> @@ -167,6 +188,7 @@ wl_resource_post_event_array(struct wl_resource *resource, uint32_t opcode,
>>>> struct wl_closure *closure;
>>>> struct wl_object *object = &resource->object;
>>>>
>>>> + assert(clients_match(resource, opcode, args));
>>>> closure = wl_closure_marshal(object, opcode, args,
>>>> &object->interface->events[opcode]);
>>>>
>>>> @@ -206,6 +228,7 @@ wl_resource_queue_event_array(struct wl_resource *resource, uint32_t opcode,
>>>> struct wl_closure *closure;
>>>> struct wl_object *object = &resource->object;
>>>>
>>>> + assert(clients_match(resource, opcode, args));
>>>> closure = wl_closure_marshal(object, opcode, args,
>>>> &object->interface->events[opcode]);
>>>>
>>>
>>> A good idea, all in all.
>>>
>>>
>>> Thanks,
>>> pq
>>>
>>
More information about the wayland-devel
mailing list