[pulseaudio-discuss] Access control

Mon Feb 13 12:58:34 UTC 2017

On Fri, 2017-01-27 at 17:15 +0100, Wim Taymans wrote:
> Hi All,
> 
> I took another look at the access control patches.
> 
> There was a desire from Arun and Tanu to have the access control
> checks more integrated
> into the core objects instead of just some checks in the native protocol.

I don't remember what I've said before, but I care more about getting
this done than getting this done my way.

> I was a bit reluctant to start this because it would involve passing
> around a lot of pa_client
> objects to many functions in order to do access control. This week, I
> put my brain to 0 and
> started this anyway, You can find the result in this branch:
> 
> (1)  https://cgit.freedesktop.org/~wtay/pulseaudio/log/?h=access-hooks-arg
> 
> I then spent some time thinking about using TLS to pass the client
> around, more like a context
> to evaluate the functions in than an argument of the function. Tanu
> then suggested to put the
> current_client in the pa_core object where it can be picked up from
> anywhere. The trick is then
> to set and clear the current_client in the various protocols. The
> result can be seen here:
> 
> (2) https://cgit.freedesktop.org/~wtay/pulseaudio/log/?h=access-hooks-core
> 
> You can still find the old way here:
> 
> (3) https://cgit.freedesktop.org/~wtay/pulseaudio/log/?h=access-hooks
> 
> I think (1) is quite ugly and requires you to modify many functions
> with a new parameter that
> is not at all related to what the function does.

I disagree with the characterization that the client parameter is not
related to what the function does. If the function does access control,
then the client parameter is very much related to that.

> It does however give
> you good control over what
> you can check. One example is pa_sink_update_rate() which might
> suspend and unsuspend
> the source to implement the rate change. It a new client does not have
> permission to suspend a
> sink, it will not be able to change the rate.

If I understood correctly, you're comparing options (1) and (2) here,
and you're saying that option (2) will prevent sink rate changes
for streams that are played by clients that don't have the permission
to suspend sinks. I don't think that's true. pa_sink_update_rate() can
set current_client to NULL before it calls pa_sink_suspend().

> (2) looks quite ok. Setting and clearing the current_client is
> straightforward and easily verifiable.
> You can also be certain that when you add new checks, you can just get
> the client from there
> instead of having to change a method. The downside is that if ever a
> non-main thread tries to
> do some access control, you get random behaviour. using TLS will, of
> course fix this.

I don't know why you're worried about other threads. I don't see the
need to do access control from other threads than the main thread.

> (3) is probably still the most simple solution but only deals with the
> native-protocol. (1) and (2)
> now also automatically work for the cli, DBus, esound, http, simple
> and native protocol.
> 
> What do you think? Which option do you like best?

I feel I need to have a close look at all the implementations to give a
definite opinion. So far I've only looked at the implementation of (3).
At the moment I'm not strongly opposed to any of the alternatives, but
if I had to choose right now, I'd prefer (1), then (2) and last (3).
Option (3) is last, because it's easier to forget access checks when
the core doesn't enforce it. (1) and (2) are functionally equivalent.
Both are about enforcing access control in the core, and in both cases
the core functions need the client object as a parameter. Option (1)
does the parameter passing in the usual way, and option (2) does the
parameter passing in an unusual way. Option (1) makes it obvious which
functions involve access control, and forces the programmer to think
about how access control should be done in each case, which I believe
reduces the possibility for errors.

Below are my notes for (3), the "access-hooks" branch. Some of the
notes are perhaps too nitpicky when considering that this is just
prototype code, not yet intended to be ready for merging. I will
hopefully soon review the other branches.

creds: add pid to pa_creds and use store it in pa_client
--------------------------------------------------------

This patch seems to have three separate changes that should probably be
in separate patches: adding pid to pa_creds, adding creds to pa_client
and adding the CLIENT_AUTH hook.

protocol-native: add access checks
----------------------------------

+#define CHECK_ACCESS_STMT(c, command, tag, idx, name, async, denied) { \
+  pa_hook_result_t res = check_access(pd, command, tag, t, userdata, idx, 0, name); \
+  if (res == PA_HOOK_STOP) { \
+      denied; \
+  } else if (res == PA_HOOK_CANCEL) { \
+      async; \
+  } \
+};

Why does the macro take the c parameter when it's not used?

check_access() is passed arguments pd, t and userdata that are not
declared anywhere - are they assumed to be present in the macro calling
context?

The macros should have some comments about what they do, and what the
parameters are.

check_access() takes parameters that are interpreted differently based
on the command. There should be a table that can be used as a reference
for looking up the parameter meanings.

When creating streams, there's no filtering based on which device is
requested for the stream. Maybe that's good enough for now. Such
filtering exists for sample playing, however. Why this inconsistency?

check_access() takes a void pointer "userdata". It's always a
pa_native_connection object, so change the parameter type to
pa_native_connection poiner.

subscription_cb():

+    if (check_access (NULL, PA_COMMAND_SUBSCRIBE_EVENT, 0, NULL, userdata, idx, e, NULL) != PA_HOOK_OK)
+      return;

Tag is set to 0, but (uint32_t) -1 would be better. There are
whitespace issues too.

command_set_volume():

     CHECK_VALIDITY(c->pstream, si || so || sink || source, tag, PA_ERR_NOENTITY);

     client_name = pa_strnull(pa_proplist_gets(c->client->proplist, PA_PROP_APPLICATION_PROCESS_BINARY));

+    CHECK_ACCESS(c, command, tag, idx, name);

It would be better to add the check before the client_name assignment,
since client_name is not used in the check.

command_set_default_sink_or_source():

         source = pa_namereg_get(c->protocol->core, s, PA_NAMEREG_SOURCE);
         CHECK_VALIDITY(c->pstream, source, tag, PA_ERR_NOENTITY);
+        CHECK_ACCESS(c, command, tag, source->index, s);

Since the goal seems to be completely hide the existence of sinks and
sources (e.g. server info hides the default sink and source names),
it's probably not a good idea to respond with PA_ERR_NOENTITY before
doing the access check, or the access check should respond with
PA_ERR_NOENTITY too. If the client can't see what devices exist, it
should be impossible to distinguish between "access denied" and "no
such entity".

The same comment applies to command_kill(), command_unload_module(),
command_move_stream(), command_suspend(), command_extension(),
command_set_card_profile(), command_set_sink_or_source_port(),
command_set_port_latency_offset() (and possibly elsewhere too, it's
easy to miss this particular issue).

map_table needs to be updated whenever the set of commands changes.
Therefore, a comment should be added to the command enum.

+typedef struct pa_protocol_native_access_data {
+    pa_access_data d;
+
+    pa_pdispatch *pd;
+    uint32_t command;
+    uint32_t tag;
+    pa_tagstruct *tc;
+    void *userdata;
+} pa_protocol_native_access_data;

The struct is not exported outside the .c file, so typedef should not
be used, and it doesn't need the "pa_protocol_native_" prefix.
https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/Developer/CodingStyle/

"tc" is a weird abbreviation of "tagstruct", IMHO. I think "tagstruct"
would be a better name for the field.

The userdata field should be changed to "pa_native_connection
*connection".

Reusing the existing pa_hook_result_t type is a bit ugly. I'd define a
new type with more descriptive names than OK, STOP and CANCEL. Also, I
think it would be good to not allow more than one callback per hook,
because the current stacking semantics are weird: if the first callback
returns OK and the second returns STOP, the result is that access is
denied, which makes sense, but if the first callback returns CANCEL
(i.e. async check), the second callback won't be called at all. If the
second callback would return STOP if it were called, then shouldn't
that override the previous CANCEL result?

+static void check_access_finish_cb(pa_access_data *data, bool res) {

What's "res"? The result of the authorization? "access_granted" would
be a better name, except that the parameter doesn't affect the
authorization end result when the value is true, so inverting the
meaning and renaming the parameter to "access_denied" would perhaps
make sense. It's not clear to me why the parameter even exists.

+finish:
+    pa_xfree((char *)d->d.name);
+    if (d->pd)
+        pa_pdispatch_unref(d->pd);
+    if (d->tc)
+        pa_tagstruct_free(d->tc);
+    pa_xfree(d);

It would be good to encapsulate this in access_data_free().

Is it necessary to hold a reference to the pdispatch object? The
pdispatch is guaranteed to stay around as long as the
pa_native_connection object exists. Can we cancel all pending access
checks (and free the access_data structs) when a connection dies,
before pa_native_connection drops its reference to the pdispatch
object?

Should check_access() default to deny instead of allow when
map_table[command] is NULL?

module-access: add example access module
----------------------------------------

+typedef struct access_policy access_policy;
+typedef struct event_item event_item;
+typedef struct client_data client_data;
+typedef struct userdata userdata;

What's the purpose of these typedefs? If you need to forward declare
something (like the access_policy struct references the userdata struct
before the userdata struct has been defined) you can just do "struct
foo;", no need for typedefs.

+struct client_data {
+    uint32_t index;
+    uint32_t policy;
+
+    PA_LLIST_HEAD(event_item, events);
+};

Why is the policy referenced by index rather than by pointer?

+struct userdata {
+    pa_core *core;
+
+    pa_hook_slot *hook[PA_ACCESS_HOOK_MAX];
+
+    pa_idxset *policies;
+    uint32_t default_policy;
+
+    pa_hashmap *clients;
+    pa_hook_slot *client_put_slot;
+    pa_hook_slot *client_proplist_changed_slot;
+    pa_hook_slot *client_unlink_slot;
+};

Why is the default policy referenced by index rather than by pointer?

It would be nice to have a comment that documents the clients hashmap's
key and value types. It seems that the keys are client indexes, why not
pa_client pointers?

Why does the module follow client proplist changes? It looks like the
intention is to update the access policy based on the proplist, but the
module actually never changes it. Maybe that's because the code is only
at a prototype phase, but I don't think the code should ever evolve to
use the proplist as a basis for choosing the access policy. The
proplist is controlled by the client, so it can't be trusted.

You can use pa_module_hook_connect() for avoiding the need to manually
free the client put, unlink and proplist_changed hook slots.

+        u->hook[i] = pa_hook_connect(&u->core->access[i], PA_HOOK_EARLY - 1, cb, u);

Setting the priority to super high doesn't make much sense to me. This
is the only module using the hooks, and you don't know what other
modules there are going to be, and how they should be prioritized
relative to this one. Stacking access control modules doesn't make much
sense anyway with the current hook semantics (as I explained earlier).

module-access: add async handler for record
-------------------------------------------

struct client_data {
+    struct userdata *u;
+
+    uint32_t index;
+    uint32_t policy;
+
+    struct async_cache cached[PA_ACCESS_HOOK_MAX];
+    pa_time_event *time_event;
+    pa_access_data *access_data;
+
+    PA_LLIST_HEAD(event_item, events);
+};

pa_access_data is a per-request object, but this struct allows having
only one pa_access_data per client. Shouldn't the module be prepared to
handle multiple simultaneous requests?

It would be nice to have a comment about what an "event_item"
represents. I read the code yesterday, but now I have already forgotten
what the purpose of those event items is... Ok, now I re-read the code,
and an "event item" actually is just an identifier for an object
(consisting of the type and object index). The events list in
client_data contains the identifiers of all objects that the client is
allowed to see. If an object isn't in the list, then subscription
events for that object are prevented from reaching the client. I think
"visible_objects" or something like that would be a better name for the
list.

-- 
Tanu

https://www.patreon.com/tanuk