[pulseaudio-discuss] [PATCH 6/8] message-params: Allow parameter strings to contain escaped curly braces

Tanu Kaskinen tanuk at iki.fi
Sat Jul 21 18:17:39 UTC 2018


On Mon, 2018-04-09 at 19:35 +0200, Georg Chini wrote:
> The patch adds the possibility to escape curly braces within parameter strings
> and introduces several new functions that can be used for writing parameters.
> 
> For writing, the structure pa_message_param, which is a wrapper for pa_strbuf
> has been created. Following new write functions are available:
> 
> pa_message_param_new() - creates a new pa_message_param structure
> pa_message_param_to_string() - converts a pa_message_param to string and frees
> the structure

I'd like to have the _free suffix in the function name, because for the
uninitiated the current name looks like just another "to_string"
function with no surprising side effects.

I think we need pa_message_params_free() as well in case an application
runs into an error while constructing the parameters and it wants to
just get rid of the pa_message_params struct without converting it to a
string first.

> pa_message_param_begin_list() - starts a list
> pa_message_param_end_list() - ends a list
> pa_message_param_write_string() - writes a string to a pa_message_param structure
> 
> For string parameters that contain curly braces, those braces must be escaped
> by adding a "\" before them. This however means that a trailing backslash would
> falsely escape the closing bracket. To avoid this, single quotes must be added
> at start and end of the string.

Why not solve this by the usual method: require escaping of "\" in
input with "\\" in output?

> The function pa_message_param_write_string()
> has a parameter do_escape.

Why not do escaping always?

> If true, the necessary escaping is added. Escaping
> is only needed if a string might fulfill one of the following conditions:
> 
> - It contains curly braces
> - It contains a trailing "\"
> - It is enclosed in single quotes
> 
> Other strings can be passed without modification.
> 
> For reading, pa_message_param_read_string() reverts the changes that
> pa_message_param_write_string() might have introduced.
> 
> The patch also adds more restrictions on the object path name. Now only
> alphanumeric characters and one of "_", ".", "-" and "/" are allowed.
> The path name may not end with a /. If the user specifies a trailing / when
> sending a message, it will be removed.
> ---
>  doc/messaging_api.txt           |  34 +++++++-
>  src/Makefile.am                 |   3 +-
>  src/map-file                    |   5 ++
>  src/pulse/message-params.c      | 181 ++++++++++++++++++++++++++++++++++++++--
>  src/pulse/message-params.h      |  23 ++++-
>  src/pulsecore/core.c            |  20 ++---
>  src/pulsecore/message-handler.c |  53 +++++++-----
>  src/utils/pactl.c               |   4 +-
>  8 files changed, 279 insertions(+), 44 deletions(-)
> 
> diff --git a/doc/messaging_api.txt b/doc/messaging_api.txt
> index 431a5df2..0e6be53f 100644
> --- a/doc/messaging_api.txt
> +++ b/doc/messaging_api.txt
> @@ -14,10 +14,42 @@ look like that:
>  {{Integer} {{1st float} {2nd float} ...}}{...}
>  Any characters that are not enclosed in curly braces are ignored (all characters
>  between { and {, between } and } and between } and {). The same syntax is used
> -to specify message parameters. The following reference lists available messages,
> +to specify message parameters. The reference further down lists available messages,
>  their parameters and return values. If a return value is enclosed in {}, this
>  means that multiple elements of the same type may be returned.
>  
> +There are several functions that simplify reading and writing message parameter
> +strings. For writing, the structure pa_message_param can be used. Following
> +functions are available:
> +pa_message_param_new() - creates a new pa_message_param structure
> +pa_message_param_to_string() - converts a pa_message_param to string and frees
> +the structure
> +pa_message_param_begin_list() - starts a list
> +pa_message_param_end_list() - ends a list
> +pa_message_param_write_string() - writes a string to a pa_message_param structure
> +
> +For string parameters that contain curly braces, those braces must be escaped
> +by adding a "\" before them. This however means that a trailing backslash would
> +falsely escape the closing bracket. To avoid this, single quotes must be added
> +at start and end of the string. The function pa_message_param_write_string()
> +has a parameter do_escape. If true, the necessary escaping is added. Escaping
> +is only needed if a string might fulfill one of the following conditions:
> +
> +- It contains curly braces
> +- It contains a trailing "\"
> +- It is enclosed in single quotes
> +
> +Other strings can be passed without modification.
> +
> +For reading, the following functions are available:
> +pa_message_param_split_list() - parse message parameter string
> +pa_message_param_read_string() - read a string from a parameter list
> +
> +pa_message_param_read_string() also reverts the changes that
> +pa_message_param_write_string() might have introduced.

I think the doxygen documentation is better suited for the details of
the C API. messaging_api.txt should contain information that is common
to both libpulse and pactl users.

> +
> +Reference:
> +
>  Object path: /core
>  Message: list-handlers
>  Parameters: None
> diff --git a/src/Makefile.am b/src/Makefile.am
> index ccdad8ff..72d8cf22 100644
> --- a/src/Makefile.am
> +++ b/src/Makefile.am
> @@ -675,6 +675,7 @@ libpulsecommon_ at PA_MAJORMINOR@_la_SOURCES = \
>  		pulse/timeval.c pulse/timeval.h \
>  		pulse/rtclock.c pulse/rtclock.h \
>  		pulse/volume.c pulse/volume.h \
> +		pulse/message-params.c pulse/message-params.h \
>  		pulsecore/atomic.h \
>  		pulsecore/authkey.c pulsecore/authkey.h \
>  		pulsecore/conf-parser.c pulsecore/conf-parser.h \
> @@ -884,6 +885,7 @@ libpulse_la_SOURCES = \
>  		pulse/mainloop-api.c pulse/mainloop-api.h \
>  		pulse/mainloop-signal.c pulse/mainloop-signal.h \
>  		pulse/mainloop.c pulse/mainloop.h \
> +		pulse/message-params.c pulse/message-params.h \
>  		pulse/operation.c pulse/operation.h \
>  		pulse/proplist.c pulse/proplist.h \
>  		pulse/pulseaudio.h \
> @@ -896,7 +898,6 @@ libpulse_la_SOURCES = \
>  		pulse/timeval.c pulse/timeval.h \
>  		pulse/utf8.c pulse/utf8.h \
>  		pulse/util.c pulse/util.h \
> -		pulse/message-params.c pulse/message-params.h \
>  		pulse/volume.c pulse/volume.h \
>  		pulse/xmalloc.c pulse/xmalloc.h
>  
> diff --git a/src/map-file b/src/map-file
> index 385731dc..372d190d 100644
> --- a/src/map-file
> +++ b/src/map-file
> @@ -225,8 +225,13 @@ pa_mainloop_quit;
>  pa_mainloop_run;
>  pa_mainloop_set_poll_func;
>  pa_mainloop_wakeup;
> +pa_message_param_begin_list;
> +pa_message_param_end_list;
> +pa_message_param_new;
>  pa_message_param_read_string;
>  pa_message_param_split_list;
> +pa_message_param_to_string;
> +pa_message_param_write_string;
>  pa_msleep;
>  pa_operation_cancel;
>  pa_operation_get_state;
> diff --git a/src/pulse/message-params.c b/src/pulse/message-params.c
> index 964e29d0..d68ec59d 100644
> --- a/src/pulse/message-params.c
> +++ b/src/pulse/message-params.c
> @@ -28,20 +28,55 @@
>  #include <pulse/xmalloc.h>
>  
>  #include <pulsecore/macro.h>
> +#include <pulsecore/strbuf.h>
>  
>  #include "message-params.h"
>  
> +/* Message parameter structure, a wrapper for pa_strbuf */
> +struct pa_message_param {
> +    pa_strbuf *buffer;
> +};
> +
> +/* Remove escaping from a string */
> +static char *unescape(char *value) {
> +    char *tmp;
> +
> +    if (!value)
> +        return NULL;
> +
> +    /* If the string is enclosed in single quotes, remove them */
> +    if (*value == '\'' && value[strlen(value) - 1] == '\'') {
> +
> +        memmove(value, value + 1, strlen(value));
> +        value[strlen(value) - 1] = 0;
> +    }
> +
> +    /* Remove escape character from curly braces if present. */
> +    while ((tmp = strstr(value, "\\{")))
> +        memmove(tmp, tmp + 1, strlen(value) - (size_t)(tmp  - value));
> +    while ((tmp = strstr(value, "\\}")))
> +        memmove(tmp, tmp + 1, strlen(value) - (size_t)(tmp  - value));
> +
> +    /* Return the pointer that was passed in. */
> +    return value;
> +}

core-util already contains pa_unescape() that does the same thing more
efficiently (if you drop the single quote thing).

> +
> +/* Read functions */
> +
>  /* Split the specified string into elements. An element is defined as
>   * a sub-string between curly braces. The function is needed to parse
>   * the parameters of messages. Each time it is called returns the position
>   * of the current element in result and the state pointer is advanced to
> - * the next list element.
> + * the next list element. On return, the parameter *is_unpacked indicates
> + * if the string is plain text or contains a sub-list. is_unpacked may
> + * be NULL.

is_unpacked looks like unnecessary complexity.
pa_message_params_read_string() should always unescape the value.

>   * The variable state points to, should be initialized to NULL before
>   * the first call. The function returns 1 on success, 0 if end of string
>   * is encountered and -1 on parse error. */
> -int pa_message_param_split_list(char *c, char **result, void **state) {
> +int pa_message_param_split_list(char *c, char **result, bool *is_unpacked, void **state) {
>      char *current = *state ? *state : c;
>      uint32_t open_braces;
> +    bool found_backslash = false;
>  
>      pa_assert(result);
>  
> @@ -54,29 +89,52 @@ int pa_message_param_split_list(char *c, char **result, void **state) {
>      /* Find opening brace */
>      while (*current != 0) {
>  
> -        if (*current == '{')
> +        /* Skip escaped curly braces. */
> +        if (*current == '\\') {

With my proposed escaping rules, this should be

    if (*current == '\\' && !found_backslash)

because if we have two backslashes in a row, the next character is not
escaped.

> +            found_backslash = true;
> +            current++;
> +            continue;
> +        }
> +
> +        if (*current == '{' && !found_backslash)
>              break;
>  
>          /* unexpected closing brace, parse error */
> -        if (*current == '}')
> +        if (*current == '}' && !found_backslash)
>              return -1;
>  
> +        found_backslash = false;
>          current++;
>      }
>  
>      /* No opening brace found, end of string */
>      if (*current == 0)
> -         return 0;
> +        return 0;
>  
> +    if (is_unpacked)
> +        *is_unpacked = true;
>      *result = current + 1;
> +    found_backslash = false;
>      open_braces = 1;
>  
>      while (open_braces != 0 && *current != 0) {
>          current++;
> -        if (*current == '{')
> +
> +        /* Skip escaped curly braces. */
> +        if (*current == '\\') {

Same as above, with my proposed unescaping rules this should be

    if (*current == '\\' && !found_backslash)

> +            found_backslash = true;
> +            continue;
> +        }
> +
> +        if (*current == '{' && !found_backslash) {
>              open_braces++;
> -        if (*current == '}')
> +            if (is_unpacked)
> +                *is_unpacked = false;
> +        }
> +        if (*current == '}' && !found_backslash)
>              open_braces--;
> +
> +        found_backslash = false;
>      }
>  
>      /* Parse error, closing brace missing */
> @@ -96,21 +154,126 @@ int pa_message_param_split_list(char *c, char **result, void **state) {
>  /* Read a string from the parameter list. The state pointer is
>   * advanced to the next element of the list. If allocate is
>   * true, a newly allocated string will be returned, else a
> - * pointer to a sub-string within c. */
> + * pointer to a sub-string within c. Escaping will be removed
> + * if the string does not contain another list. */

A string is a string, it's not a list. I think this function should
always do unescaping. If a string value is read that is missing
escaping for curly braces, that could be reported as an error (I'm not
saying "should", but I think it would be good to do the validation in
this function).

>  int pa_message_param_read_string(char *c, char **result, bool allocate, void **state) {
>      char *start_pos;
>      int err;
> +    bool is_unpacked;
>  
>      pa_assert(result);
>  
>      *result = NULL;
>  
> -    if ((err = pa_message_param_split_list(c, &start_pos, state)) != 1)
> +    if ((err = pa_message_param_split_list(c, &start_pos, &is_unpacked, state)) != 1)
>          return err;
>  
>      *result = start_pos;
> +
> +    if (is_unpacked)
> +        *result = unescape(*result);
> +
>      if (allocate)
>          *result = pa_xstrdup(*result);
>  
>      return err;
>  }
> +
> +/* Write functions. The functions are wrapper functions around pa_strbuf,
> + * so that the client does not need to use pa_strbuf directly. */
> +
> +/* Creates a new pa_message_param structure */
> +pa_message_param *pa_message_param_new(void) {
> +    pa_message_param *param;
> +
> +    param = pa_xnew(pa_message_param, 1);
> +    param->buffer = pa_strbuf_new();
> +
> +    return param;
> +}
> +
> +/* Converts a pa_message_param structure to string and frees the structure */
> +char *pa_message_param_to_string(pa_message_param *param) {
> +    char *result;
> +
> +    pa_assert(param);
> +
> +    result = pa_strbuf_to_string_free(param->buffer);
> +
> +    pa_xfree(param);
> +    return result;
> +}
> +
> +/* Writes an opening curly brace */
> +void pa_message_param_begin_list(pa_message_param *param) {
> +
> +    pa_assert(param);
> +
> +    pa_strbuf_putc(param->buffer, '{');
> +}
> +
> +/* Writes a closing curly brace */
> +void pa_message_param_end_list(pa_message_param *param) {
> +
> +    pa_assert(param);
> +
> +    pa_strbuf_putc(param->buffer, '}');
> +}
> +
> +/* Writes a string to a message_param structure, adding curly braces
> + * around the string. If do_escape is true, leading and trailing single
> + * quotes and also a "\" before curly braces are added to the input string.
> + * Escaping only needs to be used if the original string might fulfill any
> + * of the following conditions:
> + * - It contains curly braces
> + * - It is enclosed in single quotes
> + * - It ends with a "\" */
> +void pa_message_param_write_string(pa_message_param *param, const char *value, bool do_escape) {

I would drop the do_escape flag, and if there's need for appending
unescaped raw data to pa_message_params, there could be a separate
function for that.

> +    const char *s;
> +    char *output, *out_char;
> +    size_t brace_count;
> +
> +    pa_assert(param);
> +
> +    /* Null value is written as empty element */
> +    if (!value)
> +        value = "";
> +
> +    if (!do_escape) {
> +        pa_strbuf_printf(param->buffer, "{%s}", value);
> +        return;
> +    }
> +
> +    /* Using pa_strbuf_putc() to write to the strbuf while iterating over
> +     * the input string would cause the allocation of a linked list element
> +     * for each character of the input string. Therefore the output string
> +     * is constructed locally before writing it to the buffer */

This is an interesting point. I think you should improve pa_escape(),
which currently uses the naive pa_strbuf_putc() method. With my
proposed escaping rules, you could then replace this code with a call
to pa_escape().

> +
> +    /* Count number of characters to escape */
> +    brace_count = 0;
> +    for (s = value; *s; ++s) {
> +        if (*s == '{' || *s == '}')
> +            brace_count++;
> +    }

You could at the same time count all characters to save the strlen()
call.

> +
> +    /* allocate output string */
> +    output = pa_xmalloc(strlen(value) + brace_count + 5);

It would be good to have a comment for that magic value 5.

> +    out_char = output;
> +
> +    *out_char++ = '{';
> +    *out_char++ = '\'';
> +
> +    for (s = value; *s; ++s) {
> +        if (*s == '{' || *s == '}')
> +            *out_char++ = '\\';
> +
> +        *out_char++ = *s;
> +    }
> +
> +    *out_char++ = '\'';
> +    *out_char++ = '}';
> +    *out_char = 0;
> +
> +    pa_strbuf_puts(param->buffer, output);
> +    pa_xfree(output);
> +}
> diff --git a/src/pulse/message-params.h b/src/pulse/message-params.h
> index 02e08363..c0675c6e 100644
> --- a/src/pulse/message-params.h
> +++ b/src/pulse/message-params.h
> @@ -30,12 +30,33 @@
>  
>  PA_C_DECL_BEGIN
>  
> +typedef struct pa_message_param pa_message_param;
> +
> +/** Read functions */

While wondering how this kind of lone comments show up in doxygen, I
noticed that message-params.h is not listed in the INPUT list in
doxygen/doxygen.conf.in, so no documentation is generated.

> +
>  /** Split message parameter string into list elements */
> -int pa_message_param_split_list(char *c, char **result, void **state);
> +int pa_message_param_split_list(char *c, char **result, bool * is_unpacked, void **state);
>  
>  /** Read a string from the parameter list. */
>  int pa_message_param_read_string(char *c, char **result, bool allocate, void **state);

It's worth noting in the documentation that the function unescapes the
string.

>  
> +/** Write functions */
> +
> +/** Create a new pa_message_param structure */
> +pa_message_param *pa_message_param_new(void);
> +
> +/** Convert pa_message_param to string, free pa_message_param structure */
> +char *pa_message_param_to_string(pa_message_param *param);

It should be documented that the returned string should be freed with
pa_xfree(). (Also remember the \since tags.)

> +
> +/** Write an opening brace */
> +void pa_message_param_begin_list(pa_message_param *param);
> +
> +/** Write a closing brace */
> +void pa_message_param_end_list(pa_message_param *param);
> +
> +/** Append string to parameter list */
> +void pa_message_param_write_string(pa_message_param *param, const char *value, bool do_escape);

The documentation should mention that the function does escaping,
escpecially if you drop the do_escape flag as I hope, but even if the
flag stays, it should be described.

> +
>  PA_C_DECL_END
>  
>  #endif
> diff --git a/src/pulsecore/core.c b/src/pulsecore/core.c
> index e95657f0..acd47cbb 100644
> --- a/src/pulsecore/core.c
> +++ b/src/pulsecore/core.c
> @@ -29,6 +29,7 @@
>  #include <pulse/rtclock.h>
>  #include <pulse/timeval.h>
>  #include <pulse/xmalloc.h>
> +#include <pulse/message-params.h>
>  
>  #include <pulsecore/module.h>
>  #include <pulsecore/core-rtclock.h>
> @@ -65,25 +66,24 @@ static void core_free(pa_object *o);
>  
>  /* Returns a list of handlers. */
>  static char *message_handler_list(pa_core *c) {
> -    pa_strbuf *buf;
> +    pa_message_param *param;
>      void *state = NULL;
>      struct pa_message_handler *handler;
>  
> -    buf = pa_strbuf_new();
> +    param = pa_message_param_new();
>  
> -    pa_strbuf_putc(buf, '{');
> +    pa_message_param_begin_list(param);
>      PA_HASHMAP_FOREACH(handler, c->message_handlers, state) {
> -        pa_strbuf_putc(buf, '{');
> +        pa_message_param_begin_list(param);
>  
> -        pa_strbuf_printf(buf, "{%s} {", handler->object_path);
> -        if (handler->description)
> -            pa_strbuf_puts(buf, handler->description);
> +        pa_message_param_write_string(param, handler->object_path, false);
> +        pa_message_param_write_string(param, handler->description, true);
>  
> -        pa_strbuf_puts(buf, "}}");
> +        pa_message_param_end_list(param);
>      }
> -    pa_strbuf_putc(buf, '}');
> +    pa_message_param_end_list(param);
>  
> -    return pa_strbuf_to_string_free(buf);
> +    return pa_message_param_to_string(param);
>  }
>  
>  static int core_message_handler(const char *object_path, const char *message, char *message_parameters, char **response, void *userdata) {
> diff --git a/src/pulsecore/message-handler.c b/src/pulsecore/message-handler.c
> index 427186db..860f3b8e 100644
> --- a/src/pulsecore/message-handler.c
> +++ b/src/pulsecore/message-handler.c
> @@ -31,15 +31,29 @@
>  
>  #include "message-handler.h"
>  
> -/* Check if a string does not contain control characters. Currently these are
> - * only "{" and "}". */
> -static bool string_is_valid(const char *test_string) {
> +/* Check if a path string starts with a / and only contains valid characters */
> +static bool object_path_is_valid(const char *test_string) {
>      uint32_t i;
>  
> +    if (!test_string)
> +        return false;
> +
> +    /* Make sure the string starts with / and does not end with a / */
> +    if (test_string[0] != '/' || test_string[strlen(test_string) - 1] == '/')
> +        return false;
> +
>      for (i = 0; test_string[i]; i++) {
> -        if (test_string[i] == '{' ||
> -            test_string[i] == '}')
> -            return false;
> +
> +        if ((test_string[i] >= 'a' && test_string[i] <= 'z') ||
> +            (test_string[i] >= 'A' && test_string[i] <= 'Z') ||
> +            (test_string[i] >= '0' && test_string[i] <= '9') ||
> +            test_string[i] == '.' ||
> +            test_string[i] == '_' ||
> +            test_string[i] == '-' ||
> +            test_string[i] == '/')
> +            continue;
> +
> +        return false;
>      }

You could save the strlen() call if you checked the trailing slash
after this for loop when you're at the end of the string.
	
By the way, do you perhaps want to disallow two subsequent slashes in
object paths? (I don't care myself, because I doubt that invalid paths
will ever be encountered anyway when registering message handlers.)

>  
>      return true;
> @@ -56,13 +70,8 @@ void pa_message_handler_register(pa_core *c, const char *object_path, const char
>      pa_assert(cb);
>      pa_assert(userdata);
>  
> -    /* Ensure that the object path is not empty and starts with "/". */
> -    pa_assert(object_path[0] == '/');
> -
> -    /* Ensure that object path and description are valid strings */
> -    pa_assert(string_is_valid(object_path));
> -    if (description)
> -        pa_assert(string_is_valid(description));
> +    /* Ensure that object path is valid */
> +    pa_assert(object_path_is_valid(object_path));
>  
>      handler = pa_xnew0(struct pa_message_handler, 1);
>      handler->userdata = userdata;
> @@ -91,7 +100,7 @@ void pa_message_handler_unregister(pa_core *c, const char *object_path) {
>  int pa_message_handler_send_message(pa_core *c, const char *object_path, const char *message, const char *message_parameters, char **response) {
>      struct pa_message_handler *handler;
>      int ret;
> -    char *parameter_copy;
> +    char *parameter_copy, *path_copy;
>  
>      pa_assert(c);
>      pa_assert(object_path);
> @@ -100,8 +109,16 @@ int pa_message_handler_send_message(pa_core *c, const char *object_path, const c
>  
>      *response = NULL;
>  
> -    if (!(handler = pa_hashmap_get(c->message_handlers, object_path)))
> +    path_copy = pa_xstrdup(object_path);
> +
> +    /* Remove trailing / from path name if present */
> +    if (path_copy[strlen(path_copy) - 1] == '/')
> +        path_copy[strlen(path_copy) - 1] = 0;

If you want to allow some slack for applications, would it be better to
do this extra work in libpulse rather than in the daemon?

I would personally not bother with this at all, but I don't mind it
that much (even if it's done in the daemon).

-- 
Tanu

https://www.patreon.com/tanuk
https://liberapay.com/tanuk


More information about the pulseaudio-discuss mailing list