[pulseaudio-discuss] [PATCH 6/8] message-params: Allow parameter strings to contain escaped curly braces

Tanu Kaskinen tanuk at iki.fi
Mon Jul 30 09:18:08 UTC 2018


On Sun, 2018-07-29 at 22:36 +0200, Georg Chini wrote:
> On 29.07.2018 21:47, Tanu Kaskinen wrote:
> > On Fri, 2018-07-27 at 10:51 +0200, Georg Chini wrote:
> > > On 27.07.2018 10:08, Tanu Kaskinen wrote:
> > > > On Thu, 2018-07-26 at 18:02 +0200, Georg Chini wrote:
> > > > > On 26.07.2018 12:37, Tanu Kaskinen wrote:
> > > > > > On Sun, 2018-07-22 at 21:11 +0200, Georg Chini wrote:
> > > > > > > On 22.07.2018 17:48, Tanu Kaskinen wrote:
> > > > > > > > On Sun, 2018-07-22 at 16:02 +0200, Georg Chini wrote:
> > > > > > > > > On 21.07.2018 20:17, Tanu Kaskinen wrote:
> > > > > > > > > > On Mon, 2018-04-09 at 19:35 +0200, Georg Chini wrote:
> > > > > > > > > > > +
> > > > > > > > > > > +/* Read functions */
> > > > > > > > > > > +
> > > > > > > > > > >       /* Split the specified string into elements. An element is defined as
> > > > > > > > > > >        * a sub-string between curly braces. The function is needed to parse
> > > > > > > > > > >        * the parameters of messages. Each time it is called returns the position
> > > > > > > > > > >        * of the current element in result and the state pointer is advanced to
> > > > > > > > > > > - * the next list element.
> > > > > > > > > > > + * the next list element. On return, the parameter *is_unpacked indicates
> > > > > > > > > > > + * if the string is plain text or contains a sub-list. is_unpacked may
> > > > > > > > > > > + * be NULL.
> > > > > > > > > > 
> > > > > > > > > > is_unpacked looks like unnecessary complexity.
> > > > > > > > > > pa_message_params_read_string() should always unescape the value.
> > > > > > > > > 
> > > > > > > > > It may be possible, that the string you read is a list. Consider the
> > > > > > > > > following
> > > > > > > > > parameter list: {string1}{some nested structure}{string2}. You can now
> > > > > > > > > read this list as three strings and then continue to read the elements of
> > > > > > > > > the nested structure from the second string. You might even create a
> > > > > > > > > function
> > > > > > > > > that takes a string and outputs a structure. So you are not forced to go
> > > > > > > > > to the full depth of nesting on the first pass. This makes it much easier
> > > > > > > > > to handle deeply nested parameter lists. For me this behavior is an
> > > > > > > > > important
> > > > > > > > > feature and I do not want to drop it. See also my comment on why I do
> > > > > > > > > not always want escaping.
> > > > > > > > 
> > > > > > > > Doesn't split_list() already allow this, why do you want to use
> > > > > > > > read_string() to do the same thing as split_list()?
> > > > > > > 
> > > > > > > read_string() and split_list() are very similar and we could live
> > > > > > > without read_string(). It is provided as a counterpart to write_string()
> > > > > > > and for convenience additionally does the unescaping if necessary
> > > > > > > like write_string does the escaping.
> > > > > > > I don't see why this is a problem. It depends on the context which
> > > > > > > is the better function to use.
> > > > > > 
> > > > > > Again, in my mind a structure is not a string, they are different
> > > > > > types, and I think read_string() should only deal with the string type.
> > > > > > is_unpacked makes the API more complicated, so I'd like to get rid of
> > > > > > it.
> > > > > > 
> > > > > 
> > > > > I don't see your point. is_unpacked is not part of the read_string()
> > > > > arguments
> > > > > or return value. In split_list() it is definitively needed to indicate
> > > > > if the returned
> > > > > string (in the C sense) contains another list. I can imagine many
> > > > > situations where
> > > > > you might either pass an array or just a single value or even NULL.
> > > > > is_unpacked
> > > > > allows to differentiate between the situations.
> > > > 
> > > > Can you give an example? You say is_unpacked is definitely needed, but
> > > > so far the only use case has been read_string(), which you wanted to be
> > > > used for reading not only string values but everything else too. If
> > > > read_string() is changed to only read string values, then it doesn't
> > > > need is_unpacked from split_list().
> > > > 
> > > > The parameter types are known beforehand, so i don't see the need for
> > > > looking at the data to figure out the type. If introspection support is
> > > > desired, then that's a separate project (the is_unpacked flag isn't
> > > > sufficient for proper introspection).
> > > > 
> > > 
> > > This is not about parameter type, it is just to distinguish between
> > > a list and a simple value. One example comes to my mind immediately:
> > > Consider a parameter list that consists of strings and a couple of
> > > arrays. Now you can read this list as an array of strings (patch 8
> > > provides a function for that) and then examine those strings that
> > > contain arrays separately. Having the flag (and using it in read_string())
> > > provides a more flexible and convenient way to parse a parameter list.
> > > 
> > > The flag may not be strictly necessary at the moment, but I would still
> > > like to keep it.
> > 
> > To continue a familiar theme of my review: if there's a
> > read_string_array() function, I want that to be used only for string
> > arrays, not a mishmash of random types. There could be a separate
> > function split_list_into_array() that does something similar to
> > what you wanted to do with read_string_array().
> > split_list_into_array() wouldn't do special string handling,
> > though, so unescaping would be left to the application. I find that
> > only logical, since other basic types don't get special handling
> > either (i.e. floats aren't converted to C floats).
> > 
> > Your use case could be served with a vararg function that instead of
> > producing a string array would read into separate variables, like this:
> > 
> > pa_message_params_read(c, state,
> >                         PA_TYPE_STRING, &param1,
> >                         PA_TYPE_FLOAT, &param2,
> >                         PA_TYPE_RAW, &param3,
> >                         0);
> > 
> > PA_TYPE_RAW would be useful for reading a list (or anything else) into
> > a C string without unescaping or other processing. There could be also
> > PA_TYPE_IGNORE for uninteresting variables, and PA_TYPE_*_ARRAY for
> > arrays of basic types.
> > 
> > (Now it occurred to me that the 'c' argument name that is used in the
> > parsing functions is a bit weird and unhelpful. Maybe "param_list"
> > would be good?)
> > 
> 
> I still don't see your point. Again, the use of is_unpacked is 
> transparent for the
> user of read_string(), so the user just reads a string without having to 
> care about
> unescaping. The flag does not complicate the API, it simplifies it 
> because the
> unescaping is done automatically.

The flag complicates the split_list() function parameters, but also the
read_string() semantics: you need to explain to the user that unlike
all the other read_foo() functions, read_string() can read any value
and unescaping becomes conditional because of this.

> Your approach seems unnecessary
> complicated to me. A string is a string, even if it contains 
> sub-structures. Your
> split_list_into_array() function would basically return an array of 
> strings, so what
> would be the advantage? It would only make parsing more cumbersome, because
> the task of unescaping is given to the user instead of doing it 
> automatically where
> necessary.

That's why I came up with the vararg function. I agree that
split_list_into_array() is unlikely to be very useful.

> Also there is no "mishmash of random types", they are all strings It is 
> only the
> difference between a "simple" string which needs no further processing and a
> "complex" string which needs further parsing.

The API defines types for the parameters. There are no separate
"simple" and "complex" strings in the type system. The string type is
different than the list type. What you call complex strings are in my
mind just the raw unprocessed serialized data, which is a different
abstraction level than the values that the various read_foo() functions
return. It feels just very wrong to have a function that returns values
in both domains: unparsed raw data and parsed values. Especially when
the function is a family of read_foo() functions where all other
functions only operate on their designated types.

Compare to D-Bus: if you say that "they are all strings", then in D-Bus 
the same would be "they are all byte arrays", since that's what they
are in the message buffer. I doubt you would suggest that the
dbus_message_iter_read_byte_array() function (if such thing existed)
should be usable on arbitrarily typed values rather than just byte
array values, and should return the raw serialized data regardless of
its type. Here you're suggesting that read_string() should be able to
return the raw serialized data regardless of its type.

Do you understand what I'm getting at? You may disagree that having
strict separation of concerns between raw serialized data and parsed
values is beneficial for the user-friendliness of the API, but if you
don't even understand what I mean by that separation, then discussing
this is very hard.

-- 
Tanu

https://www.patreon.com/tanuk
https://liberapay.com/tanuk


More information about the pulseaudio-discuss mailing list