[pulseaudio-discuss] [PATCH 6/8] message-params: Allow parameter strings to contain escaped curly braces

Sun Jul 29 20:36:56 UTC 2018

On 29.07.2018 21:47, Tanu Kaskinen wrote:
> On Fri, 2018-07-27 at 10:51 +0200, Georg Chini wrote:
>> On 27.07.2018 10:08, Tanu Kaskinen wrote:
>>> On Thu, 2018-07-26 at 18:02 +0200, Georg Chini wrote:
>>>> On 26.07.2018 12:37, Tanu Kaskinen wrote:
>>>>> On Sun, 2018-07-22 at 21:11 +0200, Georg Chini wrote:
>>>>>> On 22.07.2018 17:48, Tanu Kaskinen wrote:
>>>>>>> On Sun, 2018-07-22 at 16:02 +0200, Georg Chini wrote:
>>>>>>>> On 21.07.2018 20:17, Tanu Kaskinen wrote:
>>>>>>>>> On Mon, 2018-04-09 at 19:35 +0200, Georg Chini wrote:
>>>>>>>>>> +
>>>>>>>>>> +/* Read functions */
>>>>>>>>>> +
>>>>>>>>>>       /* Split the specified string into elements. An element is defined as
>>>>>>>>>>        * a sub-string between curly braces. The function is needed to parse
>>>>>>>>>>        * the parameters of messages. Each time it is called returns the position
>>>>>>>>>>        * of the current element in result and the state pointer is advanced to
>>>>>>>>>> - * the next list element.
>>>>>>>>>> + * the next list element. On return, the parameter *is_unpacked indicates
>>>>>>>>>> + * if the string is plain text or contains a sub-list. is_unpacked may
>>>>>>>>>> + * be NULL.
>>>>>>>>> is_unpacked looks like unnecessary complexity.
>>>>>>>>> pa_message_params_read_string() should always unescape the value.
>>>>>>>> It may be possible, that the string you read is a list. Consider the
>>>>>>>> following
>>>>>>>> parameter list: {string1}{some nested structure}{string2}. You can now
>>>>>>>> read this list as three strings and then continue to read the elements of
>>>>>>>> the nested structure from the second string. You might even create a
>>>>>>>> function
>>>>>>>> that takes a string and outputs a structure. So you are not forced to go
>>>>>>>> to the full depth of nesting on the first pass. This makes it much easier
>>>>>>>> to handle deeply nested parameter lists. For me this behavior is an
>>>>>>>> important
>>>>>>>> feature and I do not want to drop it. See also my comment on why I do
>>>>>>>> not always want escaping.
>>>>>>> Doesn't split_list() already allow this, why do you want to use
>>>>>>> read_string() to do the same thing as split_list()?
>>>>>> read_string() and split_list() are very similar and we could live
>>>>>> without read_string(). It is provided as a counterpart to write_string()
>>>>>> and for convenience additionally does the unescaping if necessary
>>>>>> like write_string does the escaping.
>>>>>> I don't see why this is a problem. It depends on the context which
>>>>>> is the better function to use.
>>>>> Again, in my mind a structure is not a string, they are different
>>>>> types, and I think read_string() should only deal with the string type.
>>>>> is_unpacked makes the API more complicated, so I'd like to get rid of
>>>>> it.
>>>>>
>>>> I don't see your point. is_unpacked is not part of the read_string()
>>>> arguments
>>>> or return value. In split_list() it is definitively needed to indicate
>>>> if the returned
>>>> string (in the C sense) contains another list. I can imagine many
>>>> situations where
>>>> you might either pass an array or just a single value or even NULL.
>>>> is_unpacked
>>>> allows to differentiate between the situations.
>>> Can you give an example? You say is_unpacked is definitely needed, but
>>> so far the only use case has been read_string(), which you wanted to be
>>> used for reading not only string values but everything else too. If
>>> read_string() is changed to only read string values, then it doesn't
>>> need is_unpacked from split_list().
>>>
>>> The parameter types are known beforehand, so i don't see the need for
>>> looking at the data to figure out the type. If introspection support is
>>> desired, then that's a separate project (the is_unpacked flag isn't
>>> sufficient for proper introspection).
>>>
>> This is not about parameter type, it is just to distinguish between
>> a list and a simple value. One example comes to my mind immediately:
>> Consider a parameter list that consists of strings and a couple of
>> arrays. Now you can read this list as an array of strings (patch 8
>> provides a function for that) and then examine those strings that
>> contain arrays separately. Having the flag (and using it in read_string())
>> provides a more flexible and convenient way to parse a parameter list.
>>
>> The flag may not be strictly necessary at the moment, but I would still
>> like to keep it.
> To continue a familiar theme of my review: if there's a
> read_string_array() function, I want that to be used only for string
> arrays, not a mishmash of random types. There could be a separate
> function split_list_into_array() that does something similar to what you wanted to do with read_string_array(). split_list_into_array() wouldn't do special string handling, though, so unescaping would be left to the application. I find that only logical, since other basic types don't get special handling either (i.e. floats aren't converted to C floats).
>
> Your use case could be served with a vararg function that instead of
> producing a string array would read into separate variables, like this:
>
> pa_message_params_read(c, state,
>                         PA_TYPE_STRING, &param1,
>                         PA_TYPE_FLOAT, &param2,
>                         PA_TYPE_RAW, &param3,
>                         0);
>
> PA_TYPE_RAW would be useful for reading a list (or anything else) into
> a C string without unescaping or other processing. There could be also
> PA_TYPE_IGNORE for uninteresting variables, and PA_TYPE_*_ARRAY for
> arrays of basic types.
>
> (Now it occurred to me that the 'c' argument name that is used in the
> parsing functions is a bit weird and unhelpful. Maybe "param_list"
> would be good?)
>
I still don't see your point. Again, the use of is_unpacked is 
transparent for the
user of read_string(), so the user just reads a string without having to 
care about
unescaping. The flag does not complicate the API, it simplifies it 
because the
unescaping is done automatically. Your approach seems unnecessary
complicated to me. A string is a string, even if it contains 
sub-structures. Your
split_list_into_array() function would basically return an array of 
strings, so what
would be the advantage? It would only make parsing more cumbersome, because
the task of unescaping is given to the user instead of doing it 
automatically where
necessary.
Also there is no "mishmash of random types", they are all strings It is 
only the
difference between a "simple" string which needs no further processing and a
"complex" string which needs further parsing.

I am still not willing to drop this - for me it simplifies parsing. How 
do we proceed?