[systemd-devel] [systemd-commits] load-fragment: use UNESCAPE_RELAX flag to parse exec directives

Tue Jun 2 03:55:02 PDT 2015

Hi Filipe,

On 06/02/2015 07:34 AM, Filipe Brandenburger wrote:
> On Mon, Jun 1, 2015 at 9:10 AM, Daniel Mack
>> diff --git a/src/core/load-fragment.c b/src/core/load-fragment.c
>> index c95c110..df5fe6f 100644
>> --- a/src/core/load-fragment.c
>> +++ b/src/core/load-fragment.c
>> @@ -610,7 +610,7 @@ int config_parse_exec(
>>                          else
>>                                  skip = strneq(word, "\\;", MAX(l, 1U));
>>
>> -                        r = cunescape_length(word + skip, l - skip, 0, &c);
>> +                        r = cunescape_length(word + skip, l - skip, UNESCAPE_RELAX, &c);
>>                          if (r < 0) {
>>                                  log_syntax(unit, LOG_ERR, filename, line, r, "Failed to unescape command line, ignoring: %s", rvalue);
>>                                  r = 0;
> 
> So, my problem with it is that the bug's expectation is that
> backslashes inside single quotes will remain as backslashes, as the
> example is a regexp '\w+@\K[\d.]+'.
> 
> But this is not true here!!! It's only fixing it for the particular
> cases that are not escape sequences yet.
> 
> For instance, what if I'm doing a parameter that is a regexp that is
> looking for a word boundary and I want to use '\b'? systemd with the
> current patch will (still) turn this into a backspace character.

Yes, you're right. So far, the implementation has been that systemd does
C unescaping on these directives by converting every escape sequence
that it understands and leaving the rest untouched. If a unit file used
a word-end boundary \b sequence, it wouldn't ever have worked. In such
cases, users are expected to use \\b.

The problem here is that cunescape() wasn't as strict in the past as it
should have been, and now there are unit files in the wild which contain
escape sequences that slip through the C unescaping mechanism.

So, my primary motivation was to fix the obvious regression at hand
first, but I agree the actual problem goes deeper.

> Right now the systemd quoting rules do *not* match the shell quoting
> rules. (In fact, this is akin to a bug complaining that variables in
> systemd do not match shell variables. That's indeed the case, but it
> doesn't make it a bug. It's working as documented and as intended.)
> 
> I'd be ok with changing the rules so that backslash inside single
> quotes remains a literal backslash, as I think we have the two kinds
> of quotes (single quotes and double quotes) and I don't think it would
> hurt to make them work a little bit closer to how the shell works...

...

> In that case (of making backslashes stay literal inside single quotes)
> I think the best way forward is complete the conversion to
> unquote_first_word and then update unquote_first_word to introduce
> those rules (essentially, just get rid of the SINGLE_QUOTE_ESCAPE rule
> would do.)

I think that makes sense, but we need to make sure not to break existing
setups. So for instance, both

  ExecStart=-/bin/sh -c "echo foo at 0 | grep -Po '\w+@\K[\d.]+'"
  ExecStart=-/bin/sh -c "echo foo at 0 | grep -Po '\\w+@\\K[\\d.]+'"

are currently allowed and lead to the same result. The former because
with UNESCAPE_RELAX, all sequences in the string (\w, \K, \d) are
unrecognized and copied literal, and the latter because double
backslashes become single ones.

Martin already provided a test case the cases in the bug reports, and we
should probably add more to make sure we don't cause more regressions
with your rework.

Do you have any patches in the queue for changes you proposed?

Thanks,
Daniel