[systemd-devel] [RFC] load-fragment: use unquote_first_word in config_parse_exec

Sat May 30 23:29:29 PDT 2015

This is an attempt to convert it from using FOREACH_WORD_QUOTED into a
loop using unquote_first_word repeatedly.

Additionally, we're looping through the arguments only once and using
_cleanup_ functions to manage objects owned by this function.

Some notes:
- There is some difference in how unquote_first_word handles invalid
  quotes or escapes. In particular, for some cases where we used to
  return 0 (but set the command to NULL) we are now returning -EINVAL
  instead (still setting the command to NULL.)  I don't know if that's a
  problem anywhere...

- The test cases were updated to reflect that change.

- Handling a \; is ugly, it looks like a hack... unquote_first_word is
  not equipped to recognize that sequence, so I had to move it outside
  unquote_first_word looking for those two characters followed by
  whitespace explicitly. But then, something like ';' or ";" will be
  recognized as a command separator, is that OK?

- We do something different for empty string (clear the command list)
  than we do for just whitespace. This is pre-existing. Maybe we need to
  fix that? I put a comment on that case, that branch is triggered both
  in the "just whitespace" case as well as right after a semicolon
  command separator.

- Now we're logging more from this function, in particular if we run out
  of memory we return log_oom(), but that might not be the case for
  instance if we get -ENOMEM from unquote_first_word. Should we wrap
  around that if (r < 0) return r; in something that logs errno?

- Not sure whether we need to track "semicolon" on a variable of its own
  or if we can just check whether *p is the \0 end of the string. I
  noticed the original code had a case logging "... ignoring trailing
  garbage", but I'm not sure where that's needed... Do we need that here
  as well or is that something that will not happen with
  unquote_first_word?

Tested it by running the test suite.

I also built a systemd with this patch, installed and ran it
successfully on a VM running Arch Linux.  I didn't notice anything too
weird (but maybe I just didn't look at all the right places...)

Comments welcome!

Cheers,
Filipe
---
 src/core/load-fragment.c  | 209 ++++++++++++++++++++++------------------------
 src/test/test-unit-file.c |  14 +++-
 2 files changed, 108 insertions(+), 115 deletions(-)

diff --git a/src/core/load-fragment.c b/src/core/load-fragment.c
index c95c11014c72..dd2cf6e3257f 100644
--- a/src/core/load-fragment.c
+++ b/src/core/load-fragment.c
@@ -520,9 +520,9 @@ int config_parse_exec(
                 void *data,
                 void *userdata) {
 
-        ExecCommand **e = data, *nce;
-        char *path, **n;
-        unsigned k;
+        ExecCommand **e = data;
+        const char *p;
+        bool semicolon;
         int r;
 
         assert(filename);
@@ -538,150 +538,137 @@ int config_parse_exec(
                 return 0;
         }
 
+        rvalue += strspn(rvalue, WHITESPACE);
+        p = rvalue;
+
         /* We accept an absolute path as first argument, or
          * alternatively an absolute prefixed with @ to allow
          * overriding of argv[0]. */
-        for (;;) {
+        do {
                 int i;
-                const char *word, *state, *reason;
-                size_t l;
+                _cleanup_strv_free_ char **n = NULL;
+                _cleanup_free_ ExecCommand *nce = NULL;
+                _cleanup_free_ char *path = NULL;
+                _cleanup_free_ char *firstword = NULL;
+                char *f;
                 bool separate_argv0 = false, ignore = false;
 
-                path = NULL;
-                nce = NULL;
-                n = NULL;
+                semicolon = false;
 
-                rvalue += strspn(rvalue, WHITESPACE);
+                r = unquote_first_word(&p, &firstword, UNQUOTE_CUNESCAPE);
+                if (r < 0)
+                        return r;
+                if (r == 0)
+                        /* We get here in two situations:
+                         * 1. If we have only whitespace (shouldn't this reset
+                         *    the list same as an empty string?)
+                         * 2. With a trailing semicolon and no command after that.
+                         */
+                        return 0;
 
-                if (rvalue[0] == 0)
-                        break;
+                f = firstword;
+                for (i = 0; i < 2; i++) {
+                        if (*f == '-' && !ignore) {
+                                ignore = true;
+                                f ++;
+                        } else if (*f == '@' && !separate_argv0) {
+                                separate_argv0 = true;
+                                f ++;
+                        }
+                }
 
-                k = 0;
-                FOREACH_WORD_QUOTED(word, l, rvalue, state) {
-                        if (k == 0) {
-                                for (i = 0; i < 2; i++) {
-                                        if (*word == '-' && !ignore) {
-                                                ignore = true;
-                                                word ++;
-                                        }
-
-                                        if (*word == '@' && !separate_argv0) {
-                                                separate_argv0 = true;
-                                                word ++;
-                                        }
-                                }
-                        } else if (strneq(word, ";", MAX(l, 1U)))
-                                goto found;
+                if (!*f) {
+                        /* First word is either "-" or "@" with no command. */
+                        log_syntax(unit, LOG_ERR, filename, line, EINVAL,
+                                   "Empty path in command line, ignoring: \"%s\"", rvalue);
+                        return 0;
+                }
 
-                        k++;
+                if (!string_is_safe(f)) {
+                        log_syntax(unit, LOG_ERR, filename, line, EINVAL,
+                                        "Executable path contains special characters, ignoring: %s", rvalue);
+                        return 0;
                 }
-                if (!isempty(state)) {
-                        log_syntax(unit, LOG_ERR, filename, line, EINVAL, "Trailing garbage, ignoring.");
+                if (!path_is_absolute(f)) {
+                        log_syntax(unit, LOG_ERR, filename, line, EINVAL,
+                                        "Executable path is not absolute, ignoring: %s", rvalue);
+                        return 0;
+                }
+                if (endswith(f, "/")) {
+                        log_syntax(unit, LOG_ERR, filename, line, EINVAL,
+                                        "Executable path specifies a directory, ignoring: %s", rvalue);
                         return 0;
                 }
 
-        found:
-                /* If separate_argv0, we'll move first element to path variable */
-                n = new(char*, MAX(k + !separate_argv0, 1u));
-                if (!n)
+                path = strdup(f);
+                if (!path)
                         return log_oom();
 
-                k = 0;
-                FOREACH_WORD_QUOTED(word, l, rvalue, state) {
-                        char *c;
-                        unsigned skip;
-
-                        if (separate_argv0 ? path == NULL : k == 0) {
-                                /* first word, very special */
-                                skip = separate_argv0 + ignore;
-
-                                /* skip special chars in the beginning */
-                                if (l <= skip) {
-                                        log_syntax(unit, LOG_ERR, filename, line, EINVAL,
-                                                   "Empty path in command line, ignoring: \"%s\"", rvalue);
-                                        r = 0;
-                                        goto fail;
-                                }
+                if (!separate_argv0) {
+                        r = strv_extend(&n, path);
+                        if (r < 0)
+                                return r;
+                }
 
-                        } else if (strneq(word, ";", MAX(l, 1U)))
-                                /* new commandline */
-                                break;
+                path_kill_slashes(path);
 
-                        else
-                                skip = strneq(word, "\\;", MAX(l, 1U));
+                for (;;) {
+                        _cleanup_free_ char *word = NULL;
 
-                        r = cunescape_length(word + skip, l - skip, 0, &c);
-                        if (r < 0) {
-                                log_syntax(unit, LOG_ERR, filename, line, r, "Failed to unescape command line, ignoring: %s", rvalue);
-                                r = 0;
-                                goto fail;
+                        /* unquote_first_word chokes on \; so let's check for
+                         * it explicitly here.  Only do something if it's a
+                         * single token.  In that case, add a literal ";", skip
+                         * the \; and go to the next iteration.  */
+                        if (p[0] == '\\' && p[1] == ';' && strchr(WHITESPACE, p[2])) {
+                                r = strv_extend(&n, ";");
+                                p += 2;
+                                continue;
                         }
 
-                        if (!utf8_is_valid(c)) {
-                                log_invalid_utf8(unit, LOG_ERR, filename, line, EINVAL, rvalue);
-                                r = 0;
-                                goto fail;
-                        }
+                        r = unquote_first_word(&p, &word, UNQUOTE_CUNESCAPE);
+                        if (r < 0)
+                                return r;
+                        if (r == 0)
+                                break;
 
-                        /* where to stuff this? */
-                        if (separate_argv0 && path == NULL)
-                                path = c;
-                        else
-                                n[k++] = c;
+                        if (streq(word, ";")) {
+                                semicolon = true;
+                                break;
+                        }
+                        r = strv_extend(&n, word);
+                        if (r < 0)
+                                return r;
                 }
 
-                n[k] = NULL;
-
-                if (!n[0])
-                        reason = "Empty executable name or zeroeth argument";
-                else if (!string_is_safe(path ?: n[0]))
-                        reason = "Executable path contains special characters";
-                else if (!path_is_absolute(path ?: n[0]))
-                        reason = "Executable path is not absolute";
-                else if (endswith(path ?: n[0], "/"))
-                        reason = "Executable path specifies a directory";
-                else
-                        goto ok;
-
-                log_syntax(unit, LOG_ERR, filename, line, EINVAL, "%s, ignoring: %s", reason, rvalue);
-                r = 0;
-                goto fail;
-
-ok:
-                if (!path) {
-                        path = strdup(n[0]);
-                        if (!path) {
-                                r = log_oom();
-                                goto fail;
-                        }
+                if (!n || !n[0]) {
+                        log_syntax(unit, LOG_ERR, filename, line, EINVAL,
+                                        "Empty executable name or zeroeth argument, ignoring: %s", rvalue);
+                        return 0;
                 }
 
                 nce = new0(ExecCommand, 1);
-                if (!nce) {
-                        r = log_oom();
-                        goto fail;
-                }
+                if (!nce)
+                        return log_oom();
 
                 nce->argv = n;
                 nce->path = path;
                 nce->ignore = ignore;
 
-                path_kill_slashes(nce->path);
-
                 exec_command_append_list(e, nce);
 
-                rvalue = state;
-        }
-
-        return 0;
+                /* Do not _cleanup_free_ these. */
+                n = NULL;
+                path = NULL;
+                nce = NULL;
 
-fail:
-        n[k] = NULL;
-        strv_free(n);
-        free(path);
-        free(nce);
+                rvalue = p;
+        } while (semicolon);
+        /* Or:  while (*p) ?
+         * That seems to work as well, at least the test cases pass... In which
+         * case will there be trailing garbage that needs to be ignored here?
+         */
 
-        return r;
+        return 0;
 }
 
 DEFINE_CONFIG_PARSE_ENUM(config_parse_service_type, service_type, ServiceType, "Failed to parse service type");
diff --git a/src/test/test-unit-file.c b/src/test/test-unit-file.c
index a9711ac9f558..91f4ff3c22d6 100644
--- a/src/test/test-unit-file.c
+++ b/src/test/test-unit-file.c
@@ -275,7 +275,7 @@ static void test_config_parse_exec(void) {
         check_execcommand(c1,
                           "/PATH WITH SPACES/daemon", NULL, "-1 -2", NULL, false);
 
-        for (ccc = "abfnrtv\\\'\"x"; *ccc; ccc++) {
+        for (ccc = "abfnrtv\\\'\""; *ccc; ccc++) {
                 /* \\x is an incomplete hexadecimal sequence, invalid because of the slash */
                 char path[] = "/path\\X";
                 path[sizeof(path) - 2] = *ccc;
@@ -287,6 +287,12 @@ static void test_config_parse_exec(void) {
                 assert_se(r == 0);
                 assert_se(c1->command_next == NULL);
         }
+        log_info("/* invalid character: \\x */");
+        r = config_parse_exec(NULL, "fake", 4, "section", 1,
+                              "LValue", 0, "/path\\x",
+                              &c, NULL);
+        assert_se(r == -EINVAL);
+        assert_se(c1->command_next == NULL);
 
         log_info("/* valid character: \\s */");
         r = config_parse_exec(NULL, "fake", 4, "section", 1,
@@ -301,21 +307,21 @@ static void test_config_parse_exec(void) {
         r = config_parse_exec(NULL, "fake", 4, "section", 1,
                               "LValue", 0, "/path\\",
                               &c, NULL);
-        assert_se(r == 0);
+        assert_se(r == -EINVAL);
         assert_se(c1->command_next == NULL);
 
         log_info("/* missing ending ' */");
         r = config_parse_exec(NULL, "fake", 4, "section", 1,
                               "LValue", 0, "/path 'foo",
                               &c, NULL);
-        assert_se(r == 0);
+        assert_se(r == -EINVAL);
         assert_se(c1->command_next == NULL);
 
         log_info("/* missing ending ' with trailing backslash */");
         r = config_parse_exec(NULL, "fake", 4, "section", 1,
                               "LValue", 0, "/path 'foo\\",
                               &c, NULL);
-        assert_se(r == 0);
+        assert_se(r == -EINVAL);
         assert_se(c1->command_next == NULL);
 
         log_info("/* invalid space between modifiers */");
-- 
2.4.2