[systemd-devel] [PATCH] systemd: add RDTCacheReservation= option to support CAT (Cache Allocation Technology)

Mantas Mikulėnas grawity at gmail.com
Fri Jan 6 15:26:36 UTC 2017


On Fri, Jan 6, 2017 at 3:59 PM, Marcelo Tosatti <mtosatti at redhat.com> wrote:

>
>
> Cache Allocation Technology is a feature on selected recent Intel Xeon
> processors which allows control over L3 cache allocation.
>
> Kernel support has been merged to the upstream kernel, via a filesystem
> resctrlfs.
>
> On top of that, a userspace utility, resctrltool has been written
> to facilitate writing applications and using the filesystem
> interface (see the rationale at
> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1300792.html).
>
> This patch adds a new option to systemd, RDTCacheReservation,
> to allow configuration of CAT via resctrltool.
>
> See the first hunk of the patch for a description of the option


This really doesn't look pretty, neither the approach nor the
implementation...

Is the option actually so complex that calling resctrltool is the only way
to adjust it? What about writing to the resctrlfs directly?

Also, it would be nicer to submit the patches as GitHub pull requests
instead.


>
> +static char* convert_rdt_back(char *argm[], int argmidx)
> +{
> +       int i, ret;
> +       char *buf, *str;
> +       int size = 0;
> +       int printcomma = 0;
> +       int printsemicolon = 0;
> +       int localmode = 0;
> +
> +       for (i=0; i < argmidx; i++) {
> +               /* ',', ';', '=' */
> +               size += strlen(argm[i]) + 3;
> +       }
> +
> +       buf = malloc(size);
> +       if (buf == NULL)
> +               return NULL;
> +       memset(buf, 0, size);
> +
> +       str = buf;
> +
> +       /* cache-id specified */
> +       for (i=0; i < argmidx; i++) {
> +               if (strlen(argm[i]) == 10) {
> +                       if (strcmp(argm[i], "--cache-id") == 0)
> +                               localmode = 1;
> +               }
> +       }
> +
> +       for (i=0; i<argmidx; i++) {
> +               char *cur = argm[i];
> +
> +               if (strlen(cur) == 0)
> +                       return buf;
> +
> +               if (strlen(cur) == 6) {
> +                       if (strcmp(cur, "--type") == 0) {
> +                               ret = sprintf(str, "type=");
> +                               str = str + ret;
> +                               printcomma = 1;
> +                               continue;
> +                       }
> + etc.
> +       }
> +
> +       return buf;
> +}
> +
>

Converting from foo;bar to argv[] and back is a bit horrible. It seems to
me that it would be *much* simpler and easier to maintain if you stored
only actual parameters in the unit (e.g. as an array of
{type,size,cache_id}) and only generated argv at the moment it was
necessary, i.e. right before spawning resctrltool. After all, that's the
only place you need it. *And* that way you wouldn't need the above function
at all.


>
> Index: systemd/src/core/execute.c
> ===================================================================
> --- systemd.orig/src/core/execute.c
> +++ systemd/src/core/execute.c
> @@ -36,6 +36,10 @@
>  #include <sys/un.h>
>  #include <unistd.h>
>  #include <utmpx.h>
> +#include <fcntl.h>
> +#include <sys/select.h>
> +#include <sys/types.h>
> +#include <sys/wait.h>
>
>  #ifdef HAVE_PAM
>  #include <security/pam_appl.h>
> @@ -2201,6 +2205,161 @@ static int apply_working_directory(const
>          return 0;
>  }
>
> +
> +static int fork_exec_resctrltool(Unit *u, char *argv[])
> +{
> +        int outpipe[2];
> +        int errpipe[2];
> +        pid_t cpid;
> +
> +        pipe(outpipe);
> +        pipe(errpipe);
> +        cpid = fork();
> +
> +        if(cpid == 0) {
> +                int r;
> +
> +                dup2(errpipe[1], STDERR_FILENO);
> +                dup2(outpipe[1], STDOUT_FILENO);
> +
> +                r = execve(argv[0], argv, NULL);
> +                if (r == -1) {
> +                        log_open();
> +                        log_unit_error_errno(u, r, "Failed to execve
> resctrltool, ignoring: %m");
> +                        log_close();
> +                        return -errno;
> +                }
> +        } else {
> +                char buffer[100];
> +                fd_set fdset;
> +                int count;
> +                int ret;
> +                int nfds;
> +                int status = 0;
> +                int retst;
> +
> +                ret = waitpid(cpid, &status, 0);
> +                if (ret == -1) {
> +                        log_open();
> +                        log_unit_error_errno(u, ret, "Failed to waitpid
> resctrltool, ignoring: %m");
> +                        log_close();
> +                        return -errno;
> +                }
> +
> +                if (!WIFEXITED(status)) {
> +                        log_open();
> +                        log_unit_error_errno(u, ret, "resctrltool exits
> abnormally, ignoring: %m");
> +                        log_close();
> +                        return -1;
> +                } else {
> +                        retst = WEXITSTATUS(status);
> +
> +                        if (retst == 0) {
> +                               return 0;
> +                        }
> +                }
> +
> +                /*
> +                 * error, read stderr and stdout and log.
> +                 */
> +
> +                FD_ZERO(&fdset);
> +                FD_SET(outpipe[0], &fdset);
> +                FD_SET(errpipe[0], &fdset);
> +
> +                if (outpipe[0] > errpipe[0])
> +                        nfds = outpipe[0] + 1;
> +                else
> +                        nfds = errpipe[0] + 1;
> +
> +                ret = select(nfds, &fdset, NULL, NULL, NULL);
> +                if (ret == -1) {
> +                        log_open();
> +                        log_unit_error_errno(u, ret, "select error,
> ignoring RDTCacheReservation: %m");
> +                        log_close();
> +                        return -1;
> +                }
> +
> +                if (FD_ISSET(outpipe[0], &fdset)) {
> +                        count = read(outpipe[0], buffer,
> sizeof(buffer)-1);
> +                        if (count >= 0) {
> +                                buffer[count] = 0;
> +                                log_open();
> +                                log_unit_error(u, "resctrltool stdout:
> %s", buffer);
> +                                log_close();
> +                        } else {
> +                                log_open();
> +                                log_unit_error(u, "resctrltool io
> error\n");
> +                                log_close();
> +                        }
> +                }
> +
> +                if (FD_ISSET(errpipe[0], &fdset)) {
> +                        count = read(errpipe[0], buffer,
> sizeof(buffer)-1);
> +                        if (count >= 0) {
> +                                buffer[count] = 0;
> +                                log_open();
> +                                log_unit_error(u, "resctrltool stderr:
> %s", buffer);
> +                                log_close();
> +                        } else {
> +                                log_open();
> +                                log_unit_error(u, "resctrltool io
> error\n");
> +                                log_close();
> +                        }
> +                }
> +
> +                return retst;
> +        }
> +
> +    return -1;
> +}
> +
>

It rather looks to me like your pipes are leaking?


> +static int setup_rdt_cat(Unit *u, const ExecContext *c, pid_t pid)
> +{
> +        int ret, i;
> +        const char *arg[5];
> +        char pidstr[100];
> +        char *name = u->id;
> +        const char *largm[ARGMSIZE + 4];
>

What's the difference between arg and largm?


> +
> +        sprintf(pidstr, "%d", pid);
>

Process IDs are unsigned.


> +
> +        arg[0] = "/usr/bin/resctrltool";
> +        arg[1] = "delete";
> +        arg[2] = name;
> +        arg[3] = 0;
> +
> +        ret = fork_exec_resctrltool(u, (char **)arg);
> +        /* we want to ignore 'reservation does not exist'
> +         * errors, so skip error checking entirely.
> +         * any other errors will be caught when trying
> +         * to execute create below
> +         */
> +
> +        memset(largm, 0, sizeof(largm));
> +        largm[0] = "/usr/bin/resctrltool";
> +        largm[1] = "create";
> +        largm[2] = "--name";
> +        largm[3] = name;
> +        for (i = 0; i < c->argmidx; i++)
> +                largm[i+4] = c->argm[i];
> +
> +        ret = fork_exec_resctrltool(u, (char **)largm);
> +        if (ret)
> +                return ret;
> +
> +        arg[0] = "/usr/bin/resctrltool";
> +        arg[1] = "assign";
> +        arg[2] = pidstr;
> +        arg[3] = name;
> +        arg[4] = 0;
> +        ret = fork_exec_resctrltool(u, (char **)arg);
>

Three calls per process seems excessive; couldn't resctrltool have a
dedicated subcommand for delete+create+assign?

>


>
> +#define STATE_NEXT_TYPE 0
> +#define STATE_NEXT_SIZE 1
> +#define STATE_NEXT_CACHEID_OR_END 2
> +
> +static void free_argm(ExecContext *c)
> +{
> +        int i;
> +
> +        for (i=0; i < c->argmidx; i++) {
> +                free(c->argm[i]);
> +                c->argm[i] = NULL;
> +        }
> +        c->argmidx = 0;
> +}
> +
> +int config_parse_rdt_cache_reservation(const char* unit,
> +                                       const char *filename,
> +                                       unsigned line,
> +                                       const char *section,
> +                                       unsigned section_line,
> +                                       const char *lvalue,
> +                                       int ltype,
> +                                       const char *rvalue,
> +                                       void *data,
> +                                       void *userdata) {
> +
> +        ExecContext *c = data;
> +        int curstate = STATE_NEXT_TYPE;
> +        char *buf = (char *)rvalue;
> +
> +        assert(filename);
> +        assert(lvalue);
> +        assert(rvalue);
> +        assert(data);
> +
> +        if (isempty(rvalue)) {
> +                /* An empty assignment resets the list */
> +                free_argm(c);
> +                return 0;
> +        }
> +
> +        do {
> +            if (c->argmidx > ARGMSIZE - 3) {
> +                free_argm(c);
> +                log_syntax(unit, LOG_ERR, filename, line, ENOMEM,
> +                           "argmidx overflow, ignoring: %s", rvalue);
> +                return 0;
> +            }
> +
> +            switch (curstate) {
> +            case STATE_NEXT_TYPE: {
> +                char *string, *buf2;
> +
> +                buf = strstr(buf, "type=");
> +                if (buf == NULL) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, EINVAL,
> +                           "type= not found when expected, ignoring: %s",
> rvalue);
> +                    return 0;
> +                }
> +
> +                buf2 = strstr(buf, ",");
> +                if (buf2 == NULL) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, EINVAL,
> +                           ", not found when parsing type, ignoring: %s",
> rvalue);
> +                    return 0;
> +                }
> +                if (strlen(buf) <= 5) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, EINVAL,
> +                           "invalid type, ignoring: %s", rvalue);
> +                    return 0;
> +                }
> +                /* skip type= */
> +                buf = buf + 5;
> +                if (strlen(buf) < 4) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, EINVAL,
> +                           "invalid type, ignoring: %s", rvalue);
> +                    return 0;
> +                }
> +                string = strndup(buf, 4);
> +                if (string == NULL) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, ENOMEM,
> +                           "enomem for strndup, ignoring: %s", rvalue);
> +                    return 0;
> +                }
> +
> +                c->argm[c->argmidx++] = strdup("--type");
> +                c->argm[c->argmidx++] = string;
> +
> +                /* skip {code,data,both}, */
> +                buf = buf + 5;
> +                curstate = STATE_NEXT_SIZE;
> +                break;
> +            }
> +            case STATE_NEXT_SIZE: {
> +                char *string, *buf2, *commabuf, *dotcommabuf;
> +                int len, skip = 0;
> +
> +                buf = strstr(buf, "size=");
> +                if (buf == NULL) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, EINVAL,
> +                           "size= not found when expected, ignoring: %s",
> rvalue);
> +                    return 0;
> +                }
> +                if (strlen(buf) <= 5) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, EINVAL,
> +                           "invalid size, ignoring: %s", rvalue);
> +                    return 0;
> +                }
> +                /* skip size= */
> +                buf = buf + 5;
> +
> +                commabuf = strstr(buf, ",");
> +                dotcommabuf = strstr(buf, ";");
> +
> +                buf2 = NULL;
> +
> +                /* the end, neither , or ; follow */
> +                if (commabuf == NULL && dotcommabuf == NULL) {
> +                    buf2 = buf + strlen(buf);
> +                    skip = 0;
> +                /* only ',': cache-id must follow */
> +                } else if (commabuf && dotcommabuf == NULL) {
> +                    buf2 = commabuf;
> +                    skip = 1;
> +                /* trailing ';' at the end */
> +                } else if (commabuf == NULL && dotcommabuf) {
> +                    buf2 = dotcommabuf;
> +                    skip = 1;
> +                } else if (commabuf && dotcommabuf) {
> +                    if (commabuf > dotcommabuf)
> +                        buf2 = dotcommabuf;
> +                    else
> +                        buf2 = commabuf;
> +                    skip = 1;
> +                }
> +                len = buf2 - buf;
> +                string = strndup(buf, len);
> +                if (string == NULL) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, ENOMEM,
> +                           "enomem for strndup, ignoring: %s", rvalue);
> +                    return 0;
> +                }
> +
> +                c->argm[c->argmidx++] = strdup("--size");
> +                c->argm[c->argmidx++] = string;
> +                curstate = STATE_NEXT_CACHEID_OR_END;
> +
> +                buf = buf2;
> +                buf = buf + skip;
> +                break;
> +            }
> +            case STATE_NEXT_CACHEID_OR_END: {
> +                char *buf2, *string, *dotcommabuf;
> +                int len, skip = 0;
> +
> +                if (strlen(buf) < 4)
> +                    break;
> +
> +                if (strncmp(buf, "type", 4) == 0) {
> +                    curstate = STATE_NEXT_TYPE;
> +                    break;
> +                }
> +
> +                buf = strstr(buf, "cache-id=");
> +                if (buf == NULL) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, EINVAL,
> +                           "cache-id= not found when expected, ignoring:
> %s", rvalue);
> +                    return 0;
> +                }
> +                if (strlen(buf) <= 9) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, EINVAL,
> +                           "invalid cache-id=, ignoring: %s", rvalue);
> +                    return 0;
> +                }
> +                /* skip cache-id= */
> +                buf = buf + 9;
> +
> +                dotcommabuf = strstr(buf, ";");
> +                /* the end */
> +                if (dotcommabuf == NULL) {
> +                    buf2 = buf + strlen(buf);
> +                    skip = 0;
> +                } else {
> +                    buf2 = dotcommabuf;
> +                    skip = 1;
> +                }
> +                len = buf2 - buf;
> +                string = strndup(buf, len);
> +                if (string == NULL) {
> +                    free_argm(c);
> +                    log_syntax(unit, LOG_ERR, filename, line, ENOMEM,
> +                           "enomem for strndup, ignoring: %s", rvalue);
> +                    return 0;
> +                }
> +
> +                c->argm[c->argmidx++] = strdup("--cache-id");
> +                c->argm[c->argmidx++] = string;
> +
> +                buf = buf + skip;
> +                curstate = STATE_NEXT_TYPE;
> +                break;
> +            }
> +            default:
> +                break;
> +            }
> +        } while (strlen(buf) > 4);
> +
> +        c->argm[c->argmidx] = 0;
> +
> +        c->rdt_cache_reservation_set = true;
> +
> +        return 0;
> +}
>

This looks like it could be simplified *a lot* by using the strv functions,
e.g. strv_split(buf, ";") and just nicely iterating over each reservation.

Also, again, don't generate the command line until you actually need it --
isolate it to the "call resctrltool" function, and store raw parameters
until then.

                 r = sd_bus_message_append(m, "v", "i", oa);
+        } else if (streq(field, "RDTCacheReservation")) {
+                int argmidx = 0;
+#define ARGMSIZE 100
+#define STATE_NEXT_TYPE 0
+#define STATE_NEXT_SIZE 1
+#define STATE_NEXT_CACHEID_OR_END 2
+                int curstate = STATE_NEXT_TYPE;
+                char *buf = (char *)eq;
+
+                r = sd_bus_message_open_container(m, 'v', "as");
+                if (r < 0)
+                        return r;
+
+                r = sd_bus_message_open_container(m, 'a', "s");
+                if (r < 0)
+                        return r;
+
+                do {
+                        if (argmidx > ARGMSIZE - 3) {
+                                log_error("argmidx overflow");
+                                return -EINVAL;
+                        }
+
+                        switch (curstate) {
+                        case STATE_NEXT_TYPE: {
+                        char *string, *buf2;
+


Didn't you already do the same parsing earlier? (With the same memory leaks
and all.)

Overall, even though I'm not a committer, IMHO this needs a lot of
improvement before getting close to mergeable.

-- 
Mantas Mikulėnas <grawity at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20170106/27d2323c/attachment-0001.html>


More information about the systemd-devel mailing list