[systemd-devel] Unit Names and Environment File Variable Names - Inconsistent Character Sets and Shortcomings with Unit Name Specifiers

James Feeney james at nurealm.net
Tue Apr 26 23:19:57 UTC 2022


I thought perhaps to file this as a bug report, since the shortcomings are so glaring, and the three proposed fixes - and documentation updates - are so specific.  But instead, maybe some discussion would be appropriate.  I want to address three issues and corresponding fixes: 1) adding a specifier for the leading string of the Unit Name prefix, 2) similarly, adding a specifier for the trailing string of the Template Unit Name instance, and 3) conforming the Unit Name character set and the Environment Variable Name character set.  Of course, there will be associated updates to the man pages, addressing these changes.

Addressing the first issue, under systemd's "There's more than one way to do it" philosophy, there are two methods which allow passing variable values to systemd Unit Files, regardless whether or not "drop-in" directories and files are being used: 1) Unit Names, and 2) Environment Variable Names.

We note that Unit Name "specifiers" can be used to parse the Unit Names, using "%p", "%j", and "%i", as well as using their "with escaping undone" equivalents, with "%P", "%J", and "%I".

We also note that these Unit Name specifiers may be used as values in Unit dependency declarations, and also especially, that Environment Variable Names *cannot* be so used, as values in Unit dependency declarations.  This difference, then, places an emphasis upon the actual utility of these Unit Name specifiers, and, in particular, upon the utility of specifiers when making use of Template Unit Files.

Unfortunately, the number and function of the Unit Name specifiers, in their current form, becomes noticeably incomplete, since the useful introduction of the Unit Name prefix dash "-" formalism.  In particular, we note that their is no Unit Name specifier to select the *leading* string in the Unit Name prefix, and that - as best as I can tell, or at least for my purposes - the "%p" and "%P" specifiers have become essentially useless and pointless, though it may be that other people have found an actual use case for "%p" and "%P", given the existence now of the "%j" and "%J" specifiers, where 'If there is no "-", this is the same as "%p"'.

Specifically, from systemd.unit(5), note that the "%p" specifiers refer to 'the string before the first "@" character of the unit name', which thereby means, *all* of the components of any Unit Name prefix of the form, for example, "foo-bar-baz".  Note here that the component "baz" may be taken as a component in any dependency declaration using the "%j" specifiers, but that there is no way to select the component "foo" alone as a component in a dependency declaration.  It is extremely frustrating, not to be able to access this initial prefix component, for use in a dependency declaration.  Furthermore, it is my impression that, in this example, the explicit entire string "foo-bar-baz" will be useless as a component in any dependency declaration.

I point-out that I have found referencing template plate file dependencies and interdependencies using related Unit File Template Unit Names can be extremely useful, for expressiveness and clarity, with respect to Unit File naming when, for example, building complex network interface configurations and connected network processes.  This occurs, for instance, when building hierarchies of hot-swapped bonding and bridging interfaces, or chains of network tunnels and network namespaces.  It then becomes useful to be able to infer and derive the name of a dependent Unit File from the name of its related Unit File.  In particular, if there were to exist a new - or modified - specifier referring to the *leading* string in a Unit Name prefix, it then becomes possible to 1) utilize this leading string as an argument in any Unit File Service section option, *in addition to* utilizing the trailing string as an argument, using the "%j" specifiers, and to 2) also construct a related Template Unit File Name using the leading and trailing strings in the prefix in combination, as, for example, inferring a related Unit File Name "foo at baz.service" from some Unit File with the name "foo-bar-baz at bif.service".

This utility can be achieved with the simple expedient of creating a new Unit File Name specifier defined as the "Initial component of the prefix", 'the string before the first dash "-" of the prefix name'.

Either, the definition of the "%p" and "%P" specifiers can be modified to reference only the leading string in the prefix, rather than the *entire* prefix, or, if anyone believes that there is a real use case for these "%p" specifiers referring to the *entire* prefix string, then choose another letter to represent this new leading string prefix specifier.  I suggest *not* using "q", since there exist only two sets of four sequential letters unused as specifiers, "c d e f" and "q r s t", and these may be better reserved for some future use case.  Instead, since I will suggest another use for the currently unused letter "k" as a specifier, this leaves "x", "y", and "z" as possible choices.

Note that the current effect of the "%p" specifier, which references the entire prefix string, is not lost in the simple case when redefining the specifier as suggested, where a prefix "foo-baz" could then be expressed in its entirety as literally "%p-%j", with no loss of utility.

A counter argument to loosing a reference to the *entire* prefix string, with "%p" as currently defined, is that then it would no longer be possible to extract additional sub-strings from the prefix name, using, as example, bash parameter expansion. However, this is not a meaningful loss in functionality, because it is *already* possible to pass arbitrary strings as Unit File Service section options using Environment File variable value assignments.  In contrast, passing arguments to Template Unit File Unit section dependency declarations *cannot* be done in any other way, than using the Unit Name components.

As with current usage, the effect of "%p" and "%P" does not change if no dash "-" is being used in the prefix name.  The function is the same after this proposed redefinition.

It does not appear that the availability of these additional specifiers would interfere in any way with current usage, existing Unit File names, or with "drop-in" file and directory functionality.

Addressing to the second issue, similarly, with respect to the Template Unit File instance specifiers, using the "at per se" "@" delimiter, we note again from systemd.unit(5), that the "%i" specifiers allow parsing 'the string between the first "@" character and the type suffix'.  While these specifiers have utility, as above, both as arguments to Unit File service section options, especially for use with the "Exec*" options, and with inferring and deriving dependent Unit File Names for use in the various Unit File unit section dependency declarations, there is also a glaring lost opportunity here, to infer and derive an additional *Template* Unit File Name from another purposely crafted Unit File Name.

Yes, there are use cases in which I am tempted to do this.

Note that, for instance, "foo-bar-baz at bif@baf at boof.service" is a valid Template Unit Name, where the "%i" specifiers will select the *entire* string "bif at baf@boof".  However, with the simple expedient of defining a pair of instance specifiers, similarly as above for the prefix, it becomes possible then to pass an additional variable to a Template Unit File, allowing an additional related Unit File dependency to be expressed, as for instance "bif at boof.service".  And then also, an additional argument, "boof", becomes available to pass to Unit File service options.

The letter "k" could then be used as the "%k" and "%K" specifiers, defined as "Final component of the instance", 'the string between the last "@" character and the type suffix', reflecting an equivalent usage with respect to the "%j" prefix specifiers.

The "$i" specifiers should then also be redefined, as "First component of the Instance name", 'For instantiated units this is the string between the first "@" character and the type suffix or any additional "@" character'.

Again, note that these additional "%k" specifiers and modified "%i" specifiers do not interfere in any way with current usage, existing Unit File names, or with "drop-in" file and directory functionality.  With current Template Unit File naming, where there is no instance string, or where there is only one instance string, the modified "%i" and "%I" specifiers will behave in exactly the same way.

Addressing to the third issue, inconsistent character sets for Unit Names and Environment File Variable Names, I find it very useful and clarifying to provide Environment Variables to Template Unit files by prefixing shared template Environment Variable Names with an identical Unit Name prefix or instance name.  These variables can then be referenced, for instance in "Exec*" service options, by simply prefixing the variable with a "%p", "%j", or "%i" specifier, as appropriate.  In this way, the Environment File might specify, for instance, "bridge0STP= 'on'" and bridge1STP= 'off'" for some Template Unit File named "bridge at .service", for each instance of "bridge at bridge0.service" and "bridge at bridge1.service".

I had considered resolving the problem with accessing the initial string in the Unit File name prefix by parsing the *entire* prefix string given by the "%p" specifier for the initial string, before the first dash "-" character.  However, to retain the convention of matching the specifier "%p" and the Environment File Variable prefix naming just described, the Environment Variable prefix must then include the dash character "-", as implied in systemd.unit(5), for the "%p" specifier, "For instantiated units, this refers to the string before the first "@" character of the unit name", meaning the *entire* prefix string.  This case might result, for example, from using some Unit File named "sit1-bridge at wlan0.service", where the matching "%p" Environment File Variable Name prefix becomes "sit1-bridge".

However, attempting this, an error message will be encountered immediately: "Ignoring invalid environment assignment".  Why is that?

We take note of the Environment File variable name character set described in the man page systemd.exec(5), and also, the Unit Name character set described in systemd.unit(5):

systemd.exec
The names of the variables can contain ASCII letters, digits, and the underscore character. Variable names cannot be empty or start with a digit.

systemd.unit
The "unit prefix" must consist of one or more valid characters (ASCII letters, digits, ":", "-", "_", ".", and "\").

This character list for the Unit Name there is incomplete, due to the over specific reference to "unit prefix".  The man page goes on to say: 'The name of the full unit is formed by inserting the instance name between "@" and the unit type suffix'.  So the complete Unit Name character set then includes also the character "@".

Of course, that is the cause of the "invalid environment assignment" error.  The dash "-" character, which is allowed and required in the Unit Name character set, is *not* included or allowed in the Environment File character set.

Why is that?  That seems like a pointless and unnecessary limitation, having an Environment File Variable Name character set *different from* the Unit Name character set.

Furthermore, notice that the "escaping" name specifiers, "%P", "%J", and "%I", cannot be used with the Environment File variable names generally, and cannot be used to work-around the dash "-" character limitation, because the backslash "\" character is also required and used with systemd String Escaping in Unit Names, and is also *not* included or allowed in the Environment File Variable Name character set.

So, finally, unless there is some use case that can be discovered prohibiting this, it would be best to conform the Unit Name and the Environment File Variable Name character sets, by simply expanding the Environment File Variable Name character set to include and allow those additional characters, ":", "-", ".", "\", and "@", from the Unit Name character set.

James


More information about the systemd-devel mailing list