A few questions about escaping in desktop files
Simon McVittie
smcv at collabora.com
Mon Aug 22 15:49:51 UTC 2022
On Mon, 22 Aug 2022 at 09:41:45 +0200, meator wrote:
> I've noticed that a lot of the special characters in the second phase have
> something to do with the shell. Is shell supposed to be involved in the
> execution of the program? I'm currently executing it directly (I'm not doing
> sh -c <args>). If there isn't a shell involved, why are there so many
> restricted characters? They look pretty useless.
I didn't write this specification, but I believe the intention is that you
can either do the word-splitting and unquoting from first principles, or
construct a string to pass to a shell and let the shell do the unquoting.
Reserving characters like $ is a way to be nice to implementations that
want to delegate the word-splitting and unquoting to a shell.
GLib does almost the same as you are doing, but reversing your second
and third phases: first it decodes the .desktop file (using GKeyFile
and g_key_file_get_string() to expand the escape sequences \s, \n, \t,
\r, \\), then it replaces field codes like %f with a shell-style-quoted
version of their expansion (so for example %f expands to the result of
g_shell_quote(filename), similar to Python shlex.quote(filename)), then it
does the equivalent of shell word-splitting and unquoting to get an array
of arguments (g_shell_parse_argv(), similar to Python shlex.split()), and
finally it passes that array to an argv-based API similar to execve()
or posix_spawn(). This is considered to be a valid implementation.
I haven't done any Qt/KDE development for a while, but if I remember
correctly, Qt/KDE does does the first two phases very similarly to
GLib, but then instead of doing the word-splitting itself, it passes
the entire command to a shell via something similar to system() or
{"sh", "-c", ...} and lets the shell figure out what to do next. This is
also considered to be a valid implementation. (Sorry, I don't remember
whether this code is in Qt or in a KDE library.)
If there is an existing high-quality implementation that is suitable for
your language/dependency/etc. requirements (for instance GDesktopAppInfo
in GLib, or whatever part of the Qt/KDE stack is their closest
equivalent), then I'd recommend using that instead of implementing
your own.
Or, if you have to use your own implementation, then if in doubt, I would
suggest checking what GNOME and KDE would do with a particular .desktop
file. If they both parse it without error and get the same result, then
that is probably the result to be aiming for. Other GLib-based desktops
will do the same as GNOME, and other Qt-based desktops are likely to do
the same as KDE, so looking at GNOME and KDE will cover most users of
.desktop files.
Where there are ambiguities in the spec, then the right resolution for
those ambiguities is likely to be one of:
- it is considered valid and the right interpretation is what GNOME and
KDE both do;
- it is considered invalid, applications should not install .desktop files
that do this, and if they do, the interpretation is unspecified
> > Arguments are separated by a space.
>
> A literal space ' ' 0x20? For example the shell uses $IFS which allows a
> space, a tab and a newline as separators.
I believe the intention is that separating arguments with anything
other than a single 0x20 has unspecified behaviour: applications should
not install a .desktop file that does this, and if they do, different
implementations are not guaranteed to parse it the same way.
> How many arguments of echo would
>
> Exec=echo prefix%isuffix
>
> produce? 2 or 4?
The intention is clearly that %i appears as an unquoted word in its own
right, rather than being concatenated with other words, so the answer is
"don't do that, it's silly" :-)
However, if an implementation wants to be able to support something like
Exec=my-utility --input-from-file=%f
then it is forced to have a parsing model where that works; and if its
parsing model makes that work, then that probably implies that
prefix%isuffix must be expanded to two arguments, "prefix--icon" and
"my-icon-namesuffix". Certainly it looks like this is what GLib would do.
> So when I have
>
> Exec=true $test > 1
>
> It should be quoted as
>
> Exec=true "\\$test" ">" 1
>
> And the program would receive three arguments (argc = 4), "$test", ">" and
> "1" or JHRlc3Q=, Pg==, MQ== in base64 (I'm including the base64 encoded
> string because it is unambiguous). Is this correct?
Honestly, put your "difficult" strings in a script somewhere else and
then put the path to the script in the .desktop file - that'll be a lot
more reliable and also easier to read.
smcv
More information about the xdg
mailing list