A few questions about escaping in desktop files
meator
meator.dev at gmail.com
Mon Aug 22 19:40:58 UTC 2022
Thanks for your quick response!
On 8/22/22 17:49, Simon McVittie wrote:
> I didn't write this specification, but I believe the intention is that you
> can either do the word-splitting and unquoting from first principles, or
> construct a string to pass to a shell and let the shell do the unquoting.
> Reserving characters like $ is a way to be nice to implementations that
> want to delegate the word-splitting and unquoting to a shell.
Let the shell do the unquoting?! To quote (pun intended) the standard:
> Implementations must undo quoting before expanding field codes and
before passing the argument to the executable program.
Isn't this forbidden? I guess the universal as-if rule mandates that if
it won't change the behavior than it's possible, but this is strange.
When I first read this, I completely ruled out using a shell with Exec,
but now I see that the "before passing the argument to the executable
program" part doesn't strictly have to mean that a shell can't do its
things before it would be passed to the program.
But then there are field codes. As I said, if it behaves correctly than
it can probably do this in different order but I think this conflicts
with the two implementations you have described.
> GLib does almost the same as you are doing, but reversing your second
> and third phases: first it decodes the .desktop file (using GKeyFile
> and g_key_file_get_string() to expand the escape sequences \s, \n, \t,
> \r, \\), then it replaces field codes like %f with a shell-style-quoted
> version of their expansion (so for example %f expands to the result of
> g_shell_quote(filename), similar to Python shlex.quote(filename)), then it
> does the equivalent of shell word-splitting and unquoting to get an array
> of arguments (g_shell_parse_argv(), similar to Python shlex.split()), and
> finally it passes that array to an argv-based API similar to execve()
> or posix_spawn(). This is considered to be a valid implementation.
A little disclaimer: I am working on a program that does this from the
ground up. It's in C++ and it doesn't use toolkits nor GLib. I am not
very familiar with them nor with python so my questions might be trivial.
Why is it quoting the evaluated field code? The filenames and URLs are
pretty unambiguous and could be copied verbatim into the argument
without fear of something misinterpreting it. If I understand it
correctly, it's going to be unquoted by g_shell_parse_argv() you
mentioned. Is this done just to make the parsing with
g_shell_parse_argv() simpler?
> Or, if you have to use your own implementation, then if in doubt, I would
> suggest checking what GNOME and KDE would do with a particular .desktop
> file. If they both parse it without error and get the same result, then
> that is probably the result to be aiming for. Other GLib-based desktops
> will do the same as GNOME, and other Qt-based desktops are likely to do
> the same as KDE, so looking at GNOME and KDE will cover most users of
> .desktop files.
This is probably the best way to test this. I'll have to somehow
configure Gnome and KDE on my system or find their specific program that
does this but this will work.
> I believe the intention is that separating arguments with anything
> other than a single 0x20 has unspecified behaviour: applications should
> not install a .desktop file that does this, and if they do, different
> implementations are not guaranteed to parse it the same way.
Ok.
> However, if an implementation wants to be able to support something like
>
> Exec=my-utility --input-from-file=%f
>
> then it is forced to have a parsing model where that works; and if its
> parsing model makes that work, then that probably implies that
> prefix%isuffix must be expanded to two arguments, "prefix--icon" and
> "my-icon-namesuffix". Certainly it looks like this is what GLib would do.
I didn't think of --arg=$f. This makes sense. Field codes would be nicer
to parse if each field code would have its own argument but this would
break your example.
> Honestly, put your "difficult" strings in a script somewhere else and
> then put the path to the script in the .desktop file - that'll be a lot
> more reliable and also easier to read.
I'm trying to come up with the most difficult, weird and ambiguous
strings so that my implementation wouldn't get a segfault on them. If I
would actually want to execute a program with such peculiar arguments
than I would try to come up with something more reliable.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 659 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/xdg/attachments/20220822/796e4765/attachment.sig>
More information about the xdg
mailing list