Dozens of incomplete stuff in Desktop Entry Standard

Owen Taylor otaylor at
Thu Jun 26 22:54:33 EEST 2003

I'm just going to respond to some questions about encoding here. I would
strongly suggest you look through the archives of this mailing list.
There have been large numbers of comments since the last revision
of the desktop file spec on such subjects as the interpretation
of Exec: lines.

If you wanted to pick up the editorial ball here, that would be
wonderful. A lot of the necessary changes are really trivial obvious
stuff, though there are definitely other issues where no consensus
has been arrived at yet (Like questions about Exec:)

On Thu, 2003-06-26 at 13:31, Koblinger Egmont wrote:

> - "value for a string key must contain only ASCII characters". A
> typical key that takes a string is Exec or Icon. It's clear that these
> take string instead of localestring: the value is not shown to the
> user in human-readable form but is only used for verbatim copying,
> opening a file, execing something. However, filenames might contain
> accented characters, and I do want to be able to open or exec them.
> Hence I recommend that values to the string should be able to contain
> 8-bit characters. They should be treated as-is without any charset
> conversion, without even requiring it to be valid UTF-8 or anything
> similar.

Allowing uninterpreted 8-bit I think is not really sensible because
we don't know the encoding in which the command will be executed
and we don't know the encoding of the user's file system. Plus,
as Havoc said, it cannot be displayed in a tool such as a desktop 
file editor.

A solution beyond ASCII for Exec may not be possible until 
filesystem  encodings for Linux are resolved, and that seems 
unlikely to be resolved until we get to the everybody-uses-UTF-8-locales

The Icon theme spec defines icon names to be ASCII only so there
is no issue there. (The desktop spec probably needs updating
to specify that Icon: is intepreted as in the Icon theme spec -
it wasn't defined when we first wrote it.)

> - Should lines such as Encoding[hu]=Legacy-Mixed be supported? This would
> have the meaning that all Key[hu]=Value lines are encoded in ISO-8859-2
> but it doesn't specify the encoding of other languages.

No. Even if it was sensible,  Legacy-Mixed is deprecated, so adding
new extensions here is something we wouldn't do.

> - Does an Encoding field specify the encoding of all the lines in the
> file, even the preceding ones and the lines of other groups? Can it occur
> anywhere in the group (even after the Names and Comments encoded in the
> charset mentioned here) or does it have to occur before the very first key
> of type localestring?

Since the spec doesn't say anything about ordering, I think it's
pretty clear that the Encoding line can appear anywhere in the file,
even if that's a bit inconvient for parsers. 

(The obvious thing to do is to do the encoding conversion later and
store localstring entries read in in the file encoding. GNOME does the
reading two-pass instead, partly because it wants to check if the entire
file is valid UTF-8 for autodetection when Encoding: is missing.)

Encoding is defined to only apply to lines of type localstring, so
it's irrelavant for other groups, unless those groups are specified
(by some other specification) to have localstring lines as defined
in the desktop entry spec.


More information about the xdg mailing list