[PATCH] Add long comments to shared-mime-info XML file

Christian Neumair chris at gnome-de.org
Sat Oct 15 12:18:17 PDT 2005

On Do, 2005-10-13 at 11:44 -0400, Andrew J. Montalenti wrote:
> Dear Christian,
> On Thu, 2005-10-13 at 12:17 +0200, Christian Neumair wrote:
> > > I don't know about this patch.  Some of your long forms are a bit too
> > > long.  "Cascading Style Sheets stylesheet" and "Microsoft Windows Media
> > > Video video", for example, are particularly long and redundant, and
> > > would look stupid should a programmer ever actually employ the long
> > > comment form.
> > 
> > Maybe you could come up with alternate proposals?
> I guess it wasn't clear that I was pointing to two separate problems.
> I'll make it clear here:
> (1) If we are to have long comments, they shouldn't be redundant.
> "Cascading Style Sheet Styles stylesheet" is just silly, as is "Windows
> Media Video video".  This doesn't help the user, and looks like a
> mistake.  If I were seeing a dialog showing asking me with what
> application I'd like to open this "Cascading Style Sheet stylesheet",
> I'd immediately think the programmer blindly expanded a "CSS" acronym to
> its full name in a string that, prior, had "CSS stylesheet" in it, and
> didn't realize there'd be redundancies.

Yes, I totally agree here.

> (2) Are long comments even necessary or useful in the form you have
> enumerated them?

> o Your proposal is to have two categories of comments--long and
> short--whose strings are used in different contexts.
> o My proposal is to have one category of comments, which has a
> contextual clue (<abbr> tag) whenever parts of those strings may need to
> offer an acronym expansion.

I have three problems with your proposal:

a) Backwards compatibility. I know, we haven't reached 1.0 yet, and
don't make semantic guarantees but it's plain unfriendly to introduce
markup in a previously markup-free string which could be interpreted
plain. GNOME 2.10 and 2.12 would display the abbr tag carbon, which is a
no no.

b) i18n. I used to be the translation coordinator for the German l10n
crew at GNOME for a few years, and my experience with translators was
and is that they often don't compile the software they translate, or
don't know how to figure out how a po string maps to the final
application. You can't rely on them being properly informed about the
technology they use.

Having markup tags inside gettext strings is perceived as being very,
very i18n unfriendly and confusing, especially since it involves string
merging that is not obvious, at least not to the average translator, and
will lead to dozens of problems. I'm quiet surprised that Behdad
proposed the same, although I think he is involved into the Persian
GNOME i18n.

c) I don't like the semantics. I think the <abbr/> tag isn't appropriate
here, just because "HTML uses it". Its naming just isn't quiet right in
this context, because "Windows Media Video" is not an abbreviation for
"WMV video", but for "WMV", i.e. if you take <abbr/> serious
semantically, it's meant to be used like

<_comment><abbr title="Windows Media Video">WMV</abbr> video</_comment>

and not like

<_comment><abbr="WMV video">Windows Media Video</abbr></_comment>

if I'm not taken wrong. Maybe I've not seen a point you made, if so
please enlighten me.

> But this way, later on, if GtkLabel's get some feature equivalent to
> <abbr> tags, we'll be able to use them.

i18n people will kill you if you do run-time merging of marked up text,
as mentioned above and below. Rather use the complete strings, and
explicitly offer all possible combinations, i.e. use

if (abbr_html && abbr_css) {
  string = g_strdup_printf (_("HTML and CSS"));
} else if (abbr_html) {
  string = g_strdup_printf (_("HTML and Cascading Style Sheets"));
} else if (abbr_css) {
  string = g_strdup_printf (_("Hyper Text Markup Language and CSS"));
} else {
  string = g_strdup_printf (_("Hyper Text Markup Language and Cascading
Style Sheets"));

> And for now, we can also do
> other things like "sort by acronym" in a display of recognizable file
> types, or "search acronyms," without having to do dirty hacks like a strcmp
> between the <long- and non-<long- comments.  As is, even your regular
> <comment>'s don't have any notion of pointing out where the acronym is.

Ugh, what does sorting by acronym mean? The sort order should be almost
precisely the same, because

Foo Bar Video
foo bar video

Bar Foo Video
bar foo video

shouldn't really make a difference.

I must admit that the old <long-comment/> semantics kind of implied
redundancy as well, and that the naming was not very well thought-out.
Maybe we could have something like:

<_format-name>Windows Media Video</_format-name>
<_comment>WMV video</_comment>


<_comment _short="WMV video">Windows Media Video</_comment>

but I don't like the latter because it has the same shortcomings as long
comment/comment or comment/abbr, i.e. some mismatch between the obvious
interpretation of a tag and the relation of the tag contents to other
tags' contents.

I'm sorry but because of b) I really have to insist on the unexpanded
and the expanded version of a MIME description being two totally
separate strings. I guarantee you that if we CCed gnome-i18n at gnome.org,
many people supported my POV.

I'm also very sorry for the long followup, things like this can usually
better be discussed by two people using the phone. It's really a PITA to
read and/or write long mails.

Christian Neumair <chris at gnome-de.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.freedesktop.org/archives/xdg/attachments/20051015/a9cf1ecd/attachment.pgp 

More information about the xdg mailing list