File URI Specification ?
alexl at redhat.com
Mon Jun 12 10:54:33 EEST 2006
On Fri, 2006-06-09 at 21:23 +0200, Thiago Macieira wrote:
> Alexander Larsson wrote:
> >I'm not sure what happened to the copy on the freedesktop site. Attached
> >is a version i had in my homedir (i wrote the thing). Hopefully its the
> >latest version. It might be possible to do some archeological digging
> >and find the old one on the site too.
> I'm sorry to bring back this old discussion, but I believe the spec is
> actually recommending something that is wrong (file:///). It says:
> >Some current apps generate URIs of the form "file:/<path>". These
> >are not correct according to RFC1738, so they should not be
> >generated. However for backwards compatibility, it is recommended that
> >such URIs are interpreted as file URIs with an empty hostname.
> Unfortunately, the more-recent RFC 3986 (Jan 2005) clearly indicates that
> this recommendation is, actually, wrong. It shows that, if
> the "authority" part of the URI is empty, the leading two slashes must be
> dropped as well.
> See the ABNF syntax:
> URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
> hier-part = "//" authority path-abempty
> / path-absolute
> / path-rootless
> / path-empty
> authority = [ userinfo "@" ] host [ ":" port ]
> path-absolute = "/" [ segment-nz *( "/" segment ) ]
> This indicates that, if // is present, it must be followed by a non-empty
> hostname. The alternative is to use a path-absolute, which starts with a
> single leading slash.
> Technically, the URI also allows for relative paths (without the root),
> but that's relative to the originating document (i.e., file:bin/ls could
> be /usr/local/bin/ls)
Ugh. This is sort of tricky. RFC 3986 defines the "general" URI
recommendations, and it can be read as you do above.
However, it also says about itself: "It excludes portions of RFC 1738
that defined the specific syntax of individual URI schemes; those
portions will be updated as separate documents."
One could read this such that the file: specific parts of 1738 still
stand until there is a new document overriding it. And 1738 explicitly
As a special case, <host> can be the string "localhost" or the empty
string; this is interpreted as `the machine from which the URL is
Even more confusing, 3986 has a reference to file: uris in section 1.1:
URIs that identify in relation to the end-user's local context should
only be used when the context itself is a defining aspect of the
resource, such as when an on-line help manual refers to a file on the
end-user's file system (e.g., "file:///etc/hosts").
Note that it uses the 1738 format for the example URI...
All this just strengthens my view that using "web" uris for a vfs was a
gigantic mistake. There is just no way "normal" apps (i.e.
non-webbrowsers) can cope with pathname semantics that vary over time
(as rfcs change) and between targets (i.e. root is "/" except on ftp
where it is "/%2F"). Its pretty ok when all you need to do is parse a
uri given to you and download the file it references. However, when you
need to do directory traversals and other more complicated things that
show up when using a vfs things go bad pretty fast.
Alexander Larsson Red Hat, Inc
alexl at redhat.com alla at lysator.liu.se
He's a notorious misogynist firefighter with nothing left to lose. She's a
man-hating punk vampire looking for love in all the wrong places. They fight
More information about the xdg