File URI Specification ?

Alexander Larsson alexl at redhat.com
Mon Jun 12 10:54:33 EEST 2006


On Fri, 2006-06-09 at 21:23 +0200, Thiago Macieira wrote:
> Alexander Larsson wrote:
> >I'm not sure what happened to the copy on the freedesktop site. Attached
> >is a version i had in my homedir (i wrote the thing). Hopefully its the
> >latest version. It might be possible to do some archeological digging
> >and find the old one on the site too.
> 
> I'm sorry to bring back this old discussion, but I believe the spec is 
> actually recommending something that is wrong (file:///). It says:
> 
> >Some current apps generate URIs of the form "file:/<path>". These
> >are not correct according to RFC1738, so they should not be
> >generated. However for backwards compatibility, it is recommended that
> >such URIs are interpreted as file URIs with an empty hostname.
> 
> Unfortunately, the more-recent RFC 3986 (Jan 2005) clearly indicates that 
> this recommendation is, actually, wrong. It shows that, if 
> the "authority" part of the URI is empty, the leading two slashes must be 
> dropped as well.
> 
> See the ABNF syntax: 
> http://www.gbiv.com/protocols/uri/rfc/rfc3986.html#collected-abnf
> 
>  URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
>  hier-part     = "//" authority path-abempty
>                / path-absolute
>                / path-rootless
>                / path-empty
>  authority     = [ userinfo "@" ] host [ ":" port ]
>  path-absolute = "/" [ segment-nz *( "/" segment ) ]
> 
> This indicates that, if // is present, it must be followed by a non-empty 
> hostname. The alternative is to use a path-absolute, which starts with a 
> single leading slash.
> 
> Technically, the URI also allows for relative paths (without the root), 
> but that's relative to the originating document (i.e., file:bin/ls could 
> be /usr/local/bin/ls)

Ugh. This is sort of tricky. RFC 3986 defines the "general" URI
recommendations, and it can be read as you do above.

However, it also says about itself: "It excludes portions of RFC 1738
that defined the specific syntax of individual URI schemes; those
portions will be updated as separate documents." 

One could read this such that the file: specific parts of 1738 still
stand until there is a new document overriding it. And 1738 explicitly
says:
   As a special case, <host> can be the string "localhost" or the empty
   string; this is interpreted as `the machine from which the URL is
   being interpreted'.

Even more confusing, 3986 has a reference to file: uris in section 1.1:

  URIs that identify in relation to the end-user's local context should
  only be used when the context itself is a defining aspect of the
  resource, such as when an on-line help manual refers to a file on the
  end-user's file system (e.g., "file:///etc/hosts").
 
Note that it uses the 1738 format for the example URI...

All this just strengthens my view that using "web" uris for a vfs was a
gigantic mistake. There is just no way "normal" apps (i.e.
non-webbrowsers) can cope with pathname semantics that vary over time
(as rfcs change) and between targets (i.e. root is "/" except on ftp
where it is "/%2F"). Its pretty ok when all you need to do is parse a
uri given to you and download the file it references. However, when you
need to do directory traversals and other more complicated things that
show up when using a vfs things go bad pretty fast.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl at redhat.com    alla at lysator.liu.se 
He's a notorious misogynist firefighter with nothing left to lose. She's a 
man-hating punk vampire looking for love in all the wrong places. They fight 
crime! 




More information about the xdg mailing list