'file' URI scheme

Alexander Larsson alexl at redhat.com
Mon Mar 31 03:17:01 PDT 2008

On Mon, 2008-03-31 at 00:58 +0200, Thiago Macieira wrote:
> Luke -Jr wrote:
> >On Sunday 30 March 2008, Thiago Macieira wrote:
> >> >file:///etc;foo/config;bar/some.conf;hi
> >>
> >> What parameters?
> >
> >Section 3.3 of RFC 2396 (Uniform Resource Identifiers (URI): Generic
> > Syntax)
> RFC 3986 is newer. But we're talking about URLs here, which are a specific 
> subset of URIs. The parameters do not apply.
> >
> >> That URL above maps to the file:
> >>
> >> 	/etc;foo/config;bar/some.conf;hi
> >>
> >> I doubt that file exists in your system. If there's any such thing as
> >> parameters, they must be handled by the open(2) system call.
> >
> >Path component parameters are defined in the URI standard, but (AFAIK)
> > do not exist in any POSIX standard-- nor does Linux itself provide
> > support for URIs, as far as I am aware.
> There's an XDG standard. It requires that the contents of the URL be the 
> exact byte value passed to the system calls, regardless of encoding or 
> locale.
> [RFC 3987 requires URIs to be in UTF-8, which means that practically the 
> only valid encoding on Linux now is UTF-8.]

I think there is some sort of misunderstanding here. All valid URIs are
ASCII (non-ASCII needs to be escaped, making it ASCII). ASCII is a
subset of UTF-8, so all valid URIs are UTF-8. 

RFC 3987 isn't about URIs at all, but IRIs, and it does not *require*
things to be UTF-8. All it does is *allow* UTF-8 to be in an IRI without
having to escape it. You can still create a valid IRI for a filename
that has non-utf8 in the pathname, it will just contain hex escapes.

More information about the xdg mailing list