File URI quandy

Stephan Bergmann sbergman at
Mon May 29 08:03:06 UTC 2017

On 05/20/2017 01:38 PM, Chris Sherlock wrote:
> I’ve lately been looking at how the OSL handles URIs (as an aside, our 
> codebase is ancient, so ancient we still call them file URLs, not URIs).

I don't think there's anything wrong with using the term "file URL". 
(RFC 3986 divides URIs into URLs and URNs, and historically URIs with 
"file" scheme have typically been considered URLs, whether or not that 
distinction is actually useful.)

> However, I’ve hit a genuine quandry. When we convert from system paths 
> to file URIs, the RFC that details the file URI spec (RFC 8089) handles 
> everything except for system paths on POSIX systems that start with 
> double slashes. POSIX defines the behaviour of initial double slashes as 
> implementation specific, however I cannot see anywhere in the RFC where 
> it describes how to handle initial double slashes in file URIs.

I don't understand the word "quandry".

All RFC 8089 says is:  "The path component represents the absolute path 
to the file in the file system."  How exactly POSIX pathnames map to URI 
path-absolute syntax remains under-specified and open to interpretation. 
  (Another issue, e.g., is the semantic mismatch between ".." POSIX 
filenames in the face of directory symlinks and ".." path segments as 
per RFC 3986's reference resolution algorithm.)  And there's issues with 
more real-world impact, like OOo/LO's design decision of always 
interpreting file URL path content as UTF-8.

More information about the LibreOffice mailing list