devkit-disks: Allowed ntfs-3g mount options

David Zeuthen david at fubar.dk
Mon Jun 15 14:33:37 PDT 2009


Hi,

On Mon, 2009-06-15 at 11:37 -0700, Dan Nicholson wrote:
> What about the codepage/iocharset/utf8 type options? Probably everyone
> agrees that everything should be unicode, but that doesn't govern
> what's already on someone's USB key today. I know this was a big issue
> for some people.

Hmm, I remember thinking hard about this some time ago and came to the
conclusion that it's not about what's _on disk_, it's about people using
non-UTF8 locales. And it's not something we need to throw UI real estate
at.

Let's look at file systems one by one

 - FAT
   - short file names: 8-bit, using the OEM codepage
   - long file names: uses 16-bit unicode (UCS-2, not UTF-16 IIRC)

 - NTFS
   - using 16-bit unicode (UTF-16, not UCS-2 I think)

 - ISO9660
   - IIRC plain iso9660 is ASCII
   - Most people use Joliet which means filenames are in UCS-2

 - UDF
   - Specifies nine different encodings that can be used

- HFS+
   - some form of UTF-16

 - Linux/Unix filesystems (e.g. ext2, ext3, btrfs, ZFS, xfs, ...)
   - the encoding is not specified...
   - ...and there's no way to specify mount options to
     translate file names

So the morale here is something like this: If we know that the system is
using an UTF-8 locale (e.g. like most modern distros) then filesystems
mounted by DeviceKit-disks with no options will work out of the box with
the two following caveats

 a. FAT short file names may be wrong (we don't _know_ what the OEM
    code page was)

 b. Linux file systems are as-is; filename encoding is not specified
    so you if you use a filesystem with filenames using ISO-8859-1
    then it will look funny in a UTF-8 system [1].

I would argue that a. doesn't really matter. We can't really do anything
at all about b. - personally I think it's totally busted that
filesystems don't specify filename encoding. I guess that's just
UNIX/Linux for you :-/

Anyway, people still complain. Problems typically only arise only when
people are using non-UTF8 locales.

E.g. suppose I'm running in the da_DK.ISO8859-1 locale. Then I insert a
USB disk and the automounter in my session invokes FilesystemMount() on
DeviceKit-disks. Then the FAT file system is mounted using 'utf8' and I
end up seeing weird characters in my terminal and file manager.

The solution for that is simple. The automounter (e.g. Nautilus if you
are using GNOME) should simply just pass the mount options

 utf8=0 iocharset=8859-1

as arguments to the FilesystemMount() call. It knows how to do this
because it knows what locale it is running in. [2]

In conclusion, I don't think we need to expose any options for this.

     David

[1] : That's why there are things like G_FILENAME_ENCODING and
G_BROKEN_FILENAMES to support GLib and GTK+ apps cf.

http://library.gnome.org/devel/glib/unstable/glib-running.html

so at least programmers only need to care about the one true encoding:
UTF-8 ;-)

[2] : Of course this isn't implemented. But file a bug and add me
(david at fubar.dk) and I'll help make this happen.



More information about the devkit-devel mailing list