[patch] filter invalid utf-8 characters from volume labels

Sjoerd Simons sjoerd at luon.net
Wed Jul 28 14:26:20 PDT 2004


On Wed, Jul 28, 2004 at 10:33:15AM -0400, Joe Shaw wrote:
> On Wed, 2004-07-28 at 09:12 +0200, Sjoerd Simons wrote:
> >   Attached path replaces the invalid utf-8 characters in a volume label with ?.
> >   I believe David is thinking about a better long-term solution, but he can 
> >   better comment on that himself.  And ofcourse there is the traditional
> >   screenshot[1] :)
> 
> If we don't know what encoding the label is in, we'll never be able to
> know definitively.  Maybe it would make sense to have a filename
> encoding setting in hal.conf.  In the meantime though I'd recommend just
> assuming ISO-8859-15, and doing something like this:

This is one of those cases where i don't really care how it get's fixed as long
as it gets fixed :). Although i don't know if guessing that it's -15 when it's
not (or invalid) utf-8 is better then assuming there is garbage in an utf-8
string.

The problem seems very general to hal atm though. One debian user apperently
has a MS usb mouse which has a 0xAE char (latin1 copyright sign) in it's
description[0]. 

Any ideas how to solve it more generally ? It's possible some code into 
hal_device_set_property_string to ensure that the string value is always valid 
utf-8. But that doesn't feel right, on the other hand ``fixing'' every place
where hal sets a string property with information from the outside is a lot of
work and probably error-prone.

  Sjoerd
0: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=261462
-- 
Those who can, do; those who can't, write.
Those who can't write work for the Bell Labs Record.
_______________________________________________
hal mailing list
hal at freedesktop.org
http://freedesktop.org/mailman/listinfo/hal



More information about the Hal mailing list