File encoding

Fri Dec 3 16:27:38 PST 2004

Keith Packard wrote on 2004-12-03 21:37 UTC:
> Around 21 o'clock on Dec 3, Markus Kuhn wrote:
> 
> > I'll have a go at converting xc to UTF-8. Little harm can be done where
> > only C comments and README.txt files are concerned, which is the vast
> > majority of where non-ASCII ISO 8859 characters are found in the code at
> > present.
> 
> I know you'll be careful to avoid converting files which are already in 
> UTF-8; all of my recent changes have used that encoding...

At the moment, I'm having some fun with ChangeLog files that contain ISO
8859-1, ISO 8859-2 and UTF-8 entries simultaneously! The encoding
question clearly has become messy. It is time to move to a single one:
UTF-8. I found a mixture of ISO 8859-1, ISO 8859-2, ISO 8859-15, ISO
8859-16, CP1257, EUC-JP and UTF-8 files so far in xc. In one case, I had
to contact the author to tell me what the non-ASCII characters in his
name are meant to look like.

There were a few couple of C files that used ISO 8859-1 in char
constants. I've replace these with hex integers.

Markus

-- 
Markus Kuhn, Computer Lab, Univ of Cambridge, GB
http://www.cl.cam.ac.uk/~mgk25/ | __oo_O..O_oo__