[CREATE] Clipping Path names in TIFF files

Simon Budig simon at gimp.org
Sun May 27 17:15:07 PDT 2007


Hi all.

As some of you may be aware I've been implementing support for clipping
paths in the last few days. It seems to more or less work - Scribus can
read the paths written by Gimp, Gimp can read paths written by e.g.
Photoshop.

Now there is a small problem that has come up regarding the names of the
paths. The background is, that TIFF stores these paths in the same
format as the old PSD6 file format does. This specifies the names as
"Pascal String", which basically shows that this format is old.

In particular there is no information available about the encoding used
to store non-ascii characters in these up to 255 bytes.

In fact not even the current Photoshop CS3 beta is able to handle this
problem properly. If you have a path that has e.g. cyrillic letters in
its name, store that to a tif-file and load that file back into
photoshop, all your cyrillic letters got replaced with a '?'.

I think we can do better, but we need to agree on a handling of the
names.

While discussing this with Franz Schmid - who is responsible for the
path handling in scribus - we came up with this idea:

- try to encode the name as iso8859-15, which seems to be what photoshop
  use by default (or is it windows 1250 - that is a bit hazy at the
  moment, needs some more tests).

- if this is not possible, because the string contains glyphs outside
  the iso8859-15 range, we just use UTF-8, and prepend it with
  0xff 0xfe  (the unicode byte order marker) to make it a bit easier to
  detect a "free software style" encoded path name.

I'd like to hear what other people think about this proposal. Any
alternative suggestions? Do you like the idea?

I could need help with this problem:

The name "€name¤§@"  (EURO, "name", CURRENCY, PARAGRAPH, @) gets encoded
as 80 6E 61 6D 65 A4 A7 40. Can someone identify this encoding? Encoding
the EURO as 0x80 seems to hint at some windows code page, but the
currency symbol seems to contradict this.

If this turns out to be a non-standard encoding I suggest to just try to
encode to iso8859-1, since this is a pretty close "standard" encoding
(we might need to do some more tests here to really settle this).

Also: Could someone try to give clipping paths complicated names and
create tiff files on windows and mac and look if the names show up the
same on the other platform?

On a more theoretical side the current proposal has a problem with path
names starting with ÿþ (YDIAERESIS, THORN), since we use this as the
unicode marker. Can anyone imagine this as being a real problem?

Thanks,
        Simon
-- 
              simon at budig.de              http://simon.budig.de/


More information about the CREATE mailing list