[Clipart] SVG::Metadata re-engineering

Bryce Harrington bryce at bryceharrington.com
Tue Apr 5 10:34:21 PDT 2005


On Mon, Apr 04, 2005 at 07:19:34PM -0400, Jonadab the Unsightly One wrote:
> Bryce Harrington <bryce at bryceharrington.com> writes:
> > Btw, iirc, that encode_entities() bit was added in there at someone
> > else's request (Jonadab?) to fix a different issue
> 
> That rings a bell.  I will grep my archives...
> 
> Well, I can't find the specific message at the moment, but no matter.
> I'm pretty sure the problem we intended to solve with this was the
> character set issue, and it fundamentally isn't solving that.
>
> Note, however, that unless I'm missing something, we still do need to
> encode angle brackets, ampersands, and quote marks.  This can be done
> by passing encode_entities a second argument that is a string
> consisting of the characters that should be encoded, i.e., q('"<>&)

Ah-ha, that was it - we had some images with &'s in them that was
screwing things up.  E.g., 'Fruits & Vegetables'.  As long as the new
code also accounts for that, it should be good.  Thanks for looking into
it. 

> The best solution in technical terms would be to make the whole world
> switch to English and do away with non-ASCII characters altogether,
> but I think for political reasons we will have to go with some other
> solution, even if it's a suboptimal workaround ;-)

;-)

> In all seriousness, what I really need to do is hunt down a module
> that converts charsets to UTF8.  Umm...  lesse...  Unicode::MapUTF8
> seems to have something to do with that.  I'm putting its
> documentation on my reading list.

Sounds good.  Fwiw, early on Inkscape also had a lot of problems with
UTF8 encoding errors (caused a lot of crashes when used with foreign
language settings).  Jon Cruz and others added a lot of glib's UTF8
handling code to fix these things.  Hopefully Unicode::MapUTF8 will be
sufficient to take care of this for us.

Bryce



More information about the clipart mailing list