[packagekit] Character encoding woes

Wed Nov 7 23:44:52 PST 2007

On Thu, 2007-11-08 at 00:38 -0500, Matthias Clasen wrote:
> On Nov 7, 2007 11:54 PM, James Bowes <jbowes at dangerouslyinc.com> wrote:
> > Hi all:
> >
> > The package iwl3945-firmware in Fedora contains a copyright symbol in
> > the description. By default when writing to a pipe python will encode
> > in ascii, raising an exception when it hits this character. We can get
> > around this by using some hacks to wrap sys.stdout/stderr in
> > codecs.getwriter('utf=8'), but the backend or frontend does not
> > display the results.
> >
> > I'm guessing this will require changing some things from vanilla
> > character pointers to unicode strings, but I'm not sure. Anyone have
> > any ideas ?
> 
> Not being a python hacker, I don't have any ideas for a solution, but
> just wanted to confirm that packagekit clearly needs to handle
> summary, description and file lists containing
> non-ascii utf-8. That is inevitable when dealing with translated descriptions.

Totally agree. Can we add some unit tests for this in PkSpawn? I admit
I'm a bit of a newbie with UTF8 and unicode, so an in depth explanation
would be terrific. Thanks.

Richard.