[packagekit] Backend unicode improvements

Mon Mar 17 04:38:24 PDT 2008

Luke Macken wrote:
> Hey guys,
> 
> So I've made an attempt to clean up the unicode mess in the yum backend.

Thanks for doing this Luke, it almost have driven me insane :)
> Some issues that I noticed with the current code:
>     
>     _toUTF behavior is inconsistent.  If you give it a unicode string,
>     it returns a unicode string.  If you give it a byte string, it
>     returns a byte string.  I assume we want to always return a unicode
>     object?
>     
>     We catch UnicodeDecodeErrors in multiple places.  These should never
>     happen, yet they currently do.
> 
> Ideally, we should be following the 3 golden rules for handling unicode
> in Python:
> 
>     - decode (str->unicode) early
>     - unicode everywhere
>     - encode (unicode->str) late
> 
> To comply with the first rule, we should probably be decoding our
> bytestrings to unicode in each helper script.  Since yum2 is in the
> works, it's probably not worth the effort to modify all of them, so my
> patch simply decodes in a couple of places in the yumBackend.
> 
> There is still a unicode hack or two lying around due to yum/rpmdb not
> handling unicode properly.  The comments suggest that these fixes will be going
> upstream to yum, so hopefully those are/will be underway.
> 
> Attached is patch that attempts to cleans up a lot of the unicode mess in the
> yum/yum2 backends.
> 
> I've tested it locally with the issues that were raised on this list
> in the past, and haven't seen any regressions.  I'd be glad to commit
> this patch, but would really appreciate some extra testing.  I'm
> currently at PyCon, and my rawhide vm seems to be on the fritz, so I
> haven't been able to test it with the yum2 backend, or with
> gnome-packagekit.

Looks good.

Tim