[PATCH] dim: decode email message content charset to unicode

Jani Nikula jani.nikula at intel.com
Tue Dec 15 12:31:03 UTC 2020


On Tue, 15 Dec 2020, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Tue, Dec 15, 2020 at 12:26 PM Jani Nikula <jani.nikula at intel.com> wrote:
>>
>> On Tue, 15 Dec 2020, Daniel Vetter <daniel at ffwll.ch> wrote:
>> > Adding Thomas too.
>> >
>> > On Tue, Dec 15, 2020 at 10:23 AM Daniel Vetter <daniel at ffwll.ch> wrote:
>> >>
>> >> On Wed, Nov 4, 2020 at 9:33 AM Jani Nikula <jani.nikula at intel.com> wrote:
>> >> >
>> >> > On Wed, 04 Nov 2020, Dave Airlie <airlied at gmail.com> wrote:
>> >> > > is this why I get
>> >> > > dim apply-pull drm-next < /tmp/PULL-drm-intel-next-queued.patch
>> >> > > Traceback (most recent call last):
>> >> > >   File "<stdin>", line 9, in <module>
>> >> > >   File "<stdin>", line 7, in print_msg
>> >> > > UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
>> >> > > position 1256: ordinal not in range(128)
>> >> > >
>> >> > > now?
>> >> > >
>> >> > > just taking the pull request patch from patchwork
>> >> > > https://patchwork.freedesktop.org/patch/398659/
>> >> >
>> >> > *sigh*
>> >> >
>> >> > When the message left here, and also when a copy arrived through a round
>> >> > trip via the mailing list, it had Content-Transfer-Encoding:
>> >> > quoted-printable, and the decoding works fine on the local copies, on
>> >> > both python2 and python3.
>> >> >
>> >> > The message from patchwork has Content-Transfer-Encoding: 8bit,
>> >> > i.e. patchwork modified the encoding, and the decoding fails on
>> >> > python3 due to invalid characters. Python2 is less picky.
>> >> >
>> >> > With the change reverted, message_print_body() prints the message as
>> >> > binary without decoding on python3. I don't know if that works by
>> >> > coincidence.
>> >> >
>> >> > Everything also seems to work on the mbox downloaded from Lore [1], can
>> >> > you please use that in the mean time?
>> >>
>> >> gmail seems to do the same mangling, at least my local mailbox also
>> >> has issues. And it's with all of Thomas' pull requests. Pulling from
>> >> lore is kinda awkward.
>> >>
>> >> Any ideas?
>>
>> Isn't this fixed by
>>
>> commit 03f281de0f9175875b8d4da0a43d9d288debb228
>> Author: Jani Nikula <jani.nikula at intel.com>
>> Date:   Wed Nov 18 15:11:03 2020 +0200
>>
>>     dim: replace message characters leading to decoding errors with U+FFFD
>
> Nope, I had that one already. Simon debugged it, apparently problem is
> even earlier in the python magic.

Odd. I tested a number of combinations, and that fixed it for me.

What's the failure mode? Backtrace? What?

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center


More information about the dim-tools mailing list