[PATCH] dim: decode email message content charset to unicode
Jani Nikula
jani.nikula at intel.com
Wed Oct 28 07:46:39 UTC 2020
On Tue, 27 Oct 2020, Rodrigo Vivi <rodrigo.vivi at intel.com> wrote:
> On Mon, Oct 26, 2020 at 12:21:24PM +0200, Jani Nikula wrote:
>> On Wed, 16 Sep 2020, Rodrigo Vivi <rodrigo.vivi at intel.com> wrote:
>> > On Wed, Sep 16, 2020 at 12:57:43PM +0300, Jani Nikula wrote:
>> >> Email messages need two levels of decoding: First, content transfer
>> >> encoding, such as base64 or quoted-printable. Second, charset decoding.
>> >>
>> >> We've done the first (with part.get_payload(decode=True)), but we've
>> >> ignored the charset. Mostly, it has not mattered, since most email is
>> >> ascii or utf-8 anyway, and python2 has been relaxed about it. However,
>> >> python3 part.get_payload(decode=True) gives us binary instead of
>> >> unicode, so we also need to do the charset decoding to get the result we
>> >> want.
>> >>
>> >> The problem has likely been observed only now that 'python' no longer
>> >> exists or points at python3 instead of python2.
>> >>
>> >> Use part.get_content_charset() for charset decoding, defaulting to
>> >> 'us-ascii' source charset if nothing is specified.
>> >>
>> >> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> >> Cc: Daniel Vetter <daniel at ffwll.ch>
>> >> Signed-off-by: Jani Nikula <jani.nikula at intel.com>
>> >
>> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> > Tested-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> >
>> > (Although it continue to fail with the encoded email)
>>
>> Thanks, pushed, though still work to do I guess. :/
>
> yeap... it also fails with recent gvt-fixes pull request :(
Except this is an altogether different issue. The mail parsing works
just fine.
> Pulling https://github.com/intel/gvt-linux tags/gvt-fixes-2020-10-27 ...
> From https://github.com/intel/gvt-linux
> * tag gvt-fixes-2020-10-27 -> FETCH_HEAD
> dim: 401ccfa87856 ("drm/i915/gvt: Only pin/unpin intel_context along with workload"): Subject in fixes line doesn't match referenced commit:
> dim: e6ba76480299 (drm/i915: Remove i915->kernel_context)
> dim: ERROR: issues in commits detected, aborting
>
>
> $ git log e6ba76480299 -1 --format="%s"
> drm/i915: Remove i915->kernel_context
This is a valid complaint.
This is what's in the pull request:
$ git show 401ccfa87856 | grep Fixes
Fixes: e6ba76480299 (drm/i915: Remove i915->kernel_context)
And this is what it should have:
$ dim fixes e6ba76480299 | grep Fixes
Fixes: e6ba76480299 ("drm/i915: Remove i915->kernel_context")
BR,
Jani.
>
>>
>> BR,
>> Jani.
>>
>>
>> >
>> > Thanks,
>> > Rodrigo.
>> >
>> >> ---
>> >> dim | 2 +-
>> >> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/dim b/dim
>> >> index c3a048db8956..3f489976c6bc 100755
>> >> --- a/dim
>> >> +++ b/dim
>> >> @@ -447,7 +447,7 @@ def print_msg(file):
>> >> msg = email.message_from_file(file)
>> >> for part in msg.walk():
>> >> if part.get_content_type() == 'text/plain':
>> >> - print(part.get_payload(decode=True))
>> >> + print(part.get_payload(decode=True).decode(part.get_content_charset(failobj='us-ascii')))
>> >>
>> >> print_msg(open('$1', 'r'))
>> >> EOF
>> >> --
>> >> 2.20.1
>> >>
>>
>> --
>> Jani Nikula, Intel Open Source Graphics Center
--
Jani Nikula, Intel Open Source Graphics Center
More information about the dim-tools
mailing list