[PATCH] dim: replace message characters leading to decoding errors with U+FFFD

Daniel Vetter daniel at ffwll.ch
Tue Nov 24 14:01:11 UTC 2020


On Mon, Nov 23, 2020 at 10:37 AM Jani Nikula <jani.nikula at intel.com> wrote:
>
>
> Anyone care to review, please?
>
> For convenience, see [1] and [2] for what's going on.
>
> BR,
> Jani.
>
>
> [1] https://docs.python.org/3/howto/unicode.html#the-string-type
> [2] https://docs.python.org/3/library/stdtypes.html#bytes.decode

Thanks for the pointers.

Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
>
>
> On Wed, 18 Nov 2020, Jani Nikula <jani.nikula at intel.com> wrote:
> > The character set decoding added in commit b66d07db11e5 ("dim: decode
> > email message content charset to unicode") started failing with unicode
> > decoding failures under certain conditions. (Specifically python 3 and
> > mboxes downloaded from patchwork.)
> >
> > Instead of raising UnicodeDecodeErrors, replace values that can't be
> > converted with U+FFFD (REPLACEMENT CHARACTER, �).
> >
> > Reported-by: Dave Airlie <airlied at gmail.com>
> > Cc: Dave Airlie <airlied at gmail.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > Signed-off-by: Jani Nikula <jani.nikula at intel.com>
> > ---
> >  dim | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/dim b/dim
> > index 1be1435a1a52..1572cf33f25c 100755
> > --- a/dim
> > +++ b/dim
> > @@ -460,7 +460,7 @@ def print_msg(file):
> >      msg = email.message_from_file(file)
> >      for part in msg.walk():
> >          if part.get_content_type() == 'text/plain':
> > -            print(part.get_payload(decode=True).decode(part.get_content_charset(failobj='us-ascii')))
> > +            print(part.get_payload(decode=True).decode(part.get_content_charset(failobj='us-ascii'), 'replace'))
> >
> >  print_msg(open('$1', 'r'))
> >  EOF
>
> --
> Jani Nikula, Intel Open Source Graphics Center
> _______________________________________________
> dim-tools mailing list
> dim-tools at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dim-tools



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dim-tools mailing list