[poppler] pdftotext needs support for surrogates outside the BMP plane

Albert Astals Cid aacid at kde.org
Sun Jun 1 08:28:11 PDT 2008


A Dijous 29 Maig 2008, Koji Otani va escriure:
> Hi, All.
>
> I'd like to commit this patch to the trunk tree.
> Should I register this to Bugzilla before doing it?

No, but i'd like to confirm that "it works" before commiting it, i can see 
that your patch gives a different output but i don't have any font installed 
in my system that can "draw" the characters, what font are you using?

Albert

> --------------
> Koji Otani.
>
> From: Ross Moore <ross at ics.mq.edu.au>
> Subject: Re: [poppler] pdftotext needs support for surrogates outside the
> BMP plane Date: Thu, 29 May 2008 09:06:24 +1000
> Message-ID: <29E1BEE5-11BA-4A5F-A881-29DFA63A7E8A at maths.mq.edu.au>
>
> ross>
> ross> On 28/05/2008, at 6:25 PM, Koji Otani wrote:
> ross> > Hi.
> ross> >
> ross> > ross> There are many pieces of software that do not regard the
> 6-byte ross> > ross> sequences
> ross> > ross> as being valid UTF-8. Thus there needs to be an extra step
> that ross> > ross> translates
> ross> > ross> these 2 x 3 = 6-byte sequences into the proper UTF-8 4-byte
> ross> > sequence.
> ross> > ross>
> ross> > ross> Is anybody working on this kind of thing?
> ross> > ross>
> ross> >
> ross> > I've made a patch fixes this bug, and attached it to this mail.
> ross>
> ross> Thank you very much for this.
> ross> It works brilliantly.
> ross>
> ross> The attached image shows the result of using
> ross>
> ross>       pdftotext -layout testmath.pdf
> ross>
> ross> on the example PDF from my previous message,
> ross> viewed with a standard Mac text-editor application.
> ross>
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler




More information about the poppler mailing list