[poppler] Regression in text extraction

Adrian Johnson ajohnson at redneon.com
Mon Jun 30 05:42:41 PDT 2008


Albert Astals Cid wrote:
> Right, the attached patch should fix the problem, can you test?

Works for me.

> Also can you please send an url to a pdf where ActualText gives a different 
> output than "classical" text extraction?

There are some sample PDFs at:

   http://www.unicode.org/udhr/

such as:

   http://www.unicode.org/udhr/d/udhr_san.pdf

Or if you prefer something small, simple, and in English for testing, I 
have created a PDF that uses ActualText to change a couple of the words 
in the extracted text from what is displayed:

   http://people.freedesktop.org/~ajohnson/actualtext.pdf



More information about the poppler mailing list