[poppler] Regression in text extraction
Adrian Johnson
ajohnson at redneon.com
Mon Jun 30 05:42:41 PDT 2008
Albert Astals Cid wrote:
> Right, the attached patch should fix the problem, can you test?
Works for me.
> Also can you please send an url to a pdf where ActualText gives a different
> output than "classical" text extraction?
There are some sample PDFs at:
http://www.unicode.org/udhr/
such as:
http://www.unicode.org/udhr/d/udhr_san.pdf
Or if you prefer something small, simple, and in English for testing, I
have created a PDF that uses ActualText to change a couple of the words
in the extracted text from what is displayed:
http://people.freedesktop.org/~ajohnson/actualtext.pdf
More information about the poppler
mailing list