[poppler] Incompatible number of glyphs from glib get_text{, layout}

Peter Waller peter at scraperwiki.com
Wed May 27 09:04:04 PDT 2015


If I drop strings in doShowText which have
`!font->hasToUnicodeCMap()`, I get the desired output from
poppler_page_get_text() and poppler_page_get_layout().


I do that by just returning early from `Gfx::doShowText()`.

Would a patch be welcomed that does this? I propose that OutputDev
would grow a `needUnicodeText()` which would default to false (so that
we don't influence renderers) and TextOutputDev would return true.

This would fix cutting and pasting for approximately 10% of our users
and enable us to get text from documents via the poppler API.

I note that the Adobe Reader running on Windows gave junk when
copy-pasting those characters in my example PDF (but it didn't break
copying the rest of the text).


- Peter

More information about the poppler mailing list