[poppler] Confusion about poppler_page_get_text and poppler_page_get_text_layout
rswarbrick at gmail.com
Mon Dec 19 13:03:58 PST 2011
I'm messing around with a lisp FFI binding to poppler (via the glib
interface) and have bumped into a strange situation.
If I've understood the documentation correctly, poppler_page_get_text
and poppler_page_get_text_layout should give me a string and an array of
rectangles, respectively. The n'th rectangle should be the position on
the page of the n'th chanacter of the string.
If I'm right there, I'm confused. For the first PDF with which I've
tried this, I get a string of length 1541 and an array of rectangles of
length 1477. This is... mystifying!
I presume that I've misunderstood what's supposed to happen (since I
can't imagine that Evince would work on this system if I was
right!). Can anyone clear up what I'm getting wrong?
PS If I have understood the documentation correctly, then I've probably
made a programming error, but I can't really see how I could end up
with something "almost right" like this, so I thought I should ask a
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 315 bytes
Desc: not available
More information about the poppler