[poppler] New selection algorithm
Albert Astals Cid
aacid at kde.org
Mon Sep 6 12:30:14 PDT 2010
A Dilluns, 6 de setembre de 2010, Daniel Garcia Moreno va escriure:
> Poppler does not make table selection in "order". It detects tables as
> columns, because poppler uses distance between text to decide what is a
> column so tables are selected in column order when the "logic way" is as
> rows.
>
> Other problem in selection caused by that heuristic is when you have a
> pdf with near columns or text with spaces.
>
> I looked at acroread to see how it does columns and tables selection and
> I realized that it selects text in "order", I mean, in the order that
> you put it in pdf file. To see that I created a text pdf file with
> inkscape.
>
> So the selection logic is simple, we select the nearest word to the
> first selection point and the nearest word to the last selection point,
> and every word between that two words (in text order, no matter where
> the words are at screen) is selected too.
What is "text order"?
Albert
More information about the poppler
mailing list