[poppler] New selection algorithm

Albert Astals Cid aacid at kde.org
Mon Sep 6 12:30:14 PDT 2010


A Dilluns, 6 de setembre de 2010, Daniel Garcia Moreno va escriure:
> Poppler does not make table selection in "order". It detects tables as
> columns, because poppler uses distance between text to decide what is a
> column so tables are selected in column order when the "logic way" is as
> rows.
> 
> Other problem in selection caused by that heuristic is when you have a
> pdf with near columns or text with spaces.
> 
> I looked at acroread to see how it does columns and tables selection and
> I realized that it selects text in "order", I mean, in the order that
> you put it in pdf file. To see that I created a text pdf file with
> inkscape.
> 
> So the selection logic is simple, we select the nearest word to the
> first selection point and the nearest word to the last selection point,
> and every word between that two words (in text order, no matter where
> the words are at screen) is selected too.

What is "text order"?

Albert


More information about the poppler mailing list