[poppler] New selection algorithm

Lorenzo Gil lgs at yaco.es
Mon Sep 6 12:49:35 PDT 2010


On Mon, Sep 06, 2010 at 12:18:19PM -0700, Leonard Rosenthol wrote:
> I won't comment on the patch itself, but I will make two comments.
> 
> 1) Your assumptions about how Acrobat/Reader work is incorrect.

Well I don't know if you can elaborate a little bit more on this but roughly we think
Acrobat Reader behaves like that. Our quick and dirty tests were basically like this:

 1. Open a vector drawing tool (in our case we used Inkscape)
 2. Draw some text in different funny positions
 3. Generate a PDF file from that
 4. Open it with Acrobat Reader and see how the selection works

In all cases Acrobat Reader selected the text in the same order we put it in Inkscape.

I guess Inkscape won't put any structure/tagging into the PDF file so in this simple
case our assumptions may be right.


> 2) You should consider taking PDF structure/tagging into account when present.

Right, the question is if this information is available in Poppler. If no, we should
make it available first and then use it in the selection algorithm. I don't think
the current selection code uses this information anyway, so I don't think we are
doing a regression here.


Best regards,

Lorenzo Gil


More information about the poppler mailing list