[poppler] Multicolumn select

Albert Astals Cid aacid at kde.org
Wed Nov 18 13:30:27 PST 2009


A Dilluns, 16 de novembre de 2009, Baz va escriure:
> 2009/11/15 Albert Astals Cid <aacid at kde.org>:
> > A Divendres, 13 de novembre de 2009, Baz va escriure:
> >> Hi,
> >> I uploaded a new version of my multicolumn select patches to
> >> https://bugs.freedesktop.org/show_bug.cgi?id=3188 this morning, as you
> >> might've seen. This version uses a similar algorithm to ocropus to
> >> determine reading order, and tries to make the selection follow this
> >> reading order. Its looking fairly good now I think - for all but one
> >> of the documents I tested with it picked a reasonable order, and
> >> selection doesn't jump all over the place. Of course, I've only tested
> >> on the handful of docs that were in the bug reports so I might've made
> >> things worse elsewhere :(
> >>
> >> I was wondering what I can do to get these patches into an acceptable
> >> state. There's some obvious issues still to iron out, eg RTL (see
> >> http://bugs.kde.org/show_bug.cgi?id=156380 ,
> >> http://bugs.kde.org/show_bug.cgi?id=184399) and handling blocks with
> >> non-zero rotation; also the new depth_first_visit method I added is in
> >> the wrong class - should probably be in TextBlock. I'll fix this up.
> >>
> >> But beyond that, these patches might be problematic because they
> >> remove the old selection behaviour. The new behaviour is much better
> >> for multicolumn documents, but is likely to be worse at selecting data
> >> out of tables, for example. Should the new selection mode introduce
> >> new API, so as not to change the current behaviour of Evince &
> >> Okular[1]?
> >
> > What was the [1] supposed to mean here?
> 
> Typo. A reference to a footnote about me not using Okular that was
> left over when I moved that into the text... ignore.
> 
> > As Carlos said we use a Okular coded
> > algorithm for text selection so i'm not sure should affect us much, on
> > the other hand we still have the same problem with columns so if this
> > work we should probably apply a similar solution to okular.
> 
> Ok.
> 
> >> In older versions of acrobat, they had 'table select' and
> >> 'text select' modes, covering these two uses, but more recently table
> >> select has been dropped entirely. I suspect that they now just follow
> >> the tags in tagged pdf, with the fallback behaviour being something
> >> like what I've coded up here.
> >>
> >> Also, testing. At the moment, testing for me consists of opening a
> >> bunch of documents in Evince and selecting stuff randomly (I don't
> >> have Okular, but since they use the same API for text selection I
> >> presume the bug is the same). I have no idea if I'm introducing
> >> regressions. Is there a plan to integrate the unit test framework that
> >> was discussed previously?
> >> http://lists.freedesktop.org/archives/poppler/2009-March/004535.html .
> >
> > The Qt4 frontend already has unittests, these are unit tests for glib
> > frontend, not a test suite that is what you want.
> 
> Ok, I see that now. The qt4 tests refer to documents that aren't in git
>  though?

Yes they are http://cgit.freedesktop.org/poppler/test/

Albert


More information about the poppler mailing list