[poppler] recent defect with page.get_text

Albert Astals Cid aacid at kde.org
Wed Sep 28 06:12:30 PDT 2011


Brian any chance you can have a look at this regression?

Albert

A Dimecres, 28 de setembre de 2011, alex bodnaru vàreu escriure:
> hello albert, brian, other friends
> i've did the research regarding the bug reported here, using git bisect, and
> found the following:
> 
> i will look at the commit.
> db014ffb357e760d9397544c5a8fe747cdb497ab is the first bad commit
> 
> commit db014ffb357e760d9397544c5a8fe747cdb497ab
> Author: Brian Ewins <brian.ewins at gmail.com>
> Date:   Mon Nov 23 08:58:19 2009 +0000
> 
>     Select top right to bottom left in RTL mode
>    
>     This makes pure RTL selection work. Bidi is not handled at all.
>     Rendering of the selection is poor and the dumped text appears
>     to still be in reverse order to me.
> 
> :040000 040000 fbc5ddcd87a559cd94f20119eaee2af2fa9dc257
> :e58e22d3707422029f1ca753868164eb22cf8bb4 M    poppler
> On 09/27/2011 04:52 AM, alex bodnaru wrote:
> 
> hello,
> On 09/26/2011 03:26 PM, Albert Astals Cid wrote:
> 
> A Dissabte, 24 de setembre de 2011, alex bodnaru vàreu escriure:
> hello albert, friends,
> Hi
> 
> about the "recent" defect, it got to me since the *recent* debian upgrade of
> libpoppler* from 0.12.4 strait to 0.16.7.
> 
> after comparing the releases in between, i found the problem occured from
> 0.13.2 to 0.13.3. diff between these 2 is attached.
> That is not recent at all, that is older than a year ;-)
> no comment ;)
> In that release there were the following commits that touched TextOutputdev,
> I do not know if you know how to compile from git but if you do it would be
> great if you could try going back to
> 9c5612f6e013a8698eff6531ec388a7e6c1fb89a
> db014ffb357e760d9397544c5a8fe747cdb497ab
> b1d43fa052d9160c4f319a67415ecf3ebf2cf9b3
> f83b677a8eb44d65698b77edb13a5c7de3a72c0f
> a2191a4d45e0abaec97c19aacae37c4c5824bd36
> 345ed51af9b9e7ea53af42727b91ed68dcc52370
> 12d83931ae1b899b70c7ea5c01f03f123b1bb9a8 thanks a lot. i'll certainly
> checkout each of them to get closer to the problem. And compile for each of
> them and see in which of those the bug is present and in which of them is
> not. Albert P.S: You still send html email ;-) hope to also have fixed this
> now. sorry again.
> 
> 
> best regards,
> 
> alex
> just please look at the glib/demo/poppler-glib-demo get text output from the
> attached pdf, even of the fist page.
> 
> On 09/18/2011 05:09 PM, Albert Astals Cid wrote:
> 
> Please do not email me, email the list.
> 
> A Diumenge, 18 de setembre de 2011, vàreu escriure:
> On 09/18/2011 02:41 PM, Albert Astals Cid wrote:
> A Diumenge, 18 de setembre de
>       2011, alex bodnaru vàreu escriure:
> 
> hello friends,
> 
> Hi
> thanks a lot albert for considering my problem.
> I am not considering your problem, I am complaining about the lack of
> information in your original mail ;-)
> 
> i'm using poppler through python (that invokes glib interface).
> 
> a recent change (probably together with get_text separation) broke the glib
> interface.
> 
> what does recent mean? 0.16.7? 0.17.x? git master?
> 0.16.7.
> So 0.16.7 does not work, which is the version you know it works?
> 
> Albert
> 
> P.S: Would it be possible for you not to send HTML email?
> 
> Albert
> thanks again,
> alex
> 
> i can't load the entire page text with get_text (see the glib demo) of one
> pdf i have, but pdftotext does output the entire text.
> 
> my pdf is attached. i apology for the language, but i promise it's a non
> offending cadastre report. please see that not all text lines are being
> output by get_text.
> 
> could you help?
> 
> thanks in advance,
> 
> alex
> 
>       _______________________________________________
> 
>       > poppler mailing list
>       > 
>       > poppler at lists.freedesktop.org
>       > 
>       > http://lists.freedesktop.org/mailman/listinfo/poppler
> 
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list