[poppler] recent defect with page.get_text

alex bodnaru alexbodn.groups at gmail.com
Sun Oct 9 19:54:53 PDT 2011


hello albert, marc, other friends,

i sincerely hope this message has at least a textual copy.

the patch i proposed is successfully bypassing the problem with text selection
for the case the entire page text is being asked (with get_text).

is there a chance my patch would enter git?
the patch is re-attached.

best regards,
alex

On 09/24/2011 02:38 PM, alex bodnaru wrote:
> On 09/24/2011 01:02 PM, Marc J. Driftmeyer wrote:
> 
>> Much better, thanks. lol.
>>
>> I don't mind off-green and off-yellow faintly in engineering paper, but not in
>> an email app.
> 
> now that it's readable ...
> 
> any idea about my question, marc?
> 
> alex
> 
>>
>> - Marc
>>
>> On 09/24/2011 01:55 AM, alex bodnaru wrote:
>>>
>>> On 09/24/2011 11:28 AM, Marc J. Driftmeyer wrote:
>>>
>>>> Alex,
>>>>
>>>> Is it all possible for you to change your background color from yellow to white?
>>>
>>> sorry marc. does it match your taste now ;) ?
>>>
>>> alex
>>>
>>>> Sincerely,
>>>>
>>>> Marc J. Driftmeyer
>>>>
>>>> On 09/24/2011 01:19 AM, alex bodnaru wrote:
>>>>>
>>>>>
>>>>> hello albert, other friends,
>>>>>
>>>>> about the "recent" defect, it got to me since the *recent* debian upgrade
>>>>> of libpoppler* from 0.12.4 strait to 0.16.7.
>>>>>
>>>>> after comparing the releases in between, i found the problem occured from
>>>>> 0.13.2 to 0.13.3. couldn't attach the diff, since the message seems to get
>>>>> scrubbed.
>>>>>
>>>>> i see it's a great update. any advice would be welcome.
>>>>>
>>>>> just please look at the glib/demo/poppler-glib-demo get text output from
>>>>> the attached pdf, even of the fist page.
>>>>>
>>>>>
>>>>> thanks in advance,
>>>>>
>>>>> alex
>>>>>
>>>>> On 09/18/2011 05:09 PM, Albert Astals Cid wrote:
>>>>>
>>>>>> Please do not email me, email the list.
>>>>>>
>>>>>> A Diumenge, 18 de setembre de 2011, vàreu escriure:
>>>>>>> On 09/18/2011 02:41 PM, Albert Astals Cid wrote:
>>>>>>>> A Diumenge, 18 de setembre de
>>>>>>>       2011, alex bodnaru vàreu escriure:
>>>>>>>
>>>>>>> hello friends,
>>>>>>>
>>>>>>>> Hi
>>>>>>> thanks a lot albert for considering my problem.
>>>>>> I am not considering your problem, I am complaining about the lack of 
>>>>>> information in your original mail ;-)
>>>>>>
>>>>>>> i'm using poppler through python (that invokes glib interface).
>>>>>>>
>>>>>>> a recent change (probably together with get_text separation) broke the glib
>>>>>>> interface.
>>>>>>>
>>>>>>>> what does recent mean? 0.16.7? 0.17.x? git master?
>>>>>>> 0.16.7.
>>>>>> So 0.16.7 does not work, which is the version you know it works?
>>>>>>
>>>>>> Albert
>>>>>>
>>>>>> P.S: Would it be possible for you not to send HTML email?
>>>>>>
>>>>>>>> Albert
>>>>>>> thanks again,
>>>>>>> alex
>>>>>>>
>>>>>>> i can't load the entire page text with get_text (see the glib demo) of one
>>>>>>> pdf i have, but pdftotext does output the entire text.
>>>>>>>
>>>>>>> my pdf is attached. i apology for the language, but i promise it's a non
>>>>>>> offending cadastre report. please see that not all text lines are being
>>>>>>> output by get_text.
>>>>>>>
>>>>>>> could you help?
>>>>>>>
>>>>>>> thanks in advance,
>>>>>>>
>>>>>>> alex
>>>>>>>
>>>>>>>       _______________________________________________
>>>>>>>
>>>>>>>       > poppler mailing list
>>>>>>>       > 
>>>>>>>       > poppler at lists.freedesktop.org
>>>>>>>       > 
>>>>>>>       > http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>> _______________________________________________
>>>>>> poppler mailing list
>>>>>> poppler at lists.freedesktop.org
>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> poppler mailing list
>>>>> poppler at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>
>>>> -- 
>>>> Marc J. Driftmeyer
>>>> Email :: mjd at reanimality.com <mailto:mjd at reanimality.com>
>>>> Web :: http://www.reanimality.com
>>>> Cell :: (509) 435-5212
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> poppler mailing list
>>> poppler at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>
>> -- 
>> Marc J. Driftmeyer
>> Email :: mjd at reanimality.com <mailto:mjd at reanimality.com>
>> Web :: http://www.reanimality.com
>> Cell :: (509) 435-5212
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: page_get_text1.diff
Type: text/x-diff
Size: 863 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20111010/f88d149f/attachment.diff>


More information about the poppler mailing list