[poppler] c++ ustring encoding still completely broken

Adam Reichold adam.reichold at t-online.de
Sun Dec 2 11:51:38 UTC 2018


Hello,

Am 02.12.18 um 00:06 schrieb Albert Astals Cid:
> El dissabte, 1 de desembre de 2018, a les 23:20:46 CET, Jeroen Ooms va escriure:
>> I maintain the poppler bindings for the R programming language and get
>> a lot of bug reports about corrupted text extracted with poppler.
>> Below a minimal example that illustrates the problem:
>>
>>   git clone https://github.com/jeroen/popplertest
>>   cd popplertest
>>   g++ -std=c++11 encoding.cpp -o encoding $(pkg-config --cflags --libs
>> poppler-cpp)
>>   ./encoding hello.pdf
>>
>> The output shows a lot of Chinese characters which is incorrect (all
>> text in the pdf is english).
>>
>> Back in March 2018, Suzuki Toshiya had posted a patch with at least a
>> partial solution:
>> https://lists.freedesktop.org/archives/poppler/2018-March/012962.html
>> . I hope we can revisit this.
> 
> Can someone please post a patch to the new gitlab merge requests? It's muuuuuch easier to keep track of what needs reviewing if we have it all there.

Created !129 [1]. Probably a big improvement but I am not completely
convinced that this is all there is to do.

Best regards,
Adam

[1] https://gitlab.freedesktop.org/poppler/poppler/merge_requests/129

> Cheers,
>   Albert
> 
>> _______________________________________________
>> poppler mailing list
>> poppler at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/poppler
>>
> 
> 
> 
> 
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/poppler
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/poppler/attachments/20181202/05834b32/attachment.sig>


More information about the poppler mailing list