[poppler] reconsideration about iconv() in cpp frontend

Adam Reichold adam.reichold at t-online.de
Wed Apr 18 05:31:47 UTC 2018


Hello,

Am 18.04.2018 um 06:35 schrieb suzuki toshiya:
> Albert Astals Cid wrote:
>>>> Sure, is there any system where sizeof(unsigned short) != 2 anyway?
>>> Maybe people feel "such system is out of scope", but there were:
>>> https://web.archive.org/web/20090408221917/http://www.zib.de/benger/hdf5/Dat
>>> atypes.html according to the description for "size_t H5Tget_precision (hid_t
>>> type)", there is a note saying
>>> "For instance, a short on a Cray is 32 significant bits in an eight-byte
>>> field."
>>
>> Yeah we don't care about Cray :D
> 
> So, I attached a revised patch which replacing ustring as
> std::basic_string<uint16_t>, and removing conditionalized
> codes for the platform where (uint16_t != unsigned short).
> 
> How about this?

I like it. Some minor remarks:

* Is this the only user of iconv within Poppler? If so, we could remove
it from the build system as wel.

* In "ustring::to_utf8":
  - "utf16_end" is only used to compute the size once, maybe just use
"size()" to call "utf16ToUtf8"
  - "utf16ToUtf8" returns how many bytes were written, so no need to
call "std::strlen" again

* In "ustring::to_latin1":
  - for readability, I'd suggest either iterating both "me" and "ret"
using a parallel index "i" or via pointer arithmetic, but not mixing it

* In "has_bom_utf8":
  - The spacing of the first "if" is a bit off
  - Instead of "if(foo) { return true; } return false;", just do "return
foo;"

* In "ustring::from_utf8":
  - Three times "skip BOM" is probably a bit on the noisy side ;-)
  - Why no additional call to "ret.resize" as for "to_utf8"? Or why the
additional call there?

Just for my understanding, this also contains another important fix that
is unrelated to the iconv usage?

-            u = data[i] & 0xff;
+            u = pdfDocEncoding[ *data & 0xff ];

Maybe the changes to "detail::unicode_GooString_to_ustring" could be a
separate commit?

> Regards,
> mpsuzuki

Best regards,
Adam

> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/poppler
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/poppler/attachments/20180418/83c5fce7/attachment.sig>


More information about the poppler mailing list