[poppler] Utils - Fix UTF-16 file name on Windows environment

Thibaut Brard thibaut.brard at gmail.com
Mon Jun 25 06:37:52 UTC 2018

So I guess we can't use the patch as it.
On my side, since I applied it, I haven't faced any problem on Windows
environment but I'm only using pdftohtml -xml from poppler. I might amend
the patch to scope the modification to the part I use.
Regarding VMS (I did not even know what it was before this conversation!),
I might manage the 'VMS' directive in the gfile.cc to keep in that case the
use of fopen

I just have a look to GooFile. As I understand, Martin suggests to use it
instead of gfile.cc, am I right?

Cordialement / Regards
Thibaut Brard

2018-06-24 23:43 GMT+02:00 William Bader <williambader at hotmail.com>:

> I haven't used VMS for about two decades, but I remember that the extra
> options to fopen() are important. Without them, I think that you end up
> with a record oriented file instead of a stream file, and stdio might not
> work as you expect, for example, writing '\n' might close the current
> record instead of writing a 0x0A byte. The full list of parameters is in
> the entry for creat() at http://h41379.www4.hpe.com/commercial/c/docs/5763
> p021.html#index_x_600
> HP sold VMS to a company that is porting it to Intel x86-64, so VMS might
> soon be easier to run. http://www.openvms.org/node/107
> Regards, William
> ------------------------------
> *From:* poppler <poppler-bounces at lists.freedesktop.org> on behalf of
> Martin (gzlist) <gzlist at googlemail.com>
> *Sent:* Sunday, June 24, 2018 8:05 AM
> *To:* poppler at lists.freedesktop.org
> *Subject:* Re: [poppler] Utils - Fix UTF-16 file name on Windows
> environment
> On 19/06/2018, Albert Astals Cid <aacid at kde.org> wrote:
> >
> > It does make some sense given that function exists in the first place
> but my
> > experience with windows is veeeeeeeeeeeeery limited so I'd like for
> someone
> > else to vouch for this before landing it.
> Overall the change might be, but there are some tricky aspects.
> On windows fopen takes a narrow string of variable codepage, and
> openFile takes (sort-of*) utf-8 but the types are not distinguished in
> the codebase so it's tricky to see what callers are providing.
> In cases where the string being used comes straight from say, the
> command line or environment block, it will not be utf-8 so non-ascii
> characters will be mangled. That's probably best fixed by ensuring on
> GooString construction that it's converted to utf-8 but at present
> that's entirely unvalidated.
> Also:
>  #ifdef VMS
> -    f = fopen(fileName->getCString(), "rb", "ctx=stm");
> +    f = openFile(fileName->getCString(), "rb", "ctx=stm");
>  #else
> -    f = fopen(fileName->getCString(), "rb");
> +    f = openFile(fileName->getCString(), "rb");
>  #endif
> Breaks compilation on VMS (if that's still a platform that matters) as
> openFile takes two args only. Oddly, GooFile::open already includes
> this logic but openFile does not. Can just drop the first branch
> change for now.
> Martin
> *sort-of utf-8: gfile.cc has a pretty half-assed utf-8 to utf-16
> conversion algorithm in several places that only correctly handles a
> subset of inputs.
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/poppler
> poppler Info Page - freedesktop.org
> <https://lists.freedesktop.org/mailman/listinfo/poppler>
> lists.freedesktop.org
> Subscribing to poppler: Subscribe to poppler by filling out the following
> form. Use of all freedesktop.org lists is subject to our Code of Conduct.
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/poppler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler/attachments/20180625/517a3082/attachment.html>

More information about the poppler mailing list