[poppler] pdftohtml patch: restore old "raw" command-line option

Albert Astals Cid aacid at kde.org
Wed Jan 7 05:20:25 PST 2009


A Diumenge 05 Octubre 2008, Warren Toomey va escriure:
> pdftohtml used to have a "raw" mode which has been removed. In "raw" mode,
> text from a PDF document is processed in the order that it occurs. However,
> the current version of pdftohtml reorders the text to be in increasing
> y-value, i.e. from the top of a page going down to the bottom.
>
> This text reordering plays merry havoc with multi-column pages, as the text
> from the columns becomes interleaved instead of remaining separate.
> The attached patch restores the -raw command-line option to pdftohtml. The
> program retains its current behaviour if the -raw option is not used, but
> reverts to the "text as it appears" behaviour with the -raw option enabled.

I've had a look at all the pdftohtml tarballs present at 
http://sourceforge.net/project/showfiles.php?group_id=45839 and none of them 
had the raw option enabled for the user to use. Are you sure this is ok to 
enable?

Albert

>
> Cheers,
>         Warren




More information about the poppler mailing list