[poppler] line brakes and layout for multi-column texts ...

Albert Astals Cid aacid at kde.org
Wed Feb 5 20:27:18 UTC 2020


El dimecres, 5 de febrer de 2020, a les 12:20:10 CET, Albretch Mueller va escriure:
>  pdftotext has the option
> 
> -layout              : maintain original physical layout
> 
>  but pdftohtml doesn't

pdftotext and pdftohtml use different code/algorithms, you'd have to see if one can be adapted/improved for the other.

Cheers,
  Albert

> 
>  $ pdftohtml --help
> pdftohtml version 0.48.0
> Copyright 2005-2016 The Poppler Developers - http://poppler.freedesktop.org
> Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
> Copyright 1996-2011 Glyph & Cog, LLC
> 
> Usage: pdftohtml [options] <PDF-file> [<html-file> <xml-file>]
>   -f <int>              : first page to convert
>   -l <int>              : last page to convert
>   -q                    : don't print any messages or errors
>   -h                    : print usage information
>   -?                    : print usage information
>   -help                 : print usage information
>   --help                : print usage information
>   -p                    : exchange .pdf links by .html
>   -c                    : generate complex document
>   -s                    : generate single document that includes all pages
>   -i                    : ignore images
>   -noframes             : generate no frames
>   -stdout               : use standard output
>   -zoom <fp>            : zoom the pdf document (default 1.5)
>   -xml                  : output for XML post-processing
>   -hidden               : output hidden text
>   -nomerge              : do not merge paragraphs
>   -enc <string>         : output text encoding name
>   -fmt <string>         : image file format for Splash output (png or jpg)
>   -v                    : print copyright and version info
>   -opw <string>         : owner password (for encrypted files)
>   -upw <string>         : user password (for encrypted files)
>   -nodrm                : override document DRM settings
>   -wbt <fp>             : word break threshold (default 10 percent)
>   -fontfullname         : outputs font full name
> $
> ~
>   is it some sort of "hidden" parameter?, or, how do work around it?
> 
>   lbrtchx
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/poppler
> 






More information about the poppler mailing list