[poppler] pdftotext -bbox page size
Albert Astals Cid
aacid at kde.org
Fri Nov 6 23:21:44 UTC 2020
Tom, you suggested the reverse change 9 years ago
https://gitlab.freedesktop.org/poppler/poppler/-/commit/807c1df2bf79c7c6378390b41dc230d80533ae3f
Do you remember why? Anything that William may be missing?
Cheers,
Albert
El divendres, 6 de novembre de 2020, a les 21:22:40 CET, William Bader va escriure:
> I have a PDF with a TrimBox and a CropBox that do not start at the origin.
> It looks pdftotext -bbox writes the maximum extent of the MediaBox into the <page> element instead of writing the page size.
> Can I submit a patch to change it, or to add an option to change it, or to write more values in the <page width=... depth=...> line?
> For example, I have a pdf where pdfinfo -box says
> Page size: 84.95 x 2210.75 pts
> Page rot: 0
> MediaBox: 0.00 0.00 84.95 2310.50
> CropBox: 0.00 99.75 84.95 2310.50
> BleedBox: 0.00 99.75 84.95 2310.50
> TrimBox: 0.00 99.75 84.95 2310.50
> ArtBox: 0.00 99.75 84.95 2310.50
> but pdftotext -bbox writes
> <page width="84.950000" height="2310.500000">
> <word xMin="13.350000" yMin="0.322500" xMax="34.018450" yMax="6.466000">NOTICE</word>
> ...
> <word xMin="22.548900" yMin="2197.922500" xMax="36.779600" yMax="2204.066000">2010)</word>
> </page>
> so when I assemble the page from the words, I have an extra 99.75 points of empty space at the bottom.
>
> Regards, William
>
>
More information about the poppler
mailing list