[poppler] extracting images with pdfimages 0.87.0

Valerio Messina efa at iol.it
Sat Mar 28 16:45:52 UTC 2020


hi,
I had some "strange" PDF, here an example:

https://drive.google.com/file/d/1ef_twhqJWRbF54meN5vt8bu1gmTLWkto/view?usp=sharing

that appear do not contain any images:

$ pdfimages0.87.0 -list TheStoryGarden_pag002.pdf
page   num  type   width height color comp bpc  enc interp  object ID 
x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
$ pdfimages0.87.0 -all TheStoryGarden_pag002.pdf TheStoryGarden_pag002
$

No image are extracted.
The PDF seem not protected in any way.
Inkscape and LibreOffice can open and extract the (low res 72 dpi) 
wholepage 624x794 bitmap image, but are not easily scriptable.


As now I solved with a simple:

$ lp -d pdf -o media=custom220x280mm TheStoryGarden_pag002.pdf

that generate another PDF with an image extractable with pdfimages0.87.0
(and repage to the right book format)

I do not understand why pdfimages0.87.0 can't extract the bitmap
What has of strange those PDF? Any hint?


Note: pdfimages0.87.0 is the just built binary, but some prev version do 
the same

-- 
Valerio

-- 
Valerio


More information about the poppler mailing list