[poppler] Scanned images in PDFs and DPI

Adrian Johnson ajohnson at redneon.com
Thu Feb 9 00:26:59 PST 2012


On 09/02/12 11:30, Ralph wrote:
> Hi Folks,
> 
> This isn't really a poppler issue, but I was hoping that someone on
> the list might have some experience dealing with this edge case.
> 
> The PDFs that we process with pdftoppm are regularly sized 42" x 30"
> and often contain scans of rasterized data.  Once in a while,
> whomever made the scans screws up and puts in 8.5 x 11, even though
> it's *actually* 42" x 30".  Obviously the 150 dpi is far too low and
> the output quality is horrible.
> 
> There's a slight chance that the PDFs might actually be 8.5" x 11" so
> doing a brute force DPI increase might not be a good idea.  Does
> anyone have any good workflows for these sorts of mess-ups?  Or is
> scaling the DPI based on the size difference (544 x 388 -> 3168x2448)
> the only approach?

You could use pdftoimages to extract the embedded images and check the
resolution.

> 
> Here's the pdfinfo dump of the failure: ---- Tagged: no Pages: 1 
> Encrypted: no Page size: 544.32 x 388.8 pts Page rot: 0 File size:
> 881061 bytes Optimized: no
> 
> PDF version: 1.6 ----
> 
> Here's a pdf dump of what we expect: ---- Pages: 1 Encrypted: no Page
> size: 3168 x 2448 pts Page rot: 0 File size: 523131 bytes Optimized:
> no PDF version: 1.4
> 
> 
> ----
> 
> I'd be happy to mail anyone the original files if needed (there
> semi-sensitive, so I didn't want to post them to the list).  Thanks
> for anything :)  We *really* enjoy using poppler.
> 



More information about the poppler mailing list