[poppler] Scanned images in PDFs and DPI

Ralph ralph at plangrid.com
Wed Feb 8 17:00:33 PST 2012


Hi Folks, 

This isn't really a poppler issue, but I was hoping that someone on the list might have some experience dealing with this edge case.

The PDFs that we process with pdftoppm are regularly sized 42" x 30" and often contain scans of rasterized data.  Once in a while, whomever made the scans screws up and puts in 8.5 x 11, even though it's *actually* 42" x 30".  Obviously the 150 dpi is far too low and the output quality is horrible.  

There's a slight chance that the PDFs might actually be 8.5" x 11" so doing a brute force DPI increase might not be a good idea.  Does anyone have any good workflows for these sorts of mess-ups?  Or is scaling the DPI based on the size difference (544 x 388 -> 3168x2448) the only approach?

Here's the pdfinfo dump of the failure:
----
Tagged: no
Pages: 1
Encrypted: no
Page size: 544.32 x 388.8 pts
Page rot: 0
File size: 881061 bytes
Optimized: no

PDF version: 1.6 
----

Here's a pdf dump of what we expect:
----
Pages: 1
Encrypted: no
Page size: 3168 x 2448 pts
Page rot: 0
File size: 523131 bytes
Optimized: no
PDF version: 1.4


----

I'd be happy to mail anyone the original files if needed (there semi-sensitive, so I didn't want to post them to the list).  Thanks for anything :)  We *really* enjoy using poppler. 

-- 
Ralph




More information about the poppler mailing list