[poppler] Scanned images in PDFs and DPI
Adrian Johnson
ajohnson at redneon.com
Thu Feb 9 00:26:59 PST 2012
On 09/02/12 11:30, Ralph wrote:
> Hi Folks,
>
> This isn't really a poppler issue, but I was hoping that someone on
> the list might have some experience dealing with this edge case.
>
> The PDFs that we process with pdftoppm are regularly sized 42" x 30"
> and often contain scans of rasterized data. Once in a while,
> whomever made the scans screws up and puts in 8.5 x 11, even though
> it's *actually* 42" x 30". Obviously the 150 dpi is far too low and
> the output quality is horrible.
>
> There's a slight chance that the PDFs might actually be 8.5" x 11" so
> doing a brute force DPI increase might not be a good idea. Does
> anyone have any good workflows for these sorts of mess-ups? Or is
> scaling the DPI based on the size difference (544 x 388 -> 3168x2448)
> the only approach?
You could use pdftoimages to extract the embedded images and check the
resolution.
>
> Here's the pdfinfo dump of the failure: ---- Tagged: no Pages: 1
> Encrypted: no Page size: 544.32 x 388.8 pts Page rot: 0 File size:
> 881061 bytes Optimized: no
>
> PDF version: 1.6 ----
>
> Here's a pdf dump of what we expect: ---- Pages: 1 Encrypted: no Page
> size: 3168 x 2448 pts Page rot: 0 File size: 523131 bytes Optimized:
> no PDF version: 1.4
>
>
> ----
>
> I'd be happy to mail anyone the original files if needed (there
> semi-sensitive, so I didn't want to post them to the list). Thanks
> for anything :) We *really* enjoy using poppler.
>
More information about the poppler
mailing list