[poppler] Help with pdfimages on USGS maps

Albert Astals Cid aacid at kde.org
Thu Mar 19 15:08:32 PDT 2009


A Dijous, 19 de març de 2009, Phil Endecott va escriure:
> Dear All,
>
> USGS maps can be downloaded, free, from their web site at
> http://store.usgs.gov/.  Here's an example of what you can get (17
> MByte file): http://chezphil.org/tmp/Boston_South_K42071C1_geo.PDF.
> It's a PDF from which pdfimages will happily extract a few hundred JPEGs.
>
> What I'd like to do is to assemble a single large raster image (TIFF,
> JPEG, whatever) at the natural resolution of those embedded images.
> That means assembling those few hundred JPEG images in the right
> pattern.  And I'd like to be able to do that automatically for a large
> number of these files.  So:
>
> - Does pdfimages write out the images in an order that has some
> guaranteed relationship to the position of the images on the page?

I'm almost sure it just outputs them as they are found on the pdf commands 
that has nothing to do with their position on the page.

> - Can pdfimages be hacked to output some hint of the positions of each
> image?

Should not be very difficult, but you need to be a coder to do that.

> - Alternatively, if I have to use e.g. pdftoppm rather than
> pdfimages, can I somehow determine the correct resolution to tell pdftoppm
> to use to get the natural resolution of the embedded images?

I think no, because each image could have a different natural resolution.

Albert

>
> Any suggestions would be much appreciated.  If you're curious, a few
> years ago before the USGS entered the "internet age" a large number of
> digitised maps were obtained in TIFF format on DVDs by the Libre Map
> Project.  Although the intention was to get everything there were a few
> gaps, including the whole of the state of Massachusetts; one theory is
> that that DVD was lost somewhere along the line.  Now, USGS has an
> online store where the maps can be downloaded free - but they're now
> PDFs, not TIFFs.  So I'd like to be able to convert these PDFs into
> TIFFs that can be used alongside the old ones.
>
> There is also the question of the geo-location data which is embedded
> in there too, somehow.  But I'll worry about that later.
>
>
> Many thanks,
>
> Phil.
>
>
>
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler




More information about the poppler mailing list