[poppler] Problem with pdfimages

Adrian Johnson ajohnson at redneon.com
Tue Dec 30 03:02:15 PST 2008


Jean-Claude REPETTO wrote:
> Adrian Johnson wrote :
>>
>> The PDF files do not contain any images (except for the thumbnail 
>> images). The scanned image has been converted a vector format by 
>> drawing lots of closely spaced parallel lines. You can see this if you 
>> zoom in on the page. I assume this is for printing on a pen plotter.
>>
>> You can create an image from each PDF page with pdftoppm. After 
>> converting to PNG this results in a much smaller and faster to render 
>> image but it does lose the shading effects that were created by the 
>> hatching.
>>
> 
> Hello Adrian,
> 
> Thanks for the explanation. Is there a tool I could use to display the 
> number of parallel lines and their coordinates ?
> 
> Thanks,
> Jean-Claude

I used pdftk to uncompress the PDF then opened it in emacs. The content 
stream contains several million lines similar to the following:

   1420.32 1983.558 m
   1420.56 1983.558 l
   1420.32 1983.318 m
   1420.56 1983.318 l
   1542 1981.158 m
   1542.24 1981.158 l
   747.36 1978.279 m
   747.6 1978.279 l
   S

"x y m" starts a new subpath setting the current point to x, y
"x y l" appends a line from the current point to x, y
S       strokes the path




More information about the poppler mailing list