[poppler] Using PDF tools to extract links from images?

Wed Jan 9 21:59:47 UTC 2019

El dimecres, 9 de gener de 2019, a les 21:29:59 CET, John LaCour va escriure:
> I have a number of PDFs that have images that are hyperlinks to URIs.    For example, I see this in the PDF file:
> 
> <</Subtype/Link/Rect[ 167.3 515.83 429.43 561.1] /BS<</W 0>>/F 4/A<</Type/Action/S/URI/URI(https://host.com/path/file.php) >>/StructParent 1>>
> 
> I want to extract the URI:  https://host.com/path/file.php
> 
> Easy enough to do parse out the URL if I can get the tools to output something with the URI, but none of the poppler pdf tools seems to spit this out.

maybe pdftohtml?

Cheers,
  Albert

> Is there any way to extract this information with the poppler PDF tools?    Am I missing something?
> 
> Is there a PDF tool that will just give the raw dump of the contents (handling the decompression, field markers, etc.) ?
> 
> Thanks
> John
> 
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/poppler
>