[poppler] Compatibility between poppler's pdfunite and JHOVE.

Russell McOrmond Russell.McOrmond at canadiana.ca
Fri Apr 7 19:44:59 UTC 2017


Replying to https://lists.freedesktop.org/archives/poppler/2017-April/012147.html
, On Fri, Apr 7, 2017 at 1:33 PM, Leonard Rosenthol
<lrosenth at adobe.com> wrote:

> Can I assume that you are aware that JHOVE is NOT a PDF validator in any way?  In addition, it’s support for modern PDF feature is quite out of date!  And their own site (<http://jhove.openpreservation.org/modules/pdf/>) says as much.  I suspect that if you ran these files through a more thorough PDF validation, such as the one in Adobe Acrobat Pro, it would not report any problems.
>
> Leonard


  Canadiana runs a preservation platform. We want to identify and
disallow files that aren't encoded in the publicly documented format
or that use features that aren't appropriate for long-term
preservation (PDF/A).  What we need is a tool to take multiple PDF/A
files and join them together, with the result also being a PDF/A file.
This is something I presumed pdfunite could do, but that might not be
the case.

  If it turns out that poppler is using features that are
inappropriate for preservation then this would mean we need to
discontinue our use of poppler.  In that case messaging from JHOVE
would be helpful to know that the problem is with a specific feature
that poppler is using (the current messaging isn't very helpful).  At
this point I do not know if the problem is in poppler or JHOVE (or
both).


http://verapdf.org/software/ is far more verbose in its XML output,
but I didn't think its messages would be as helpful.  The text format
output offers a simple pass/fail.

russell at russell-desktop:/opt/wip/Temp/rwm$ verapdf --format text
MississaugaNews_2/0001.pdf
PASS /opt/wip/Temp/rwm/MississaugaNews_2/0001.pdf
russell at russell-desktop:/opt/wip/Temp/rwm$ verapdf --format text
MississaugaNews_2/0002.pdf
PASS /opt/wip/Temp/rwm/MississaugaNews_2/0002.pdf
russell at russell-desktop:/opt/wip/Temp/rwm$ verapdf --format text pdfunite.pdf
FAIL /opt/wip/Temp/rwm/pdfunite.pdf
russell at russell-desktop:/opt/wip/Temp/rwm$

-- 
System Administration and software developer,
Canadiana.org   http://www.canadiana.ca


More information about the poppler mailing list