[Poppler-bugs] [Bug 27450] New: fails to save PDF form data properly when PDF has object streams

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Apr 4 06:26:11 PDT 2010


https://bugs.freedesktop.org/show_bug.cgi?id=27450

           Summary: fails to save PDF form data properly when PDF has
                    object streams
           Product: poppler
           Version: unspecified
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: general
        AssignedTo: poppler-bugs at lists.freedesktop.org
        ReportedBy: carlosgc at gnome.org


Bug forwarded from Evince: https://bugzilla.gnome.org/show_bug.cgi?id=614740

"I haven't tried reproducing this in a newer version of evince (since my debian
unstable system has gnome 2.28), but the problem is easy to reproduce and test.

evince saves PDF form data by appending to the PDF, which is a perfectly valid
way to do it, but it makes a few mistakes in appending to the file when object
streams are used.  The effect is that the resulting file loads into evince
without the form data, and Adobe Reader can't open the file at all.  This bug
report uses qpdf (http://qpdf.sourceforge.net) to check and manipulate the PDF
file.  qpdf is available in Debian and Ubuntu or can be downloaded from
sourceforge.  It has only pcre and zlib as external dependencies.

The first attached pdf file (form1.pdf) can be downloaded from here:

http://www.soest.hawaii.edu/gg/isotope_biogeochem/Samplerequest.htm

This file contains no object streams even though it is a PDF 1.5 file.  Filling
in the form and saving it works fine.  The resulting file is appended, and the
/Size field of the trailer dictionary is set properly to 1 more than the
highest numbered object.  Everything is fine.  The file is attached as
form1-saved.pdf.

Now consider the same file with object streams.  You can get this with

qpdf --object-streams=generate form1.pdf form2.pdf

This time, there are several problems.  For one thing, the /Size field in the
new trailer dictionary is wrong: it is equal to the highest object number
instead of one above it.  If you run qpdf --check form2-saved.pdf, you get

WARNING: /home/ejb/Documents/form2-saved.pdf: reported number of objects (237)
inconsistent with actual number of objects (238)

When you open the file with evince, you get lots of errors about referencing
invalid or non-existent objects, and the file opens without the form data. 
This happens even if you manually edit the file to change /Size to 238.

The xref table is also pretty messed up.  The generation numbers look to be the
original object stream offset values from the original PDF.  In
form2-saved.pdf, observe lines like

0000053156 00064 n

in the xref table and corresponding objects like 66 64 obj.  If you manually
change all the generation numbers to 0 in both the xref table and in the PDF
file themselves, the file is now correct and the saved form data is now
accessible.

So whatever is generating the append data needs to be updated to support object
streams and understand the meanings of the fields in the xref stream,
apparently.

My manually repaired file is form2-fixed.pdf.

I will attach the five pdf files momentarily."

I confirm it's reproducible with current git master. Original bug report
contains attachments to test cases.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the Poppler-bugs mailing list