[poppler] Black boxes & Poppler

Thomas Freitag Thomas.Freitag at kabelmail.de
Sat Feb 18 07:28:44 PST 2012


Am 15.02.2012 09:15, schrieb Thomas Freitag:
> Am 14.02.2012 23:17, schrieb Albert Astals Cid:
>> El Dimarts, 14 de febrer de 2012, a les 20:35:29, Thomas Freitag va 
>> escriure:
>>> Hi Albert!
>>>
>>> Am 14.02.2012 18:07, schrieb Ralph Gootee:
>>>> Hi Thomas!
>>>>
>>>> Thanks for the help!
>>>>
>>>> Steps to reproduce
>>>>
>>>> 1) split the pdf with pdfseparate
>>>> 2) use pdftopdf to convert the output to png
>>>>
>>>> Also, the PDF errors out acrobat after separation.  It's a little
>>>> confusing but there's already black boxes in the pdf (from redaction)
>>>> the black boxes will show up in the middle after pdftoppm.
>>>>
>>>> We're really really happy with poppler, thanks for helping to make 
>>>> such
>>>> an awesome lib!
>>> We have two problems with it, one is a general problem coming from the
>>> merge:
>>>
>>> a) xRef->getNumObjects() will no more work with the changes from our/my
>>> merge in PDFDoc::writePageObjects, 'cause last is not set here. We need
>>> to use xRef->getSize().
>> We use getNumObjects in a lot of other places, aren't those affected 
>> too?
>> Shouldn't we just revert getNumObjects to do what it did? i.e. kill 
>> the last
>> variable and just return size? What's the benefit of this last variable?
> in the other places getNumOnjects() will work: last is filled during 
> creating the XRef table in readXRefTable, and has the number of the 
> last valid xref, where as size is the number of allocated xrefs.
> This optimization is coming from the merge. The problem in 
> writePageObjects is the the xref table wasn't read but is indirect 
> created, therefore we must use size there, also because xrefs are here 
> not created necessary in ascending order.
> In short terms: in all other cases it is okay to use getNumOnjects.
I regtest the patch. All tests pass:
1124 tests passed (100.00%)
For completeness I attach the patch, which I regtested, once again.

Cheers,
Thomas
>
> Thomas
>>
>> Albert
>>
>>> b) CCITTFaxStream and DCTStream are enherited by FlateStream, and the
>>> FlateStream::reset "eats" the first two bytes. Therefore a call of
>>> unfilteredReset will not work in pdfseparate. As far as I can see,
>>> unfilteredReset is just called by PDFDoc::writeRawStream (or in
>>> Stream.cc itself), therefore I think my changes in the attached patch
>>> are safe.
>>>
>>> a) is the reason why I send this patch immediately and do not wait 
>>> until
>>> the weekend: pdfseparate and pdfunite will no more work on the HEAD
>>> revision.
>>>
>>> @Ralph: You need to check out the head revision and apply this 
>>> patch, if
>>> You want to test it immediately.
>>>
>>> Cheers,
>>> Thomas
>>>
>>>> Cheers,
>>>> Ralph G.
>> _______________________________________________
>> poppler mailing list
>> poppler at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>
>> .
>>
>
>
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
>
> .
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pdfseparate.diff
Type: text/x-patch
Size: 2413 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20120218/5f848894/attachment.bin>


More information about the poppler mailing list