[poppler] There is a flaw with poppler that needs to be fixed. Deleted annotations are not actually deleted. I require assistance in fixing this.

Zeke Williams lakeleaf8 at gmail.com
Fri Sep 23 19:21:00 UTC 2022


I thank you Leonard for the clarification. Perhaps this option could
be added to poppler as well with incremental updates or save as with
garbage collection to clean out obsolete parts of the document.

On Fri, Sep 23, 2022 at 2:56 PM Leonard Rosenthol <lrosenth at adobe.com> wrote:
>
> Zeke – the issue here isn’t specific to deletion of annotations but is related to the PDF file format and it’s support for “incremental updates”.
>
>
>
> When saving changes to a PDF, they can either be saved by simply appending them as part of an increment update section (which includes not only new or changed objects, but a list of deleted objects).  This is the most common way to save things because it is faster.  You will find that 99% of all PDF processing tools do this by default.
>
>
>
> Alternatively, software could do a “full save” or a “Save As”, where objects no longer in use are “garbage collected”.  Poppler does not offer this option.
>
>
>
> Leonard
>
>
>
> From: poppler <poppler-bounces at lists.freedesktop.org> on behalf of Zeke Williams <lakeleaf8 at gmail.com>
> Date: Friday, September 23, 2022 at 9:20 AM
> To: poppler at lists.freedesktop.org <poppler at lists.freedesktop.org>
> Subject: [poppler] There is a flaw with poppler that needs to be fixed. Deleted annotations are not actually deleted. I require assistance in fixing this.
>
> EXTERNAL: Use caution when clicking on links or opening attachments.
>
>
> I require assistance as I am not a very proficient C++ programmer with
> this issue with poppler. What happens with poppler is that the portion
> of the PDF document that shows the annotation is deleted when you
> delete an annotation in such as okular or evince, but the actual
> contents is in a separate part of the document and that doesn't get
> deleted. Meaning in other words, it's still there. That is a privacy
> violation that should be fixed. I believe this is the part of poppler
> that removes the annotation:
>
> bool Annots::removeAnnot(Annot *annot)
> {
>     auto idx = std::find(annots.begin(), annots.end(), annot);
>
>     if (idx == annots.end()) {
>         return false;
>     } else {
>         annot->decRefCnt();
>         annots.erase(idx);
>         return true;
>     }
> }
>
> And from another PDF reader (PDF4QT) here is how it removes them:
>
> void PDFDocumentBuilder::removeAnnotation(PDFObjectReference page,
> PDFObjectReference annotation)
> {
>     PDFDocumentDataLoaderDecorator loader(&m_storage);
>
>     if (const PDFDictionary* pageDictionary =
> m_storage.getDictionaryFromObject(m_storage.getObjectByReference(page)))
>     {
>         std::vector<PDFObjectReference> annots =
> loader.readReferenceArrayFromDictionary(pageDictionary, "Annots");
>         annots.erase(std::remove(annots.begin(), annots.end(),
> annotation), annots.end());
>
>         PDFObjectFactory factory;
>         factory.beginDictionary();
>         factory.beginDictionaryItem("Annots");
>         if (!annots.empty())
>         {
>             factory << annots;
>         }
>         else
>         {
>             factory << PDFObject();
>         }
>         factory.endDictionaryItem();
>         factory.endDictionary();
>
>         mergeTo(page, factory.takeObject());
>     }
>
>     setObject(annotation, PDFObject());
> }
>
> PDF4QT can be found here: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJakubMelka%2FPDF4QT&data=05%7C01%7Clrosenth%40adobe.com%7C20fe244e683442cbddc008da9d665505%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637995360231784319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NMLVc%2Fbwtyjm0UxtXlqtIEs9eaBU%2BO%2F%2FNaCevw%2F%2Bz8E%3D&reserved=0
>
> What can we do to solve this? I think we should mimic how PDF4QT does
> it. What do you think?


More information about the poppler mailing list