[poppler] There is a flaw with poppler that needs to be fixed. Deleted annotations are not actually deleted. I require assistance in fixing this.

Albert Astals Cid aacid at kde.org
Sun Sep 25 18:47:42 UTC 2022


El divendres, 23 de setembre de 2022, a les 20:56:09 (CEST), Leonard Rosenthol 
va escriure:
> Zeke – the issue here isn’t specific to deletion of annotations but is
> related to the PDF file format and it’s support for “incremental updates”.
> 
> When saving changes to a PDF, they can either be saved by simply appending
> them as part of an increment update section (which includes not only new or
> changed objects, but a list of deleted objects).  This is the most common
> way to save things because it is faster.  You will find that 99% of all PDF
> processing tools do this by default.
> 
> Alternatively, software could do a “full save” or a “Save As”, where objects
> no longer in use are “garbage collected”.  Poppler does not offer this
> option.

Poppler does have a non incremental save option.

Cheers,
  Albert

> 
> Leonard
> 
> From: poppler <poppler-bounces at lists.freedesktop.org> on behalf of Zeke
> Williams <lakeleaf8 at gmail.com> Date: Friday, September 23, 2022 at 9:20 AM
> To: poppler at lists.freedesktop.org <poppler at lists.freedesktop.org>
> Subject: [poppler] There is a flaw with poppler that needs to be fixed.
> Deleted annotations are not actually deleted. I require assistance in
> fixing this. EXTERNAL: Use caution when clicking on links or opening
> attachments.
> 
> 
> I require assistance as I am not a very proficient C++ programmer with
> this issue with poppler. What happens with poppler is that the portion
> of the PDF document that shows the annotation is deleted when you
> delete an annotation in such as okular or evince, but the actual
> contents is in a separate part of the document and that doesn't get
> deleted. Meaning in other words, it's still there. That is a privacy
> violation that should be fixed. I believe this is the part of poppler
> that removes the annotation:
> 
> bool Annots::removeAnnot(Annot *annot)
> {
>     auto idx = std::find(annots.begin(), annots.end(), annot);
> 
>     if (idx == annots.end()) {
>         return false;
>     } else {
>         annot->decRefCnt();
>         annots.erase(idx);
>         return true;
>     }
> }
> 
> And from another PDF reader (PDF4QT) here is how it removes them:
> 
> void PDFDocumentBuilder::removeAnnotation(PDFObjectReference page,
> PDFObjectReference annotation)
> {
>     PDFDocumentDataLoaderDecorator loader(&m_storage);
> 
>     if (const PDFDictionary* pageDictionary =
> m_storage.getDictionaryFromObject(m_storage.getObjectByReference(page)))
>     {
>         std::vector<PDFObjectReference> annots =
> loader.readReferenceArrayFromDictionary(pageDictionary, "Annots");
>         annots.erase(std::remove(annots.begin(), annots.end(),
> annotation), annots.end());
> 
>         PDFObjectFactory factory;
>         factory.beginDictionary();
>         factory.beginDictionaryItem("Annots");
>         if (!annots.empty())
>         {
>             factory << annots;
>         }
>         else
>         {
>             factory << PDFObject();
>         }
>         factory.endDictionaryItem();
>         factory.endDictionary();
> 
>         mergeTo(page, factory.takeObject());
>     }
> 
>     setObject(annotation, PDFObject());
> }
> 
> PDF4QT can be found here:
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co
> m%2FJakubMelka%2FPDF4QT&data=05%7C01%7Clrosenth%40adobe.com%7C20fe244e68
> 3442cbddc008da9d665505%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C63799536
> 0231784319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBT
> iI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NMLVc%2Fbwtyjm0UxtXlqtIE
> s9eaBU%2BO%2F%2FNaCevw%2F%2Bz8E%3D&reserved=0
> 
> What can we do to solve this? I think we should mimic how PDF4QT does
> it. What do you think?






More information about the poppler mailing list