<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta http-equiv="Content-Type" content="text/html; charset=Windows-1252"> <meta name="Generator" content="Microsoft Word 15 (filtered medium)"> <style></style> </head> <body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word"> <div class="WordSection1"> Zeke – the issue here isn’t specific to deletion of annotations but is related to the PDF file format and it’s support for “incremental updates”.<o:p></o:p> <o:p> </o:p> When saving changes to a PDF, they can either be saved by simply appending them as part of an increment update section (which includes not only new or changed objects, but a list of deleted objects). This is the most common way to save things because it is faster. You will find that 99% of all PDF processing tools do this by default.<o:p></o:p> <o:p> </o:p> Alternatively, software could do a “full save” or a “Save As”, where objects no longer in use are “garbage collected”. Poppler does not offer this option.<o:p></o:p> <o:p> </o:p> Leonard<o:p></o:p> <o:p> </o:p> <div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in"> From: poppler <poppler-bounces@lists.freedesktop.org> on behalf of Zeke Williams <lakeleaf8@gmail.com> Date: Friday, September 23, 2022 at 9:20 AM To: poppler@lists.freedesktop.org <poppler@lists.freedesktop.org> Subject: [poppler] There is a flaw with poppler that needs to be fixed. Deleted annotations are not actually deleted. I require assistance in fixing this.<o:p></o:p> </div> <div> EXTERNAL: Use caution when clicking on links or opening attachments. I require assistance as I am not a very proficient C++ programmer with this issue with poppler. What happens with poppler is that the portion of the PDF document that shows the annotation is deleted when you delete an annotation in such as okular or evince, but the actual contents is in a separate part of the document and that doesn't get deleted. Meaning in other words, it's still there. That is a privacy violation that should be fixed. I believe this is the part of poppler that removes the annotation: bool Annots::removeAnnot(Annot *annot) { auto idx = std::find(annots.begin(), annots.end(), annot); if (idx == annots.end()) { return false; } else { annot->decRefCnt(); annots.erase(idx); return true; } } And from another PDF reader (PDF4QT) here is how it removes them: void PDFDocumentBuilder::removeAnnotation(PDFObjectReference page, PDFObjectReference annotation) { PDFDocumentDataLoaderDecorator loader(&m_storage); if (const PDFDictionary* pageDictionary = m_storage.getDictionaryFromObject(m_storage.getObjectByReference(page))) { std::vector<PDFObjectReference> annots = loader.readReferenceArrayFromDictionary(pageDictionary, "Annots"); annots.erase(std::remove(annots.begin(), annots.end(), annotation), annots.end()); PDFObjectFactory factory; factory.beginDictionary(); factory.beginDictionaryItem("Annots"); if (!annots.empty()) { factory << annots; } else { factory << PDFObject(); } factory.endDictionaryItem(); factory.endDictionary(); mergeTo(page, factory.takeObject()); } setObject(annotation, PDFObject()); } PDF4QT can be found here: <a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJakubMelka%2FPDF4QT&data=05%7C01%7Clrosenth%40adobe.com%7C20fe244e683442cbddc008da9d665505%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637995360231784319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NMLVc%2Fbwtyjm0UxtXlqtIEs9eaBU%2BO%2F%2FNaCevw%2F%2Bz8E%3D&reserved=0"> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJakubMelka%2FPDF4QT&data=05%7C01%7Clrosenth%40adobe.com%7C20fe244e683442cbddc008da9d665505%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637995360231784319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NMLVc%2Fbwtyjm0UxtXlqtIEs9eaBU%2BO%2F%2FNaCevw%2F%2Bz8E%3D&reserved=0</a> What can we do to solve this? I think we should mimic how PDF4QT does it. What do you think?<o:p></o:p> </div> </div> </body> </html>