[Libreoffice-bugs] [Bug 142359] New: ACCESSIBILITY: Language tagging is lost when merging LO generated PDFs with Acrobat

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Tue May 18 17:37:00 UTC 2021


https://bugs.documentfoundation.org/show_bug.cgi?id=142359

            Bug ID: 142359
           Summary: ACCESSIBILITY: Language tagging is lost when merging
                    LO generated PDFs with Acrobat
           Product: LibreOffice
           Version: 3.3.0 release
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Keywords: accessibility, filter:pdf
          Severity: normal
          Priority: medium
         Component: Printing and PDF export
          Assignee: libreoffice-bugs at lists.freedesktop.org
          Reporter: devseppala at gmail.com

When LibreOffice generated multilingual accessible PDF files are merged using
Adobe Acrobat, the language information in document tag structure is lost.

To my understanding, this happens because there are two ways to do language
tagging in PDF files:

https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=619

* Structure elements of any type, through a Lang entry in the structure element
dictionary. 

* Marked-content sequences that are not in the structure hierarchy, through a
Lang entry in a property list attached to the marked-content sequence with a
Span tag.

I think that LibreOffice uses the former strategy, where as Word uses the
latter. When merging Word generated PDF-files with Acrobat the language
information is retained and when merging LibreOffice generated files the
language information is lost.

The real problem if of course that Acrobat does not support PDF-standard
properly and it should fix their software.

However, it is the de facto tool for editing PDF-files and I think many users
have to merge their LibreOffice generated PDF-document with other documents
using Acrobat. This Acrobat incompatibility will result to a lot of
multilingual documents not being properly accessible. This is problematic also,
because normal accessibility checkers can not even detect that multilingual
documents are not properly language tagged, they only check that a document
level language property exists. So, in many cases language tagging will be
silently lost.

Could LibreOffice also support the language tagging method favoured by Acrobat,
in addition to the current method. I think this would resolve this issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210518/df14f7e8/attachment.htm>


More information about the Libreoffice-bugs mailing list