<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Tagged-PDF: LBody tag is not supported"
href="https://bugs.freedesktop.org/show_bug.cgi?id=67710">67710</a>
</td>
</tr>
<tr>
<th>Assignee</th>
<td>poppler-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Summary</th>
<td>Tagged-PDF: LBody tag is not supported
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Reporter</th>
<td>apinheiro@igalia.com
</td>
</tr>
<tr>
<th>Hardware</th>
<td>Other
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Component</th>
<td>general
</td>
</tr>
<tr>
<th>Product</th>
<td>poppler
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=83578" name="attach_83578" title="Dumb test that can be used to reproduce the bug">attachment 83578</a> <a href="attachment.cgi?id=83578&action=edit" title="Dumb test that can be used to reproduce the bug">[details]</a></span>
Dumb test that can be used to reproduce the bug
STEPS TO REPRODUCE IT:
1. Use patches on <a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [TAGGEDPDF] Provide some way of dumping the document structure"
href="show_bug.cgi?id=64816">bug 64816</a> in order to have a tool to scan tagged pdf (note:
support to get that scanned is already on master)
2. Use one of those tools (ie: pdfinfo -struct-text) and scan the document
attached with this bug report.
EXPECTED OUTCOME:
Document properly parsed without warnings, structure and content properly
printed
ACTUAL OUTCOME:
Executing pdfinfo -struc-text (and fwiw. pdfstructhtml) prints the following
warnings:
Syntax Error: StructElem object is wrong type (LBody)
Syntax Error: StructElem object is wrong type (LBody)
Syntax Error: StructElem object is wrong type (LBody)
The text of the list items are not properly extracted/printed
EXTRA NOTES:
I already checked that the problem is not at the tools, but at the core
tagged-pdf. Specifically, with <a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [TAGGEDPDF] Parse the Tagged-PDF document structure tree when present"
href="show_bug.cgi?id=64815">bug 64815</a>, StructElement was added, with a
typeMap structure with all the tags supported. LBody was missing. LBody is a
valid tag, defined at page 586 of the reference (PDF32000_2008.pdf).</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>