[Poppler-bugs] [Bug 64815] [TAGGEDPDF] Parse the Tagged-PDF document structure tree when present

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Jun 16 02:25:39 PDT 2013


https://bugs.freedesktop.org/show_bug.cgi?id=64815

--- Comment #10 from Carlos Garcia Campos <carlosgc at gnome.org> ---
(In reply to comment #9)
> > Maybe you can split the patch in those 3 things? to make it a bit easier to
> > review?
> 
> Ouch, the commit message does not really describe well the patch after
> the rebasing/squashing done prior to uploading. But I can split it up
> in three logical parts:
> 
> - Defining the StructElement class and parsing of the corresponding
>   objects from the PDF.
> - Defining the Attribute class and parsing the corresponding objects
>   from the PDF.
> - Text content extraction, a.k.a. the machinery needed for
>   StructElement::getText() and StructElement::getMCOps()
> 
> If there's no objections, I would upload an updated version with the
> patch split in three mentioned above soon.

Sounds good to me, I think I would split it even more, since the patch is
mixing 2 things, document logical structure and tagged PDFs. The first patch
could define the StructTreeRoot and StructTreeElement without any tagged PDF
info. And an top of logical structure support, add tagged PDF implementation.
I'll do a first review of the current patch anyway, but upload a new version
split.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20130616/8103db5d/attachment.html>


More information about the Poppler-bugs mailing list