[Poppler-bugs] [Bug 64815] [TAGGEDPDF] Parse the Tagged-PDF document structure tree when present

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Jun 20 03:59:51 PDT 2013


https://bugs.freedesktop.org/show_bug.cgi?id=64815

--- Comment #25 from Carlos Garcia Campos <carlosgc at gnome.org> ---
Comment on attachment 81010
  --> https://bugs.freedesktop.org/attachment.cgi?id=81010
[PATCH v5 03/10] Tagged-PDF: Implement parsing of StructElem objects

Review of attachment 81010:
-----------------------------------------------------------------

Please, try to make sure patches build before submitting them.

::: poppler/StructElement.cc
@@ +82,5 @@
> +  { StructElement::Figure,     "Figure",     elementTypeUndefined },
> +  { StructElement::Formula,    "Formula",    elementTypeUndefined },
> +  { StructElement::Form,       "Form",       elementTypeUndefined },
> +  { StructElement::TOC,        "TOC",        elementTypeUndefined },
> +  { StructElement::TOCI,       "TOCI",       elementTypeUndefined },

Isn't all this also part of the standard attributes defined for tagged PDFs? If
we leave the tagged pdf support for a different patch, maybe you can merge this
and the StructTreeRoot into a single patch and you don't need to dummy impl of
StructElement in the previous patch.

@@ +237,5 @@
> +
> +GooString* StructElement::getText(GooString *string, GBool recursive) const
> +{
> +  // TODO: Dummy implementation, complete
> +  return NULL;

Do we really need this? or can it be added in a follow up patch with the proper
code instead?

@@ +338,5 @@
> +
> +  // Language (optional).
> +  if (element->lookup("Lang", &obj)->isString()) {
> +    s->language = obj.getString();
> +    obj.initNull(); // The StructElement takes ownership of the GooString

Same concern I had in previous review.

::: poppler/StructElement.h
@@ +74,5 @@
> +  const GooString *getID() const { return isContent() ? NULL : s->id; }
> +  GooString *getID() { return isContent() ? NULL : s->id; }
> +
> +  // Optional ISO language name, e.g. en_US
> +  GooString *getLang()

getLanguage() is not that long :-)

@@ +75,5 @@
> +  GooString *getID() { return isContent() ? NULL : s->id; }
> +
> +  // Optional ISO language name, e.g. en_US
> +  GooString *getLang()
> +  { return (!isContent() && s->language) ? s->language : (parent ? parent->getLang() : NULL); }

We normally indent this.

@@ +85,5 @@
> +  void setRevision(Guint revision) { if (isContent()) s->revision = revision; }
> +
> +  // Optional element title, in human-readable form.
> +  const GooString *getTitle() const { return isContent() ? NULL : s->title; }
> +  GooString *getTitle() { return isContent() ? NULL : s->title; }

Do we really need the two versions of every getter? I prefer to only add the
code that is actually used and add more later when needed.

@@ +98,5 @@
> +
> +  void appendElement(StructElement *element)
> +  { if (!isContent() && element && element->isOk()) s->elements.push_back(element); }
> +
> +<<<<<<< HEAD

Please, at least try to build the patches before submitting them.

@@ +133,5 @@
> +  //
> +  // The text will be appended to the passed GooString. If NULL is passed,
> +  // a new string is returned, and the ownership passed to the caller.
> +  //
> +  GooString* getText(GooString *string = NULL, GBool recursive = gTrue) const;

Don't include this in this first patch if it's unused and unimplemented.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20130620/2da9ea0a/attachment.html>


More information about the Poppler-bugs mailing list