<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [TAGGEDPDF] Parse the Tagged-PDF document structure tree when present"
href="https://bugs.freedesktop.org/show_bug.cgi?id=64815#c11">Comment # 11</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [TAGGEDPDF] Parse the Tagged-PDF document structure tree when present"
href="https://bugs.freedesktop.org/show_bug.cgi?id=64815">bug 64815</a>
from <span class="vcard"><a class="email" href="mailto:carlosgc@gnome.org" title="Carlos Garcia Campos <carlosgc@gnome.org>"> <span class="fn">Carlos Garcia Campos</span></a>
</span></b>
<pre>Comment on <span class=""><a href="attachment.cgi?id=80407" name="attach_80407" title="[PATCH v3 2/7] Tagged-PDF: Interpret the document structure">attachment 80407</a> <a href="attachment.cgi?id=80407&action=edit" title="[PATCH v3 2/7] Tagged-PDF: Interpret the document structure">[details]</a></span> <a href='page.cgi?id=splinter.html&bug=64815&attachment=80407'>[review]</a>
[PATCH v3 2/7] Tagged-PDF: Interpret the document structure
Review of <span class=""><a href="attachment.cgi?id=80407" name="attach_80407" title="[PATCH v3 2/7] Tagged-PDF: Interpret the document structure">attachment 80407</a> <a href="attachment.cgi?id=80407&action=edit" title="[PATCH v3 2/7] Tagged-PDF: Interpret the document structure">[details]</a></span> <a href='page.cgi?id=splinter.html&bug=64815&attachment=80407'>[review]</a>:
-----------------------------------------------------------------
Ok, I have just a few comments for now, I'll do a full review when the patch is
split. In general, * are placed wrongly (close to the type instead of the
argument/variable name). Some parts of the code are hard to follow with all
those abbreviations.
::: poppler/Catalog.h
@@ +124,4 @@
<span class="quote">> GooString *readMetadata();
>
> // Return the structure tree root object.
> + StructTreeRoot* getStructTreeRoot();</span >
Note that this is used by pdfinfo, you need to update it too
::: poppler/StructElement.cc
@@ +197,5 @@
<span class="quote">> +{
> + return value->isNum();
> +}
> +
> +static GBool isNumber_or_AutoName(Object* value)</span >
isNumerOrAutoName I guess or even just isNumberOrAuto
@@ +231,5 @@
<span class="quote">> + } \
> + return okay; \
> + }
> +
> +ARRAY_CHECKER(isRGBColor_or_OptX4, isRGBColor, 4, gTrue, gTrue );</span >
what is OptX4?
@@ +247,5 @@
<span class="quote">> +// Maps attributes to their names and whether the attribute can be inherited.
> +struct AttributeMapEntry {
> + Attribute::Type type;
> + const char* name;
> + const Object* defval;</span >
Move the * to the variable name.
@@ +862,5 @@
<span class="quote">> +StructElement::StructElement(Dict *element, StructTreeRoot *treeRootA, StructElement *parentA):
> + type(Unknown),
> + treeRoot(treeRootA),
> + parent(parentA),
> + pageRef(),</span >
Do you really need this? I guess it does nothing.
@@ +863,5 @@
<span class="quote">> + type(Unknown),
> + treeRoot(treeRootA),
> + parent(parentA),
> + pageRef(),
> + s(new StructData())</span >
Why having this as a separate struct?
@@ +900,5 @@
<span class="quote">> +{
> + if (isContent())
> + delete c;
> + else
> + delete s;</span >
I see why using separate structs now. I would just keep the mcid or object ref
in the same StructElement and parse it on demand. I find very confusing using
variable names like c and s
@@ +995,5 @@
<span class="quote">> + assert(map);
> +
> + const MCOpArray& ops(getMCOps());
> + if (!ops.size())
> + return NULL;</span >
This leaks the map, GlobalParams::getTextEncoding() returns a new reference. I
would get the map below right before you need it.
@@ +1149,5 @@
<span class="quote">> +
> + // Language (optional).
> + if (element->lookup("Lang", &obj)->isString()) {
> + s->language = obj.getString()->getCString();
> + obj.initNull(); // The StructElement takes ownership of the GooString</span >
This is weird, I guess you are trying to avoid another string copy. I wonder
why you didn't do the same for title. Maybe we could add "take" methods to
Object and do something like:
language = obj.takeString();
hmm, but here you are getting the CString from the GooString and leaking the
GooString.
@@ +1160,5 @@
<span class="quote">> + if (element->lookup("Alt", &obj)->isString()) {
> + s->altText = obj.getString();
> + obj.initNull(); // The StructElement takes ownership of the GooString
> + } else if (!obj.isNull()) {
> + error(errSyntaxWarning, -1, "Alt object is wrong type ({0:s})", obj.getTypeName());</span >
I think we can just ignore invalid objects for optional fields.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>