[Poppler-bugs] [Bug 64821] [TAGGEDPDF] Expose the structure tree and attributes in poppler-glib
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Wed Dec 4 09:37:44 PST 2013
https://bugs.freedesktop.org/show_bug.cgi?id=64821
--- Comment #41 from Adrian Perez de Castro <aperez at igalia.com> ---
(In reply to comment #40)
> Comment on attachment 89919 [details] [review]
> [PATCH v11 07/11] glib: Expose inline attributes of structure elements
>
> Review of attachment 89919 [details] [review]:
> -----------------------------------------------------------------
>
> My main concern is that we already have both a way to provide font
> information and text attributes, but I think none of them fit in this API.
> We have PopplerFontInfo, but it's actually an iterator, and we have
> PopplerTextAttributes, that contains the offsets of the text in the page
> text.
Yes, I looked into PopplerTextAttributes first but it is not a good fit
for this kind of access :-(
> ::: glib/poppler-structure-element.cc
> @@ +718,5 @@
> > + * <informalexample><programlisting>
> > + * GList *text_spans = poppler_structure_element_get_text_spans (element);
> > + * /<!-- -->* Use the text spans *<!-- -->/
> > + * g_list_free_full (text_spans,
> > + * (GDestroyNotify) poppler_text_span_free);
>
> We usually provide a convenient method to do this, see
> poppler_page_free_link_mapping, poppler_page_free_text_attributes, etc.
I have added a poppler_structure_element_free_text_spans() function
> @@ +743,5 @@
> > + {
> > + PopplerTextSpan *span = g_slice_new0 (PopplerTextSpan);
> > + span->text = _poppler_goo_string_to_utf8 (i->getText ());
> > +
> > + GfxRGB rgb = i->getColor();
>
> Is this copying the struct? should it return a const reference instead or a
> pointer?
In the last version of the patch for Poppler core the TextSpan::getColor()
method returns a reference.
> @@ +761,5 @@
> > + span->flags |= POPPLER_TEXT_SPAN_SERIF_FONT;
> > + if (i->getFont ()->isItalic ())
> > + span->flags |= POPPLER_TEXT_SPAN_ITALIC;
> > + if (i->getFont ()->isBold ())
> > + span->flags |= POPPLER_TEXT_SPAN_BOLD;
>
> Why not doing this in the core too and here simply do i->getFlags()?
Adding TextSpan::getFlags() in core seems redundant to me because
TextSpan provides access to the GfxFont. Also, to be able to do a
plain “span->flags = i->getFlags()” would need to either manually
type the same bits for the flags in the poppler-glib header (not
nice if it changes in the future), or to include the core header
(not an option), or to check each flag from core and set it with
the corresponding poppler-glib value... IMHO it is not worth the
effort to have a TextSpan::getFlags() in core.
> @@ +775,5 @@
> > + span->flags |= POPPLER_TEXT_SPAN_BOLD;
> > + default:
> > + break;
> > + }
> > + }
>
> I would move this block to a helper function that creates a PopplerTextSpan
> from a given TextSpan
Moved.
> @@ +796,5 @@
> > + */
> > +PopplerTextSpan *
> > +poppler_text_span_copy (PopplerTextSpan *poppler_text_span)
> > +{
> > + PopplerTextSpan *new_span = g_slice_new0 (PopplerTextSpan);
>
> You can use g_slice_dup and then you only need to manually copy the heap
> allocated members.
Ah, nice. I didn't know about g_slice_dup(), it is quite handy :)
> @@ +938,5 @@
> > + return poppler_text_span->font_name;
> > +}
> > +
> > +/**
> > + * poppler_text_span_get_link_target:
>
> What is a link target? a named destination?
This is removed now, it does not make sense as links are exposed as
a PopplerStructureElement object.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20131204/6fd216c2/attachment-0001.html>
More information about the Poppler-bugs
mailing list