[Poppler-bugs] [Bug 64821] [TAGGEDPDF] Expose the structure tree and attributes in poppler-glib

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Jun 6 07:49:41 PDT 2013


https://bugs.freedesktop.org/show_bug.cgi?id=64821

Adrian Perez de Castro <aperez at igalia.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #79993|0                           |1
        is obsolete|                            |

--- Comment #3 from Adrian Perez de Castro <aperez at igalia.com> ---
Created attachment 80416
  --> https://bugs.freedesktop.org/attachment.cgi?id=80416&action=edit
[PATCH v3 5/6] Tagged-PDF: Expose the structure tree in poppler-glib

Attached updated version of the 5/6 patch, with the following additions
on top of the previous version:

* Changes and API additions to handle object reference structure elements:

  - poppler_structure_element_is_reference()
  - poppler_structure_element_get_reference_type()

* API additions to get PopplerLinkMapping structures from object reference
  structure elements:

  - poppler_structure_element_get_reference_link()

  ...and to search for a the reference and obtain the PopplerLinkMapping
  from a POPPLER_STRUCTURE_ELEMENT_LINK element:

  - poppler_structure_element_find_link()

* New poppler_structure_element_get_page() function. Obtains the number of
  the page with the content described by the structure element.

* New poppler_structure_element_get_id() function. Returns the identifier
  of a structure element (or NULL if not defined).

* New poppler_structure_element_get_title() function: Returns the title
  of a structure element (or NULL if not defined).

* New popppler_structure_element_get_abbreviation() function: for
  POPPLER_STRUCTURE_ELEMENT_SPAN elements which contain an abbreviation,
  the function returns the expanded form of the abbreviation (or NULL
  if not defined or the element is not an abbreviation).

* New poppler_structure_element_get_alt_text() function: Returns the
  alternate text for an elemement (or NULL if not defined).

* New poppler_structure_element_get_actual_text() function: Returns
  the actual text (textual representation of a text-like graphic element,
  returns NULL if not defined).

* Function poppler_structure_element_get_language() does no longer have
  an argument to specify whether it should find the language by looking
  up recursively in the structure tree. According to the PDF spec, the
  language must always to be inherited from parent elements.

My plan is to update this patch further to add new functions to obtain
form fields from the structure tree, in a similar way in which the link
mappings are obtained, tentatively I would be adding:

* Definition of POPPLER_STRUCTURE_REFERENCE_FORM_FIELD.

* poppler_structure_element_get_reference_form_field(), to be used in an
  object reference structure element, returning a PopplerFormFieldMapping*.

* poppler_structure_element_find_form_field(), to be used in an element
  of type POPPLER_STRUCTURE_FORM, returning a PopplerFormFieldMapping*.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20130606/013e1f5d/attachment.html>


More information about the Poppler-bugs mailing list