[poppler] Extract title from pdf file.

Leonard Rosenthol lrosenth at adobe.com
Thu Nov 10 03:36:39 PST 2011


EXCEPT that Poppler (and by extension, pdftoxml) does NOT process the
tagging & structure of the PDF :(.   That's why I was hoping that you were
ADDING THIS FEATURE to Poppler's core.

Leonard

On 11/9/11 10:44 PM, "Alec Taylor" <alec.taylor6 at gmail.com> wrote:

>Running pdftohtml -xml, analysing XML, processing information back into
>PDF
>
>On Thu, Nov 10, 2011 at 2:01 PM, Leonard Rosenthol <lrosenth at adobe.com>
>wrote:
>> On 11/9/11 10:02 AM, "Alec Taylor" <alec.taylor6 at gmail.com> wrote:
>>>>Are you also submitting patches to read & process any tags & structure
>>>>in
>>>> the PDF?  If the PDF is already tagged, then it will have any
>>>> headers/footers already identified accordingly.  You should be using
>>>>this
>>>> when present.
>>>
>>>Yes, I am using the RapidXML library, which I specifically chose for
>>>speed and that it is header only.
>>
>> What does an XML library have to do with processing PDF structure &
>> tagging (ISO 32000-1:2008, 14.7-14.9)???
>>
>>
>> Leonard
>>
>>
>_______________________________________________
>poppler mailing list
>poppler at lists.freedesktop.org
>http://lists.freedesktop.org/mailman/listinfo/poppler



More information about the poppler mailing list