[poppler] poppler util pdftohtml

Leonard Rosenthol lrosenth at adobe.com
Fri Sep 23 05:18:56 PDT 2011

And what is the primary reading order for any document?  That's also
important not just for semantic analysis but for things such as
text-to-speech or screen readers (aka accessibility).


On 9/23/11 7:59 AM, "Jonathan Kew" <jfkthame at googlemail.com> wrote:

>On 23 Sep 2011, at 12:44, Peter A. Kerzum wrote:
>> Actually consistent To-Unicode mapping should be a good compromise, as
>> level software can really segment text into regions of different
>> based solely on their alphabets and then detect and correct text flow
>>for each 
>> particular region
>> This way the example
>>   english WERBEH
>> should generaly work being decomposed into 2 regions with the latter
>But what is the order of those "2 regions"? You cannot tell unless you
>have some higher-level info... the purely visual presentation is
>inherently ambiguous.
>poppler mailing list
>poppler at lists.freedesktop.org

More information about the poppler mailing list