[poppler] For Accessibility of pdf document: changes required in pdftohtml complex output

Albert Astals Cid aacid at kde.org
Sun Jun 13 14:32:59 PDT 2010

A Dimecres, 9 de juny de 2010, leena chourey va escriure:
> Dear poppler developers,
> I am new to this list, and working on Gnome accessibility.  To read pdf
> document, visually impaired person uses screen reader, but very less
> support is provided by the opensource communities. We are working in the
> same line and trying to make pdf document accessible using screen reader
> Orca. We have analysed various options in this reagard, that includes
> exploration of evince document viewer, orca accessibility features for pdf
> document and more. As a first step, we have decided to use pdftohtml
> utility to provide pdf content in html format, so that orca  can the pdf
> content available in html format.
> Observations while exploring poppler-0.12.4 (utils):
>    - Poppler-utils has a pdftohtml facility to generate html file for pdf
>    document, Similarly with -c option it can generate the formatted html
> file for corresponding pdf. -c generates file_ind.html, file_outline.html,
> file.html and 1 .html & .png for each page of pdf.  (please confirm) -
> While working on this file.html in firefox, we have observed that this
> links/contains only index file (file_ind.html) and file1.html (first page
> html) file. To shift to another page, I have to click on that page from
> index, which opens the corresponding page in new tab of firefox. So for
> every page one new tab will open. (please confirm)
>    - I don't find way to return to previous page or jump to some particular
>    page.
> For a person with perfect vision, no issues in reading pdfcontent in
> complex html format. But to ensure that the complex html format is as much
> as similar to pdfdocument displayed using any document viewer and to make
> html format more accessible and usable by a blind person, we found that
> following issues need to be resolved.  As mentioned above for
> accessibility, now if a blind person reads file.html then following are
> some issues :
>    1. Because file.html uses frameset/frame so orca is not able to shift
>    control from 1 frame to another. it shifted after reading full content
> of one frame (with tab). Normal person can shift from frame to frame with
> the help of mouse, but with tab it is not possible to skip no. of tabs. 2.
> If a blind person want to read/shift to another page , it opens in new
> tab, it will be confusing for her/him to handle no of tabs (1 for each
> page).
>    3. Some more issues are there related with content format can be
>    discussed in further communication
> To resolve first 2 issues, it is required to have changes in pdftohtml -c
> utility, that will make html document more accessible and usable to a
> visually impaired person.

Hi, i wonder if you know the people working on 
https://bugs.freedesktop.org/show_bug.cgi?id=28276 it would make sense to join 
efforts since it seems you are trying to achieve the same.


> With regards
> Leena

More information about the poppler mailing list