[poppler] Accessibility of PDF documents (corrected patch attached)

leena chourey leenagour at gmail.com
Sun Jul 11 21:55:16 PDT 2010


Dear Albert and list members,

We have submitted the related patch a week before and waiting for your
feedback on 'pdftohtml -s <filename>' (see the below mail and attached
patch).
If its ok then please commit this with the coming version of poppler lib.

With best regards
Leena C
(for CDAC Accessibility Team)


On Tue, Jul 6, 2010 at 4:49 PM, leena chourey <leenagour at gmail.com> wrote:

> Dear Albert,
>
> Thanks for your response.
>
> As discussed in the last mail, we have modified the patch so that:
>
>    - There is no behavioural change in pdftohtml -c <filename> means it
>    produces exactly the same output it did before.
>    - Defined new option as pdftohtml -s <filename> to generate a single
>    html file corresponding to a pdf file.
>
> Please check and give your feedback if any further change is required.
>
> With best regard
> Leena C
>
>
> On Wed, Jun 23, 2010 at 1:19 AM, Albert Astals Cid <aacid at kde.org> wrote:
>
>> A Dimarts, 22 de juny de 2010, leena chourey va escriure:
>> > Dear Albert,
>> >
>> > Thanks for giving detail comment to patch.
>> > Please check updates given inline:
>>
>> Please do not forget to CC the poppler mailing list.
>>
>> >
>> > On Thu, Jun 17, 2010 at 4:14 AM, Albert Astals Cid <aacid at kde.org>
>> wrote:
>> > > A Dimecres, 16 de juny de 2010, omkar va escriure:
>> > > > Dear Albert,
>> > > >
>> > > > Please find the corrected patch for "accessibility of pdf document "
>> > > > and give your feedback.
>> > >
>> > > Hi, some comments:
>> > >  * The comments like
>> > >  // One more parameter(int j) is added in the getCSStyle function by
>> CDAC
>> > >
>> > > developer Team
>> > >
>> > >   need to be removed, if each line had near it who coded it, the code
>> > >   will
>> > >
>> > > be
>> > > twice as big and much more unreadable
>> >
>> > Done, deleted all unwanted comments
>> >
>> > >   * The spacing of your patches could be better, that is
>> > >
>> > > GooString* HtmlFontAccu::getCSStyle(int i, GooString* content ,int j){
>> > > should be
>> > > +GooString* HtmlFontAccu::getCSStyle(int i, GooString* content, int
>> j){
>> > > but that's nothing huge, i can fix it
>> >
>> > Updated accordingly.
>> >
>> > >   * You are leaking (i.e. not deleting) jStr in both
>> > >
>> > > HtmlFontAccu::getCSStyle
>> > > and HtmlFontAccu::CSStyle
>> >
>> > Deleted jStr
>> >
>> > >  * I see that the new HtmlPage::complexHtml and the old
>> > >
>> > > HtmlPage::dumpComplex
>> > > are very simple, i if you reused the code instead of copying it
>> > >
>> > >  * This introduces a behavioural change that is unaccetable, i
>> understand
>> > >
>> > > you
>> > > want pdftohtml to produce a different (in your opinion better) output,
>> > > for that you'll have to introduce a new comandline option to pdftohtml
>> > > (something
>> > > like --singlehtml) or something like that
>> >
>> > For last 2 point we want some clarification.
>> > As you said behavioural change is unacceptable and also suggested to
>> > introduce a new command line option to generate single html. So if we do
>> as
>> > following, will it be acceptable?
>> >
>> >    - *Existing is:*
>> >    Command line option: pdftohtml -c  <filename>
>> >    Function called:
>> >
>> >
>> >                       dumpComplex
>> >    ()
>> >    {
>> >        Read from input file
>> >        Write into file to Generates pagewise html format
>> >    }
>> >
>> >
>> >    - *Proposed changes:*
>> >    New Command line option : pdftohtml -s <filename>         //Checked,
>> >    nothing  is already defined for -s            (pdftohtml -c
>>  <filename>
>> > will exists as it is)
>> >
>> >
>> >    - Function called:
>> >
>> >                          dumpSingle()  //new function similar to
>> > dumpComplex {
>> >        Read from input file
>> >        Write into file to append single html format
>> >    }
>> >
>> >    - A function to “Read from input file” can be defined and call it in
>> > both dumpComplex() and dumpSingle(), So that code duplication can be
>> > removed (for second last point of your mail).
>> >    - And with -s option (for --single Html) behavioural change will be
>> >    defined separately. (-c will not be affected)
>>
>> To be clear, pdftohtml -c should produce exactly the same output it did
>> before
>> your patch, pdftohtml -s you can output your version.
>>
>> So yes, i think i kind of agree with your proposal.
>>
>> Albert
>>
>>
>> >
>> >
>> > For your opinion
>> >
>> > With Regards
>> > Leena C & Onkar P
>> > (for CDAC Accessibility Team)
>>
>
>
>
> --
> Leena C
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20100712/12402ca1/attachment.html>


More information about the poppler mailing list