[poppler] Accessibility of PDF documents (corrected patch attached)
Albert Astals Cid
aacid at kde.org
Thu Sep 16 15:24:31 PDT 2010
A Dijous, 16 de setembre de 2010, vàreu escriure:
> Dear Poppler Developers,
>
> Thanks for response.
>
> For copyright information:
>
> *Copyright OSSD CDAC Mumbai
>
> Changed 2010 by Leena Chourey (leenac at cdacmumbai.in) & Onkar Potdar (
> onkar at cdacmumbai.in)
> *
I've commited your code, it'll be in poppler >= 0.15.1
Albert
>
> With regards
> Leena Chourey
> For Accessibility Team
> CDAC Mumbai
>
> On Thu, Sep 16, 2010 at 12:55 AM, Albert Astals Cid <aacid at kde.org> wrote:
> > A Dimarts, 7 de setembre de 2010, leena chourey va escriure:
> > > Thanks Albert to review this patch again.
> >
> > The patch seems ok.
> >
> > Before commiting i need the names and e-mail addresses of the people the
> > copyright of this patch belongs to.
> >
> > Albert
> >
> > > Comments are inline:
> > >
> > > On Sun, Jul 25, 2010 at 9:27 PM, Albert Astals Cid <aacid at kde.org>
> >
> > wrote:
> > > > A Dimarts, 6 de juliol de 2010, leena chourey va escriure:
> > > > > Dear Albert,
> > > >
> > > > Hi
> > > >
> > > > > Thanks for your response.
> > > > >
> > > > > As discussed in the last mail, we have modified the patch so that:
> > > > > - There is no behavioural change in pdftohtml -c <filename>
> > > > > means
> >
> > it
> >
> > > > > produces exactly the same output it did before.
> > > > > - Defined new option as pdftohtml -s <filename> to generate a
> >
> > single
> >
> > > > > html file corresponding to a pdf file.
> > > > >
> > > > > Please check and give your feedback if any further change is
> >
> > required.
> >
> > > > You are using a variable you deleted (tmp) in this chunk of code
> > > >
> > > > ***********************
> > > >
> > > > delete tmp;
> > >
> > > This (delete tmp) was from the original development code only, we
> > > didn't made changes regarding tmp. I have checked, it is in the recent
> >
> > development
> >
> > > version also.
> > >
> > > > - fprintf(pageFile,"%s\n<HTML>\n<HEAD>\n<TITLE>Page
> > > > %d</TITLE>\n\n",
> > > >
> > > > - DOCTYPE, page);
> > > > + if (!singleHtml)
> > > > + fprintf(pageFile,"%s\n<HTML>\n<HEAD>\n<TITLE>Page
> > > > %d</TITLE>\n\n",
> > > > DOCTYPE, page);
> > > > + else
> > > > + fprintf(pageFile,"%s\n<HTML>\n<HEAD>\n<TITLE>
> >
> > %s</TITLE>\n\n",
> >
> > > > DOCTYPE, tmp->getCString());////file name
> > > > ***********************
> > > >
> > > > I'm also concerned about you adding various <HTML> to the same .html
> > > > page, my
> > > > limited HTML knowledge says you can only have one of those.
> > >
> > > For the above: I would like to say that every page has different
> > > heading details as well as title. This should not be changed for
> > > pages.
> > >
> > > > Also it would be necessary that you update the pdftohtml.1 file (the
> >
> > man
> >
> > > > page)
> > > > adding the new option.
> > >
> > > pdftohtml.1 is updated.
> > >
> > > Please find the latest patch for "pdftohtml -s <file.pdf> " and give
> > > feedback.
> > >
> > > > Albert
> > > >
> > > > > With best regard
> > > > > Leena C
> > > > >
> > > > > On Wed, Jun 23, 2010 at 1:19 AM, Albert Astals Cid <aacid at kde.org>
> > > >
> > > > wrote:
> > > > > > A Dimarts, 22 de juny de 2010, leena chourey va escriure:
> > > > > > > Dear Albert,
> > > > > > >
> > > > > > > Thanks for giving detail comment to patch.
> > > > > >
> > > > > > > Please check updates given inline:
> > > > > > Please do not forget to CC the poppler mailing list.
> > > > > >
> > > > > > > On Thu, Jun 17, 2010 at 4:14 AM, Albert Astals Cid <
> >
> > aacid at kde.org>
> >
> > > > > > wrote:
> > > > > > > > A Dimecres, 16 de juny de 2010, omkar va escriure:
> > > > > > > > > Dear Albert,
> > > > > > > > >
> > > > > > > > > Please find the corrected patch for "accessibility of pdf
> > > >
> > > > document
> > > >
> > > > > > > > > " and give your feedback.
> > > > > > > >
> > > > > > > > Hi, some comments:
> > > > > > > > * The comments like
> > > > > > > > // One more parameter(int j) is added in the getCSStyle
> >
> > function
> >
> > > > by
> > > >
> > > > > > CDAC
> > > > > >
> > > > > > > > developer Team
> > > > > > > >
> > > > > > > > need to be removed, if each line had near it who coded it,
> >
> > the
> >
> > > > code
> > > >
> > > > > > > > will
> > > > > > > >
> > > > > > > > be
> > > > > > > > twice as big and much more unreadable
> > > > > > >
> > > > > > > Done, deleted all unwanted comments
> > > > > > >
> > > > > > > > * The spacing of your patches could be better, that is
> > > > > > > >
> > > > > > > > GooString* HtmlFontAccu::getCSStyle(int i, GooString* content
> > > > > > > > ,int j){ should be
> > > > > > > > +GooString* HtmlFontAccu::getCSStyle(int i, GooString*
> > > > > > > > content, int j){ but that's nothing huge, i can fix it
> > > > > > >
> > > > > > > Updated accordingly.
> > > > > > >
> > > > > > > > * You are leaking (i.e. not deleting) jStr in both
> > > > > > > >
> > > > > > > > HtmlFontAccu::getCSStyle
> > > > > > > > and HtmlFontAccu::CSStyle
> > > > > > >
> > > > > > > Deleted jStr
> > > > > > >
> > > > > > > > * I see that the new HtmlPage::complexHtml and the old
> > > > > > > >
> > > > > > > > HtmlPage::dumpComplex
> > > > > > > > are very simple, i if you reused the code instead of copying
> > > > > > > > it
> > > > > > > >
> > > > > > > > * This introduces a behavioural change that is unaccetable,
> > > > > > > > i
> > > > > >
> > > > > > understand
> > > > > >
> > > > > > > > you
> > > > > > > > want pdftohtml to produce a different (in your opinion
> > > > > > > > better) output, for that you'll have to introduce a new
> > > > > > > > comandline
> >
> > option
> >
> > > > to
> > > >
> > > > > > > > pdftohtml (something
> > > > > > > > like --singlehtml) or something like that
> > > > > > >
> > > > > > > For last 2 point we want some clarification.
> > > > > > > As you said behavioural change is unacceptable and also
> > > > > > > suggested to introduce a new command line option to generate
> > > > > > > single html.
> >
> > So
> >
> > > > > > > if we do
> > > > > >
> > > > > > as
> > > > > >
> > > > > > > following, will it be acceptable?
> > > > > > >
> > > > > > > - *Existing is:*
> > > > > > > Command line option: pdftohtml -c <filename>
> > > > > > >
> > > > > > > Function called:
> > > > > > > dumpComplex
> > > > > > >
> > > > > > > ()
> > > > > > > {
> > > > > > >
> > > > > > > Read from input file
> > > > > > > Write into file to Generates pagewise html format
> > > > > > >
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > - *Proposed changes:*
> > > > > > > New Command line option : pdftohtml -s <filename>
> > > >
> > > > //Checked,
> > > >
> > > > > > > nothing is already defined for -s (pdftohtml -c
> > > > > >
> > > > > > <filename>
> > > > > >
> > > > > > > will exists as it is)
> > > > > > >
> > > > > > > - Function called:
> > > > > > > dumpSingle() //new function similar
> > > > > > > to
> > > > > > >
> > > > > > > dumpComplex {
> > > > > > >
> > > > > > > Read from input file
> > > > > > > Write into file to append single html format
> > > > > > >
> > > > > > > }
> > > > > > >
> > > > > > > - A function to “Read from input file” can be defined and
> > > > > > > call it
> > > >
> > > > in
> > > >
> > > > > > > both dumpComplex() and dumpSingle(), So that code duplication
> > > > > > > can be removed (for second last point of your mail).
> > > > > > >
> > > > > > > - And with -s option (for --single Html) behavioural change
> >
> > will
> >
> > > > be
> > > >
> > > > > > > defined separately. (-c will not be affected)
> > > > > >
> > > > > > To be clear, pdftohtml -c should produce exactly the same output
> > > > > > it did before
> > > > > > your patch, pdftohtml -s you can output your version.
> > > > > >
> > > > > > So yes, i think i kind of agree with your proposal.
> > > > > >
> > > > > > Albert
> > > > > >
> > > > > > > For your opinion
> > > > > > >
> > > > > > > With Regards
> > > > > > > Leena C & Onkar P
> > > > > > > (for CDAC Accessibility Team)
> > >
> > > With best regards
> > > Leena C
> >
> > _______________________________________________
> > poppler mailing list
> > poppler at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/poppler
More information about the poppler
mailing list