[poppler] images in pdftohtml -xml mode
Igor Slepchin
igor.slepchin at gmail.com
Mon Nov 14 19:59:37 PST 2011
On 11/14/2011 07:38 PM, Albert Astals Cid wrote:
> A Dilluns, 14 de novembre de 2011, Igor Slepchin vàreu escriure:
>> <...>
>> The change is small enough that I hope it won't be very controversial
>> but comments are certainly appreciated.
>
> I'm a bit confused you add encoding="US-ASCII" to the first line
pdf2xml.dtd
> and then you remove it altogether?
Oops, thanks for noticing - removing it was a typo. I added it back now
- xmllint doesn't like the DTD without the encoding and it does no harm
to have it there (encoding is theoretically required in external text
entities that have the text declaration). I also changed the encoding
there to UTF-8 just in case it matters to anyone (all XML processors are
required to understand UTF-8).
> I'm wondering if why you did not add make GfxState *state a parameter
of the
> constructor. Seems to be mandatory to call the transform method.
Yeah, could be done that way as well - I sorta had the idea that
(0,0)-(1,1) user space coordinates could somehow be useful on their own
but they are clearly not at the moment.
> I'd prefer if you make HtmlImage a class.
Sure, I'll change that - I used struct since I wanted everything there
to be public anyway.
> It'd be cool if next time you attach the patches instead of making me
go and
> lose time trying to navigate github ;-)
Here you go, with your suggested changes - sorry, I assumed you would
prefer github :p Let me know if you want me to rebase the branch there
so that you could pull it without intermediate commits.
Igor
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: xml_images.diff
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20111114/56b38921/attachment.ksh>
More information about the poppler
mailing list