[poppler] Poppler - SVG Device
todd.hubers at alivate.com.au
Fri Nov 4 04:58:09 PDT 2011
You can probably tell me :) I'm not claiming to be a poppler genius. Please
do elaborate on the suitability the CairoOutputDevice to generate an SVG
(remembering that SVGs are favoured for their vector ability for text,
lines and filled shapes).
On 4 November 2011 22:55, Dominic Lachowicz <domlachowicz at gmail.com> wrote:
> Just out of curiosity, how would the proposed SVGOutputDevice differ
> from using (say) the existing CairoOutputDevice that was configured to
> write to SVG? That can already be accomplished today.
> On Fri, Nov 4, 2011 at 7:38 AM, Todd Hubers <todd.hubers at alivate.com.au>
> > Alec, I'm quite sold on the SVG idea. It is self contained and can even
> > outside the browser.
> > Josh, it would seem that the HTMLOutputDevice is the better candidate for
> > SVG. HTML would be a good interim solution as well, however with SVG,
> > everything is packaged into a single file as a package. With HTML the
> > browser is making repeated calls back to the web server (for image
> > resources), but with SVG it's naturally all together. You can also
> > effects like gradients in SVG quite easily and is better supported by
> > browsers than alternative approaches to getting PDF into the browser.
> > I am interested in seeing the latest version of the HTML solution. I may
> > attempt some preliminary SVG rendering.
> > Back on the topic of "Data" output device. I'm already using XML for RTF
> > output (I'm doing this in my language of choice - C# though so it's not
> > easy task to contribute this back to poppler). It's true that direct
> > implementation of device drivers are more efficient, however XML or the
> > do provide a convenient interface very accessible for many programming
> > languages. I would not expect such a "data" output device to be used by
> > viewing applications. However it would be good for all other purposes,
> > such implementations are usually performed in batch processes and the
> > processing in the presence of multi-threading is readily accepted in
> > for flexibility - that is, a larger community can make use of poppler.
> > Cheers,
> > Todd
> > On 4 November 2011 17:24, Josh Richardson <jric at chegg.com> wrote:
> >> Hi Todd,
> >> Some of us who are working on pdftohtml utility have had similar
> >> It's on my wish list to completely remove the need for a poppler output
> >> device by utilizing the SVG toolset available in modern browsers. In
> >> case, we are achieving high accuracy on Gecko and Webkit browsers with
> >> current version (not merged into the Poppler main repo yet, but I can
> >> you an invite for a git repo that Alec Taylor made, which has all those
> >> latest changes.) I think it might meet your needs as-is, or with some
> >> tweaks to make it work better on other browsers.
> >> We are currently extracting the text and fonts for the browser to render
> >> directly, but still must rely on Splash, Cairo, etc. to rasterize other
> >> graphic operations. With the way we've done it, we have an easy path to
> >> change over to SVG, one graphic operation at a time, if you'd be
> >> in doing that.
> >> The idea of a separate "data" device is interesting, but I don't think
> >> it's the right way to go. In effect, you are talking about changing
> the PDF
> >> data to XML, and from there to other formats. I can appreciate the
> >> sentiment, since PDF is such a difficult format to work with, but
> adding a
> >> layer of abstraction is just going to make things more complex,
> >> and slow. To note, the current version of pdftohtml creates a valid
> >> XML-compliant HTML format — actually there's a small bug, but you
> >> get the point. You can always use the XML-compliant HTML as your
> >> easier-to-digest "data" format, which also allows us to represent more
> >> semantics than are available in the original PDF document, and you can
> >> always extend it with whatever XML tags you need. For example, I
> >> it with an attribute describing bounding boxes for all of the text
> >> Let me know if you want the repo invite.
> >> Best, --josh
> >> From: Todd Hubers <todd.hubers at alivate.com.au>
> >> Date: Thu, 3 Nov 2011 18:13:52 -0700
> >> To: "poppler at lists.freedesktop.org" <poppler at lists.freedesktop.org>
> >> Subject: [poppler] Poppler - SVG Device
> >> I'm currently using Poppler for Text extraction and using GhostScript
> >> PDF to Image functionality, all for viewing PDFs online without
> requiring a
> >> PDF plugin in the browser.
> >> I noticed Mozilla was working on an interesting project, PDF.js
> >> [https://wiki.mozilla.org/PDF.js]. It loads PDF files with pure
> >> (on a HTML5 compatible browser - probably needs canvas).
> >> This is an opportunity for poppler to steam ahead and get some headline
> >> grabbing exposure. The SVG format is well supported by browsers. PDFs
> >> portable across systems, however SVGs are very portable (and fast)
> >> the web.
> >> I propose the building of an SVG Device - PDF to SVG. I am currently
> >> considering using PDF to XML, to then perform XML to SVG. Given the
> >> quo, I believe it's time for PDF to SVG.
> >> I see SVG as a very efficient and therefore powerful web format, I hope
> >> others in the poppler community will see the potential as I do.
> >> Thanks,
> >> Todd Hubers (BBIT Hons)
> >> Alivate
> >> PS. Perhaps we could then have PDF>Cairo, PDF>SVG, and then tools for
> >> SVG>XML, SVG>HTML, SVG>Text. In any case it would be good to have
> simply one
> >> direct rendering device and one "data" device.
> > _______________________________________________
> > poppler mailing list
> > poppler at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/poppler
> "I like to pay taxes. With them, I buy civilization." -- Oliver Wendell
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the poppler