[cairo] PDF backend starts to get interesting

Fri Apr 21 09:31:29 PDT 2006

On Fri, 21 Apr 2006 09:30:24 -0400, "Mike Shaver" wrote:
>
> On 4/21/06, Michael Sweet <mike at easysw.com> wrote:
> > All very cool!  Just to throw out a feature request that would be
> > particularly useful for web browsers and other PDF creators using
> > Cairo - support for hyperlinks and forms.  Not sure what the API
> > should look like - links and form controls need a bounding box, some
> > appearance data, and a URL...
> 
> This is something that's quite interesting to Firefox, indeed.

Yes, there are lots of interesting things that are desirable here.

We're not planning on touching this kind of thing for the 1.2 release,
but I have been keeping it in the back of my mind for a while.

The trick is going to be coming up with a few, well-selected
PDF-backend-specific functions that provide a bulk of the necessary
functionality. I definitely don't want to provide the be-all and
end-all PDF-generation API, since that would quickly grow to be larger
than the primary API of cairo itself.

(Though if someone did want to work on that API, or knows of something
similar already, then it would make sense to make these new
PDF-backend-specific hooks work well with it. And then, cairo's PDF
backend implementation might even take advantage of this itself).

Fortunately the ability to name and later reference objects in PDF
should come in quite handy here. For starters we might add API to to
allocated an object ID, and API to allow for providing chunks of PDF
to define the objects themselves. What would be left would be the few
hooks to tie object IDs to existing regions/objects already in the PDF
output. Or something like that. Hopefully you get the idea.

I've also spent some time talking to Craig Ringer from the Scribus
project about what its needs are for PDF output. That application has
the wonderful feature of demanding more than any other that I've heard
of being a candidate consumer of cairo's PDF API. So I think if we can
make Scribus happy we'll be able to make most anyone happy.

Hopefully Craig won't mind me quoting below a list of things that
Scribus might need/want in terms of PDF output which he had sent to me
in a private email earlier. Some of things will require more direct
support in cairo than others. And some probably need to be placed much
further down cairo's roadmap than others. But anyway, I think this is
a useful list for starting discussion and starting some API design.

The last time Craig and I talked he said he'd give this some more
thought and try to come back at some point with some more concrete API
suggestions. Any progress on that front, Craig?

-Carl

On Thu, 23 Mar 2006 at 21:07:47 +0800, Craig Ringer wrote:
>
> The really tricky one if Scribus was to try to use Cairo would be PDF. 
> Scribus's PDF export needs a lot of control over the output, and needs 
> to generate some rather advanced PDF features. In particular, Scribus 
> needs (to/to have):
> 	- Support for CMYK, RGB, spot colour, true greyscale
> 	  (ie /DeviceCMYK, /DeviceRGB, /DeviceN, /DeviceGray)
> 	- Mixed colour space support (input colours, images, etc
> 	  in different colour spaces and formats)
> 	- Output mixed colour space PDF, convert all to
> 	  one colour space/format (using managed transforms), and/or
> 	  to convert only some elements (eg "make everything CMYK, but
> 	  don't mess with existing CMYK elements" or "Desaturate
> 	  all colours and convert to /DeviceN, but retain spots").
> 	  Some/all of this should be app's problem, but may be
> 	  reasonably implemented within the lib.
> 	- Tag images with colour profiles
> 	- Embed colour profiles in document
> 	- Embed whole fonts, font subsets, or convert text to outlines
> 	  on export
> 	- Add non-graphical PDF features/objects such as PDF forms,
> 	  XObjects, annotations, etc. Some may need graphical elements.
> 	- Constrain PDF feature use by version or ext spec.
> 	  eg limit to PDF 1.3 (most notably: no transparency)
> 	  or PDF 1.4 (notably: transparency, but no layers);
> 	  require PDF/X-3 (mixed colour space tagged objects,
> 	  some feature limits, version limit), PDF/X-1a, PDF/A (?), etc.
> 	  I'm not currently familiar enough with the details of these
> 	  specs to tell you exactly what they demand (Franz has done
> 	  most of that work to date) but all are subsets of the full
> 	  PDF functionality.
> 	- Support PDF layers
> 	- Export large documents with reasonable memory use. Potential
> 	  for VERY large images, text runs. Stream things direct into
> 	  the document where practical (ie JPEG images). Need to avoid
> 	  building whole doc in memory (even whole page if possible)
> 	  but still share resources such as images, fonts, etc.
> 	- Re-use PDF resources such as images. An image should be
> 	  included only once, and referenced. Different parts of the
> 	  image may actually be used (cropped, rotated, scaled),
> 	  but should all come from the same original where possible.
> 	  If only a small part of a large image is used, the app
> 	  should (where resampling settings etc permit) extract just
> 	  that part, but that's probably not Cairo's problem.
> 	- Filters - compression, ASCII encodings
> 	- Encryption (inc security controls; need both old and new
> 	  schemes for compatability)
> 	- Control over metadata
> 	- Generation of pre-separated output (again, maybe app's
> 	  problem... but it may be much better done in the lib I
> 	  suspect.). Example: document in C, M, Y, K, spot plates. Not
> 	  too important for PDF right now, but needed for PS output,
> 	  on-screen preview.
> 	- Extensible for future PDF capabilities where possible
> 
> That's rather a big list, and unfortunately most of the entries are 
> things Scribus does _now_ (Some, eg PDF/X-1a, PDF/A, transparency 
> flattening, are not). I'm not sure most of it is really that big a deal 
> though. Filters and encryption, for example, can be done with a rather 
> simple abstract interface, then you can just plug in the particular 
> filter chain  / encryption method you need.
> 
> One of the likely sticking points will be handling of mixed colour 
> spaces and formats, especially combined with blends, composition, and 
> flattening. Cairo currently assumes simple 8-bit RGB, which just won't 
> work in the DTP space, but I understand that you're considering changing 
> that.
> 
> Some app-enforced limits may need to be imposed on (eg) using a spot 
> colour in a blend to make some problems sane. Scribus has a preflight 
> checker that should be able to do much of that.
> 
> Hopefully most of the other advanced things can be done by low-level 
> access to the PDF backend - perhaps hooks/callbacks to permit the app to 
> fiddle with dictionaries being output, insert new objects, etc. I need 
> to think about this one some more, as I haven't done enough work with 
> features like forms and annotations to be able to think how this might work.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20060421/517c50ba/attachment-0001.pgp