[cairo] Re: Embeding JPG in PDF

Tue Jan 9 13:39:30 PST 2007

Emmanuel Pacaud wrote:

> That would be better, and even better without a specific surface. A
> cairo_image_surface_create_from_jpeg () would load a jpeg image in
> memory and keep it as jpeg data, with some internal cairo API that would
> allow other backend to get directly these raw data. Only an
> acquire_image would trigger jpeg decompression.

If this is going to be done, it is probably a good idea to match what 
most such libraries do today, which is to understand several image 
types, ie jpeg, png, etc.

cairo_image_surface_create_from_file(filename) would identify the type 
of file and make an image surface containing it's contents.

There also MUST be a cairo_image_surface_create_from_data(data,length) 
which takes a memory-mapped file image. This allows the image to be 
imbedded into the program or another database.

Typically there is also an implementation that takes caller-defined 
read-block and close callbacks, similar to how the pdf surface can write 
a file. This allows it to accept already-opened C++ streams or C FILE 
objects. This must also take the data passed to the register function 
described below.

To avoid having cairo link every image library known the solution seems 
to be "register" a constructor function. This involves calling some 
cairo api with a pointer to a function that takes a filename (used for 
pattern-matching filenames, it is not opened), a block of data that is 
the first 512 or so bytes of the file, and whatever arguments are passed 
to the base class constructor such as the read+close callbacks. This 
function figures out from the block of data (or the filename if it is 
really stupid) if it can read the data, and if so constructs the proper 
subclass and returns it. It returns null if the test fails. It helps a 
lot if platform-specific tricks are done so that, for instance, the jpeg 
subclass can be in the cairo library, but if the register for it is not 
done it does not matter if libjpeg exists on the machine, the program 
will work. Add a "register_common_images" function that gets you jpeg 
and png.

A huge annoyance is that for most image file libraries it is quite 
inefficient to measure the image without reading it at the same time, if 
you assume there is any chance that after measuring the image the caller 
will want the decompressed data. The result has usually been that it is 
impossible to avoid the decompression when the object is constructed, or 
at least when it is first used as a source. So the pdf example will 
likely result in the image being decompressed into memory even if that 
data is not used. Some file types can get the width/height from the 
block of data used to id the file so they don't have this problem.