[cairo] Re: Embeding JPG in PDF

Fri Jan 12 18:05:04 PST 2007

Hello,

On 1/13/07, Bill Spitzak <spitzak at d2.com> wrote:
>
>
> Pierre wrote:
> > Hello Bill,
> >
> > On 1/12/07, Bill Spitzak <spitzak at d2.com> wrote:
> >
> >> However the original problem is not solved. Even with the above, a
> >> program reading a jpeg in and drawing it on a pdf surface using cairo
> >> will result in the jpeg being decompressed and then recompressed. This
> >> is lossy and slow. The only way it will happen is if the pdf surface can
> >> look at the surface it is copying from and identify that it can get raw
> >> jpeg data from it.
> >
> > My last reply was certainly unclear, let me try to explain it again.
> > PDF can embed images (like jpeg) without uncompressing it and without
> > doing anything but embed the full jpeg file in an object. All you have
> > to do is to detect the format, the color scheme, bpp and dimensions.
> > With these informations you can create a XObject and embed the
> > _complete_ jpeg file in a stream element without uncompressing it.
>
> Your plan means that any program that wants to write pdfs efficiently
> must have special code to draw a jpeg into it. I think the original
> request was to make a method by which simple code that displays the
> document on screen can, without changes, produce an efficient pdf file.

For what I understand, a program print an images to PDF ended to
embedded RGB buffers  (see the emit_image in pdf code for ex.).  Also
the first sentence of the bug report describes exactly this plan:

"PDF files can embed PNG and JPG files as is. This would be a useful feature"

The example (an image viewer) used to explain the problem is exactly
why I used this method in PDF documents in the past, create photos
catalog using PDF document without having to worry about
compression/decompression. I took the files and "simply" embed them.
It worked using files or buffers.

If the app draws an images grid on the screen, the only change it
needs to draw a pdf would be to give a file path or a buffer instead
of loading a jpeg/png/gif/etc into a cairo image surface. Once an
image has been embedded, it can display it many times at any position
using any size.

I fail to see a more efficient way to embed images (png, jpeg or GIF
work) than reading 500-1024 bytes, fetch the info and store the
complete binary data in the PDF, there is no expensive operation
except the IO part if the data is a file.

> The idea was that cairo provide a convienent way to make the source be a
> jpeg image, and that the pdf writer have code to recognize it and skip
> the cairo uncompressing and compositing of the pixels.

That's what this patch aims to do. It uses a file now but it is
possible to pass a buffer containing jpeg data, the process remains
the same.

--Pierre