[cairo] Size of PDF with lots of images

Simon Sapin simon.sapin at exyr.org
Fri Jan 17 06:59:48 PST 2014

On 17/01/2014 03:44, Behdad Esfahbod wrote:
> On 14-01-17 04:20 AM, Adrian Johnson wrote:
>> On 17/01/14 06:45, Simon Sapin wrote:
>>> On 16/01/2014 19:57, Adrian Johnson wrote:
>>>>> If your images were in a format that the PDF backend supports [2] (which
>>>>> includes JPEG but not PNG), you could use cairo_surface_set_mime_data()
>>>>> to have cairo store the original image data (almost) as-is in PDF,
>>>>> without re-compressing. Although I expect that lossy JPEG may not look
>>>>> nice for these specific images.
>>>> Yes, jpeg images are the most likely reason for the increase in size.
>>>> I'm not sure what you mean by "lossy JPEG may not look
>>>> nice for these specific images". The jpeg data is stored exactly as
>>>> provided to cairo_surface_set_mime_data() so there will be no loss of
>>>> image quality.
>>> Behdad’s images are PNG here, not JPEG. They are computer-generated with
>>> sharp edges, the kind where JPEG does not do well.
>> I see two jpegs in his git repo.
> Yes, the background image is a 0.5MB JPEG. But the other 13.5MB are PNGs.
> Now it occurred to me that I also use a surface cache and that's why the
> background image wasn't reembedded 150 times!  I was under the impression that
> cairo's being smart...  I still think it should try to be smart by hashing
> images and reusing them.

Behdad, I’d give cairocffi and Surface.set_mime_date() a try for the 
JPEG image.

Simon Sapin

More information about the cairo mailing list