[poppler] Toward to JBIG2 support in CairoOutputDev

Adrian Johnson ajohnson at redneon.com
Wed Dec 31 15:34:49 PST 2014

On 31/12/14 18:59, suzuki toshiya wrote:
> Cairo interface to manage JBIG2Globals
> --------------------------------------
> In cairo, we can pass 3 kinds related to JBIG2 data
> via cairo_surface_set_mime_data() API;
> 1) JBIG2 data itself (the stream in "5 0 obj" itself, in
> above example),
> 2) JBIG2 global data (the stream in "6 0 R" in above example),
> 3) Unique ID to specify which JBIG2 global data should be
> used in the decoding process.
> Yet I'm not fully understanding the official design in cairo,
> it seems that: unique-id (3) is passed for first, and JBIG2
> image (1) is passed in next, and finally JBIG2 global data
> (2) is passed - when JBIG2 image is passed, cairo bind it
> with the latest declaration of the unique-id, and, when
> JBIG2 global data (2) is passed to cairo, cairo binds it
> with the latest declared unique-id. Therefore, even if
> we repeat sending same JBIG2 global data (2), as far as
> we don't change unique-id (3), only 1 JBIG2 global data
> is emitted to PDF output.

The usage of cairo JBIG2 API would go something like this:

For each JBIG2 image encountered by CairoOutputDev:
1) Create a cairo image surface (CairoOutputDev already does this).

2) Set CAIRO_MIME_TYPE_UNIQUE_ID on the image to ensure only one
instance of each image is embedded (CairoOutputDev already does this).

3) Set CAIRO_MIME_TYPE_JBIG2 on the image to the JBIG2 data (the 5 0
stream in your example above).

4) If the JBIG2 stream uses global data:

4a) Set CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID on the image to some unique
identifier. For your example I suggest "6-0". The namespace is unique to
the CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID mime type so you do not need any
prefix like "pdf-jbig2-globals-".

4b) Set CAIRO_MIME_TYPE_JBIG2_GLOBAL on the image to the global data.
You only need to do this for once for each of the images that share the
same CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID. Setting it on more than one is
harmless. Cairo will only embed one copy.

6) Paint the image (CairoOutputDev already does this).

> The problem is "how we can determine the unique-id for
> JBIG2 global data?".
> Problem to make a unique-id for JBIG2Globals in PDF
> ---------------------------------------------------
> The easiest & straight-forward idea would be using the
> object reference and generation number (referring the
> JBIG2 global data) to form a unique-id. In above example,
> we can declare as "pdf-jbig2-globals-6-0".
> But, it seems that current design of JBIG2Stream hold
> the stream itself, not the indirect object referring
> to the stream (in above example, JBIG2Stream class
> could access to the content of "6 0 R" stream, but
> could not know how it is referred - the reference number
> (=6) and generation number (=0)).

You could
 1) add a function to JBIG2Stream to return the ref of the stream, or
 2) walk through the image stream dictionary (provided to
CairoOutputDev::drawImage) and pick out the global stream.

> Furthermore, we could imagine a worse case, differently
> chained reference to same object;

The poppler lookup functions should follow the chain and return the
final ref unless you use the *NF (no follow) variants.

Object::dictLookup / Object::dictLookupNF
Object::arrayGet / Object::arrayGetNF
Dict::lookup / Dict::lookupNF

More information about the poppler mailing list