[poppler] Color Management
Hal V. Engel
hvengel at astound.net
Thu Aug 21 18:51:59 PDT 2008
On Thursday 07 August 2008 06:58:13 pm Leonard Rosenthol wrote:
> Correct. You'd need to make MAJOR changes to the rendering
> architecture of Poppler to do proper color management - especially
> where transparency is involved.
> But it would be a very good thing...
(This is kind of long please bear with me).
I have been looking at the PDF spec. and at the poppler code. Leonard is
correct that the code in it's present form is not designed to support color
management and it will require major changes.
The PDF spec calls for all CIE based colorspaces (Lab, calRGB, calCMYK and
ICCBased) to undergo the following transformations:
original color space -> XYZ -> output color space
This implies that there is clear separation between the first conversion and
the second conversion and that one set of routines can handle the second
conversion for all CIE based color transforms.
It also implies that there is some way for calling applications to specifiy a
CIE based output color space such as an ICC profile and related information
(rendering intents, black point comp. and output channel depth). This is
currently not possible.
The code as it exists only does a direct conversion to XYZ in one location
GfxLabColorSpace::getRGB() and then does a second conversion to an arbitrary
RGB** color space in the same function. None of the other CIE based color
space conversions produce an intermeadiate XYZ conversion and of course there
are no generic XYZ to output color space routines.
The gray and cmyk code for Lab conversions uses the getRGB() function and then
does some kind of generic conversion using the RGB values. Since these RGB
values are in an arbitrary RGB** color space the conversion to gray and CMYK
is also arbitrary.
CalGray does not do gamma compensation in getGray() as called for in the PDF
spec. In addition the CalGray functions only support 8 bit depth. What
happens if there is a 16bit/channel gray image in a PDF file or if a user
wants 16 bit/channel output for his/her printer?
CalRGB passes RGB values directly through to the output and makes no attempt
to convert these into an intermeadiate XYZ color space or into the actual
output colorspace. It also does not apply a gamma correction to the data as
per the PDF spec.
I don't think that I am pointing out anything new to most on this list. When
I first started to look at the code I was hoping that it would not be too
difficult to find the places where CM hooks could be put in place and that
perhaps I could spend a few days putting together a set of patches that would
provide a starting place. But it appears that the code needs significant
restructuring in order to even start doing the actaul color management
specific work. Fortunately the PDF specification has enough detail that it
should be possible for the poppler team to do much of the restructuring work
without too much involvement from someone with color management expertise.
Mostly what is needed is:
1. An API for applications to specify the output color space, rendering
intent, black point compensation and channel depth for a document. These are
needed to correctly setup the XYZ to output color space transform.
2. The CIE related code (calRGB, calGray, Lab and ICCBased) needs to be
restructured so that it has a clean division between producing the
intermeadeate XYZ values and the code that does the XYZ to output color space
conversion. See the diagram on pages 238 and 239 of the version 1.7 PDF
Initially the ICCBased code could be left alone and calRGB, calGray and Lab
would be setup to convert to XYZ but would still do the current simplistic
conversions to the output color space (IE. these would all look some what like
GfxLabColorSpace::getRGB() & friends) .
Then the XYZ to output color space routines would be added and the calRGB,
calGray and Lab routines would call them to handle the output conversion in a
more correct way. It is at this point where the code would start making use
of a CMS like LCMS as well as the API that was created in #1. This
functionality is documented in the diagram on page 239 of the PDF version 1.7
spec as "Conversion from CIE-based to device color space (not specified by
3. At this point setting up the ICCBased routines to create XYZ intermeadiate
results would be added. Once this is in place poppler would have a
functioning color managed system for all of the CIE based color space types.
I have not looked at the PDF specs for "Special color spaces" such as
Separation, DeviceN, Indexed and Pattern so I do not know what implications
there are for these in a color managed system. I have also not looked at
transparancy issues but I didn't see much in the spec that related to ICCBased
objects other than that it was nessassary to have to AtoB and BtoA tables in
the profiles to support this and that all blending must be either in device
space or in CIE space and that the spec advises that CIE is prefered.
Many of you are probably asking "Why should we give a rats behind about this?"
The reason I am looking at this is that the printing community is in the
process of converting from PostScript to PDF as the standard document type for
printing for *nix systems. Because of this they are in the process of writing
a new pdftoraster filter for CUPS as well as putting together a PDF based
"Common Printing Dialog" (this is a Google Summer of Code project). I asked
on the printing email lists about color management in the new printing
workflow. Specifcally did the new pdftoraster filter have CM support? I
didn't get an answer and so I found the code and had a look. Guess what (most
of you probably know this) it is using poppler and as a result it does not
support color management.
Printing is an important piece of the infastructure in general and
particularly for anyone doing color critical work. It is currently badly
broken with respect to color managemnt and it appears that until poppler has
CM support or another similar library with CM support becomes available it
will remain broken. So now you know what the story is and why this is
I have color management expertise (I am the maintainer of LProf) and I have
connections to the open source color management community where there are many
others with CM expertise some of them with way more expertise than I have. As
a community we have been working with the printing community, Xorg and the
DE's trying to get the missing CM components in place.
There has been major progress with color management on monitors and we now
have calibration and profiling software for these devices as well as support
in XOrg (with more on the way) and work is underway to extend this to include
better user tools in KDE and GNOME. We also have good open source tools for
profiling input devices like cameras and scanners and these profiles are now
well supported by things like XSane, UFRAW and other software in this area.
We also have high quality profiling software for output devices like printers.
One of the major things that is missing is a viable color managed path for
printing. That is I can create very high quality profiles for my printers but
using them with the existing printing tools is at best difficult even for
someone who really understands CM and nearly imposible for a normal user. If
the CUPS pdftoraster filter had CM support much of the printing part of this
would start falling in place and would become accessable to a much wider
audiance. So from the point of view of the CM community poppler is now a very
important piece of software.
PS For those interested this web page from the ICC has a link to a PDF file
from Adobe that demonstrates CM or a lack of CM as the case may be:
**It looks like it might end up being sort of sRGB since it uses a matrix
conversion and then a gamma conversion that is too simple to give actual sRGB
results since sRGB has a compound gamma curve.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the poppler