[RFC wayland-protocols] Color management protocol

Thu Jan 5 02:30:23 UTC 2017

Daniel Stone wrote:

Hi Daniel,

> For the purposes of this discussion, I'd like to park the topic of
> calibration for now. It goes without saying that providing facilities
> for non-calibration clients is useless without calibration existing,

I'm puzzled by what you mean by "non-calibration clients" ?

- taken literally that translates to applications that don't
  perform calibration, which is almost all of them, which I guess
  is not what you mean.

> but the two are surprisingly different from a window-system-mechanism
> point of view; different enough that my current thinking tends towards
> a total compositor bypass for calibration, and just having it drive
> DRM/KMS directly. I'd like to attack and bottom out the
> non-calibration usecase without muddying those waters, though ...

There are dangers in bypasses that go outside the normal
rendering pipeline - if not very carefully understood, they
can lead to the situation where the test values are not
being processed in the same way that normal rendering is
processed, leading to invalid test patch results.

And in any case, this approach always strikes me as really hacky -
if there is a well thought out color management pipeline, then
the same mechanisms used to configure the steps in the pipeline
are the very ones that should work to configure it into a state suitable
for measurement. This is natural and elegant, and  much safer.

("Safer" in color management terms typically translates to
 "less likely to lead to difficult to diagnose failures to
  get the expected color, due to processing steps that are
  modifying color in hard to scrutinize ways".)

Chris Murphy wrote:
>>>> The holy grail is as Richard Hughes describes, late binding color
>>>> transforms. In effect every pixel that will go to a display is going
>>>> to be transformed. Every button, every bit of white text, for every
>>>> application. There is no such thing as opt in color management, the
>>>> dumbest program in the world will have its pixels intercepted and
>>>> transformed to make sure it does not really produce 255,255,255
>>>> (deviceRGB) white on the user display.

I wouldn't call this a "holy grail", nor am I sure this really
falls into what is normally regarded as a "late binding" color
workflow.

In color workflows, "late binding" typically refers to delaying
the rendering of original (assumed photographic) source material
into an intermediate device dependent colorspace until it actually
gets to the point where this is necessary (i.e. printing
or display). But this is not automatically better or easier,
it actually comes down to where the rendering intent information
(creative judgment) resides, as well as where the gamut definitions reside.
[At this point I'll omit a whole discussion about the nuances and pro's
 and con's of early & late binding.]

What Chris is talking about above, is simply providing a mechanism
in a display server to by-default manage non-color managed "RGB"
output from applications. That's very desirable in a world
full of non-color aware applications (including most desktop
software itself) and native wide gamut displays.

> I completely agree with you! Not only on the 'opt in' point quoted
> just here, but on this. When I said 'opt out' in my reply, I was
> talking about several proposals in the discussion that applications be
> able to 'opt out of colour management' because they already had the
> perfect pipeline created.

Right. A completely different case to dealing with
non-color aware/managed applications.

Daniel Stone wrote:
>>> As arguments to support his solution, Graeme presents a number of
>>> cases such as complete perfect colour accuracy whilst dragging
>>> surfaces between multiple displays, and others which are deeply
>>> understandable coming from X11. The two systems are so vastly
>>> different in their rendering and display pipelines that, whilst the
>>> problems he raises are valid and worth solving, I think he is missing
>>> an endless long tail of problems with his decreed solution caused by
>>> the difference in processing pipelines.

Sorry, I'm not at all as convinced that I don't understand
many of the differences between X11 and Wayland.

>>> Put briefly, I don't believe it's possible to design a system which
>>> makes these guarantees, without the compositor being intimately aware
>>> of the detail.

Using device links is a direction to make this possible I think,
since this allows decoupling the setting up of the color transform,
from when (and who) executes the corresponding pixel transformation.
So the application can then (if it chooses to manage its own color)
setup the color transformations for each output, while the compositor
can execute the pixel transformation on whichever of the surface
pixels it chooses to whichever of the outputs it needs to render to.
Note the implications about what range of source colorspace widths
the compositor then needs to handle though!

Chris Murphy wrote:
>> I'm not fussy about the architectural details. But there's a real
>> world need for multiple display support, and for images to look
>> correct across multiple displays, and for images to look correct when
>> split across displays. I'm aware there are limits how good this can
>> be, which is why the high end use cases involve self-calibrating
>> displays with their own high bit internal LUT, and the video card LUT
>> is set to linear tone response. In this case, the display color space
>> is identical across all display devices. Simple.

Except it's not, unless all the displays have identical
chromaticity primaries and mix the colors in the
same way. So conceivable yes for high quality displays that
are notionally identical. But even high end displays
with more complex internal calibration machinery
have the same limitations with regard to how to
match gamuts if the primaries are not identical, the same as
attempting to use video card lut-matrix-lut machinery
- you can't square the circle. You either reduce all
to a common smaller gamut, or loose any control over
how clipping occurs. This may be perfectly OK for
displays that are close to being the same, or for
users that are prepared to sacrifice some gamut,
but it is distinctly not so good when mixing different
types of displays, which happens all the time when someone
docks their laptop.

> Similarly, I'd like to park the discussion about surfaces split across
> multiple displays; it's a red herring.

I'm not sure why you say that, as it seems to lie at the
heart of what's different between (say) X11 and Wayland.

> Again, in X11, your pixel
> content exists in one single flat buffer which is shared between
> displays. This is not a restriction we have in Wayland, and a lot of
> the discussion here has rat-holed on the specifics of how to achieve
> this based on assumptions from X11. It's entirely possible that the
> best solution to this (a problem shared with heterogeneous-DPI
> systems) is to provide multiple buffers.

I'm certainly not assuming anything like a single buffer shared
between displays - all I'm interested in is who sets up and who
does the transformation between the source color spaces
and the output colorspaces. Spatial transformation is orthogonal.

> Or maybe, as you suggest
> below, normalised to an intermediate colour system of perhaps wider
> gamut.

Doesn't solve the problem. Stuffing color into a wide gamut
colorspace is easy, what to do with it after that is hard,
since the transformation of those colors to a specific
output may depend on the source gamut and the destination
gamut. i.e. stuffing things into a wide gamut space doesn't
decouple the transformation without also sacrificing
the color outcome due to loss of control over clipping
or gamut mapping behavior.

Chris Murphy wrote:
>> The video card LUT is a fast transform. It applies to video playback
>> the same as anything else, and has no performance penalty.
>>
>> So then I wonder where the real performance penalty is these days?
>> Video card LUT is a simplistic 2D transform. Maybe the "thing" that
>> ultimately pushes pixels to each display, can push those pixels
>> through a software 2D LUT instead of the hardware one, and do it on 10
>> bits per channel rather than on full bit data.

Daniel Stone wrote:
> Some of the LUTs/matrices in display controllers (see a partial
> enumeration in reply to Mattias) can already handle wide-gamut colour,
> with caveats. Sometimes they will be perfectly appropriate to use, and
> sometimes the lack of granularity will destroy much of their value.

Interesting speculation of course, but I'm not actually sure it
helps, except for specific situations. For instance, it
may help the Video situation if there is a hardware decode
pipeline that renders directly to the display buffer (or
if further processing via the CPU or GPU is undesirable
from a processing overhead or power consumption point of view).
Doing accurate color in this situation is hard, because
simple machinery (i.e. matrix and 1D luts) assume a perfectly
behaved output device in terms of additivity, and either
a wider gamut than the video space or a willingness to put
up with whatever clipping the machinery implements (typically
per component clipping, leading to clip hue changes).

In general (i.e. for application supplied color), such machinery
doesn't offer much of interest over the far more flexible
and capable mechanisms that an ICC profile (and corresponding CMM)
offer, and that can be applied per application element rather than over the
whole output.

Even if it comes to the point where graphic card hardware routinely
offers a reasonable resolution multi-dimensional LUT
within a CRTC, I can't actually see that as being very useful,
apart from some very specific situations (for some reason you want
your whole desktop to be in a specific emulated colorspace
such as Rec709, rather than leaving it up to each application to take
full advantage of the displays capabilities).

> If
> the compositor is using the GPU for composition, then doing colour
> transformations is extremely cheap, because we're rarely bound on the
> GPU's ALU capacity.

Yes - programmable beats fixed pipelines for flexibility and
possible quality. I'm not so confident that super high quality
color management can't tax a GPU though - a high end
Video 3DLut box supports 64 x 64 x 64 resolution LUT
(that's something like 1.6 Mbytes of table for a single
transform @ 16 BPC), and MadVR uses a 256 x 256 x 256 table for
it's color, i.e. 100 Mbytes using the GPU. And multiple
applications may use multiple transforms.
In practice I wouldn't expect apps. to normally push these limits,
because most of this stuff originated on systems with
much more limited CPU and memory, so 17^4 and 33^3 table
resolutions or similar are much more common.

> Mind you, I see an ideal steady state for non-alpha-blended
> colour-aware applications on a calibrated display, as involving no
> intermediate transformations other than a sufficiently capable display
> controller LUT/matrix. Efficiency is important, after all. But I think
> designing a system to the particular details of a subset of hardware
> capability today is unnecessarily limiting, and we'd be kicking
> ourselves further down the road if we did so.

Yep. Future project, as HW and possible usage becomes clearer.

Regards,

Graeme Gill.