[PATCH] unstable: add HDR Mastering Meta-data Protocol

Fri Mar 1 14:05:56 UTC 2019

On Fri, 01 Mar 2019 15:22:17 +0530
Harish Krupo <harish.krupo.kps at intel.com> wrote:

> Hi Pekka,
> 
> I have some comments to add to Ankit's. Please find them inline.
> 
> Nautiyal, Ankit K <ankit.k.nautiyal at intel.com> writes:

Hi Ankit and Harish,

comments below.

> > Hi Pekka,
> >
> > Thanks for the comments and suggestions.
> >
> > Please my responses inline:
> >
> > On 2/28/2019 7:51 PM, Pekka Paalanen wrote:  
> >> On Wed, 27 Feb 2019 10:27:16 +0530
> >> "Nautiyal, Ankit K"<ankit.k.nautiyal at intel.com>  wrote:
> >>  
> >>> From: Ankit Nautiyal<ankit.k.nautiyal at intel.com>
> >>>
> >>> This protcol enables a client to send the hdr meta-data:
> >>> MAX-CLL, MAX-FALL, Max Luminance and Min Luminance as defined by
> >>> SMPTE ST.2086.
> >>> The clients get these values for an HDR video, encoded for a video
> >>> stream/file. MAX-CLL (Maximum Content Light Level) tells the brightest
> >>> pixel in the entire stream/file in nits.
> >>> MAX-FALL (Maximum Frame Average Light Level) tells the highest frame
> >>> average brightness in nits for a single frame. Max and Min Luminance
> >>> tells the max/min Luminance for the mastering display.
> >>> These values give an idea of the brightness of the video which can be
> >>> used by displays, so that they can adjust themselves for a better
> >>> viewing experience.
> >>>
> >>> The protocol depends on the Color Management Protocol which is still
> >>> under review. There are couple of propsed protocols by Neils Ole [1],
> >>> and Sebastian Wick [2], which allow a client to select a color-space
> >>> for a surface, via ICC color profile files.
> >>> The client is expected to set the color space using the icc files and
> >>> the color-management protocol.
> >>>  
> 
> I believe this part should be removed as the hdr_surface object wraps
> over the wl_surface and doesn't interact with colorspace objects.
> More on this below.
> 
> >>> [1]https://patchwork.freedesktop.org/patch/134570/
> >>> [2]https://patchwork.freedesktop.org/patch/286062/
> >>>
> >>> Co-authored-by: Harish Krupo<harish.krupo.kps at intel.com>
> >>> Signed-off-by: Ankit Nautiyal<ankit.k.nautiyal at intel.com>  
> >> Hi Ankit,
> >>
> >> thanks for working on this, comments inline.
> >>
> >> I do wonder if this should just be baked into a color management
> >> extension directly. More on that below.  
> >
> > We had this in mind initially with this line of thought in couple of early
> > interactions with Niels and Sebastian.
> > Later I had a discussion in #wayland with Sebastian, and it seemed that the
> > brightness values should be handled separately,
> > as these do not directly come under ICC profiles as far as I understand.
> > As we know HDR  comprises of :
> > * Transfer functions
> > * Color primaries
> > * Mastering meta-data : MAX_CLL, MAX_FALL etc.
> > Out of these, the first two, can be handled by color-manager and would not
> > change often.
> > HDR mastering meta-data, as is discussed is coming from the video stream, and
> > with the dynamic HDR mastering metadata,
> > these brightness/luminance values might change frame by frame.
> > So it makes sense to have a separate protocol for this, where a client can tell
> > the compositor about the color-space once with the color-manager protocol,
> > and these luminance values, can be sent to the compositor as per the video
> > frames.

We could have a separate request in the color management protocol
instead of having this whole another protocol extension, that is what I
meant. If ICC profile does not contain all the parameters HDR needs,
then add a request for those that were missed. Being a separate
request, a client can choose to send them or not.

The per-frame change is not a problem. Both color management proposals
already support changing the color profile for each frame. Not changing
the color profile while changing the HDR mastering parameters is not a
problem either.

The only question is, do color management and HDR support conceptually
make sense as independent features, or would implementations always
support both with essentially the same effort?

"Supporting HDR" here means only that the compositor is able to process
HDR content from clients, it does not include the capability to drive
HDR monitors. IOW, a compositor that only converts HDR content to SDR
is a valid HDR implementation from protocol perspective. It merely
advertises all outputs as SDR.

> 
> Agreed. The two protocols do not interact with each other and can be used
> independently.
> 
> >>> ---
> >>>   Makefile.am                                        |  1 +
> >>>   unstable/hdr-mastering-metadata/README             |  5 ++
> >>>   .../hdr-mastering-metadata-unstable-v1.xml         | 95 ++++++++++++++++++++++
> >>>   3 files changed, 101 insertions(+)
> >>>   create mode 100644 unstable/hdr-mastering-metadata/README
> >>>   create mode 100644 unstable/hdr-mastering-metadata/hdr-mastering-metadata-unstable-v1.xml

> >>> diff --git a/unstable/hdr-mastering-metadata/hdr-mastering-metadata-unstable-v1.xml b/unstable/hdr-mastering-metadata/hdr-mastering-metadata-unstable-v1.xml
> >>> new file mode 100644
> >>> index 0000000..aeddf39
> >>> --- /dev/null
> >>> +++ b/unstable/hdr-mastering-metadata/hdr-mastering-metadata-unstable-v1.xml  
> >> Could it be named hdr-mastering rather than hdr-mastering-metadata?
> >> Shorter C function names wouldn't hurt.  
> > Yes I guess, thanks for the suggestion, or perhaps "hdr-metadata" ?
> >  
> >>> @@ -0,0 +1,95 @@
> >>> +<?xml version="1.0" encoding="UTF-8"?>
> >>> +<protocol name="hdr_mastering_metadata_unstable_v1">
> >>> +
> >>> +  <copyright>
> >>> +    Copyright © 2019 Intel
> >>> +

> >>> +  </copyright>
> >>> +
> >>> +  <description summary="hdr mastering meta data protocol">  
> >> I think this chapter should explicitly refer to the SMPTE
> >> specification. You did it in the commit message, but I think it would be
> >> appropriate here.
> >>
> >> The commit message explains a lot of what this is. The commit message
> >> should concentrate on why this extension is needed and why it is like
> >> this, and leave the what for the protocol documentation.  
> >
> > Point taken. Will make the commit concise and elaborate more in protocol
> > documentation.
> >
> >  
> >>> +    This protocol provides the ability to specify the mastering color volume
> >>> +    metadata for an HDR video, for a given surface.
> >>> +    These values give an idea of the brightness of the video, which can be
> >>> +    used by the display so that it can adjust itself for a better viewing
> >>> +    experience.
> >>> +
> >>> +    The hdr-metadata values are enocoded in the video and the client can
> >>> +    retreive these values and provide them to the compositor.
> >>> +
> >>> +    A client need to first get the color-space interface for the HDR
> >>> +    color-space using the color-manager protocol, via ICC profile.
> >>> +    Then it needs to get the hdr_mastering_surface using
> >>> +    hdr_mastering_metadata interface. The get_hdr_surface(), provides the
> >>> +    hdr_surface interface which can be used to set the hdr mastering meta-data.  
> >> Is this interface, or what it provides, completely useless without an
> >> explicit color space / color image encoding definition?
> >>
> >> Can one not apply the HDR parameters to the usual assumed sRGB content
> >> with 8 bits per channel? Sure, it would probably look ugly, but so does
> >> using a RGB332 pixel format for instance. I mean, mathematically the
> >> definition of what should happen exists, right?  
> > You are absolutely right that these can be for other color space, and is not
> > tied to HDR color space.
> > Had discussion with Harish from my team, who brought more clarity.
> >
> > Harish, can you elaborate on this?
> >  
> 
> The two components, colorspace and HDR information are independent.
> Although I haven't seen one but it is technically possible to have HDR
> metadata associated with REC 709/sRGB image.

Right, that means it makes sense to attach HDR mastering parameters to
a wl_surface, as opposed to a color image encoding.

> >> OTOH, if you do want to tie this to a color management extension, it
> >> should probably use an object from the color management extension as
> >> the base instead of wl_surface for instance. Or, you'd need an error
> >> code for color management not set on the wl_surface.  
> 
> We wouldn't need this as we will be using the wl_surface as our base
> object.

Sorry, that is cause and effect reversed.

If you do require color management (because it allows you to define the
color space for instance), then you should integrate with the color
management extension, not wl_surface.

If you do not require color management, then you can hook up to
wl_surface directly.

> >
> > Yes this probably needs more discussion.
> > Initially I was discussing the possibility of having the color-space interface
> > as defined by Niel Ole's or Sebastian's proposed protocols.
> > As we understand, that these are not tied to HDR, but still HDR meta-data
> > comprises of the color-primaries as well, so whether we
> > can use this protocol, without setting the color-space, I am not entirely sure.
> >  
> 
> The color primaries that are provided as a part of the HDR metadata are
> the mastering display's primaries. It is possible that the primaries of
> the mastering display did not exactly match with that of a standard
> colorspace's (BT2020 / DCI-P3) primaries. Right now, for our use case
> this information is sufficient. If we need that data in the future we
> could add it as set_metadata_primaries and bump the protocol's version
> to v2.

ICC profiles can define the primaries, that I am pretty sure of. If you
add another way to define primaries, you add confusion. Which extension
should win when both are used? Do you use anything else from the ICC
profile if you use the HDR primaries?

If you add the request to set the primaries, then I suggest you make
the HDR extension and color management extension mutually exclusive: a
wl_surface can only have one, never both.

The big question here is: does a HDR video, with reasonable defaults if
not explicitly defined, allow to programmatically manufacture a custom
ICC profile with the HDR primaries?

So far the color management extension proposals use ICC profiles as the
encoding of the color parameters. Either you leverage that or you
invent your own, everything in between is just a bog (or a bug).

> >> Is there a case for compositors that implement color management but not
> >> HDR support? Would it be unreasonable to make HDR parameters a part of
> >> the color management extension? Such that it would be optional for a
> >> client to set the HDR parameters.  
> 
> It is possible to just implement colospace conversion. The image quality
> might not be very great if the luminance range in the image exceeds that
> of the display. HDR handling will at least try to ensure that the
> content's luminance matches that of the display.

If a compositor implemented color management but not HDR (assuming the
two are separate), a client cannot even provide HDR content to begin
with, so I don't understand the comment about quality.

What I mean is, if a compositor implements color management, is there
any good reason to not implement HDR luminance mapping as well? At
least the implementation would not differ, just the parameters or
tables. That is why I suspect supporting HDR will be easy on top of a
good color conversion implementation.

Hence the question: is it worthwhile in protocol design to support
compositors that choose to implement color management but not HDR
luminance mapping?

It very well could be, which necessitates independent protocol
extensions.

> >
> > As discussed above, seems best to have a separate protocol for these parameters.
> >  
> >> What I mean by that is that color management already needs to be able
> >> to handle out-of-gamut pixels somehow. Could out-of-dynamic-range be
> >> handled with the exact same code, in case the compositor does not
> >> support driving HDR monitors? Is there a significant implementation
> >> effort to deal with the HDR parameters?  
> 
> Out-of-gamut pixels are handled completely separately compared to
> out-of-dynamic-range pixels. Out of gamut can be solved either clamping
> the color after conversion or by using a nearest approximation of the
> color available within the colorspace.
> Out-of-dynamic-range OTOH, requires applying tone mapping to adjust the
> luminance ranges.

I thought that some of the rendering intents of color management do the
same as HDR luminance mapping. Do they not? Only few parameters more to
adjust the ranges of the mapping curves.

> > I dont think that would be possible as I believe, there is significant
> > implementation effort for HDR parameters
> > example tonemapping via Libva for HDR->SDR or SDR->HDR before blending.
> > Though some of the activities might be overlapping with color-management
> > protocol implementation.
> > As per discussion with Sebastian on IRC, a separate protocol seemed a better.

Libva is a pure implementation detail, and I'm starting to think that
it would be better to not use it at all. If we have a capable color
conversion implementation for color management, it seems that HDR<->SDR
mapping would be just different parameter values to the same code.

We're going to need to do the conversion from client color image
encoding to the blending color space in a single pass, and we need to
do the conversion from blending space to output space in a single pass.
Can you do the conversions defined by arbitrary ICC profiles in libva?
I wouldn't expect so. Besides, the the former conversion is part of the
compositing, which I don't think is suitable for libva.

Tone mapping is not a separate step, it has to be built in to the color
space conversion shaders for memory bandwidth reasons.

I guess I should clarify that I am talking about the reference
implementation which we should aim for first. If there are special
cases where we can use libva to gain performance or save power, those
would come later.

> >  
> >> Erwin raised the issue of the client sometimes needing to know what the
> >> outputs are capable of in terms of HDR. Can that include "not capable
> >> of HDR" as well? Either as numerical values or special events. Probably
> >> as special events, because you'd have to make random guesses on what
> >> the HDR parameters of a SDR monitor are.
> >>
> >> Knowing the capabilities of the outputs would also let video players
> >> warn the user, if the played content exceeds their hardware
> >> capabilities, or maybe even switch to a different video stream.  
> >
> > Yes that makes sense, we can have an event from the compositor to the client,
> > signifying that
> > the outputs on which the surface is visible, are "not HDR capable".

That is a fairly awkward formulation. It would be better to have an
extension object created from a wl_output that then delivers the
per-output events describing the HDR capabilities of the output.
Clients already know which outputs are involved for a wl_surface, and
for color management there was the idea of additionally exposing which
output the client should optimize for.

It is also likely that color management will need a wl_output extension
object of its own, so if HDR information can be integrated in that, it
would be simpler to use.

> > this information as far as I understand can be taken from edid of the displays,
> > at compositor side.
> >  
> 
> Yes, this indeed would be useful. Players like vlc implicitly convert
> HDR videos to SDR before playing. If the player had information that the
> display supports HDR, the player could skip the conversion and directly
> present the buffer to the compositor. One way this could be implemented
> is by adding an event in the HDR-METADATA interface to get this
> information once the client binds to the interface. But this might not
> be as simple as adding an event, as we would also need to ensure that
> the client is always displayed on the same output/head which has the hdr
> support.

There is no need to ensure at the protocol level. If the compositor
advertises HDR support and there exists at least one HDR output, the
player should always produce HDR frames if possible, especially if it
is cheaper for the player than SDR. The compositor will take care of the
SDR conversion if necessary, and there won't be glitches if the window
moves between HDR and SDR monitors as there is no need for the client
to catch up.

> Currently the scope of this protocol is to set information about the
> surface not the other way around. Maybe we can discuss this once we
> are done with one side of the communcation? :)

I don't see any difficulty, we should include the output information
from the start too.

> >> Btw. do HDR videos contain ICC profiles? Where would a client get the
> >> ICC profile for a video it wants to show? Do we need to require the
> >> compositor to expose some commonly used specific profiles?  
> 
> HDR videos contain the primaries/luminance information of the display
> which was used while mastering the content. FFMPEG lets us extract this
> data per frame, by getting the frame's side data. More on this here [1].

Right. As I wrote earlier, if that is enough to manufacture a valid
custom ICC profile enough for the color conversions, it's good.

> >
> > HDR videos do contain the color primaries, white point etc, which are covered in
> > ICC profiles.
> > I guess both Niel's and Sebastian's solution provide a colorspace interface
> > which can expose the built in specific profiles
> > to the client.
> > Whether these values, change during video playback, I am again not sure.

Change is not a problem.

> > In that case perhaps we need to create an ICC profile from the values retrieved
> > from the frame and provide to the compositor for the given surface.
> >  
> >>> +  </description>
> >>> +
> >>> +  <interface name="zwp_hdr_mastering_metadata_v1" version="1">
> >>> +    <description summary="hdr mastering metadata">
> >>> +      The hdr matering metadata is a singleton global object that provides
> >>> +      the extenstion hdr_surface for a given surface.
> >>> +    </description>
> >>> +
> >>> +    <enum summary="error">
> >>> +      <entry name="hdr_surface_exists" value="0"
> >>> +	      summary="The hdr surface exists for given surface."/>
> >>> +    </enum>
> >>> +
> >>> +    <request name="destroy" type="destructor">
> >>> +      <description summary="destroy the hdr_mastering_metadata object">
> >>> +        Destroy the HDR mastering metadata object.  
> >> What side-effects does this have?  
> >
> > When the client calls destroy, the hdr mastering metatadata object gets
> > destroyed.
> > I believe the hdr_surface objects for that client should be destroyed as well.
> > What ever the surface/s the client have, will retain the last set values.
> > Client should call destroy only when it does not anticipate that it needs to
> > send the value to client any more.
> > In case it needs to send the values again, it needs to bind again, which will be
> > overhead I suppose.
> >  
> 
> Maybe not. When the client calls destroy for the the mastering metadata
> object, the hdr_surface objects need not be destroyed as they dont
> interact with the mastering metadata interface. The only obvious side
> effect would be that the client will not be able to create more hdr_surface
> objects.

Yes, exactly as Harish says. It's a common convention that the child
objects already created are not affected. Anything else would be just
more error cases.

> Probably "release" would be a better request name than "destroy".

No, "destroy" is the canonical name.

The only reason why some interfaces have "release" is that it was added
after the fact. Originally the interface did not have a destructor at
all, then we found out that we need a destructor for the compositor to
be able to free resources while the client is still connected. We added
the destructor, but we could not call it "destroy" because
wayland-scanner generates a *_destroy() function in any case, and we
must not change the behaviour of it. So the destructor was named
"release", to ever remind us of our mistake.

> 
> >>> +      </description>
> >>> +    </request>
> >>> +
> >>> +    <request name="get_hdr_surface">
> >>> +      <description summary="get the interface for hdr_surface">
> >>> +        This interface is created for a surface and the hdr mastering metadata
> >>> +	should be attached to this surface.
> >>> +      </description>
> >>> +      <arg name="hdr_surface" type="new_id" interface="zwp_hdr_surface_v1"/>
> >>> +      <arg name="surface" type="object" interface="wl_surface"/>
> >>> +    </request>
> >>> +
> >>> +  </interface>
> >>> +
> >>> +  <interface name="zwp_hdr_surface_v1" version="1">
> >>> +    <description summary="an interface to add hdr mastering metadata">
> >>> +      An interface to add the hdr mastering metadata like MAX-CLL and MAX-FALL,
> >>> +      for a given surface.  
> >> What happens if the wl_surface is destroyed first? And the client
> >> issues requests through this interface?  
> >
> > Valid point. The server should store the object for each surface as a list of
> > surfaces, when the get_hdr_surface is called.
> > wl_surface must be checked, and error sent if the wl_surface is not there.

That sounds... complicated. We have plenty of examples, and almost all
just have one pointer to set to NULL from a wl_surface destroy listener.

> 
> The effect would be similar to what happens to viewport objects when the
> base wl_surface is destroyed. This could be worded as:
>      If the wl_surface associated with the hdr_surface is destroyed,
>      all hdr_surface requests except 'destroy' raise the protocol error
>      no_surface.

Yes, I would prefer that.

> 
> >>> +    </description>
> >>> +
> >>> +    <request name="set_hdr_mastering_metadata">
> >>> +      <description summary="set the hdr mastering metadata for the surface.">
> >>> +        This request is double buffered and it will be applied to the surface
> >>> +	on wl_surface::commit.  
> >> This is the request that actually turns HDR "on" for a wl_surface,
> >> right?
> >>  
> 
> Yes, it does.
> 
> >> Maybe there should be some words about SDR vs. HDR somewhere in the
> >> spec, how they are handled differently.  
> >
> > Yes thats correct, The idea is that while tone-mapping the color-spaces need to
> > be convert to common space.
> > HDR->SDR or SDR->HDR.
> > Harish and Shashank from my team are working on it perhaps they can shed more
> > light on this.
> >  
> 
> When HDR and SDR content are presented together, the compositor converts
> all the buffers to HDR or SDR depending on the display's capability and
> then composes the buffers. Would something in these lines be sufficient?

I was thinking of more along the lines:
- a short definition of what SDR and HDR mean
- that a surface is SDR by default
- how to return a surface to SDR after it was set HDR (hdr_surface destructor)

> >>> +      </description>
> >>> +      <arg name="max_cll" type="uint" summary="MAX Content Light Level"/>
> >>> +      <arg name="max_fall" type="uint" summary="MAX Frame Average Light Level"/>
> >>> +      <arg name="max_lum" type="uint" summary="MAX Luminance"/>
> >>> +      <arg name="min_lum" type="uint" summary="MIN Luminance"/>  
> >> What are the units for all these values?
> >>
> >> Do integers give enough precision and range?  
> >
> > The values are in nits, AFAIK the values are of the order of hundreds and
> > thousands(for max values)
> > So perhaps uint is sufficient.
> >  
> 
> The values are whole numbers and currently the maximum value is 10000
> nits (max value that PQ curve encodes).

Sounds fine.

Is it good enough for min luminance as well? What are typical min_lum
values? Is there no need for values between 0 and 1? Or more precision
at the low end?

> >> Should these be split into separate requests? Could it be possible that
> >> only some of these might be superseded by something else in the future?  
> >
> > Not sure about this, perhaps more parameters get added.
> >  
> >> OTOH, keeping all parameters in a single request means that there is no
> >> error case of not defining all the parameters.  
> >  
> 
> Yes, this is a possibility. The compositor would need to pick "sane"
> defaults for the values. It could pick the generally seen values for
> max_lum and min_lum.

I'm a little wary of default values.

An SDR monitor does not tell us these values at all. If default values
were picked, we should use them consistently with both surface defaults
and output defaults. If a client explicitly sets values that are
the defaults, then the result on screen should be the same as if it
never set them. OTOH, if a compositor describes an output with the
default values, a client might get confused if the output is SDR or HDR.

Considering all that, should there be default values, or should there
be an explicit and unique "flag" for SDR? E.g. lack of making a request
or receiving an event would imply SDR.

> >>> +    </request>
> >>> +
> >>> +    <request name="destroy" type="destructor">
> >>> +      <description summary="destroy the hdr mastering surface object">
> >>> +        Destroy the hdr_surface.  
> >> What effects does this have to the wl_surface?  
> >
> > Destroy the corresponding hdr_surface.
> > The client needs to call get_hdr_surface() again to change these values.
> > At wl_surface level, the luminance value should remain set to last value.
> >  
> 
> Once the object is destroyed, the associaed wl_surface would no more
> have any HDR metadata. The buffer attached in the next commit would be
> considered as a SDR buffer.

Yes, Harish made a good definition. I'd even drop the mention of a new
buffer and just rely on the double-buffered state, otherwise you have
to define what happens if the next commit does not have a new buffer
attached.

Thanks,
pq

> 
> Thank you
> Regards
> Harish Krupo
> 
> [1] https://www.ffmpeg.org/doxygen/3.1/group__lavu__frame.html#gae01fa7e427274293aacdf2adc17076bc

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20190301/4c6006e9/attachment-0001.sig>