[RFC PATCH v2 06/17] drm/doc/rfc: Describe why prescriptive color pipeline is needed

Fri Nov 10 11:27:14 UTC 2023

> -----Original Message-----
> From: Pekka Paalanen <ppaalanen at gmail.com>
> Sent: Thursday, November 9, 2023 5:26 PM
> To: Shankar, Uma <uma.shankar at intel.com>
> Cc: Joshua Ashton <joshua at froggi.es>; Harry Wentland
> <harry.wentland at amd.com>; dri-devel at lists.freedesktop.org; Sebastian Wick
> <sebastian.wick at redhat.com>; Sasha McIntosh <sashamcintosh at google.com>;
> Abhinav Kumar <quic_abhinavk at quicinc.com>; Shashank Sharma
> <shashank.sharma at amd.com>; Xaver Hugl <xaver.hugl at gmail.com>; Hector
> Martin <marcan at marcan.st>; Liviu Dudau <Liviu.Dudau at arm.com>; Alexander
> Goins <agoins at nvidia.com>; Michel Dänzer <mdaenzer at redhat.com>; wayland-
> devel at lists.freedesktop.org; Melissa Wen <mwen at igalia.com>; Jonas Ådahl
> <jadahl at redhat.com>; Arthur Grillo <arthurgrillo at riseup.net>; Victoria
> Brekenfeld <victoria at system76.com>; Sima <daniel at ffwll.ch>; Aleix Pol
> <aleixpol at kde.org>; Naseer Ahmed <quic_naseer at quicinc.com>; Christopher
> Braga <quic_cbraga at quicinc.com>; Ville Syrjala <ville.syrjala at linux.intel.com>
> Subject: Re: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why prescriptive color
> pipeline is needed
> 
> On Thu, 9 Nov 2023 10:17:11 +0000
> "Shankar, Uma" <uma.shankar at intel.com> wrote:
> 
> > > -----Original Message-----
> > > From: Joshua Ashton <joshua at froggi.es>
> > > Sent: Wednesday, November 8, 2023 7:13 PM
> > > To: Shankar, Uma <uma.shankar at intel.com>; Harry Wentland
> > > <harry.wentland at amd.com>; dri-devel at lists.freedesktop.org
> 
> ...
> 
> > > Subject: Re: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why
> > > prescriptive color pipeline is needed
> > >
> > >
> > >
> > > On 11/8/23 12:18, Shankar, Uma wrote:
> > > >
> > > >
> > > >> -----Original Message-----
> > > >> From: Harry Wentland <harry.wentland at amd.com>
> > > >> Sent: Friday, October 20, 2023 2:51 AM
> > > >> To: dri-devel at lists.freedesktop.org
> 
> ...
> 
> > > >> Subject: [RFC PATCH v2 06/17] drm/doc/rfc: Describe why
> > > >> prescriptive color pipeline is needed
> 
> ...
> 
> > > >> +An example of a drm_colorop object might look like one of these::
> > > >> +
> > > >> +    /* 1D enumerated curve */
> > > >> +    Color operation 42
> > > >> +    ├─ "TYPE": immutable enum {1D enumerated curve, 1D LUT, 3x3
> > > >> + matrix, 3x4
> > > >> matrix, 3D LUT, etc.} = 1D enumerated curve
> > > >> +    ├─ "BYPASS": bool {true, false}
> > > >> +    ├─ "CURVE_1D_TYPE": enum {sRGB EOTF, sRGB inverse EOTF, PQ
> > > >> + EOTF, PQ
> > > >> inverse EOTF, …}
> > > >
> > > > Having the fixed function enum for some targeted input/output may
> > > > not be scalable for all usecases. There are multiple colorspaces
> > > > and transfer functions possible, so it will not be possible to
> > > > cover all these by any enum definitions. Also, this will depend on
> > > > the capabilities of
> > > respective hardware from various vendors.
> > >
> > > The reason this exists is such that certain HW vendors such as AMD
> > > have transfer functions implemented in HW. It is important to take
> > > advantage of these for both precision and power reasons.
> >
> > Issue we see here is that, it will be too usecase and vendor specific.
> > There will be BT601, BT709, BT2020, SRGB, HDR EOTF and many more. Not
> > to forget we will need linearization and non-linearization enums for each of
> these.
> 
> I don't see that as a problem at all. It's not a combinatorial explosion like
> input/output combinations in a single enum would be.
> It's always a curve and its inverse at most.
> 
> It's KMS properties, not every driver needs to implement every defined enum
> value but only those values it can and wants to support.
> Userspace also sees the supported list, it does not need trial and error.
> 
> This is the only way to actually use hard-wired curves. The alternative would be
> for userspace to submit a LUT of some type, and the driver needs to start
> guessing if it matches one of the hard-wired curves the hardware supports, which
> is just not feasible.
> 
> Hard-wired curves are an addition, not a replacement, to custom curves defined
> by parameters or various different LUT representations.
> Many of these hard-wired curves will emerge as is from common use cases.

Point taken, we can go with this fixed function curve types as long as it represents a
single mathematical operation, thereby avoiding the combination nightmare.

However, just want to make sure that the same thing can be done with a programmable
hardware. In the case above, lut tables for the same need to be hardcoded in driver for
various platforms (depending on its capabilities, precision, number, and distribution of luts etc).
This is manageable, but driver will get bloated with all kinds of hardcoded lut tables,
which could have been easily computed by the compositor runtime. Driver cannot compute
the tables runtime due to the complexity of the floating math involved, so hardcoded
lut tables will be the only option. 

So we should just ensure that if these enums are not exposed by a driver, but a programmable
lut block is exposed instead, userspace should fall back to the programmable lut. Having the
fixed function enum should not become a mandatory norm to implement and expose even for a
programmable hardware.

With this we will be able to cater to both kinds of hardware with a generic userspace.
Hope this expectation is ok.

> > Also
> > a CTM indication to convert colospace.
> 
> Did someone propose to enumerate matrices? I would not do that, unless you
> literally have hard-wired matrices in hardware and cannot do custom matrices.

Not currently, but there can be fixed function matrix for certain color space or
format conversion like BT709->BT2020 etc..
However, we see this is not proposed currently and if not needed, it's fine and
don't want to bring another non-problem for discussion.

> > Also, if the underlying hardware block is programmable, its not
> > limited to be used only for the colorspace management but can be used
> > for other color enhancements as well by a capable client.
> 
> Yes, that's why we have other types for curves, the programmable ones.

Got that and agree, it's fine as mentioned above.

> > Hence, we feel that it is bordering on being descriptive with too many
> > possible combinations (not easy to generalize). So, if hardware is
> > programmable, lets expose its capability through a blob and be generic.
> 
> It's not descriptive though. It's a prescription of a mathematical function the
> hardware implements as fixed-function hardware. The function is a curve. There
> is no implication that the curve must be used with specific input or output color
> spaces.

As long as we don’t mix combinations it should be fine. But all hardware's may not
represent these fixed functions with single mathematical operation level granularity.
It would be tough to represent such color blocks with a single enum.

> > For any fixed function hardware where Lut etc is stored in ROM and
> > just a control/enable bit is provided to driver, we can define a
> > pipeline with a vendor specific color block. This can be identified with a flag
> (better ways can be discussed).
> 
> No, there is no need for that. A curve type will do well.

Agree and aligned here.

> A vendor specific colorop needs vendor specific userspace code to program *at
> all*. A generic curve colorop might list some curve types the userspace does not
> understand, but also curve types userspace does understand. The understood
> curve types can still be used by userspace.

Issue is with combination operation in hardware. If it’s a single mathematical operation,
it would be easy.

> > For example, on some of the Intel platform, we had a fixed function to
> > convert colorspaces directly with a bit setting. These kinds of things
> > should be vendor specific and not be part of generic userspace implementation.
> 
> Why would you forbid generic userspace from making use of them?

Issue is that it was not one single mathematical operation but a combination
as described below.

> > For reference:
> > 001b	YUV601 to RGB601 YUV BT.601 to RGB BT.601 conversion.
> > 010b	YUV709 to RGB709 YUV BT.709 to RGB BT.709 conversion.
> > 011b	YUV2020 to RGB2020 YUV BT.2020 to RGB BT.2020 conversion.
> > 100b	RGB709 to RGB2020 RGB BT.709 to RGB BT.2020 conversion.
> 
> This is nothing like the curves we talked about above.
> Anyway, you can expose these fixed-function operations with a colorop that has
> an enum choosing the conversion. There is no need to make it vendor-specific at
> all. It's possible that only specific chips from Intel support it, but nothing stops
> anyone else from implementing or emulating the colorop if they can construct a
> hardware configuration achieving the same result.
> 
> It seems there are already problems in exploding the number of pipelines to
> expose, so it's best to try to avoid single-use colorops and use enums in more
> generic colorops instead.

Yeah, this is how hardware will implement and it involves multiple mathematical operations,
controlled with one programmable bit to enable the same. These will be tough to generalize.
What should be the type of color op for these would be an open.

It would be great if we can address this generically.

> >
> > > Additionally, not every vendor implements bucketed/segemented LUTs
> > > the same way, so it's not feasible to expose that in a way that's
> > > particularly useful or not vendor-specific.
> 
> Joshua, I see no problem here really. They are just another type of LUT for a curve
> colorop, with a different configuration blob that can be defined in the UAPI.

Yeah, agree.
And the programmable hardware can be easily exposed and generalize for all vendors,
so it should not be a concern.

> > If the underlying hardware is programmable, the structure which we
> > propose to advertise the capability of the block to userspace will be sufficient to
> compute the LUT coefficients.
> > The caps can be :
> > 1. Number of segments in Lut
> > 2. Precision of lut
> > 3. Starting and ending point of the segment 4. Number of samples in
> > the segment.
> > 5. Any other flag which could be useful in this computation.
> >
> > This way we can compute LUT's generically and send to driver. This
> > will be scalable for all colorspaces, configurations and vendors.
> 
> Drop the mention of colorspaces, and I hope so. :-)
> 
> Color spaces don't quite exist in a prescriptive pipeline definition.

Yeah. For driver it's just a LUT for programmable hardware, OR mathematical
operation for fixed function hardware defined via enum 😊

> > > Thus we decided to have a regular 1D LUT modulated onto a known curve.
> > > This is the only real cross-vendor solution here that allows HW
> > > curve implementations to be taken advantage of and also works with
> > > bucketing/segemented LUTs.
> > > (Including vendors we are not aware of yet).
> > >
> > > This also means that vendors that only support HW curves at some
> > > stages without an actual LUT are also serviced.
> >
> > Any fixed function vendor implementation should be supported but with
> > a vendor specific color block. Trying to come up with enums which
> > aligns with some underlying hardware may not be scalable.
> 
> I disagree with both of you.
> 
> Who said there could be only one "degamma" block on a plane's pipeline?
> 
> If hardware is best modelled as a fixed-function selectable curve followed by a
> custom curve, then expose exactly those two generic colorops. Nothing stops a
> pipeline from having two curve colorops in sequence with a disjoint set of
> supported types or features. If some hardware does not have one of the curve
> colorops, then just don't add the missing one in a pipeline.

Agree, I think we are aligned now here.

Regards,
Uma Shankar

> 
> 
> Thanks,
> pq
> 
> > > You are right that there *might* be some usecase not covered by this
> > > right now, and that it would need kernel churn to implement new
> > > curves, but unfortunately that's the compromise that we (so-far)
> > > have decided on in order to ensure everyone can have good, precise, power-
> efficient support.
> >
> > Yes, we are aligned on this. But believe programmable hardware should
> > be able to expose its caps. Fixed function hardware should be non-generic and
> vendor specific.
> >
> > > It is always possible for us to extend the uAPI at a later date for
> > > other curves, or other properties that might expose a generic
> > > segmented LUT interface (such as what you have proposed for a while) for
> vendors that can support it.
> > > (With the whole color pipeline thing, we can essentially do 'versioning'
> > > with that, if we wanted a new 1D LUT type.)
> >
> > Most of the hardware vendors have programmable luts (including AMD),
> > so it would be good to have this as a default generic compositor
> > implementation. And yes, any new color block with a type can be added
> > to the existing API's as the need arises without breaking compatibility.
> >
> > Regards,
> > Uma Shankar
> >
> > >
> > > Thanks!
> > > - Joshie 🐸✨