[PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type

Wed Jun 4 18:59:22 UTC 2025

> -----Original Message-----
> From: Harry Wentland <harry.wentland at amd.com>
> Sent: Wednesday, June 4, 2025 1:57 AM
> To: Pekka Paalanen <pekka.paalanen at collabora.com>; Shankar, Uma
> <uma.shankar at intel.com>
> Cc: Simon Ser <contact at emersion.fr>; Alex Hung <alex.hung at amd.com>; dri-
> devel at lists.freedesktop.org; amd-gfx at lists.freedesktop.org; intel-
> gfx at lists.freedesktop.org; wayland-devel at lists.freedesktop.org;
> leo.liu at amd.com; ville.syrjala at linux.intel.com; mwen at igalia.com;
> jadahl at redhat.com; sebastian.wick at redhat.com; shashank.sharma at amd.com;
> agoins at nvidia.com; joshua at froggi.es; mdaenzer at redhat.com;
> aleixpol at kde.org; xaver.hugl at gmail.com; victoria at system76.com;
> daniel at ffwll.ch; quic_naseer at quicinc.com; quic_cbraga at quicinc.com;
> quic_abhinavk at quicinc.com; marcan at marcan.st; Liviu.Dudau at arm.com;
> sashamcintosh at google.com; Borah, Chaitanya Kumar
> <chaitanya.kumar.borah at intel.com>; louis.chauvet at bootlin.com
> Subject: Re: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type
> 
> 
> 
> On 2025-06-03 06:51, Pekka Paalanen wrote:
> > On Tue, 3 Jun 2025 08:30:23 +0000
> > "Shankar, Uma" <uma.shankar at intel.com> wrote:
> >
> >>> -----Original Message-----
> >>> From: Pekka Paalanen <pekka.paalanen at collabora.com>
> >>> Sent: Friday, May 30, 2025 7:28 PM
> >>> To: Shankar, Uma <uma.shankar at intel.com>
> >>> Cc: Simon Ser <contact at emersion.fr>; Harry Wentland
> >>> <harry.wentland at amd.com>; Alex Hung <alex.hung at amd.com>; dri-
> >>> devel at lists.freedesktop.org; amd-gfx at lists.freedesktop.org; intel-
> >>> gfx at lists.freedesktop.org; wayland-devel at lists.freedesktop.org;
> >>> leo.liu at amd.com; ville.syrjala at linux.intel.com;
> >>> pekka.paalanen at collabora.com; mwen at igalia.com; jadahl at redhat.com;
> >>> sebastian.wick at redhat.com; shashank.sharma at amd.com;
> >>> agoins at nvidia.com; joshua at froggi.es; mdaenzer at redhat.com;
> >>> aleixpol at kde.org; xaver.hugl at gmail.com; victoria at system76.com;
> >>> daniel at ffwll.ch; quic_naseer at quicinc.com; quic_cbraga at quicinc.com;
> >>> quic_abhinavk at quicinc.com; marcan at marcan.st; Liviu.Dudau at arm.com;
> >>> sashamcintosh at google.com; Borah, Chaitanya Kumar
> >>> <chaitanya.kumar.borah at intel.com>; louis.chauvet at bootlin.com
> >>> Subject: Re: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT
> >>> type
> >>>
> >>> On Thu, 22 May 2025 11:33:00 +0000
> >>> "Shankar, Uma" <uma.shankar at intel.com> wrote:
> >>>
> >>>> One request though: Can we enhance the lut samples from existing
> >>>> 16bits to 32bits as lut precision is going to be more than 16 in certain
> hardware.
> >>> While adding the new UAPI, lets extend this to 32 to make it future proof.
> >>>> Reference:
> >>>> https://patchwork.freedesktop.org/patch/642592/?series=129811&rev=4
> >>>>
> >>>> +/**
> >>>> + * struct drm_color_lut_32 - Represents high precision lut values
> >>>> + *
> >>>> + * Creating 32 bit palette entries for better data
> >>>> + * precision. This will be required for HDR and
> >>>> + * similar color processing usecases.
> >>>> + */
> >>>> +struct drm_color_lut_32 {
> >>>> +	/*
> >>>> +	 * Data for high precision LUTs
> >>>> +	 */
> >>>> +	__u32 red;
> >>>> +	__u32 green;
> >>>> +	__u32 blue;
> >>>> +	__u32 reserved;
> >>>> +};
> >>>
> >>> Hi,
> >>>
> >>> I suppose you need this much precision for optical data? If so,
> >>> floating-point would be much more appropriate and we could probably keep
> 16-bit storage.
> >>>
> >>> What does the "more than 16-bit" hardware actually use? ISTR at
> >>> least AMD having some sort of float'ish point internal pipeline?
> >>>
> >>> This sounds the same thing as non-uniformly distributed taps in a LUT.
> >>> That mimics floating-point input while this feels like floating-point output of a
> LUT.
> >>>
> >>> I've recently decided for myself (and Weston) that I will never
> >>> store optical data in an integer format, because it is far too
> >>> wasteful. That's why the electrical encodings like power-2.2 are so useful, not
> just for emulating a CRT.
> >>
> >> Hi Pekka,
> >> Internal pipeline in hardware can operate at higher precision than
> >> the input framebuffer to plane engines. So, in case we have optical
> >> data of 16bits or 10bits precision, hardware can scale this up to
> >> higher precision in internal pipeline in hardware to take care of
> >> rounding and overflow issues. Even FP16 optical data will be normalized and
> converted internally for further processing.
> >
> > Is it integer or floating-point?
> >
> 
> For AMD the internal format is floating point with slightly higher precision than
> FP16.
> 
> > If we take the full range of PQ as optical and put it into 16-bit
> > integer format, the luminance step from code 1 to code 2 is 0.15 cd/m².
> > That seems like a huge step in the dark end. Such a step would
> > probably need to be divided over several taps in a LUT, which wouldn't
> > be possible.
> >
> 
> Right, and with 32-bpc we'll get a luminance step size of
> ~0.0000023 cd/m^2, which seems plenty fine-grained.
> 
> > In that sense, if a LUT is used for the PQ EOTF, I totally agree that
> > 16-bit integer won't be even nearly enough precision.
> >
> > This actually points out the caveat that increasing the number of taps
> > in a LUT can cause the LUT to become non-monotonic when the sample
> > precision runs out. That is, consecutive taps don't always increase in
> > value.
> >
> >> Input to LUT hardware can be 16bits or even higher, so the look up
> >> table we program can be of higher precision than 16 (certain cases 24
> >> in Intel pipeline). This is later truncated to bpc supported in output formats from
> sync (10, 12 or 16), mostly for electrical value to be sent to sink.
> >>
> >> Hence requesting to increase the container from current u16 to u32,
> >> to get advantage of higher precision luts.
> >
> > My argument though is to use a floating-point format for the LUT
> > samples instead of adding more and more integer bits. That naturally
> > puts more precision where it is needed: near zero.
> >
> > A driver can easily convert that to any format the hardware needs.
> >
> > However, it might make best sense for a driver to expose a LUT with a
> > format that best matches the hardware precision, especially
> > floating-point vs. integer.
> >
> > I guess we may eventually need both 32 bpc integer and 16 (or 32) bpc
> > floating-point.
> >
> 
> While I like floating point better for representing these things I don't think it's a
> great idea to pass floating point values via IOCTLs but 32 bpc integer values make
> sense here.
> 

Nice, we all are on same page here.

> Thanks, Uma, for pushing on this.

Thanks Harry and Pekka for valuable inputs.

Regards,
Uma Shankar

> Harry
> 
> >
> > Thanks,
> > pq