[RFC 00/33] Add Support for Plane Color Pipeline

Pekka Paalanen ppaalanen at gmail.com
Tue Sep 5 11:33:26 UTC 2023


On Mon, 4 Sep 2023 14:29:56 +0000
"Shankar, Uma" <uma.shankar at intel.com> wrote:

> > -----Original Message-----
> > From: Sebastian Wick <sebastian.wick at redhat.com>
> > Sent: Thursday, August 31, 2023 2:46 AM
> > To: Shankar, Uma <uma.shankar at intel.com>
> > Cc: Harry Wentland <harry.wentland at amd.com>; intel-
> > gfx at lists.freedesktop.org; dri-devel at lists.freedesktop.org; wayland-
> > devel at lists.freedesktop.org; Ville Syrjala <ville.syrjala at linux.intel.com>; Pekka
> > Paalanen <pekka.paalanen at collabora.com>; Simon Ser <contact at emersion.fr>;
> > Melissa Wen <mwen at igalia.com>; Jonas Ådahl <jadahl at redhat.com>; Shashank
> > Sharma <shashank.sharma at amd.com>; Alexander Goins <agoins at nvidia.com>;
> > Naseer Ahmed <quic_naseer at quicinc.com>; Christopher Braga
> > <quic_cbraga at quicinc.com>
> > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > 
> > On Wed, Aug 30, 2023 at 08:47:37AM +0000, Shankar, Uma wrote:  
> > >
> > >  
> > > > -----Original Message-----
> > > > From: Harry Wentland <harry.wentland at amd.com>
> > > > Sent: Wednesday, August 30, 2023 12:56 AM
> > > > To: Shankar, Uma <uma.shankar at intel.com>;
> > > > intel-gfx at lists.freedesktop.org; dri- devel at lists.freedesktop.org
> > > > Cc: wayland-devel at lists.freedesktop.org; Ville Syrjala
> > > > <ville.syrjala at linux.intel.com>; Pekka Paalanen
> > > > <pekka.paalanen at collabora.com>; Simon Ser <contact at emersion.fr>;
> > > > Melissa Wen <mwen at igalia.com>; Jonas Ådahl <jadahl at redhat.com>;
> > > > Sebastian Wick <sebastian.wick at redhat.com>; Shashank Sharma
> > > > <shashank.sharma at amd.com>; Alexander Goins <agoins at nvidia.com>;
> > > > Naseer Ahmed <quic_naseer at quicinc.com>; Christopher Braga
> > > > <quic_cbraga at quicinc.com>
> > > > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > > >
> > > > +CC Naseer and Chris, FYI
> > > >
> > > > See https://patchwork.freedesktop.org/series/123024/ for whole series.
> > > >
> > > > On 2023-08-29 12:03, Uma Shankar wrote:  
> > > > > Introduction
> > > > > ============
> > > > >
> > > > > Modern hardwares have various color processing capabilities both
> > > > > at pre-blending and post-blending phases in the color pipeline.
> > > > > The current drm implementation exposes only the post-blending
> > > > > color hardware blocks. Support for pre-blending hardware is missing.
> > > > > There are multiple use cases where pre-blending color hardware
> > > > > will be
> > > > > useful:
> > > > > 	a) Linearization of input buffers encoded in various transfer
> > > > > 	   functions.
> > > > > 	b) Color Space conversion
> > > > > 	c) Tone mapping
> > > > > 	d) Frame buffer format conversion
> > > > > 	e) Non-linearization of buffer(apply transfer function)
> > > > > 	f) 3D Luts
> > > > >
> > > > > and other miscellaneous color operations.
> > > > >
> > > > > Hence, there is a need to expose the color capabilities of the
> > > > > hardware to user-space. This will help userspace/middleware to use
> > > > > display hardware for color processing and blending instead of
> > > > > doing it through GPU shaders.
> > > > >  
> > > >
> > > > Thanks, Uma, for sending this. I've been working on something
> > > > similar but you beat me to it. :)  
> > >
> > > Thanks Harry for the useful feedback and overall collaboration on this so far.
> > >  
> > > > >
> > > > > Work done so far and relevant references
> > > > > ========================================
> > > > >
> > > > > Some implementation is done by Intel and AMD/Igalia to address the same.
> > > > > Broad consensus is there that we need a generic API at drm core to
> > > > > suffice the use case of various HW vendors. Below are the links
> > > > > capturing the discussion so far.
> > > > >
> > > > > Intel's Plane Color Implementation:
> > > > > https://patchwork.freedesktop.org/series/90825/
> > > > > AMD's Plane Color Implementation:
> > > > > https://patchwork.freedesktop.org/series/116862/
> > > > >
> > > > >
> > > > > Hackfest conclusions
> > > > > ====================
> > > > >
> > > > > HDR/Color Hackfest was organised by Redhat to bring all the
> > > > > industry stakeholders together and converge on a common uapi  
> > expectations.  
> > > > > Participants from Intel, AMD, Nvidia, Collabora, Redhat, Igalia
> > > > > and other prominent user-space developers and maintainers.
> > > > >
> > > > > Discussions happened on the uapi expectations, opens, nature of
> > > > > hardware of multiple hardware vendors, challenges in generalizing
> > > > > the same and the path forward. Consensus was made that drm core
> > > > > should implement descriptive APIs and not go with prescriptive
> > > > > APIs. DRM core should just expose the hardware capabilities;
> > > > > enabling, customizing and programming the same should be done by
> > > > > the user-space. Driver should just  
> > > > honor the user space request without doing any operations internally.  
> > > > >
> > > > > Thanks to Simon Ser, for nicely documenting the design consensus
> > > > > and an UAPI RFC which can be referred to here:
> > > > >
> > > > > https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze
> > > > > _hD5
> > > > >  
> > > >  
> > nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1
> > Q  
> > > > Wn48  
> > > > > 8=@emersion.fr/
> > > > >
> > > > >
> > > > > Design considerations
> > > > > =====================
> > > > >
> > > > > Following are the important aspects taken into account while
> > > > > designing the current RFC
> > > > > proposal:
> > > > >
> > > > > 	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks
> > > > > only one  
> > > > can be used)  
> > > > > 	2. Position of the HW block in the pipeline can be programmable
> > > > > 	3. LUTs can be one dimentional or three dimentional
> > > > > 	4. Number of LUT entries can vary across platforms
> > > > > 	5. Precision of LUT entries can vary across platforms
> > > > > 	6. Distribution of LUT entries may vary. e.g Mutli-segmented,  
> > Logarithmic,  
> > > > > 	   Piece-Wise Linear(PWL) etc
> > > > > 	7. There can be parameterized/non-parameterized fixed function HW  
> > > > blocks.  
> > > > > 	   e.g. Just a hardware bit, to convert from one color space to another.
> > > > > 	8. Custom non-standard HW implementation.
> > > > > 	9. Leaving scope for some vendor defined pescriptive
> > > > > implementation if  
> > > > required.  
> > > > > 	10.Scope to handle any modification in hardware as technology
> > > > > evolves
> > > > >
> > > > > The current proposal takes into account the above considerations
> > > > > while keeping the implementation as generic as possible leaving
> > > > > scope for future  
> > > > additions or modifications.  
> > > > >
> > > > > This proposal is also in line to the details mentioned by Simon's
> > > > > RFC covering all the aspects discussed in hackfest.
> > > > >
> > > > >
> > > > > Outline of the implementation
> > > > > ============================
> > > > >
> > > > > Each Color Hardware block will be represented by a data structure  
> > drm_color_op.  
> > > > > These color operations will form the building blocks of a color
> > > > > pipeline which best represents the underlying Hardware. Color
> > > > > operations can be re-arranged, substracted or added to create
> > > > > distinct color pipelines to accurately describe the Hardware
> > > > > blocks present in the display  
> > > > engine.
> > > >
> > > > Who is doing the arranging of color operations? IMO a driver should
> > > > define one or more respective pipelines that can be selected by
> > > > userspace. This seems to be what you're talking about after (I
> > > > haven't reviewed the whole thing yet). Might be best to drop this sentence or  
> > to add clarifications in order to avoid confusion.  
> > >
> > > Yes it's the driver who will set the pipeline based on the underlying
> > > hardware arrangement and possible combinations. There can be multiple
> > > pipelines possible if hardware can be muxed or order can be re-arranged (all  
> > viable combinations should be defined as a pipeline in driver).  
> > > Yeah, I will re-phrase this to help clarify it and avoid any ambiguity.
> > >  
> > > > >
> > > > > In this proposal, a color pipeline is represented as an array of
> > > > > struct drm_color_op. For individual color operation, we add blobs
> > > > > to advertise the capability of the particular Hardware block.
> > > > >
> > > > > This color pipeline is then packaged within a blob for the user
> > > > > space to retrieve it.
> > > > >  
> > > >
> > > > Would it be better to expose the drm_color_ops directly, instead of
> > > > packing a array of drm_color_ops into a blob and then giving that to  
> > userspace.  
> > >
> > > Advantage we see in packing in blobs is that interface will be
> > > cleaner. There will be just one GET_COLOR_PIPELINE property to invoke by user  
> > and then its just parsing the data.  
> > > This way the entire underlying hardware blocks with pipeline will be available to  
> > user.
> > 
> > I don't see how parsing a blob is easier than requesting the color ops from the
> > kernel. User space is already equiped with getting KMS objects and the igt test
> > code from Harry shows that this is all pretty trivial plumbing.  
> 
> There are multiple color operations possible with unique lut distribution and
> capabilities. Also the order of hardware blocks and possibility of multiple pipelines.
> Having all the information with one query and property and also be able to set the
> same with just one property call using blobs is better than multiple calls to driver.
> This can be useful in high refresh rate cases where not much time is there to program
> the hardware state. Latency of multiple calls to driver can be avoided.

Hi,

querying all that information only happens once, at KMS client start-up.

Setting up a color pipeline is already a single call: the atomic commit ioctl.

> 
> > > For a particular hardware block in pipeline, user can get the relevant
> > > details from blob representing that particular block. We have created
> > > IGT tests (links mentioned in cover-letter) to demonstrate how it can be done.  
> > This is just to clarify the idea.
> > 
> > The blob is also not introspectable with the usual tools whereas exposing them as
> > properties would be.
> > 
> > It also would, like Pekka correctly noted, bring a whole bunch of issues about
> > compatibility and versioning that are well understood with objects
> > + properties.  
> 
> The blob should be standardized in the UAPI and structures to parse them should be fixed.
> With this compatibility issues can be prevented.

I think that is short-sighted.

> > > Also since info is passed via blobs we have the flexibility to even
> > > define segmented LUTs and PWL hardware blocks. Also we have left scope
> > > for custom private hardware blocks as well which driver can work with  
> > respective HAL and get that implemented.
> > 
> > When color ops are real KMS objects they still can have properties which are
> > blobs that can store LUTs and other data.
> > 
> > And we should avoid private blocks at all cost. In fact, I don't think the KMS rules
> > have changed in that regard and it simply is not allowed.  
> 
> Private blocks are not standardized but are vendor specific. So generic userspace will
> ignore these. However vendor and its respective HAL can make use of this field and leaves
> a scope to cater to such hardware vendors need. This doesn't affect the expectation of the
> standardized color operations which will be defined as enum in UAPI.

Vendors can have and expose their own unique snowflake operations
without any "private" as well: pick an unused operation type code, and
document what it does. Advertise it in some pipelines.

Vendor HALs can make use of it, but it also allows generic userspace to
make use of it at will, and it allows other vendors to implement the
same and benefit from it without needing to patch every userspace.

Or does Intel not want other vendors to potentially make use of their
UAPIs?

> > > We can even define prescriptive operations as a private entry and
> > > enable it if a certain driver and HAL agree.
> > >  
> > > > > To advertise the available color pipelines, an immutable ENUM
> > > > > property "GET_COLOR_PIPELINE" is introduced. This enum property has  
> > blob id's as values.  
> > > > > With each blob id representing a distinct color pipeline based on
> > > > > underlying HW capabilities and their respective combinations.
> > > > >
> > > > > Once the user space decides on a color pipeline, it can set the
> > > > > pipeline and the corresponding data for the hardware blocks within
> > > > > the pipeline with the BLOB property "SET_COLOR_PIPELINE".
> > > > >  
> > > >
> > > > When I discussed this at the hackfest with Simon he proposed a new
> > > > DRM object, (I've called it a drm_colorop) to represent a color operation.
> > > > Each drm_colorop has a "NEXT" pointer to another drm_colorop, or
> > > > NULL if its the last in the pipeline. You can then have a mutable
> > > > enum property on the plane to discover and select a color pipeline.  
> > >
> > > Yes, the proposal is inspired by this idea. Sure, we can work together to enhance  
> > the design.  
> > > Personally I feel the one proposed in the current RFC will do the same
> > > thing envisioned by Simon and you Harry. Management of the pipeline,
> > > addition, deletion and flexibility to represent hardware is more with blobs with  
> > the relevant structures agreed in UAPI.  
> > >  
> > > > This seems a bit more transparent than a blob. You can see my
> > > > changes (unfortunately very WIP, don't look too closely at
> > > > individual patches) at
> > > > https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5/di
> > > > ffs
> > > >
> > > > libdrm changes:
> > > > https://gitlab.freedesktop.org/hwentland/drm/-/merge_requests/1/diff
> > > > s  
> > >
> > > Sure, will check the same.
> > >  
> > > > IGT changes:
> > > > https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_reque
> > > > sts/1/diffs
> > > >
> > > > I'll take time to review your whole series and will see whether we
> > > > can somehow keep the best parts of each.  
> > >
> > > Thanks and agree. Let's work together and get this implemented in DRM.
> > >  
> > > > Curious to hear other opinions on the blob vs new DRM object question as  
> > well.
> > 
> > I can see a few advantages with the blob approach.
> > 
> > User space can store the entire state in one blob and just assign a new blob to
> > change to another pipeline configuration.  
> 
> Agree
> 
> > However, I would argue that changing a lot of properties is already common
> > practice and works well. User space can deal with it and has abstractions to
> > make this managable.  
> 
> Blob gives a lot of flexibility and ability to define the hardware capabilities generically.

The structure of the atomic commit ioctl and the KMS property system is
even more flexible.

> Lut distribution, number of segments, samples in each segment, precision of luts etc.
> can be precisely defined and userspace will get a complete picture of the underlying
> hardware and its capabilities. Also this is being done with just 2 properties. Leaving
> scope for future addition of any standard color operation as well.

The number of properties does not seem too useful to strictly minimise
over other aspects.


Thanks,
pq

> 
> I feel blob approach once properly documented is a bit more flexible and scalable.
> Maybe I am bit biased here 😊 but all ideas are welcome. 
> 
> We have implemented some IGT's as well to explain the design better. Link below:
> https://patchwork.freedesktop.org/series/123018/
> 
> Thanks Sebastian for the feedback.
> 
> Regards,
> Uma Shankar
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20230905/a1843fd0/attachment.sig>


More information about the dri-devel mailing list