[PATCH v3] Documentation: gpu: Mention the requirements for new properties

Pekka Paalanen ppaalanen at gmail.com
Mon Jun 21 08:21:27 UTC 2021


On Fri, 18 Jun 2021 15:19:15 +0300
Laurent Pinchart <laurent.pinchart at ideasonboard.com> wrote:

> Hi Pekka,
> 
> On Fri, Jun 18, 2021 at 02:32:00PM +0300, Pekka Paalanen wrote:
> > On Fri, 18 Jun 2021 12:58:49 +0300 Laurent Pinchart wrote:  
> > > On Fri, Jun 18, 2021 at 11:55:38AM +0300, Pekka Paalanen wrote:  
> > > > On Thu, 17 Jun 2021 16:37:14 +0300 Laurent Pinchart wrote:    
> > > > > On Thu, Jun 17, 2021 at 02:33:11PM +0300, Pekka Paalanen wrote:    
> > > > > > On Thu, 17 Jun 2021 13:29:48 +0300 Laurent Pinchart wrote:      
> > > > > > > On Thu, Jun 17, 2021 at 10:27:01AM +0300, Pekka Paalanen wrote:      
> > > > > > > > On Thu, 17 Jun 2021 00:05:24 +0300 Laurent Pinchart wrote:        
> > > > > > > > > On Tue, Jun 15, 2021 at 01:16:56PM +0300, Pekka Paalanen wrote:        

...

> > > > > One very typical difference between devices is the order of the
> > > > > processing blocks. By modelling the KMS pipeline as degamma -> ccm ->
> > > > > gamma, we can accommodate hardware that have any combination of
> > > > > [1-2] * 1D LUTs + 1 * CCM. Now, throw one 3D LUT into the mix, at    
> > > > 
> > > > But you cannot represent pipelines like
> > > > 1D LUT -> 1D LUT -> CCM
> > > > because the abstract pipeline just doesn't have the elements for that.
> > > > OTOH, maybe that ordering does not even make sense to have in hardware?
> > > > So maybe not all combinations are actually needed.    
> > > 
> > > If userspace wanted such a pipeline (I'm not sure why it would), then it
> > > could just combine the two LUTs in one.  
> > 
> > Maybe? You can also combine the 1D LUTs into the 3D LUT in the
> > middle, too, but the result is not generally the same as using them
> > separately when the 3D LUT size is limited.  
> 
> Yes, my 3D LUT has 17 points, I've heard about LUTs up to 33 points
> (with bilinear interpolation), that's quite little if used as a 1D LUT.
> There's also the fact that interpolation will result in cross-talk
> between the three colour channels.

Hi Laurent,

that is not what I meant, but that's true too.

What I meant was this:
https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/582#note_951271

pre_curve and post_curve are the 1D LUTs on each side of a size-limited
3D LUT. pre_curve controls the 3D LUT tap positioning to concentrate on
the areas where the approximated function needs more taps.


> > Why would a generic userspace library API be a more feasible effort?  
> 
> I'll make a parallel with the camera world again, because I think
> display hardware is taking a similar path, just later.
> 
> We used to have capture devices that offered a fairly high-level
> interface at the hardware or firmware level, with a set of features that
> were more or less standard. Devices could be exposed with an abstract
> model, as done in KMS today, with a set of standardized controls (V4L2
> uses the word control when KMS uses property). We had way less automated
> testing, and didn't enough attention to consistency of the userspace API
> (we're talking about code from the beginning of the century, if not
> earlier, the world was different), so different drivers were not
> consistent in how the API was implemented (for instance the API didn't
> specify gain units, so drivers would use different units), but more or
> less, you could write a standard V4L2 application that would work with
> most devices, with a few quirks here and there.
> 
> Then the world changed. Cameras become more complex, with the raw sensor
> and the ISP exposed to Linux in the OMAP3 that Nokia used in the N900
> and N9 phones. For a while we tried to stick with V4L2, and implement
> ISP control within the kernel, with the help of a userspace daemon
> behind the scenes to which the kernel driver called back. The daemon was
> required, as control of the camera required complicated computation that
> really benefited from floating point math. It was horrible, and the
> daemon was closed-source, but from an application point of view, it was
> still high-level V4L2 as before. This hit a big wall though, to expose a
> high-level API on top of a low-level hardware interface, we had to
> bridge the gap in the kernel. This meant hardcoding use cases on the
> kernel side, which clearly wouldn't scale, as not everybody had the same
> use cases (an industrial vision application and a smartphone camera
> application really don't see the world the same way).  We were facing a
> dead end.
> 
> We then decided to expose a lower-level API from kernel to userspace,
> giving userspace the ability to control everything. That's how the Media
> Controller API was born (its purpose is to expose the internal topology
> of the device as a graph of connected entities, that's about it), as
> well as the V4L2 subdev userspace API (in KMS term, that would mean more
> or less exposing DRM bridges to userspace, with the ability to control
> them individually). This simplified the kernel implementation (no more
> hardcoded use cases, no more daemon behind the scenes), but at the same
> time resulted in an API that wasn't a good fit for applications to
> consume directly. One step closer to the hardware, one step further from
> the user. You can think of this as the API exposed by GPU kernel
> drivers, it's not enough, a userspace component to translate to a
> higher-level API is needed.
> 
> This was followed by 10 years of limbo as Nokia pulled the plug from
> Linux (and then smartphone) development. Fast forward, at the end of
> 2018, we created libcamera as the mesa of the camera world to solve this
> issue. It offers a standard camera stack with a high-level API, where
> device-specific code can be added (and kept to the minimum, as a
> centralized framework means we can share code that is common between
> devices).
> 
> I see lots of parallel with display pipelines getting more complex. It
> wouldn't be fair to expose a very low-level API and ask every single
> compositor to know about all the devices out there, but at the same
> time, we're reaching a level of hardware complexity that I believe makes
> it a dead end to try to abstract and standardize everything at the
> kernel level. This doesn't mean we should allow all vendors to throw in
> any kind of crappy properties or other API extensions, we can still
> enforce some ground rules, such as proper documentation, reasonable
> effort for standardization, and reference open-source userspace
> implementation in a project we would consider to be the equalivant of
> mesa for KMS.

Thank you very much for explaining this! It makes perfect sense to me.
It also sounds like a ¤&!%¤# amount of work, but really does sound like
the best plan in the long term.

GPUs have OpenGL and Vulkan, but I suppose with cameras and display,
there is nothing standard to aim for, for either hardware designers nor
software designers? Chicken-and-egg.

> > > BTW, if we want to expose a finer-grained topology of processing blocks,
> > > I'd recommend looking at the Media Controller API, it was designed just
> > > for that purpose.  
> > 
> > Sure. However, right now we have KMS properties instead. Or maybe they
> > should just go all unused?
> > 
> > Maybe the rule should be to not add any more KMS properties and tell
> > people to design something completely new based on Media Controller API
> > instead?  
> 
> No, I think that would be too harsh, my point was that if we want to
> expose how the building blocks are layed out, the MC API is meant to
> expose a graph of connected entities, so it's a good match. Controlling
> the device should still go through KMS properties I believe.

Ok.

...

> > To be able to plug device-specific components in compositors, we need a
> > driver-agnostic API through which compositors can use those device
> > specific components. Still the same problem, now just in userspace than
> > at the UAPI level.  
> 
> Agreed, it pushes the problem to a different place :-) I however think
> it may be easier to solve it in userspace, as it's a better place to
> hardcode use cases. A general-purpose compositor may not want to make
> use of all hardware features, while a specific-purpose compositor may
> want to exercise the hardware features differently. This however opens
> the door to introducing badly-designed and un(der-)documented
> properties, which is not something I want, so we would still need to set
> ground rules, such as documentation and reference implementation in a
> given userspace compositor/stack.

Right, sounds fine, but I think it also shifts the KMS UAPI paradigm a
little, away from generic and towards driver-specific. If the DRM
maintainers agree to that, I'm fine with it. It's just different from
what I have understood so far, which is why I'm so keen on getting UAPI
documented precisely.

> On that note, I think we need to do a bit more hand-holding. I'm sitting
> on a patch series that is several years old to add 3D LUT support to my
> driver, and while I believe I'm quite experienced with KMS development,
> adding support for this in a reference compositor in userspace is a
> really big task that I don't know how to tackle. I also know very little
> about colour management, and have nobody in the team who has that
> knowledge and could help me. I'm thus facing a bit of a dead-end, not
> because I don't want to spend the time required for this, but because
> the bar is so high that I just can't do it. I would expect many
> developers to be facing the same challenges. If the community doesn't
> make an effort in fostering collaboration between developers (most
> kernel developers won't even know who to ask for help if they have to
> step out of the kernel to userspace) and providing good documentation,
> we'll just scare developers away. 

That's why I'm screaming here, to make contact. :-D

And Simor Ser is another, and I have seen him even offer to implement
userspace bits for new KMS properties in development.

But a 3D LUT in display is a really tiny piece in the landscape of color
management. Color management on Wayland has been talked about, oh I
don't remember, for a decade? Weston code has been experimented on for
the past few years at least. But only last week, the very first piece
of actual color management exclusive code was merged in Weston upstream.

Getting a display server project into a phase where it would actually
start experimenting on KMS features can take a decade - not just work
but finding the funding. Weston is not quite there yet either, not even
with all the currently open merge requests.

Then, if your 3D LUT is after blending, I'm not sure I would have use
for it in Weston! If it was on a KMS plane before blending, then yes, I
would have use for it. All this depends on how the blending color space
is chosen, and the idea we have right now does not have use for a 3D
LUT after blending - except maybe for fullscreen single-buffer
scenegraph, e.g. video and games.

> It took me weeks to figure out how to
> run Weston on this device, as it has no GPU that I can use, and falling
> back to a software compositor required finding out that lots of
> documentation is outdated and referenced command line options that were
> not accurate anymore (or I could mention painfully finding out that an
> issue I was blocked with for days was solved by plugging in a USB mouse
> to the board, as Weston didn't want to start without an input device).
> Another developer from my team tried before me and gave up, thinking it
> was impossible. That's the reality that many kernel developers are
> facing, the same way I suppose than me saying saying "you just need to
> recompile the kernel with this option enabled" is not very helpful for
> lots of userspace developers.

Why have I not heard of your problems with Weston?!

We have Gitlab issues, mailing list and IRC. You're warmly welcome to
ask when you hit a wall. :-)

I even read #dri-devel very often, so if you mentioned Weston there, it
would have had a good chance to reach me.


Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20210621/f1d14913/attachment-0001.sig>


More information about the dri-devel mailing list