[PATCH] Documentation/gpu: Requirements for driver-specific KMS uAPIs

Harry Wentland harry.wentland at amd.com
Fri May 5 13:48:15 UTC 2023



On 5/5/23 09:35, Daniel Vetter wrote:
> On Fri, May 05, 2023 at 09:20:26AM -0400, Harry Wentland wrote:
>>
>>
>> On 5/5/23 06:16, Daniel Vetter wrote:
>>> On Fri, May 05, 2023 at 11:43:20AM +0300, Pekka Paalanen wrote:
>>>> On Thu, 4 May 2023 17:25:57 -0400
>>>> Harry Wentland <harry.wentland at amd.com> wrote:
>>>>
>>>>> We have steered away for a long time now from driver-specific KMS APIs
>>>>> for good reasons but never codified our stance. With the proposal of
>>>>> new, driver-specific color management uAPIs [1] it is important to
>>>>> outline the requirements for the rare times when driver-specific KMS
>>>>> uAPIs are desired in order to move complex topics along.
>>>>>
>>>>> [1] https://patchwork.freedesktop.org/series/116862/
>>>>>
>>>>> Signed-off-by: Harry Wentland <harry.wentland at amd.com>
>>>>> Cc: Simon Ser <contact at emersion.fr>
>>>>> Cc: Joshua Ashton <joshua at froggi.es>
>>>>> Cc: Michel Dänzer <mdaenzer at redhat.com>
>>>>> Cc: Sebastian Wick <sebastian.wick at redhat.com>
>>>>> Cc: Jonas Ådahl <jadahl at redhat.com>
>>>>> Cc: Alex Goins <agoins at nvidia.com>
>>>>> Cc: Pekka Paalanen <pekka.paalanen at collabora.com>
>>>>> Cc: Melissa Wen <mwen at igalia.com>
>>>>> Cc: Aleix Pol <aleixpol at kde.org>
>>>>> Cc: Xaver Hugl <xaver.hugl at gmail.com>
>>>>> Cc: Victoria Brekenfeld <victoria at system76.com>
>>>>> Cc: Daniel Vetter <daniel at ffwll.ch>
>>>>> Cc: Dave Airlie <airlied at gmail.com>
>>>>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>>>>> Cc: Uma Shankar <uma.shankar at intel.com>
>>>>> To: dri-devel at lists.freedesktop.org
>>>>> ---
>>>>>  Documentation/gpu/drm-uapi.rst | 32 ++++++++++++++++++++++++++++++++
>>>>>  1 file changed, 32 insertions(+)
>>>>>
>>>>> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
>>>>> index ce47b4292481..eaefc3ed980c 100644
>>>>> --- a/Documentation/gpu/drm-uapi.rst
>>>>> +++ b/Documentation/gpu/drm-uapi.rst
>>>>> @@ -118,6 +118,38 @@ is already rather painful for the DRM subsystem, with multiple different uAPIs
>>>>>  for the same thing co-existing. If we add a few more complete mistakes into the
>>>>>  mix every year it would be entirely unmanageable.
>>>>>  
>>>>> +.. _driver_specific:
>>>>> +
>>>>> +Driver-Specific DRM/KMS uAPI
>>>>> +============================
>>>>> +
>>>>> +Driver-specific interfaces are strongly discouraged for DRM/KMS interfaces.
>>>>> +Kernel-modesetting (KMS) functionality does in principle apply to all drivers.
>>>>> +Driver-specific uAPIs tends to lead to unique implementations in userspace and
>>>>> +are often hard to maintain, especially when different drivers implement similar
>>>>> +but subtly different uAPIs.
>>>>> +
>>>>> +At times arriving at a consensus uAPI can be a difficult and lengthy process and
>>>>> +might benefit from experimentation. This experimentation might warrant
>>>>> +introducing driver specific APIs in order to move the eosystem forward. If a
>>>>> +driver decides to go down this path we ask for the following:
>>>
>>> I don't like this for two fairly fundamental reasons, neither of which are
>>> that sometimes merging stuff that's not great is the right thing to do to
>>> move the community and ecosystem forward.
>>>
>>>> Hi,
>>>>
>>>> should it be "require" instead of "ask"?
>>>>
>>>>> +
>>>>> +- An agreement within the community that introducing driver-specific uAPIs is
>>>>> +  warranted in this case;
>>>>> +
>>>>> +- The new uAPI is behind a CONFIG option that is clearly marked EXPERIMENTAL and
>>>>> +  is disabled by default;
>>>>> +
>>>>> +- The new uAPI is enabled when a module parameter for the driver is set, and
>>>>> +  defaults to 'off' otherwise;
>>>>> +
>>>>> +- The new uAPI follows all open-source userspace requirements outlined above;
>>>>> +
>>>>> +- The focus is maintained on development of a vendor-neutral uAPI and progress
>>>>> +  toward such an uAPI needs to be apparent on public forums. If no such progress
>>>>> +  is visible within a reasonable timeframe (1-2 years) anybody is within their
>>>>> +  right to send, review, and merge patches that remove the driver-specific uAPI.
>>>>> +
>>>>>  .. _drm_render_node:
>>>>>  
>>>>>  Render nodes
>>>>
>>>> Seems fine to me. I have another addition to suggest:
>>>>
>>>> When such UAPI is introduced, require that it comes with an expiration
>>>> date. This date should be unmissable, not just documented. The kernel
>>>> module parameter name to enable the UAPI could contain the expiry date,
>>>> for example.
>>>>
>>>> After all, the most important thing to get through to users is that
>>>> this *will* go away and not just theoretically.
>>>
>>> There's no taking-backsies with uapi. If there is a regression report,
>>> we'll have to keep it around, for the usual approximation of "forever"
>>>
>>> And this is the first reason I don't like this, from other write-ups and
>>> talking with people it seems like there's the assumption that if we just
>>> hide this behind enough knobs, we can remove the uapi again.
>>>
>>> We can't.
>>>
>>
>> Yeah, that last bullet was least sure about. FWIW, I'm prepared to maintain
>> AMD driver-specific properties for this "forever."
> 
> I guess a middle ground would be to up-front limit this to one generation
> only.
> 
> And then accept reality that the generic solution takes a bit longer and
> end up supporting this on two.
> 
> The other approach would be to make sure you can remap the amd properties
> onto the generic pipeline, and essentially use that to proof the generic
> approach. This would exactly match how all the legacy kms ioctl remapping
> to atomic helped proof the atomic infrastructure and driver
> implementations.
> 

It sounds like we're thinking the same.

>>> The times we've hidden uapi behind knobs was _not_ for uapi we
>>> fundamentally didn't want, at least for the long term. But for the cases
>>> where the overall scope was simply too big, and we needed some time
>>> in-tree to shake out all the bugs (across both kernel and userspace), and
>>> fill out any of the details. Some examples:
>>>
>>> - intel hiding new hw enabling behind the alpha support is not about
>>>   hiding that uapi so we can change it. It's about the fact that not yet
>>>   all enabling has landed in upstream, and not yet all full stack
>>>   validation on production silicon has completed. It's about not shipping
>>>   buggy code to users that we can't support.
>>>
>>> - atomic kms was simply too big, there was a lot of work in compositors
>>>   needed, testing corner cases, and details like adding the blob support
>>>   for the display mode so that modesets would work too with atomic. We
>>>   never landed a preliminary uabi version of atomic (there were plenty
>>>   floating around) that wasn't deemed ready as the long term solution, we
>>>   were simply not sure we got it right until all the pieces where in
>>>   place.
>>>
>>>   And viz Xorg-modesetting, in at least one case we still got it wrong and
>>>   had to disable atomic for that userspace.
>>>
>>> - nouveau pony years back tried this entire "oh the uapi is just
>>>   experimental" thing, and it resulted in the by far worst flameware
>>>   between Dave and Linus on dri-devel
>>>
>>> So _if_ we do this we need to be clear that uapi is forever, and not have
>>> docs that suggest otherwise.
>>>
>>>> If that date needs to be moved forward, it should be possible to do so
>>>> with a simple patch gathering enough acks. The main thing is to set the
>>>> date from the start, so there can be no confusion about when its going
>>>> to the chopping block.
>>>>
>>>> I do not suggest that the kernel would automatically runtime disable
>>>> the UAPI after that date.
>>>>
>>>> Does any of the big idea fly with upper maintainers and Linus?
>>>
>>> The other reason, and maybe even more fundamental one, is that I think the
>>> uncertainty of not documenting how pragmatic we are is beneficial.
>>>
>>> We should definitely document the gold standard aspirations, to make sure
>>> everyone knows where to aim for. And I'm definitely all for pragmatic
>>
>> So if I read you correctly you'd prefer just a short paragraph along the
>> lines of: avoid driver-specific uAPIs and instead define vendor neutral
>> uAPIs in core DRM?
> 
> Yeah, but I also thought we've had this as at least the aspirational goal
> already? After the free-for-all driver-specific kms property mess at
> least:
> 
> https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#requirements
> 
> So we have all this already I think. Might just need to link it more so
> it's not lost :-)
> 

Ah, yes that'll do. Might just need a reference from the Userland page so
people like me don't miss it.

>>> merging where it makes sense, we've had tons of that, and happily carry
>>> the costs to this day for some of them:
>>>
>>> - a lot of the early soc drivers are kinda meh, and will stay that way
>>>   forever since they're not maintained anymore
>>>
>>> - we've had very much free-for-all vendor kms properties, and I expect the
>>>   hall of shame witht he big table of vendor props with barely any docs
>>>   will never go away
>>>
>>> - we're taking all the compute runtimes despite that mesa on the 3d/gl/vk
>>>   side shows how much better collaboration would be (and I think soon will
>>>   show the same for media) because having a compute ecosystem that's
>>>   substantially weaker than the sum of all its parts is still better than
>>>   nothing. And the situation is still that collaboration even with a
>>>   company is often impossible, aiming for better is not very realistic :-/
>>>
>>> But the goal is still to have solid code, cross-vendor infrastructure and
>>> collaboration and all that stuff, because that's why upstream is strong.
>>> And the uncertainty is helping us for a lot of reasons:
>>>
>>> - it makes vendors vary of going with vendor solutions. Minimally they ask
>>>   in private, which gives Dave, me and all the others doing vendor
>>>   outreach or working as some ambassador rule at a vendor an opportunity
>>>   to steer things in a better direction. And often do the steering
>>>   _before_ code gets written.
>>>
>>> - it allows Dave&me to more freely decide when to be pragmatic, without
>>>   being bound by rules. The point of pragmatic merging is to bend the
>>>   short term principles for a better long term outcome, splattering that
>>>   entire space full with rules makes rule-bending a lot harder when
>>>   needed.
>>>
>>> - most of all I really don't want to be in a discussion with vendors where
>>>   they try to laywer-argue that we must merge their patches because they
>>>   strictly followed the wording of some pragmatic merge rules while
>>>   entirely tossing the spirit of what we aim for. I already have more than
>>>   enough of that, this will result in more.
>>>
>>> In all the past examples of pragmatic merging we never documented the
>>> pragmatic approach, but instead if we documented something, we wrote down
>>> the ideal standards to aim for. That makes it easier for everyone to do
>>> the right thing, and harder (and more expensive due to the inherit
>>> uncertainty) to try to bend them towards the least amount of collaboration
>>> a vendor can get away with.
>>>
>>
>> That makes a lot of sense.
>>
>>> That's why I really want to keep the undocumented and hence uncertain
>>> rules in this space.
>>>
>>> For the actual case at hand of plane color handling, I think the pragmatic
>>> aproach is roughly:
>>>
>>> 1. land the amdgpu code, but without uapi
>>>
>>
>> Would it be acceptable to guard uAPI bits behind #ifdef AMD_PLANE_COLOR
>> so anyone who wants it needs to do -DAMD_PLANE_COLOR or should we leave
>> those bits out completely so someone who wants a kernel with them needs
>> to apply a patch to add them?
> 
> Yeah I guess #ifdef is fine. But if you merge it all including structures
> then really the only code left out is the one that registers the
> properties against the planes (since the idea is to keep everything else,
> especially all the state structure scaffolding), and that shouldn't be
> much of a patch really.
> 

Ack.

>>> 2. use that (and any other driver code that's been floating around in this
>>> space) to build up the kernel-internal infrastructure - the proposed graph
>>> of color transformation blocks will need quite a few things
>>>
>>> 3. land the uapi on top in it's hopeful final form, maybe hidden if it's
>>> not yet complete or ready for prime time as we sometimes do with bigger
>>> projects
>>>
>>
>> This would be the vendor neutral uAPI, so something like the RFC Simon
>> sent out, right?
> 
> Yeah.
> 
> And see my comments about about the vendor uapi, depending upon how this
> all pans out you might consider the option of doing the amdgpu props on
> top of the new graph node color pipeline proposal, to proof that out. Like
> we've done with atomic.
> 

Interfacing with the vendor neutral uapi is my preference but I think
it'll take us a while to get it merged, because it affects everyone and needs
really good review from lots of people (Arm SoC folks), IGT tests, etc.

My plan is to
1. Merge the AMD-specific stuff (with ifdef of patch for uapi)
2. Enable it with gamescope (and possibly other interested compositors)
3. Hook AMD color stuff into new vendor neutral uapi
4. Modify gamescope to call into vendor neutral uapi
5. Confirm that this work equally well with AMD-specific uapi and vendor
   neutral uapi (IGT tests, visually, etc.)

Harry

> Cheers, Daniel
> 
>>
>> Harry
>>
>>> Obviously compositor work, igts, docs and all that too, and most of all
>>> this can happen in parallel too once we have a rough consensus on where to
>>> aim for.
>>>
>>> Cheers, Daniel
>>
> 



More information about the dri-devel mailing list