[PATCH] Documentation/gpu: Requirements for driver-specific KMS uAPIs

Daniel Vetter daniel at ffwll.ch
Fri May 5 10:16:55 UTC 2023


On Fri, May 05, 2023 at 11:43:20AM +0300, Pekka Paalanen wrote:
> On Thu, 4 May 2023 17:25:57 -0400
> Harry Wentland <harry.wentland at amd.com> wrote:
> 
> > We have steered away for a long time now from driver-specific KMS APIs
> > for good reasons but never codified our stance. With the proposal of
> > new, driver-specific color management uAPIs [1] it is important to
> > outline the requirements for the rare times when driver-specific KMS
> > uAPIs are desired in order to move complex topics along.
> > 
> > [1] https://patchwork.freedesktop.org/series/116862/
> > 
> > Signed-off-by: Harry Wentland <harry.wentland at amd.com>
> > Cc: Simon Ser <contact at emersion.fr>
> > Cc: Joshua Ashton <joshua at froggi.es>
> > Cc: Michel Dänzer <mdaenzer at redhat.com>
> > Cc: Sebastian Wick <sebastian.wick at redhat.com>
> > Cc: Jonas Ådahl <jadahl at redhat.com>
> > Cc: Alex Goins <agoins at nvidia.com>
> > Cc: Pekka Paalanen <pekka.paalanen at collabora.com>
> > Cc: Melissa Wen <mwen at igalia.com>
> > Cc: Aleix Pol <aleixpol at kde.org>
> > Cc: Xaver Hugl <xaver.hugl at gmail.com>
> > Cc: Victoria Brekenfeld <victoria at system76.com>
> > Cc: Daniel Vetter <daniel at ffwll.ch>
> > Cc: Dave Airlie <airlied at gmail.com>
> > Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > Cc: Uma Shankar <uma.shankar at intel.com>
> > To: dri-devel at lists.freedesktop.org
> > ---
> >  Documentation/gpu/drm-uapi.rst | 32 ++++++++++++++++++++++++++++++++
> >  1 file changed, 32 insertions(+)
> > 
> > diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> > index ce47b4292481..eaefc3ed980c 100644
> > --- a/Documentation/gpu/drm-uapi.rst
> > +++ b/Documentation/gpu/drm-uapi.rst
> > @@ -118,6 +118,38 @@ is already rather painful for the DRM subsystem, with multiple different uAPIs
> >  for the same thing co-existing. If we add a few more complete mistakes into the
> >  mix every year it would be entirely unmanageable.
> >  
> > +.. _driver_specific:
> > +
> > +Driver-Specific DRM/KMS uAPI
> > +============================
> > +
> > +Driver-specific interfaces are strongly discouraged for DRM/KMS interfaces.
> > +Kernel-modesetting (KMS) functionality does in principle apply to all drivers.
> > +Driver-specific uAPIs tends to lead to unique implementations in userspace and
> > +are often hard to maintain, especially when different drivers implement similar
> > +but subtly different uAPIs.
> > +
> > +At times arriving at a consensus uAPI can be a difficult and lengthy process and
> > +might benefit from experimentation. This experimentation might warrant
> > +introducing driver specific APIs in order to move the eosystem forward. If a
> > +driver decides to go down this path we ask for the following:

I don't like this for two fairly fundamental reasons, neither of which are
that sometimes merging stuff that's not great is the right thing to do to
move the community and ecosystem forward.

> Hi,
> 
> should it be "require" instead of "ask"?
> 
> > +
> > +- An agreement within the community that introducing driver-specific uAPIs is
> > +  warranted in this case;
> > +
> > +- The new uAPI is behind a CONFIG option that is clearly marked EXPERIMENTAL and
> > +  is disabled by default;
> > +
> > +- The new uAPI is enabled when a module parameter for the driver is set, and
> > +  defaults to 'off' otherwise;
> > +
> > +- The new uAPI follows all open-source userspace requirements outlined above;
> > +
> > +- The focus is maintained on development of a vendor-neutral uAPI and progress
> > +  toward such an uAPI needs to be apparent on public forums. If no such progress
> > +  is visible within a reasonable timeframe (1-2 years) anybody is within their
> > +  right to send, review, and merge patches that remove the driver-specific uAPI.
> > +
> >  .. _drm_render_node:
> >  
> >  Render nodes
> 
> Seems fine to me. I have another addition to suggest:
> 
> When such UAPI is introduced, require that it comes with an expiration
> date. This date should be unmissable, not just documented. The kernel
> module parameter name to enable the UAPI could contain the expiry date,
> for example.
> 
> After all, the most important thing to get through to users is that
> this *will* go away and not just theoretically.

There's no taking-backsies with uapi. If there is a regression report,
we'll have to keep it around, for the usual approximation of "forever"

And this is the first reason I don't like this, from other write-ups and
talking with people it seems like there's the assumption that if we just
hide this behind enough knobs, we can remove the uapi again.

We can't.

The times we've hidden uapi behind knobs was _not_ for uapi we
fundamentally didn't want, at least for the long term. But for the cases
where the overall scope was simply too big, and we needed some time
in-tree to shake out all the bugs (across both kernel and userspace), and
fill out any of the details. Some examples:

- intel hiding new hw enabling behind the alpha support is not about
  hiding that uapi so we can change it. It's about the fact that not yet
  all enabling has landed in upstream, and not yet all full stack
  validation on production silicon has completed. It's about not shipping
  buggy code to users that we can't support.

- atomic kms was simply too big, there was a lot of work in compositors
  needed, testing corner cases, and details like adding the blob support
  for the display mode so that modesets would work too with atomic. We
  never landed a preliminary uabi version of atomic (there were plenty
  floating around) that wasn't deemed ready as the long term solution, we
  were simply not sure we got it right until all the pieces where in
  place.

  And viz Xorg-modesetting, in at least one case we still got it wrong and
  had to disable atomic for that userspace.

- nouveau pony years back tried this entire "oh the uapi is just
  experimental" thing, and it resulted in the by far worst flameware
  between Dave and Linus on dri-devel

So _if_ we do this we need to be clear that uapi is forever, and not have
docs that suggest otherwise.

> If that date needs to be moved forward, it should be possible to do so
> with a simple patch gathering enough acks. The main thing is to set the
> date from the start, so there can be no confusion about when its going
> to the chopping block.
> 
> I do not suggest that the kernel would automatically runtime disable
> the UAPI after that date.
> 
> Does any of the big idea fly with upper maintainers and Linus?

The other reason, and maybe even more fundamental one, is that I think the
uncertainty of not documenting how pragmatic we are is beneficial.

We should definitely document the gold standard aspirations, to make sure
everyone knows where to aim for. And I'm definitely all for pragmatic
merging where it makes sense, we've had tons of that, and happily carry
the costs to this day for some of them:

- a lot of the early soc drivers are kinda meh, and will stay that way
  forever since they're not maintained anymore

- we've had very much free-for-all vendor kms properties, and I expect the
  hall of shame witht he big table of vendor props with barely any docs
  will never go away

- we're taking all the compute runtimes despite that mesa on the 3d/gl/vk
  side shows how much better collaboration would be (and I think soon will
  show the same for media) because having a compute ecosystem that's
  substantially weaker than the sum of all its parts is still better than
  nothing. And the situation is still that collaboration even with a
  company is often impossible, aiming for better is not very realistic :-/

But the goal is still to have solid code, cross-vendor infrastructure and
collaboration and all that stuff, because that's why upstream is strong.
And the uncertainty is helping us for a lot of reasons:

- it makes vendors vary of going with vendor solutions. Minimally they ask
  in private, which gives Dave, me and all the others doing vendor
  outreach or working as some ambassador rule at a vendor an opportunity
  to steer things in a better direction. And often do the steering
  _before_ code gets written.

- it allows Dave&me to more freely decide when to be pragmatic, without
  being bound by rules. The point of pragmatic merging is to bend the
  short term principles for a better long term outcome, splattering that
  entire space full with rules makes rule-bending a lot harder when
  needed.

- most of all I really don't want to be in a discussion with vendors where
  they try to laywer-argue that we must merge their patches because they
  strictly followed the wording of some pragmatic merge rules while
  entirely tossing the spirit of what we aim for. I already have more than
  enough of that, this will result in more.

In all the past examples of pragmatic merging we never documented the
pragmatic approach, but instead if we documented something, we wrote down
the ideal standards to aim for. That makes it easier for everyone to do
the right thing, and harder (and more expensive due to the inherit
uncertainty) to try to bend them towards the least amount of collaboration
a vendor can get away with.

That's why I really want to keep the undocumented and hence uncertain
rules in this space.

For the actual case at hand of plane color handling, I think the pragmatic
aproach is roughly:

1. land the amdgpu code, but without uapi

2. use that (and any other driver code that's been floating around in this
space) to build up the kernel-internal infrastructure - the proposed graph
of color transformation blocks will need quite a few things

3. land the uapi on top in it's hopeful final form, maybe hidden if it's
not yet complete or ready for prime time as we sometimes do with bigger
projects

Obviously compositor work, igts, docs and all that too, and most of all
this can happen in parallel too once we have a rough consensus on where to
aim for.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list