[RFC PATCH v2 06/17] drm/doc/rfc: Describe why prescriptive color pipeline is needed

Xaver Hugl xaver.hugl at kde.org
Fri Oct 27 11:40:20 UTC 2023


Am Fr., 27. Okt. 2023 um 12:01 Uhr schrieb Sebastian Wick <
sebastian.wick at redhat.com>:

> On Fri, Oct 27, 2023 at 10:59:25AM +0200, Michel Dänzer wrote:
> > On 10/26/23 21:25, Alex Goins wrote:
> > > On Thu, 26 Oct 2023, Sebastian Wick wrote:
> > >> On Thu, Oct 26, 2023 at 11:57:47AM +0300, Pekka Paalanen wrote:
> > >>> On Wed, 25 Oct 2023 15:16:08 -0500 (CDT)
> > >>> Alex Goins <agoins at nvidia.com> wrote:
> > >>>
> > >>>> Despite being programmable, the LUTs are updated in a manner that
> is less
> > >>>> efficient as compared to e.g. the non-static "degamma" LUT. Would
> it be helpful
> > >>>> if there was some way to tag operations according to their
> performance,
> > >>>> for example so that clients can prefer a high performance one when
> they
> > >>>> intend to do an animated transition? I recall from the XDC HDR
> workshop
> > >>>> that this is also an issue with AMD's 3DLUT, where updates can be
> too
> > >>>> slow to animate.
> > >>>
> > >>> I can certainly see such information being useful, but then we need
> to
> > >>> somehow quantize the performance.
> > >
> > > Right, which wouldn't even necessarily be universal, could depend on
> the given
> > > host, GPU, etc. It could just be a relative performance indication, to
> give an
> > > order of preference. That wouldn't tell you if it can or can't be
> animated, but
> > > when choosing between two LUTs to animate you could prefer the higher
> > > performance one.
> > >
> > >>>
> > >>> What I was left puzzled about after the XDC workshop is that is it
> > >>> possible to pre-load configurations in the background (slow), and
> then
> > >>> quickly switch between them? Hardware-wise I mean.
> > >
> > > This works fine for our "fast" LUTs, you just point them to a surface
> in video
> > > memory and they flip to it. You could keep multiple surfaces around
> and flip
> > > between them without having to reprogram them in software. We can
> easily do that
> > > with enumerated curves, populating them when the driver initializes
> instead of
> > > waiting for the client to request them. You can even point multiple
> hardware
> > > LUTs to the same video memory surface, if they need the same curve.
> > >
> > >>
> > >> We could define that pipelines with a lower ID are to be preferred
> over
> > >> higher IDs.
> > >
> > > Sure, but this isn't just an issue with a pipeline as a whole, but the
> > > individual elements within it and how to use them in a given context.
> > >
> > >>
> > >> The issue is that if programming a pipeline becomes too slow to be
> > >> useful it probably should just not be made available to user space.
> > >
> > > It's not that programming the pipeline is overall too slow. The LUTs
> we have
> > > that are relatively slow to program are meant to be set infrequently,
> or even
> > > just once, to allow the scaler and tone mapping operator to operate in
> fixed
> > > point PQ space. You might still want the tone mapper, so you would
> choose a
> > > pipeline that includes them, but when it comes to e.g. animating a
> night light,
> > > you would want to choose a different LUT for that purpose.
> > >
> > >>
> > >> The prepare-commit idea for blob properties would help to make the
> > >> pipelines usable again, but until then it's probably a good idea to
> just
> > >> not expose those pipelines.
> > >
> > > The prepare-commit idea actually wouldn't work for these LUTs, because
> they are
> > > programmed using methods instead of pointing them to a surface. I'm
> actually not
> > > sure how slow it actually is, would need to benchmark it. I think not
> exposing
> > > them at all would be overkill, since it would mean you can't use the
> preblending
> > > scaler or tonemapper, and animation isn't necessary for that.
> > >
> > > The AMD 3DLUT is another example of a LUT that is slow to update, and
> it would
> > > obviously be a major loss if that wasn't exposed. There just needs to
> be some
> > > way for clients to know if they are going to kill performance by
> trying to
> > > change it every frame.
> >
> > Might a first step be to require the ALLOW_MODESET flag to be set when
> changing the values for a colorop which is too slow to be updated per
> refresh cycle?
> >
> > This would tell the compositor: You can use this colorop, but you can't
> change its values on the fly.
>
> I argued before that changing any color op to passthrough should never
> require ALLOW_MODESET and while this is really hard to guarantee from a
> driver perspective I still believe that it's better to not expose any
> feature requiring ALLOW_MODESET or taking too long to program to be
> useful for per-frame changes.
>
> When user space has ways to figure out if going back to a specific state
> (in this case setting everything to bypass) without ALLOW_MODESET we can
> revisit this decision, but until then, let's keep things simple and only
> expose things that work reliably without ALLOW_MODESET and fast enough
> to work for per-frame changes.
>

Knowing an operation is fast enough for "per-frame" changes is by far not
enough. If programming a 3D lut takes 4 milliseconds for example, that
requires very different scheduling for atomic commits to hit the vblank
deadline from when programming a 1D lut takes 100 microseconds. It's also
dependent on the refresh rate - that 4ms example would be per-frame on a
60Hz display, but not on a 300Hz display.

The only thing that would be useful for me is to get an upper bound on how
long programming a color pipeline and/or the individual elements takes
(exposed in the API, or at the very least documented). Without something
like that I would only ever program the pipelines on modesets because
there's no strict timing requirements there.


>
> Harry, Pekka: Should we document this? It obviously restricts what can
> be exposed but exposing things that can't be used by user space isn't
> useful.
>
> >
> > --
> > Earthling Michel Dänzer            |                  https://redhat.com
> > Libre software enthusiast          |         Mesa and Xwayland developer
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20231027/40dec1da/attachment.htm>


More information about the dri-devel mailing list