<div dir="ltr"><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Am Fr., 27. Okt. 2023 um 12:01 Uhr schrieb Sebastian Wick <<a href="mailto:sebastian.wick@redhat.com">sebastian.wick@redhat.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Fri, Oct 27, 2023 at 10:59:25AM +0200, Michel Dänzer wrote:<br>
> On 10/26/23 21:25, Alex Goins wrote:<br>
> > On Thu, 26 Oct 2023, Sebastian Wick wrote:<br>
> >> On Thu, Oct 26, 2023 at 11:57:47AM +0300, Pekka Paalanen wrote:<br>
> >>> On Wed, 25 Oct 2023 15:16:08 -0500 (CDT)<br>
> >>> Alex Goins <<a href="mailto:agoins@nvidia.com" target="_blank">agoins@nvidia.com</a>> wrote:<br>
> >>><br>
> >>>> Despite being programmable, the LUTs are updated in a manner that is less<br>
> >>>> efficient as compared to e.g. the non-static "degamma" LUT. Would it be helpful<br>
> >>>> if there was some way to tag operations according to their performance,<br>
> >>>> for example so that clients can prefer a high performance one when they<br>
> >>>> intend to do an animated transition? I recall from the XDC HDR workshop<br>
> >>>> that this is also an issue with AMD's 3DLUT, where updates can be too<br>
> >>>> slow to animate.<br>
> >>><br>
> >>> I can certainly see such information being useful, but then we need to<br>
> >>> somehow quantize the performance.<br>
> > <br>
> > Right, which wouldn't even necessarily be universal, could depend on the given<br>
> > host, GPU, etc. It could just be a relative performance indication, to give an<br>
> > order of preference. That wouldn't tell you if it can or can't be animated, but<br>
> > when choosing between two LUTs to animate you could prefer the higher<br>
> > performance one.<br>
> > <br>
> >>><br>
> >>> What I was left puzzled about after the XDC workshop is that is it<br>
> >>> possible to pre-load configurations in the background (slow), and then<br>
> >>> quickly switch between them? Hardware-wise I mean.<br>
> > <br>
> > This works fine for our "fast" LUTs, you just point them to a surface in video<br>
> > memory and they flip to it. You could keep multiple surfaces around and flip<br>
> > between them without having to reprogram them in software. We can easily do that<br>
> > with enumerated curves, populating them when the driver initializes instead of<br>
> > waiting for the client to request them. You can even point multiple hardware<br>
> > LUTs to the same video memory surface, if they need the same curve.<br>
> > <br>
> >><br>
> >> We could define that pipelines with a lower ID are to be preferred over<br>
> >> higher IDs.<br>
> > <br>
> > Sure, but this isn't just an issue with a pipeline as a whole, but the<br>
> > individual elements within it and how to use them in a given context.<br>
> > <br>
> >><br>
> >> The issue is that if programming a pipeline becomes too slow to be<br>
> >> useful it probably should just not be made available to user space.<br>
> > <br>
> > It's not that programming the pipeline is overall too slow. The LUTs we have<br>
> > that are relatively slow to program are meant to be set infrequently, or even<br>
> > just once, to allow the scaler and tone mapping operator to operate in fixed<br>
> > point PQ space. You might still want the tone mapper, so you would choose a<br>
> > pipeline that includes them, but when it comes to e.g. animating a night light,<br>
> > you would want to choose a different LUT for that purpose.<br>
> > <br>
> >><br>
> >> The prepare-commit idea for blob properties would help to make the<br>
> >> pipelines usable again, but until then it's probably a good idea to just<br>
> >> not expose those pipelines.<br>
> > <br>
> > The prepare-commit idea actually wouldn't work for these LUTs, because they are<br>
> > programmed using methods instead of pointing them to a surface. I'm actually not<br>
> > sure how slow it actually is, would need to benchmark it. I think not exposing<br>
> > them at all would be overkill, since it would mean you can't use the preblending<br>
> > scaler or tonemapper, and animation isn't necessary for that.<br>
> > <br>
> > The AMD 3DLUT is another example of a LUT that is slow to update, and it would<br>
> > obviously be a major loss if that wasn't exposed. There just needs to be some<br>
> > way for clients to know if they are going to kill performance by trying to<br>
> > change it every frame.<br>
> <br>
> Might a first step be to require the ALLOW_MODESET flag to be set when changing the values for a colorop which is too slow to be updated per refresh cycle?<br>
> <br>
> This would tell the compositor: You can use this colorop, but you can't change its values on the fly.<br>
<br>
I argued before that changing any color op to passthrough should never<br>
require ALLOW_MODESET and while this is really hard to guarantee from a<br>
driver perspective I still believe that it's better to not expose any<br>
feature requiring ALLOW_MODESET or taking too long to program to be<br>
useful for per-frame changes.<br>
<br>
When user space has ways to figure out if going back to a specific state<br>
(in this case setting everything to bypass) without ALLOW_MODESET we can<br>
revisit this decision, but until then, let's keep things simple and only<br>
expose things that work reliably without ALLOW_MODESET and fast enough<br>
to work for per-frame changes.<br></blockquote><div><br></div><div>Knowing an operation is fast enough for "per-frame" changes is by far not enough. If programming a 3D lut takes 4 milliseconds for example, that requires very different scheduling for atomic commits to hit the vblank deadline from when programming a 1D lut takes 100 microseconds. It's also dependent on the refresh rate - that 4ms example would be per-frame on a 60Hz display, but not on a 300Hz display.</div><div><br></div><div>The only thing that would be useful for me is to get an upper bound on how long programming a color pipeline and/or the individual elements takes (exposed in the API, or at the very least documented). Without something like that I would only ever program the pipelines on modesets because there's no strict timing requirements there.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Harry, Pekka: Should we document this? It obviously restricts what can<br>
be exposed but exposing things that can't be used by user space isn't<br>
useful.<br>
<br>
> <br>
> -- <br>
> Earthling Michel Dänzer | <a href="https://redhat.com" rel="noreferrer" target="_blank">https://redhat.com</a><br>
> Libre software enthusiast | Mesa and Xwayland developer<br>
> <br>
<br>
</blockquote></div></div>