[Intel-gfx] [RFC v2 0/5] Waitboost drm syncobj waits

Mon Feb 20 15:45:21 UTC 2023

On Mon, Feb 20, 2023 at 4:22 AM Tvrtko Ursulin
<tvrtko.ursulin at linux.intel.com> wrote:
>
>
> On 17/02/2023 17:00, Rob Clark wrote:
> > On Fri, Feb 17, 2023 at 8:03 AM Tvrtko Ursulin
> > <tvrtko.ursulin at linux.intel.com> wrote:
>
> [snip]
>
> >>> adapted from your patches..  I think the basic idea of deadlines
> >>> (which includes "I want it NOW" ;-)) isn't controversial, but the
> >>> original idea got caught up in some bikeshed (what about compositors
> >>> that wait on fences in userspace to decide which surfaces to update in
> >>> the next frame), plus me getting busy and generally not having a good
> >>> plan for how to leverage this from VM guests (which is becoming
> >>> increasingly important for CrOS).  I think I can build on some ongoing
> >>> virtgpu fencing improvement work to solve the latter.  But now that we
> >>> have a 2nd use-case for this, it makes sense to respin.
> >>
> >> Sure, I was looking at the old version already. It is interesting. But
> >> also IMO needs quite a bit more work to approach achieving what is
> >> implied from the name of the feature. It would need proper deadline
> >> based sched job picking, and even then drm sched is mostly just a
> >> frontend. So once past runnable status and jobs handed over to backend,
> >> without further driver work it probably wouldn't be very effective past
> >> very lightly loaded systems.
> >
> > Yes, but all of that is not part of dma_fence ;-)
>
> :) Okay.
>
> Having said that, do we need a step back to think about whether adding
> deadline to dma-fences is not making them something too much different
> to what they were? Going from purely synchronisation primitive more
> towards scheduling paradigms. Just to brainstorm if there will not be
> any unintended consequences. I should mention this in your RFC thread
> actually.

Perhaps "deadline" isn't quite the right name, but I haven't thought
of anything better.  It is really a hint to the fence signaller about
how soon it is interested in a result so the driver can factor that
into freq scaling decisions.  Maybe "goal" or some other term would be
better?

I guess that can factor into scheduling decisions as well.. but we
already have priority for that.  My main interest is freq mgmt.

(Thankfully we don't have performance and efficiency cores to worry
about, like CPUs ;-))

> > A pretty common challenging usecase is still the single fullscreen
> > game, where scheduling isn't the problem, but landing at an
> > appropriate GPU freq absolutely is.  (UI workloads are perhaps more
> > interesting from a scheduler standpoint, but they generally aren't
> > challenging from a load/freq standpoint.)
>
> Challenging as in picking the right operating point? Might be latency
> impacted (and so user perceived UI smoothness) due missing waitboost for
> anything syncobj related. I don't know if anything to measure that
> exists currently though. Assuming it is measurable then the question
> would be is it perceivable.
> > Fwiw, the original motivation of the series was to implement something
> > akin to i915 pageflip boosting without having to abandon the atomic
> > helpers.  (And, I guess it would also let i915 preserve that feature
> > if it switched to atomic helpers.. I'm unsure if there are still other
> > things blocking i915's migration.)
>
> Question for display folks I guess.
>
> >> Then if we fast forward to a world where schedulers perhaps become fully
> >> deadline aware (we even had this for i915 few years back) then the
> >> question will be does equating waits with immediate deadlines still
> >> works. Maybe not too well because we wouldn't have the ability to
> >> distinguish between the "someone is waiting" signal from the otherwise
> >> propagated deadlines.
> >
> > Is there any other way to handle a wait boost than expressing it as an
> > ASAP deadline?
>
> A leading question or just a question? Nothing springs to my mind at the
> moment.

Just a question.  The immediate deadline is the only thing that makes
sense to me, but that could be because I'm looking at it from the
perspective of also trying to handle the case where missing vblank
reduces utilization and provides the wrong signal to gpufreq.. i915
already has a way to handle this internally, but it involves bypassing
the atomic helpers, which isn't a thing I want to encourage other
drivers to do.  And completely doesn't work for situations where the
gpu and display are separate devices.

BR,
-R

> Regards,
>
> Tvrtko