[Intel-gfx] [RFC v2 0/5] Waitboost drm syncobj waits

Mon Feb 20 15:56:45 UTC 2023

On 20/02/2023 15:45, Rob Clark wrote:
> On Mon, Feb 20, 2023 at 4:22 AM Tvrtko Ursulin
> <tvrtko.ursulin at linux.intel.com> wrote:
>>
>>
>> On 17/02/2023 17:00, Rob Clark wrote:
>>> On Fri, Feb 17, 2023 at 8:03 AM Tvrtko Ursulin
>>> <tvrtko.ursulin at linux.intel.com> wrote:
>>
>> [snip]
>>
>>>>> adapted from your patches..  I think the basic idea of deadlines
>>>>> (which includes "I want it NOW" ;-)) isn't controversial, but the
>>>>> original idea got caught up in some bikeshed (what about compositors
>>>>> that wait on fences in userspace to decide which surfaces to update in
>>>>> the next frame), plus me getting busy and generally not having a good
>>>>> plan for how to leverage this from VM guests (which is becoming
>>>>> increasingly important for CrOS).  I think I can build on some ongoing
>>>>> virtgpu fencing improvement work to solve the latter.  But now that we
>>>>> have a 2nd use-case for this, it makes sense to respin.
>>>>
>>>> Sure, I was looking at the old version already. It is interesting. But
>>>> also IMO needs quite a bit more work to approach achieving what is
>>>> implied from the name of the feature. It would need proper deadline
>>>> based sched job picking, and even then drm sched is mostly just a
>>>> frontend. So once past runnable status and jobs handed over to backend,
>>>> without further driver work it probably wouldn't be very effective past
>>>> very lightly loaded systems.
>>>
>>> Yes, but all of that is not part of dma_fence ;-)
>>
>> :) Okay.
>>
>> Having said that, do we need a step back to think about whether adding
>> deadline to dma-fences is not making them something too much different
>> to what they were? Going from purely synchronisation primitive more
>> towards scheduling paradigms. Just to brainstorm if there will not be
>> any unintended consequences. I should mention this in your RFC thread
>> actually.
> 
> Perhaps "deadline" isn't quite the right name, but I haven't thought
> of anything better.  It is really a hint to the fence signaller about
> how soon it is interested in a result so the driver can factor that
> into freq scaling decisions.  Maybe "goal" or some other term would be
> better?

Don't know, no strong opinion on the name at the moment. For me it was 
more about the change of what type of side channel data is getting 
attached to dma-fence and whether it changes what the primitive is for.

> I guess that can factor into scheduling decisions as well.. but we
> already have priority for that.  My main interest is freq mgmt.
> 
> (Thankfully we don't have performance and efficiency cores to worry
> about, like CPUs ;-))
> 
>>> A pretty common challenging usecase is still the single fullscreen
>>> game, where scheduling isn't the problem, but landing at an
>>> appropriate GPU freq absolutely is.  (UI workloads are perhaps more
>>> interesting from a scheduler standpoint, but they generally aren't
>>> challenging from a load/freq standpoint.)
>>
>> Challenging as in picking the right operating point? Might be latency
>> impacted (and so user perceived UI smoothness) due missing waitboost for
>> anything syncobj related. I don't know if anything to measure that
>> exists currently though. Assuming it is measurable then the question
>> would be is it perceivable.
>>> Fwiw, the original motivation of the series was to implement something
>>> akin to i915 pageflip boosting without having to abandon the atomic
>>> helpers.  (And, I guess it would also let i915 preserve that feature
>>> if it switched to atomic helpers.. I'm unsure if there are still other
>>> things blocking i915's migration.)
>>
>> Question for display folks I guess.
>>
>>>> Then if we fast forward to a world where schedulers perhaps become fully
>>>> deadline aware (we even had this for i915 few years back) then the
>>>> question will be does equating waits with immediate deadlines still
>>>> works. Maybe not too well because we wouldn't have the ability to
>>>> distinguish between the "someone is waiting" signal from the otherwise
>>>> propagated deadlines.
>>>
>>> Is there any other way to handle a wait boost than expressing it as an
>>> ASAP deadline?
>>
>> A leading question or just a question? Nothing springs to my mind at the
>> moment.
> 
> Just a question.  The immediate deadline is the only thing that makes
> sense to me, but that could be because I'm looking at it from the
> perspective of also trying to handle the case where missing vblank
> reduces utilization and provides the wrong signal to gpufreq.. i915
> already has a way to handle this internally, but it involves bypassing
> the atomic helpers, which isn't a thing I want to encourage other
> drivers to do.  And completely doesn't work for situations where the
> gpu and display are separate devices.

Right, there is yet another angle to discuss with Daniel here who AFAIR 
was a bit against i915 priority inheritance going past a single device 
instance. In which case DRI_PRIME=1 would lose the ability to boost 
frame buffer dependency chains. Opens up the question of deadline 
inheritance across different drivers too. Or perhaps Daniel would be 
okay with this working if implemented at the dma-fence layer.

Regards,

Tvrtko