[PATCH 4/7] drm/ttm: move LRU walk defines into new internal header
Thomas Hellström
thomas.hellstrom at linux.intel.com
Thu Aug 22 08:21:47 UTC 2024
On Thu, 2024-08-22 at 09:55 +0200, Christian König wrote:
> Am 22.08.24 um 08:47 schrieb Thomas Hellström:
> > > > > As Sima said, this is complicated but not beyond
> > > > > comprehension:
> > > > > i915
> > > > > https://elixir.bootlin.com/linux/v6.11-rc4/source/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c#L317
> > > > As far as I can tell what i915 does here is extremely
> > > > questionable.
> > > >
> > > > if (sc->nr_scanned < sc->nr_to_scan &&
> > > > current_is_kswapd()) {
> > > > ....
> > > > with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
> > > >
> > > > with_intel_runtime_pm() then calls pm_runtime_get_sync().
> > > >
> > > > So basically the i915 shrinker assumes that when called from
> > > > kswapd()
> > > > that it can synchronously wait for runtime PM to power up the
> > > > device
> > > > again.
> > > >
> > > > As far as I can tell that means that a device driver makes
> > > > strong
> > > > and
> > > > completely undocumented assumptions how kswapd works
> > > > internally.
> > > Admittedly that looks weird
> > >
> > > But I'd really expect a reclaim lockdep splat to happen there if
> > > the
> > > i915 pm did something not-allowed. IIRC, the design direction the
> > > i915
> > > people got from mm people regarding the shrinkers was to avoid
> > > any
> > > sleeps in direct reclaim and punt it to kswapd. Need to ask i915
> > > people
> > > how they can get away with that.
> > >
> > >
> > So it turns out that Xe integrated pm resume is reclaim-safe, and
> > I'd
> > expect i915's to be as well. Xe discrete pm resume isn't.
> >
> > So that means that, at least for integrated, the i915 shrinker
> > should
> > be ok from that POW, and punting certain bos to kswapd is not
> > AFAICT
> > abusing any undocumented features of kswapd but rather a way to
> > avoid
> > resuming the device during direct reclaim, like documented.
>
> The more I think about this the more I disagree to this driver
> design.
> In my opinion device drivers should *never* resume runtime PM in a
> shrinker callback in the first place.
Runtime PM resume is allowed even from irq context if carefully
implemented by the driver and flagged as such to the core.
https://docs.kernel.org/power/runtime_pm.html
Resuming runtime PM from reclaim therefore shouldn't be an issue IMO,
and really up to the driver.
>
> When the device is turned off it means that all of it's operations
> are
> stopped and eventually power to caches etc turned off as well. So I
> don't see any ongoing writeback operations or similar either.
>
> So the question is why do we need to power on the device in a
> shrinker
> in the first place?
>
> What could be is that the device needs to flush GART TLBs or similar
> when it is turned on, e.g. that you grab a PM reference to make sure
> that during your HW operation the device doesn't suspend.
Exactly why the i915 needs to flush the GART I'm not sure of but I
suspect the gart TLB might be forgotten while suspended.
>
> But that doesn't mean that you should resume the device. In other
> words
> when the device is powered down you shouldn't power it up again.
>
> And for GART we already have the necessary move callback implemented
> in
> TTM. This is done by radeon, amdgpu and nouveu in a common way as far
> as
> I can see.
>
> So why should Xe be special and follow the very questionable approach
> of
> i915 here?
For Xe, Lunar Lake (integrated) has the interesting design that each bo
carries compression metadata that needs to be blitted to system pages
during shrinking. The alternative is to resolve all buffer objects at
device runtime suspend...
But runtime PM aside, with a one-bo only approach we still have the
drawbacks that it
* eliminates possibility for driver deadlock avoidance
* Requires TTM knowledge of "purgeable" bos
* Requires an additional LRU to avoid O(n2) traversal of already
shrunken objects
* Drivers with legitimate shrinker designs that don't fit in the TTM-
enforced model will have frustrated maintainers.
Thanks,
Thomas
>
> Regards,
> Christian.
>
>
> >
> > /Thomas
> >
> >
> >
More information about the dri-devel
mailing list