[PATCH] drm/ttm: stop warning on TT shrinker failure

Daniel Vetter daniel at ffwll.ch
Tue Mar 23 12:00:29 UTC 2021


On Tue, Mar 23, 2021 at 12:51:13PM +0100, Christian König wrote:
> 
> 
> Am 23.03.21 um 12:46 schrieb Michal Hocko:
> > On Tue 23-03-21 12:28:20, Daniel Vetter wrote:
> > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > [...]
> > > > > > fs_reclaim_acquire is there to make sure lockdep understands that this
> > > > > > is a shrinker and that it checks all the dependencies for us like if
> > > > > > we'd be in real reclaim. There is some drop caches interfaces in proc
> > > > > > iirc, but those drop everything, and they don't have the fs_reclaim
> > > > > > annotations to teach lockdep about what we're doing.
> > > > ... I really do not follow this. You shouldn't really care whether this
> > > > is a reclaim interface or not. Or maybe I just do not understand this...
> > > We're heavily relying on lockdep and fs_reclaim to make sure we get it all
> > > right. So any drop caches interface that isn't wrapped in fs_reclaim
> > > context is kinda useless for testing. Plus ideally we want to only hit our
> > > own paths, and not trash every other cache in the system. Speed matters in
> > > CI.
> > But what is special about this path to hack around and make it pretend
> > it is part of the fs reclaim path?
> 
> That's just to teach lockdep that there is a dependency.
> 
> In other words we pretend in the debugfs file that it is part of the fs
> reclaim path to check for the case when it really becomes part of the fs
> reclaim path.

Yeah this is only for testing. There's two ways to test your shrinker:

- drive system agains the OOM wall, deal with lots of unrelated hangs and
  issues. Aside from this takes postively forever, which is not good if
  you want CI turn-around time measured in "coffee breaks" as time unit.

- have a debugfs file which reconstructs the calling context of direct
  reclaim sufficiently for lockdep to do its thing, and then test just
  your shrinker in isolation, without crashing your CI machines or even
  hurting it much.

Only one of these options is actually practical.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the amd-gfx mailing list