[Intel-gfx] [PATCH v2] drm/i915: Remove __GFP_NORETRY from our buffer allocator

Chris Wilson chris at chris-wilson.co.uk
Mon Jun 5 12:49:38 UTC 2017


Quoting Michal Hocko (2017-06-05 13:26:30)
> On Mon 05-06-17 11:35:12, Chris Wilson wrote:
> > I tried __GFP_NORETRY in the belief that __GFP_RECLAIM was effective. It
> > struggles with handling reclaim via kswapd (through inconsistency within
> > throttle_direct_reclaim() and even then the race between multiple
> > allocators makes the two step of reclaim then allocate fragile), and as
> > our buffers are always dirty (with very few exceptions), we required
> > kswapd to perform pageout on them. The only effective means of waiting
> > on kswapd is to retry the allocations (i.e. not set __GFP_NORETRY). That
> > leaves us with the dilemma of invoking the oomkiller instead of
> > propagating the allocation failure back to userspace where it can be
> > handled more gracefully (one hopes).  In the future we may have
> > __GFP_MAYFAIL to allow repeats up until we genuinely run out of memory
> > and the oomkiller would have been invoked. Until then, let the oomkiller
> > wreck havoc.
> > 
> > v2: Stop playing with side-effects of gfp flags and await __GFP_MAYFAIL
> > 
> > Fixes: 24f8e00a8a2e ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
> > Testcase: igt/gem_tiled_swapping
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > Cc: Michal Hocko <mhocko at suse.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 7286f5dd3e64..845df6067e90 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2406,7 +2406,20 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> >                       if (!*s) {
> >                               /* reclaim and warn, but no oom */
> >                               gfp = mapping_gfp_mask(mapping);
> > -                             gfp |= __GFP_NORETRY;
> > +
> > +                             /* Our bo are always dirty and so we require
> > +                              * kswapd to reclaim our pages (direct reclaim
> > +                              * performs no swapping on its own). However,
> 
> Not sure whether this is exactly what you mean. The only pageout the
> direct reclaim is allowed to the swap partition (so anonymous and
> shmem). So the above is not 100% correct.

Hmm, I didn't see anything that allows direct reclaim to perform
writeback into swap. The issue for us (i915) is that our buffers are
almost exclusively dirty, so even after we unpin them, in order to make
room they need to be paged out. Afaict, throttle_direct_reclaim() is
supposed to be the point at which direct reclaim waits for writeback via
kswapd and doesn't invoke writeback directly. throttle_direct_reclaim()
never waited as allow_direct_reclaim() kept reporting true even after a
direct reclaim failure. Without __GFP_NORETRY we were busy spinning on
kswapd making progress (and so avoiding the fail).

> 
> > +                              * direct reclaim is meant to wait for kswapd
> > +                              * when under pressure, this is broken. As a
> > +                              * result __GFP_RECLAIM is unreliable and fails
> > +                              * to actually reclaim dirty pages -- unless
> > +                              * you try over and over again with
> > +                              * !__GFP_NORETRY.
> 
> Yes, I would just mention that a heavy parallel allocations might result
> in __GFP_NORETRY failures quite easilly. I believe this is a bigger
> problem than your remark about dirty buffers (well assuming your buffers
> are not filling up the whole memory).

Yup, parallel allocations hit a problem where one thread was consuming
the successful reclaim from another and so we were failing faster. Our
tests tend to focus on i915 exclusively (mlocking the rest), so looking
at what happens when i915 fills all of memory is what we focus on.

I keep meaning to get around to adding extra pressure (find / or find
-xdev / -exec cat {};) to those tests.
-Chris


More information about the Intel-gfx mailing list