[Intel-gfx] [PATCH 04/22] drm/i915: Remove request retirement before each batch

Thu Jul 28 10:26:48 UTC 2016

On Thu, Jul 28, 2016 at 11:54:07AM +0200, Daniel Vetter wrote:
> On Wed, Jul 27, 2016 at 12:14:42PM +0100, Chris Wilson wrote:
> > This reimplements the denial-of-service protection against igt from
> > commit 227f782e4667 ("drm/i915: Retire requests before creating a new
> > one") and transfers the stall from before each batch into get_pages().
> > The issue is that the stall is increasing latency between batches which
> > is detrimental in some cases (especially coupled with execlists) to
> > keeping the GPU well fed. Also we have made the observation that retiring
> > requests can of itself free objects (and requests) and therefore makes
> > a good first step when shrinking.
> > 
> > v2: Recycle objects prior to i915_gem_object_get_pages()
> > v3: Remove the reference to the ring from i915_gem_requests_ring() as it
> > operates on an intel_engine_cs.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h            | 1 -
> >  drivers/gpu/drm/i915/i915_gem.c            | 7 +++++--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 --
> >  drivers/gpu/drm/i915/i915_gem_request.c    | 4 ++--
> >  4 files changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index fbda38f25c6b..2de3d16f7b80 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -3169,7 +3169,6 @@ struct drm_i915_gem_request *
> >  i915_gem_find_active_request(struct intel_engine_cs *engine);
> >  
> >  void i915_gem_retire_requests(struct drm_i915_private *dev_priv);
> > -void i915_gem_retire_requests_ring(struct intel_engine_cs *engine);
> >  
> >  static inline u32 i915_reset_counter(struct i915_gpu_error *error)
> >  {
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index bf652dc88024..68dbe4f7940c 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2244,7 +2244,6 @@ int
> >  i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> >  {
> >  	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
> > -	const struct drm_i915_gem_object_ops *ops = obj->ops;
> >  	int ret;
> >  
> >  	if (obj->pages)
> > @@ -2257,7 +2256,10 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> >  
> >  	BUG_ON(obj->pages_pin_count);
> >  
> > -	ret = ops->get_pages(obj);
> > +	/* Recycle as many active objects as possible first */
> > +	i915_gem_retire_requests(dev_priv);
> > +
> > +	ret = obj->ops->get_pages(obj);
> 
> Why exactly do we need this?
> - shmem objs already call shrink_all if they can't get at the memory
> - everyone else doesn't care.

Because that is very expensive and we have very poor utilisation of
caches. On average, the affected benchmarks are about 100x slower
without it and demonstrate large variation.

Everyone else isn't allocating or has their own defense.

Otoh, the more aggressive shrinking is quite recent, more recent than
this patch. But I stand by the measurements as they were made that this
is the point at which utilisation mattered, if only to worry about it
later when I need to remove the call.

> Even if we need this in some case it looks funny, since it splits the
> memory cleanup between caller and callee of get_pages.

?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre