[Intel-gfx] [PATCH 04/22] drm/i915: Remove request retirement before each batch

Thu Jul 28 11:52:49 UTC 2016

On Thu, Jul 28, 2016 at 11:26:48AM +0100, Chris Wilson wrote:
> On Thu, Jul 28, 2016 at 11:54:07AM +0200, Daniel Vetter wrote:
> > On Wed, Jul 27, 2016 at 12:14:42PM +0100, Chris Wilson wrote:
> > > This reimplements the denial-of-service protection against igt from
> > > commit 227f782e4667 ("drm/i915: Retire requests before creating a new
> > > one") and transfers the stall from before each batch into get_pages().
> > > The issue is that the stall is increasing latency between batches which
> > > is detrimental in some cases (especially coupled with execlists) to
> > > keeping the GPU well fed. Also we have made the observation that retiring
> > > requests can of itself free objects (and requests) and therefore makes
> > > a good first step when shrinking.
> > > 
> > > v2: Recycle objects prior to i915_gem_object_get_pages()
> > > v3: Remove the reference to the ring from i915_gem_requests_ring() as it
> > > operates on an intel_engine_cs.
> > > 
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h            | 1 -
> > >  drivers/gpu/drm/i915/i915_gem.c            | 7 +++++--
> > >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 --
> > >  drivers/gpu/drm/i915/i915_gem_request.c    | 4 ++--
> > >  4 files changed, 7 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index fbda38f25c6b..2de3d16f7b80 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -3169,7 +3169,6 @@ struct drm_i915_gem_request *
> > >  i915_gem_find_active_request(struct intel_engine_cs *engine);
> > >  
> > >  void i915_gem_retire_requests(struct drm_i915_private *dev_priv);
> > > -void i915_gem_retire_requests_ring(struct intel_engine_cs *engine);
> > >  
> > >  static inline u32 i915_reset_counter(struct i915_gpu_error *error)
> > >  {
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index bf652dc88024..68dbe4f7940c 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -2244,7 +2244,6 @@ int
> > >  i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> > >  {
> > >  	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
> > > -	const struct drm_i915_gem_object_ops *ops = obj->ops;
> > >  	int ret;
> > >  
> > >  	if (obj->pages)
> > > @@ -2257,7 +2256,10 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> > >  
> > >  	BUG_ON(obj->pages_pin_count);
> > >  
> > > -	ret = ops->get_pages(obj);
> > > +	/* Recycle as many active objects as possible first */
> > > +	i915_gem_retire_requests(dev_priv);
> > > +
> > > +	ret = obj->ops->get_pages(obj);
> > 
> > Why exactly do we need this?
> > - shmem objs already call shrink_all if they can't get at the memory
> > - everyone else doesn't care.
> 
> Because that is very expensive and we have very poor utilisation of
> caches. On average, the affected benchmarks are about 100x slower
> without it and demonstrate large variation.
> 
> Everyone else isn't allocating or has their own defense.
> 
> Otoh, the more aggressive shrinking is quite recent, more recent than
> this patch. But I stand by the measurements as they were made that this
> is the point at which utilisation mattered, if only to worry about it
> later when I need to remove the call.

Please add those numbers to the commit message, I think without them this
particular change isn't well-justified enough.

> > Even if we need this in some case it looks funny, since it splits the
> > memory cleanup between caller and callee of get_pages.

At least for shmem we now have the retire_requests() outside of the
get_pages vfunc, but the full shrink fallback inside. For OCD reasons I
htink it'd look better to have both inside the callback. Assuming that we
need the retire_requests for shmem objects, and not really for any of the
others.

It just felt like the placement is somewhat ad-hoc.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch