[Intel-gfx] [PATCH] drm/i915: Do not add an interrupt for a context switch

Thu Aug 8 01:10:22 CEST 2013

On Wed, Aug 07, 2013 at 11:58:21PM +0200, Daniel Vetter wrote:
> On Wed, Aug 07, 2013 at 10:29:05PM +0100, Chris Wilson wrote:
> > We use the request to ensure we hold a reference to the context for the
> > duration that it remains in use by the ring. Each request only holds a
> > reference to the current context, hence we emit a request after
> > switching contexts with the final reference to the old context. However,
> > the extra interrupt caused by that request is not useful (no timing
> > critical function will wait for the context object), instead the overhead
> > of servicing the IRQ shows up in some (lightweight) benchmarks. In order
> > to keep the useful property of using the request to manage the context
> > lifetime, we want to add a dummy request that is associated with the
> > interrupt from the subsequent real request following the batch.
> > 
> > The extra interrupt was added as a side-effect of using
> > i915_add_request() in
> > 
> > commit 112522f6789581824903f6f72082b5b841a7f0f9
> > Author: Chris Wilson <chris at chris-wilson.co.uk>
> > Date:   Thu May 2 16:48:07 2013 +0300
> > 
> >     drm/i915: put context upon switching
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Ben Widawsky <ben at bwidawsk.net>
> > Cc: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h         |  1 +
> >  drivers/gpu/drm/i915/i915_gem.c         | 32 +++++++++++++++++++++++++++++++-
> >  drivers/gpu/drm/i915/i915_gem_context.c |  2 +-
> >  3 files changed, 33 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 9a8fac3..e822390 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1873,6 +1873,7 @@ int __i915_add_request(struct intel_ring_buffer *ring,
> >  		       struct drm_file *file,
> >  		       struct drm_i915_gem_object *batch_obj,
> >  		       u32 *seqno);
> > +int __i915_add_request_marker(struct intel_ring_buffer *ring);
> >  #define i915_add_request(ring, seqno) \
> >  	__i915_add_request(ring, NULL, NULL, seqno)
> >  int __must_check i915_wait_seqno(struct intel_ring_buffer *ring,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 0fa45e0..7568bf0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2110,6 +2110,37 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
> >  	return 0;
> >  }
> >  
> > +int
> > +__i915_add_request_marker(struct intel_ring_buffer *ring)
> > +{
> > +	struct drm_i915_gem_request *request;
> > +
> > +	request = kmalloc(sizeof(*request), GFP_KERNEL);
> > +	if (request == NULL)
> > +		return -ENOMEM;
> > +
> > +	request->ring = ring;
> > +	request->seqno = intel_ring_get_seqno(ring);
> > +	request->tail = request->head = intel_ring_get_tail(ring);
> > +	request->batch_obj = NULL;
> > +
> > +	/* Whilst this request exists, batch_obj will be on the
> > +	 * active_list, and so will hold the active reference. Only when this
> > +	 * request is retired will the the batch_obj be moved onto the
> > +	 * inactive_list and lose its active reference. Hence we do not need
> > +	 * to explicitly hold another reference here.
> > +	 */
> > +	request->ctx = ring->last_context;
> > +	if (request->ctx)
> > +		i915_gem_context_reference(request->ctx);
> > +
> > +	request->emitted_jiffies = jiffies;
> > +	list_add_tail(&request->list, &ring->request_list);
> > +	request->file_priv = NULL;
> > +
> > +	return 0;
> > +}
> > +
> >  int __i915_add_request(struct intel_ring_buffer *ring,
> >  		       struct drm_file *file,
> >  		       struct drm_i915_gem_object *obj,
> > @@ -2137,7 +2168,6 @@ int __i915_add_request(struct intel_ring_buffer *ring,
> >  	if (request == NULL)
> >  		return -ENOMEM;
> >  
> > -
> >  	/* Record the position of the start of the request so that
> >  	 * should we detect the updated seqno part-way through the
> >  	 * GPU processing the request, we never over-estimate the
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index db94aca..88a9425 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -466,7 +466,7 @@ static int do_switch(struct i915_hw_context *to)
> >  		from->obj->dirty = 1;
> >  		BUG_ON(from->obj->ring != ring);
> >  
> > -		ret = i915_add_request(ring, NULL);
> > +		ret = __i915_add_request_marker(ring);
> 
> Can't we just ditch the add_request here? The actual backing storage of
> the old context won't get reaped too early (due to the oustanding lazy
> request logic) and as long as we don't schedule a batch we don't care one
> bit that there's no request with a context linked to it. Only once we emit
> batches we care, since otherwise the hangcheck code can't assign the
> blame.

The request holds the last reference to the context->obj, and each
request only has a single context slot.  The olr and next batch only
preserves ring->last_context which is now the new context. So we insert
this dummy context into the request queue for the sole purpose of
holding the reference until the real request (and new context) adds the
breadcrumb.

> Ripping out the add_request would also allow us to ditch the scary WARN +
> mi_set_context stuff below.

I had exactly the same though and gleefully ripped it before reading the
original commit and the penny dropped.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre