[Intel-gfx] [PATCH v5 3/4] drm/i915/bdw: Pin the context backing objects to GGTT on-demand

Daniel Vetter daniel at ffwll.ch
Tue Nov 18 16:11:19 CET 2014


On Tue, Nov 18, 2014 at 02:51:52PM +0000, Daniel, Thomas wrote:
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel
> > Vetter
> > Sent: Tuesday, November 18, 2014 2:33 PM
> > To: Daniel, Thomas
> > Cc: Daniel Vetter; intel-gfx at lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH v5 3/4] drm/i915/bdw: Pin the context
> > backing objects to GGTT on-demand
> > 
> > On Tue, Nov 18, 2014 at 10:48:09AM +0000, Daniel, Thomas wrote:
> > > > -----Original Message-----
> > > > From: Intel-gfx [mailto:intel-gfx-bounces at lists.freedesktop.org] On
> > > > Behalf Of Daniel, Thomas
> > > > Sent: Tuesday, November 18, 2014 9:28 AM
> > > > To: Daniel Vetter
> > > > Cc: intel-gfx at lists.freedesktop.org
> > > > Subject: Re: [Intel-gfx] [PATCH v5 3/4] drm/i915/bdw: Pin the
> > > > context backing objects to GGTT on-demand
> > > >
> > > > > -----Original Message-----
> > > > > From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of
> > > > > Daniel Vetter
> > > > > Sent: Monday, November 17, 2014 6:09 PM
> > > > > To: Daniel, Thomas
> > > > > Cc: intel-gfx at lists.freedesktop.org
> > > > > Subject: Re: [Intel-gfx] [PATCH v5 3/4] drm/i915/bdw: Pin the
> > > > > context backing objects to GGTT on-demand
> > > > >
> > > > > On Thu, Nov 13, 2014 at 10:28:10AM +0000, Thomas Daniel wrote:
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > b/drivers/gpu/drm/i915/i915_drv.h index 059330c..3c7299d 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > @@ -655,6 +655,7 @@ struct intel_context {
> > > > > >  	struct {
> > > > > >  		struct drm_i915_gem_object *state;
> > > > > >  		struct intel_ringbuffer *ringbuf;
> > > > > > +		int unpin_count;
> > > > >
> > > > > Pinning is already refcounted. Why this additional refcount?
> > > >
> > > > The vma.pin_count is only allocated 4 bits of storage.  If this
> > > > restriction can be lifted then I can use that.
> > 
> > Those 4 bits are good enough for legacy contexts, so I wonder a bit what's so
> > massively different for execlist contexts.
> With execlists, in order to dynamically unpin the LRC backing object and
> ring buffer object when not required we take a reference for each
> execlist request that uses them (remember that the execlist request
> lifecycle is currently different from the execbuffer request).  This can
> be a lot, especially in some of the less sane i-g-t tests.

Why?

Presuming the buffer objects is properly pushed onto the active list you
only need to pin while doing the command submission up to the point where
you've committed the buffer object to the active list.

I know documentation sucks for this stuff since I have this discussion
with roughly everyone ever touching anything related to active buffers :(
If you want some recent examples the cmd parser's shadow batch should
serve well (including the entire evolution from reinvented wheel to just
using the active list, although the latest patches are only 90% there and
still have 1-2 misplaced pieces).

> > > Actually I just tried to implement this, it causes a problem for patch
> > > 4 of this set as the unpin_count is also used for the ringbuffer
> > > object which has an ioremap as well as a ggtt pin.
> > 
> > Yeah, ioremap needs to be redone every time we pin/unpin. But on sane
> > archs it's almost no overhead really. And if this does start to matter (shudder
> > for 32bit kernels on gen8) then we can fix it ...
> Hm, so the CPU vaddr of the ring buffer will move around as more
> requests reference it which I suppose is not a problem.  We will use a
> lot of address space (again, especially with the i-g-t stress tests
> which can submit tens of thousands of requests in a very short space of
> time).  What would the fix be?  An extra reference count for the
> ioremap?  Looks familiar :)

ioremap always gives you the same linear address on 64bit kernels. On
32bit it makes a new one, but if you ioremap for each request it'll fall
over anyway. The solution would be to ioremap just the required pages
using the atomic kmap stuff wrapped up into the io_mapping stuff.

> I still think it's best to keep the context unpin_count for execlists mode.

Well just means the todo-list to fix up execlist grows longer.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list