[PATCH 1/3] drm: create unified vma offset manager

Wed Dec 19 02:01:14 PST 2012

On Wed, Dec 19, 2012 at 09:22:26AM +0000, Chris Wilson wrote:
> On Wed, 19 Dec 2012 11:56:18 +1000, Dave Airlie <airlied at gmail.com> wrote:
> > From: Dave Airlie <airlied at redhat.com>
> > 
> > So we have to offset manager implementations for dealing with VMA offsets.
> > GEM had one using the hash table, TTM had one using an rbtree,
> > 
> > I'd rather we just had one to rule them all, since ttm is using the rbtree
> > variant to allow sub mappings, to keep ABI we need to use that one.
> > 
> > This also adds a bunch of inline helpers to avoid gem/ttm poking around
> > inside the vma_offset objects. TTM needs a reset helper to remove a vma_offset
> > when it copies an object to another one for buffer moves. The helpers
> > also let drivers avoid the map_list.hash_key << PAGE_SHIFT nonsense.
> 
> Any clue as to the performance difference between the two
> implementations? What does it add to the cost of a pagefault?
> 
> Hmm, don't have an i-g-t handy for scalability testing of the fault
> handlers.
> 
> > +int drm_vma_offset_setup(struct drm_vma_offset_manager *man,
> > +			 struct drm_vma_offset_node *node,
> > +			 unsigned long num_pages)
> > +{
> > +	int ret;
> > +
> > +retry_pre_get:
> > +	ret = drm_mm_pre_get(&man->addr_space_mm);
> > +	if (unlikely(ret != 0))
> > +		return ret;
> > +
> > +	write_lock(&man->vm_lock);
> > +	node->vm_node = drm_mm_search_free(&man->addr_space_mm,
> > +					   num_pages, 0, 0);
> > +
> > +	if (unlikely(node->vm_node == NULL)) {
> > +		ret = -ENOMEM;
> ret = -ENOSPC;
> 
> Depended upon by the higher layers for determining when to purge their
> caches; i-g-t/gem_mmap_offset_exhaustion

The larger topic is that drm_mm is only 32bit on 32bit and we routinely
exhaust that after a few weeks of uptime. Or better: We _did_ exhaust
that, until we've added tons of checks in both kernel&libdrm to reap
cached objects if it doesn't work. Hence it's paramount for our code to
get a -ENOSPC to engage in mmap offset reaping.

> > +		goto out_unlock;
> > +	}
> > +
> > +	node->vm_node = drm_mm_get_block_atomic(node->vm_node,
> > +						num_pages, 0);
> > +
> 
> I'd feel happier if this tried to only take from the preallocated pool
> rather than actually try a GFP_ATOMIC allocation.

Seconded. Imo drm_mm_pre_get should just die - it artificially limits
prealloc to 4 nodes, which means you'll fall over if 5 threads enter this.
And with the drm_mm rework of about a year ago it's no longer required,
you can simply embedded the struct drm_mm_node whereever you want, and mm
won't allocate anything any more on top of that. Or preallocate the
drm_mm_node in the code, outside of the locks. Inserting the node happens
then with drm_mm_insert* functions, which combine the search_free +
get_blcok (since really, there's nothing interesting you can do with the
hole space you get from search_free). See e.g.

http://cgit.freedesktop.org/~danvet/drm-intel/commit/?id=dc9dd7a20fde95aa81a8307cde79c2dff9f83f3d

for the conversion. My efforts of yonder to convert everyon have stalled a
bit in the ttm code, but now that i915 is converted and (hopefully) the
mmap offset stuff, I'll pick this up again. Would allow us to kill quite a
bit of cruft from the drm_mm interfaces.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch