[Intel-gfx] [RFC 1/2] drm/i915: Pre-allocation of shmem pages of a GEM object
Chris Wilson
chris at chris-wilson.co.uk
Thu Aug 6 04:30:12 PDT 2015
On Sun, May 04, 2014 at 04:48:24PM +0530, akash.goel at intel.com wrote:
> From: Akash Goel <akash.goel at intel.com>
>
> This patch could help to reduce the time, 'struct_mutex' is kept
> locked during either the exec-buffer path or Page fault
> handling path as now the backing pages are requested from shmem layer
> without holding the 'struct_mutex'.
I'm not keen on having an extra pass inside execbuffer for what should
be relatively rare, but in discussion the idea of adding a flag to
create2 (I915_CREATE_POPULATE) should cover the usecase nicely (having
to pay the shmemfs allocation+zero price without holding struct_mutex).
> +void
> +i915_gem_object_shmem_preallocate(struct drm_i915_gem_object *obj)
This is unlocked. We have to be loud and clear about that
static void __i915_gem_object_shmemfs_populate(obj)
> +{
> + int page_count, i;
> + struct address_space *mapping;
> + struct page *page;
> + gfp_t gfp;
> +
> + if (obj->pages)
> + return;
if (READ_ONCE(obj->pages) return;
> + if (obj->madv != I915_MADV_WILLNEED) {
> + DRM_ERROR("Attempting to preallocate a purgeable object\n");
Ideally if (READ_ONCE(obj->madv)...
However, that requires a patch to move madv to unsigned obj->flags.
A user controllable *ERROR*? No, just make it debug.
> + return;
> + }
> +
> + if (obj->base.filp) {
> + int ret;
> + struct inode *inode = file_inode(obj->base.filp);
> + struct shmem_inode_info *info = SHMEM_I(inode);
> + if (!inode)
> + return;
Assume it exists. If it doesn't it is an internal bug so we should just
GPF out of here.
> + /* The alloced field stores how many data pages are allocated
> + * to the file. If already shmem space has been allocated for
> + * the object, then we can simply return */
> + spin_lock(&info->lock);
> + ret = info->alloced;
> + spin_unlock(&info->lock);
> + if (ret > 0) {
> + DRM_DEBUG_DRIVER("Already shmem space alloced for obj %p, %d pages\n",
> + obj, ret);
> + return;
> + }
I'm convinced this is of much use though. If the page is already
allocated the shmem_read_mapping_page_gfp should be quick. Given the
suggestion that we move this to create, it is clear that this won't be
an issue.
> + } else
> + return;
> +
> + BUG_ON(obj->pages_pin_count);
Requires struct_mutex; ignore.
> +
> + /* Assert that the object is not currently in any GPU domain. As it
> + * wasn't in the GTT, there shouldn't be any way it could have been in
> + * a GPU cache
> + */
> + BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> + BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
Requires struct_mutex; ignore.
> + trace_i915_gem_obj_prealloc_start(obj, obj->base.size);
You only need to pass obj to the tracer, it can work out the size
itself.
> + page_count = obj->base.size / PAGE_SIZE;
> +
> + /* Get the list of pages out of our struct file
> + * Fail silently without starting the shrinker
> + */
> + mapping = file_inode(obj->base.filp)->i_mapping;
> + gfp = mapping_gfp_mask(mapping);
> + gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD;
> + gfp &= ~(__GFP_IO | __GFP_WAIT);
Interesting. Disabling shrinker/oom for these pages - we will hit it
later of course. But for the purposes of a quick preallocation, seems
fine.
> + for (i = 0; i < page_count; i++) {
> + page = shmem_read_mapping_page_gfp(mapping, i, gfp);
> + if (IS_ERR(page)) {
> + DRM_DEBUG_DRIVER("Failure for obj(%p), size(%x) at page(%d)\n",
> + obj, obj->base.size, i);
> + break;
> + }
> + /* Decrement the extra ref count on the returned page,
> + otherwise when 'get_pages_gtt' will be called later on
> + in the regular path, it will also increment the ref count,
> + which will disturb the ref count management */
> + page_cache_release(page);
> + }
> +
> + trace_i915_gem_obj_prealloc_end(obj);
> +}
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list