[Intel-gfx] [PATCH 1/3] drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctl

Chris Wilson chris at chris-wilson.co.uk
Mon Mar 6 12:48:35 UTC 2017


On Mon, Mar 06, 2017 at 12:24:46PM +0000, Tvrtko Ursulin wrote:
> 
> On 06/03/2017 09:29, Chris Wilson wrote:
> >Before we instantiate/pin the backing store for our use, we
> >can prepopulate the shmemfs filp efficiently using the a
> >write into the pagecache. We avoid the penalty of instantiating
> >all the pages, important if the user is just writing to a few
> >and never uses the object on the GPU, and using a direct write
> >into shmemfs allows it to avoid the cost of retrieving a page
> >(either swapin or clearing-before-use) before it is overwritten.
> >
> >This can be extended later to provide additional specialisation for
> >other backends (other than shmemfs).
> >
> >References: https://bugs.freedesktop.org/show_bug.cgi?id=99107
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >Cc: Matthew Auld <matthew.william.auld at gmail.com>
> >Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> >Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> >---
> > drivers/gpu/drm/i915/i915_gem.c        | 78 ++++++++++++++++++++++++++++++++++
> > drivers/gpu/drm/i915/i915_gem_object.h |  3 ++
> > 2 files changed, 81 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 7c20601fe1de..73d5cee23dd7 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -1452,6 +1452,12 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
> >
> > 	trace_i915_gem_object_pwrite(obj, args->offset, args->size);
> >
> >+	ret = -ENODEV;
> >+	if (obj->ops->pwrite)
> >+		ret = obj->ops->pwrite(obj, args);
> 
> ops->pwrite_nowait or some good name to signify the conditions?
> 
> I am not sure that the obj->mm.pages check can be in the vfunc since
> this call happens before the i915_gem_object_wait. Which to me
> sounds like we should be explicit at this level that the ops->pwrite
> is not allowed to run if obj->mm.pages has already been
> instantiated.

That doesn't make sense to me. There's no reason that the backend can't
handle the wait, or understand special cases for when there are pages.
 
> Or if this is only true for the normal objects, and the future
> specialisation may become full, then the standard pattern of
> 
> 	if (obj->ops->pwrite)
> 		ret = obj->ops->pwrite();
> 	else
> 		ret = default_obj_pwrite();
> 
> ?

Yeah, is kind of where I think we will end up. However, pwrite is of
limited usefulness that I think we should do this case by case. Getting
rid the the shmem/phys special cases would be nice.

The sketch I have is that each backend has something like:

	err = fast_pwrite();
	if (err != -ENODEV)
		return err;

	err = default_gtt_pwrite();
	if (err != -ENODEV)
		return err;

	return slow_pwrite();

The challenge is trying to limit the duplication of the common steps to
avoid struct_mutex. (Ideally by eliminating struct_mutex entirely from
this path.)

> In which case ops->pwrite would be the correct name to keep.
> 
> The ops->pwrite for the shmemfs objects would then become
> i915_gem_object_pwrite_gtt from below followed by
> default_obj_pwrite.
> 
> I don't know, maybe I am complicating it too much?

For the moment, yes :) This was a quick hack to hide a regression - the
real bug fix will be breaking struct_mutex out of the shrinker, I think,
and there's some nasty behaviour to resolve when we do hit the shrinker
the whole object page-in/page-out behaviour is much, much slower than
what should be the equivalent with individual pages via shmemfs. The
bonus here is that shmemfs can avoid clearing/swapping-in the page if
knows it will be completely overwritten, which makes the patch useful on
its own merit.

> >+	if (ret != -ENODEV)
> >+		goto err;
> >+
> > 	ret = i915_gem_object_wait(obj,
> > 				   I915_WAIT_INTERRUPTIBLE |
> > 				   I915_WAIT_ALL,
> >@@ -2575,6 +2581,75 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
> > 	goto out_unlock;
> > }
> >
> >+static int
> >+i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj,
> 
> _shmem ? Or _nowait_shmem depending on the outcome of the above.
> Just think _gtt is wrong. Or keep it for consistency since get/put
> pages for shmem objects are also called _gtt at the moment, so until
> that is changed?

Right, until we actually move this to i915_gem_object_shmemfs.c (or just
i915_gem_shmemfs) I choose _gtt to match the existing nomenclature.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list