[Intel-gfx] [PATCH 1/3] drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctl
Chris Wilson
chris at chris-wilson.co.uk
Mon Mar 6 12:48:35 UTC 2017
On Mon, Mar 06, 2017 at 12:24:46PM +0000, Tvrtko Ursulin wrote:
>
> On 06/03/2017 09:29, Chris Wilson wrote:
> >Before we instantiate/pin the backing store for our use, we
> >can prepopulate the shmemfs filp efficiently using the a
> >write into the pagecache. We avoid the penalty of instantiating
> >all the pages, important if the user is just writing to a few
> >and never uses the object on the GPU, and using a direct write
> >into shmemfs allows it to avoid the cost of retrieving a page
> >(either swapin or clearing-before-use) before it is overwritten.
> >
> >This can be extended later to provide additional specialisation for
> >other backends (other than shmemfs).
> >
> >References: https://bugs.freedesktop.org/show_bug.cgi?id=99107
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >Cc: Matthew Auld <matthew.william.auld at gmail.com>
> >Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> >Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> >---
> > drivers/gpu/drm/i915/i915_gem.c | 78 ++++++++++++++++++++++++++++++++++
> > drivers/gpu/drm/i915/i915_gem_object.h | 3 ++
> > 2 files changed, 81 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 7c20601fe1de..73d5cee23dd7 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -1452,6 +1452,12 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
> >
> > trace_i915_gem_object_pwrite(obj, args->offset, args->size);
> >
> >+ ret = -ENODEV;
> >+ if (obj->ops->pwrite)
> >+ ret = obj->ops->pwrite(obj, args);
>
> ops->pwrite_nowait or some good name to signify the conditions?
>
> I am not sure that the obj->mm.pages check can be in the vfunc since
> this call happens before the i915_gem_object_wait. Which to me
> sounds like we should be explicit at this level that the ops->pwrite
> is not allowed to run if obj->mm.pages has already been
> instantiated.
That doesn't make sense to me. There's no reason that the backend can't
handle the wait, or understand special cases for when there are pages.
> Or if this is only true for the normal objects, and the future
> specialisation may become full, then the standard pattern of
>
> if (obj->ops->pwrite)
> ret = obj->ops->pwrite();
> else
> ret = default_obj_pwrite();
>
> ?
Yeah, is kind of where I think we will end up. However, pwrite is of
limited usefulness that I think we should do this case by case. Getting
rid the the shmem/phys special cases would be nice.
The sketch I have is that each backend has something like:
err = fast_pwrite();
if (err != -ENODEV)
return err;
err = default_gtt_pwrite();
if (err != -ENODEV)
return err;
return slow_pwrite();
The challenge is trying to limit the duplication of the common steps to
avoid struct_mutex. (Ideally by eliminating struct_mutex entirely from
this path.)
> In which case ops->pwrite would be the correct name to keep.
>
> The ops->pwrite for the shmemfs objects would then become
> i915_gem_object_pwrite_gtt from below followed by
> default_obj_pwrite.
>
> I don't know, maybe I am complicating it too much?
For the moment, yes :) This was a quick hack to hide a regression - the
real bug fix will be breaking struct_mutex out of the shrinker, I think,
and there's some nasty behaviour to resolve when we do hit the shrinker
the whole object page-in/page-out behaviour is much, much slower than
what should be the equivalent with individual pages via shmemfs. The
bonus here is that shmemfs can avoid clearing/swapping-in the page if
knows it will be completely overwritten, which makes the patch useful on
its own merit.
> >+ if (ret != -ENODEV)
> >+ goto err;
> >+
> > ret = i915_gem_object_wait(obj,
> > I915_WAIT_INTERRUPTIBLE |
> > I915_WAIT_ALL,
> >@@ -2575,6 +2581,75 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
> > goto out_unlock;
> > }
> >
> >+static int
> >+i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj,
>
> _shmem ? Or _nowait_shmem depending on the outcome of the above.
> Just think _gtt is wrong. Or keep it for consistency since get/put
> pages for shmem objects are also called _gtt at the moment, so until
> that is changed?
Right, until we actually move this to i915_gem_object_shmemfs.c (or just
i915_gem_shmemfs) I choose _gtt to match the existing nomenclature.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list