[Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable
Daniel Vetter
daniel at ffwll.ch
Wed Nov 9 11:28:35 UTC 2016
On Fri, Nov 04, 2016 at 08:32:56PM +0530, akash.goel at intel.com wrote:
> From: Chris Wilson <chris at chris-wilson.co.uk>
>
> On a long run of more than 2-3 days, physical memory tends to get
> fragmented severely, which considerably slows down the system. In such a
> scenario, the shrinker is also unable to help as lack of memory is not
> the actual problem, since it has been observed that there are enough free
> pages of 0 order. This also manifests itself when an indiviual zone in
> the mm runs out of pages and if we cannot migrate pages between zones,
> the kernel hits an out-of-memory even though there are free pages (and
> often all of swap) available.
>
> To address the issue of external fragementation, kernel does a compaction
> (which involves migration of pages) but it's efficacy depends upon how
> many pages are marked as MOVABLE, as only those pages can be migrated.
>
> Currently the backing pages for GPU buffers are allocated from shmemfs
> with GFP_RECLAIMABLE flag, in units of 4KB pages. In the case of limited
> swap space, it may not be possible always to reclaim or swap-out pages of
> all the inactive objects, to make way for free space allowing formation
> of higher order groups of physically-contiguous pages on compaction.
>
> Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
> pin the pages if they are in use by GPU, which will prevent their
> migration. So the migratepage callback in shmem is also hooked up to get
> a notification when kernel initiates the page migration. On the
> notification, i915.ko appropriately unpin the pages. With this we can
> effectively mark the GPU pages as MOVABLE and hence mitigate the
> fragmentation problem.
>
> v2:
> - Rename the migration routine to gem_shrink_migratepage, move it to the
> shrinker file, and use the existing constructs (Chris)
> - To cleanup, add a new helper function to encapsulate all page migration
> skip conditions (Chris)
> - Add a new local helper function in shrinker file, for dropping the
> backing pages, and call the same from gem_shrink() also (Chris)
>
> v3:
> - Fix/invert the check on the return value of unsafe_drop_pages (Chris)
>
> v4:
> - Minor tidy
>
> v5:
> - Fix unsafe usage of unsafe_drop_pages()
> - Rebase onto vmap-notifier
>
> v6:
> - Remove i915_gem_object_get/put across unsafe_drop_pages() as with
> struct_mutex protection object can't disappear. (Chris)
>
> Testcase: igt/gem_shrink
> Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
> Cc: Hugh Dickins <hughd at google.com>
> Cc: linux-mm at kvack.org
> Signed-off-by: Sourab Gupta <sourab.gupta at intel.com>
> Signed-off-by: Akash Goel <akash.goel at intel.com>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
Hi all -mm folks!
Any feedback on these two? It's kinda an intermediate step towards a
full-blown gemfs, and I think useful for that. Or do we need to go
directly to our own backing storage thing? Aside from ack/nack from -mm I
think this is ready for merging.
Thanks, Daniel
> ---
> drivers/gpu/drm/i915/i915_drv.h | 2 +
> drivers/gpu/drm/i915/i915_gem.c | 9 ++-
> drivers/gpu/drm/i915/i915_gem_shrinker.c | 132 +++++++++++++++++++++++++++++++
> 3 files changed, 142 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4735b417..7f2717b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1357,6 +1357,8 @@ struct intel_l3_parity {
> };
>
> struct i915_gem_mm {
> + struct shmem_dev_info shmem_info;
> +
> /** Memory allocator for GTT stolen memory */
> struct drm_mm stolen;
> /** Protects the usage of the GTT stolen memory allocator. This is
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1f995ce..f0d4ce7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct drm_i915_gem_object *obj)
> if (obj->mm.madv == I915_MADV_WILLNEED)
> mark_page_accessed(page);
>
> + set_page_private(page, 0);
> put_page(page);
> }
> obj->mm.dirty = false;
> @@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
> sg->length += PAGE_SIZE;
> }
> last_pfn = page_to_pfn(page);
> + set_page_private(page, (unsigned long)obj);
>
> /* Check that the i965g/gm workaround works. */
> WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
> @@ -2334,8 +2336,10 @@ static unsigned int swiotlb_max_size(void)
>
> err_pages:
> sg_mark_end(sg);
> - for_each_sgt_page(page, sgt_iter, st)
> + for_each_sgt_page(page, sgt_iter, st) {
> + set_page_private(page, 0);
> put_page(page);
> + }
> sg_free_table(st);
> kfree(st);
>
> @@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
> goto fail;
>
> mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> + if (IS_ENABLED(MIGRATION))
> + mask |= __GFP_MOVABLE;
> if (IS_CRESTLINE(dev_priv) || IS_BROADWATER(dev_priv)) {
> /* 965gm cannot relocate objects above 4GiB. */
> mask &= ~__GFP_HIGHMEM;
> @@ -4193,6 +4199,7 @@ struct drm_i915_gem_object *
>
> mapping = obj->base.filp->f_mapping;
> mapping_set_gfp_mask(mapping, mask);
> + shmem_set_dev_info(mapping, &dev_priv->mm.shmem_info);
>
> i915_gem_object_init(obj, &i915_gem_object_ops);
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> index a6fc1bd..051135ac 100644
> --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> @@ -24,6 +24,7 @@
>
> #include <linux/oom.h>
> #include <linux/shmem_fs.h>
> +#include <linux/migrate.h>
> #include <linux/slab.h>
> #include <linux/swap.h>
> #include <linux/pci.h>
> @@ -473,6 +474,132 @@ struct shrinker_lock_uninterruptible {
> return NOTIFY_DONE;
> }
>
> +#ifdef CONFIG_MIGRATION
> +static bool can_migrate_page(struct drm_i915_gem_object *obj)
> +{
> + /* Avoid the migration of page if being actively used by GPU */
> + if (i915_gem_object_is_active(obj))
> + return false;
> +
> + /* Skip the migration for purgeable objects otherwise there
> + * will be a deadlock when shmem will try to lock the page for
> + * truncation, which is already locked by the caller before
> + * migration.
> + */
> + if (obj->mm.madv == I915_MADV_DONTNEED)
> + return false;
> +
> + /* Skip the migration for a pinned object */
> + if (atomic_read(&obj->mm.pages_pin_count) > obj->bind_count)
> + return false;
> +
> + if (any_vma_pinned(obj))
> + return false;
> +
> + return true;
> +}
> +
> +static int do_migrate_page(struct drm_i915_gem_object *obj)
> +{
> + struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
> + int ret = 0;
> +
> + if (!can_migrate_page(obj))
> + return -EBUSY;
> +
> + /* HW access would be required for a GGTT bound object, for which
> + * device has to be kept awake. But a deadlock scenario can arise if
> + * the attempt is made to resume the device, when either a suspend
> + * or a resume operation is already happening concurrently from some
> + * other path and that only also triggers compaction. So only unbind
> + * if the device is currently awake.
> + */
> + if (!intel_runtime_pm_get_if_in_use(dev_priv))
> + return -EBUSY;
> +
> + if (!unsafe_drop_pages(obj))
> + ret = -EBUSY;
> +
> + intel_runtime_pm_put(dev_priv);
> + return ret;
> +}
> +
> +static int i915_gem_shrinker_migratepage(struct address_space *mapping,
> + struct page *newpage,
> + struct page *page,
> + enum migrate_mode mode,
> + void *dev_priv_data)
> +{
> + struct drm_i915_private *dev_priv = dev_priv_data;
> + struct shrinker_lock_uninterruptible slu;
> + int ret;
> +
> + /*
> + * Clear the private field of the new target page as it could have a
> + * stale value in the private field. Otherwise later on if this page
> + * itself gets migrated, without getting referred by the Driver
> + * in between, the stale value would cause the i915_migratepage
> + * function to go for a toss as object pointer is derived from it.
> + * This should be safe since at the time of migration, private field
> + * of the new page (which is actually an independent free 4KB page now)
> + * should be like a don't care for the kernel.
> + */
> + set_page_private(newpage, 0);
> +
> + if (!page_private(page))
> + goto migrate;
> +
> + /*
> + * Check the page count, if Driver also has a reference then it should
> + * be more than 2, as shmem will have one reference and one reference
> + * would have been taken by the migration path itself. So if reference
> + * is <=2, we can directly invoke the migration function.
> + */
> + if (page_count(page) <= 2)
> + goto migrate;
> +
> + /*
> + * Use trylock here, with a timeout, for struct_mutex as
> + * otherwise there is a possibility of deadlock due to lock
> + * inversion. This path, which tries to migrate a particular
> + * page after locking that page, can race with a path which
> + * truncate/purge pages of the corresponding object (after
> + * acquiring struct_mutex). Since page truncation will also
> + * try to lock the page, a scenario of deadlock can arise.
> + */
> + if (!i915_gem_shrinker_lock_uninterruptible(dev_priv, &slu, 10))
> + return -EBUSY;
> +
> + ret = 0;
> + if (!PageSwapCache(page) && page_private(page)) {
> + struct drm_i915_gem_object *obj =
> + (struct drm_i915_gem_object *)page_private(page);
> +
> + ret = do_migrate_page(obj);
> + }
> +
> + i915_gem_shrinker_unlock_uninterruptible(dev_priv, &slu);
> + if (ret)
> + return ret;
> +
> + /*
> + * Ideally here we don't expect the page count to be > 2, as driver
> + * would have dropped its reference, but occasionally it has been seen
> + * coming as 3 & 4. This leads to a situation of unexpected page count,
> + * causing migration failure, with -EGAIN error. This then leads to
> + * multiple attempts by the kernel to migrate the same set of pages.
> + * And sometimes the repeated attempts proves detrimental for stability.
> + * Also since we don't know who is the other owner, and for how long its
> + * gonna keep the reference, its better to return -EBUSY.
> + */
> + if (page_count(page) > 2)
> + return -EBUSY;
> +
> +migrate:
> + return migrate_page(mapping, newpage, page, mode);
> +}
> +#endif
> +
> /**
> * i915_gem_shrinker_init - Initialize i915 shrinker
> * @dev_priv: i915 device
> @@ -491,6 +618,11 @@ void i915_gem_shrinker_init(struct drm_i915_private *dev_priv)
>
> dev_priv->mm.vmap_notifier.notifier_call = i915_gem_shrinker_vmap;
> WARN_ON(register_vmap_purge_notifier(&dev_priv->mm.vmap_notifier));
> +
> + dev_priv->mm.shmem_info.private_data = dev_priv;
> +#ifdef CONFIG_MIGRATION
> + dev_priv->mm.shmem_info.migratepage = i915_gem_shrinker_migratepage;
> +#endif
> }
>
> /**
> --
> 1.9.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo at kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont at kvack.org"> email at kvack.org </a>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list