[Intel-gfx] [PATCH] drm/i915: Stop doing writeback from the shrinker
Thomas Hellström
thomas.hellstrom at linux.intel.com
Fri Dec 10 14:46:56 UTC 2021
On Fri, 2021-12-10 at 11:05 +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> This effectively removes writeback which was added in 2d6692e642e7
> ("drm/i915: Start writeback from the shrinker").
>
> Digging through the history it seems we went back and forth on the
> topic
> of whether it would be safe a couple of times. See for instance
> 5537252b6b6d ("drm/i915: Invalidate our pages under memory pressure")
> where Hugh Dickins has advised against it. I do not have enough
> expertise
> in the memory management area so am hoping for expert input here.
>
> Reason for proposing removal is that there are reports from the field
> which indicate a sysetm wide deadlock (of a sort) implicating i915
> doing
> writeback at shrinking time.
>
> Signature is a hung task notifier kicking in and task traces such as:
It would be interesting to see what exactly the find_get_entry is
blocked on. The other two tasks are blocked on the shrinker_rwsem which
is held by i915. If it's indeed a deadlock with either of those two,
then the fix Chris is working on for an unrelated issue we discovered
with shrinking would move out the writeback call from the
shrinker_rwsem and resolve this, but if i915 is in turn deadlocking
with another process and these two are just hanging waiting for the
shrinker_rwsem, we would still have other issues.
Do you by any chance have the list of the locks held by the system at
this point?
/Thomas
>
> [ 247.030274] minijail-init D 0 1773 1770 0x80004082
> [ 247.036419] Call Trace:
> [ 247.039167] __schedule+0x57e/0x10d2
> [ 247.043175] ? __schedule+0x586/0x10d2
> [ 247.047381] ? _raw_spin_unlock+0xe/0x20
> [ 247.051779] ? __queue_work+0x316/0x371
> [ 247.056079] schedule+0x7c/0x9f
> [ 247.059602] rwsem_down_write_slowpath+0x2ae/0x494
> [ 247.064971] unregister_shrinker+0x20/0x65
> [ 247.069562] deactivate_locked_super+0x38/0x88
> [ 247.074538] cleanup_mnt+0xcc/0x10e
> [ 247.078447] task_work_run+0x7d/0xa6
> [ 247.082459] do_exit+0x23d/0x831
> [ 247.086079] ? syscall_trace_enter+0x207/0x20e
> [ 247.091055] do_group_exit+0x8d/0x9d
> [ 247.095062] __x64_sys_exit_group+0x17/0x17
> [ 247.099750] do_syscall_64+0x54/0x7e
> [ 247.103758] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> [ 246.876816] chrome D 0 1791 1785 0x00004080
> [ 246.882965] Call Trace:
> [ 246.885713] __schedule+0x57e/0x10d2
> [ 246.889724] ? pcpu_alloc_area+0x25d/0x273
> [ 246.894314] schedule+0x7c/0x9f
> [ 246.897836] rwsem_down_write_slowpath+0x2ae/0x494
> [ 246.903207] register_shrinker_prepared+0x19/0x48
> [ 246.908479] ? test_single_super+0x10/0x10
> [ 246.913071] sget_fc+0x1fc/0x20e
> [ 246.916691] ? kill_litter_super+0x40/0x40
> [ 246.921334] ? proc_apply_options+0x42/0x42
> [ 246.926044] vfs_get_super+0x3a/0xdf
> [ 246.930053] vfs_get_tree+0x2b/0xc3
> [ 246.933965] fc_mount+0x12/0x39
> [ 246.937492] pid_ns_prepare_proc+0x9d/0xc5
> [ 246.942085] alloc_pid+0x275/0x289
> [ 246.945900] copy_process+0x5e5/0xeea
> [ 246.950006] _do_fork+0x95/0x303
> [ 246.953628] __se_sys_clone+0x65/0x7f
> [ 246.957735] do_syscall_64+0x54/0x7e
> [ 246.961743] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> And finally the smoking gun in:
>
> [ 247.383338] CPU: 3 PID: 88 Comm: kswapd0 Tainted: G
> U 5.4.154 #36
> [ 247.383338] Hardware name: Google Delbin/Delbin, BIOS
> Google_Delbin.13672.57.0 02/09/2021
> [ 247.383339] RIP: 0010:__rcu_read_lock+0x0/0x1a
> [ 247.383339] Code: ff ff 0f 0b e9 61 fe ff ff 0f 0b e9 76 fe ff ff
> 0f 0b 49 8b 44 24 20 e9 59 ff ff ff 0f 0b e9 67 ff ff ff 0f 0b e9 1b
> ff ff ff <0f> 1f 44 00 00 55 48 89 e5 65 48 8b 04 25 80 5d 01 00 ff
> 80 f8 03
> [ 247.383340] RSP: 0018:ffffb0aa0031b978 EFLAGS: 00000286
> [ 247.383340] RAX: 0000000000000000 RBX: fffff6b944ca8040 RCX:
> fffff6b944ca8001
> [ 247.383341] RDX: 0000000000000028 RSI: 0000000000000001 RDI:
> ffff8b52bc618c18
> [ 247.383341] RBP: ffffb0aa0031b9d0 R08: 0000000000000000 R09:
> ffff8b52fb5f00d8
> [ 247.383341] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [ 247.383342] R13: 61c8864680b583eb R14: 0000000000000001 R15:
> ffffb0aa0031b980
> [ 247.383342] FS: 0000000000000000(0000) GS:ffff8b52fbf80000(0000)
> knlGS:0000000000000000
> [ 247.383343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 247.383343] CR2: 00007c78a400d680 CR3: 0000000120f46006 CR4:
> 0000000000762ee0
> [ 247.383344] PKRU: 55555554
> [ 247.383344] Call Trace:
> [ 247.383345] find_get_entry+0x4c/0x116
> [ 247.383345] find_lock_entry+0xc8/0xec
> [ 247.383346] shmem_writeback+0x7b/0x163
> [ 247.383346] i915_gem_shrink+0x302/0x40b
> [ 247.383347] ? __intel_runtime_pm_get+0x22/0x82
> [ 247.383347] i915_gem_shrinker_scan+0x86/0xa8
> [ 247.383348] shrink_slab+0x272/0x48b
> [ 247.383348] shrink_node+0x784/0xbea
> [ 247.383348] ? rcu_read_unlock_special+0x66/0x15f
> [ 247.383349] ? update_batch_size+0x78/0x78
> [ 247.383349] kswapd+0x75c/0xa56
> [ 247.383350] kthread+0x147/0x156
> [ 247.383350] ? kswapd_run+0xb6/0xb6
> [ 247.383351] ? kthread_blkcg+0x2e/0x2e
> [ 247.383351] ret_from_fork+0x1f/0x40
>
> You will notice the trace is from an older kernel, the problem being
> reproducing the issue on latest upstream base is proving to be tricky
> due
> other (unrelated) issues.
>
> It is even tricky to repro on an older kernel, with it seemingly
> needing a
> very specific game, transparent huge pages enabled and a specific
> memory
> configuration.
>
> However given the history on the topic I could find, assuming what I
> found
> is not incomplete, suspicion on writeback being not the right thing
> to do
> in general is still there. I would therefore like to have input from
> the
> experts here.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Fixes: 2d6692e642e7 ("drm/i915: Start writeback from the shrinker")
> References: 5537252b6b6d ("drm/i915: Invalidate our pages under
> memory pressure")
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Matthew Auld <matthew.auld at intel.com>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> Cc: Michal Hocko <mhocko at suse.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com> #v1
> Cc: Jani Nikula <jani.nikula at linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Cc: Hugh Dickins <hughd at google.com>
> Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy at intel.com>
> Cc: Renato Pereyra <renatopereyra at google.com>
> Cc: intel-gfx at lists.freedesktop.org
> Cc: <stable at vger.kernel.org> # v5.3+
> ---
> drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 -
> .../gpu/drm/i915/gem/i915_gem_object_types.h | 4 +-
> drivers/gpu/drm/i915/gem/i915_gem_pages.c | 10 ----
> drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 49 -----------------
> --
> drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 18 +++----
> drivers/gpu/drm/i915/gem/i915_gem_shrinker.h | 1 -
> drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 6 +--
> .../gpu/drm/i915/gem/selftests/huge_pages.c | 3 +-
> 8 files changed, 11 insertions(+), 82 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 66f20b803b01..352c7158a487 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -455,7 +455,6 @@ i915_gem_object_unpin_pages(struct
> drm_i915_gem_object *obj)
>
> int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
> int i915_gem_object_truncate(struct drm_i915_gem_object *obj);
> -void i915_gem_object_writeback(struct drm_i915_gem_object *obj);
>
> /**
> * i915_gem_object_pin_map - return a contiguous mapping of the
> entire object
> @@ -621,7 +620,6 @@ int shmem_sg_alloc_table(struct drm_i915_private
> *i915, struct sg_table *st,
> unsigned int max_segment);
> void shmem_sg_free_table(struct sg_table *st, struct address_space
> *mapping,
> bool dirty, bool backup);
> -void __shmem_writeback(size_t size, struct address_space *mapping);
>
> #ifdef CONFIG_MMU_NOTIFIER
> static inline bool
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index f9f7e44099fe..e188d6137cc0 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -57,10 +57,8 @@ struct drm_i915_gem_object_ops {
> void (*put_pages)(struct drm_i915_gem_object *obj,
> struct sg_table *pages);
> int (*truncate)(struct drm_i915_gem_object *obj);
> - void (*writeback)(struct drm_i915_gem_object *obj);
> int (*shrinker_release_pages)(struct drm_i915_gem_object
> *obj,
> - bool no_gpu_wait,
> - bool should_writeback);
> + bool no_gpu_wait);
>
> int (*pread)(struct drm_i915_gem_object *obj,
> const struct drm_i915_gem_pread *arg);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> index 49c6e55c68ce..52e975f57956 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> @@ -168,16 +168,6 @@ int i915_gem_object_truncate(struct
> drm_i915_gem_object *obj)
> return 0;
> }
>
> -/* Try to discard unwanted pages */
> -void i915_gem_object_writeback(struct drm_i915_gem_object *obj)
> -{
> - assert_object_held_shared(obj);
> - GEM_BUG_ON(i915_gem_object_has_pages(obj));
> -
> - if (obj->ops->writeback)
> - obj->ops->writeback(obj);
> -}
> -
> static void __i915_gem_object_reset_page_iter(struct
> drm_i915_gem_object *obj)
> {
> struct radix_tree_iter iter;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index cc9fe258fba7..b4b8c921063e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -283,54 +283,6 @@ shmem_truncate(struct drm_i915_gem_object *obj)
> return 0;
> }
>
> -void __shmem_writeback(size_t size, struct address_space *mapping)
> -{
> - struct writeback_control wbc = {
> - .sync_mode = WB_SYNC_NONE,
> - .nr_to_write = SWAP_CLUSTER_MAX,
> - .range_start = 0,
> - .range_end = LLONG_MAX,
> - .for_reclaim = 1,
> - };
> - unsigned long i;
> -
> - /*
> - * Leave mmapings intact (GTT will have been revoked on
> unbinding,
> - * leaving only CPU mmapings around) and add those pages to
> the LRU
> - * instead of invoking writeback so they are aged and paged
> out
> - * as normal.
> - */
> -
> - /* Begin writeback on each dirty page */
> - for (i = 0; i < size >> PAGE_SHIFT; i++) {
> - struct page *page;
> -
> - page = find_lock_page(mapping, i);
> - if (!page)
> - continue;
> -
> - if (!page_mapped(page) &&
> clear_page_dirty_for_io(page)) {
> - int ret;
> -
> - SetPageReclaim(page);
> - ret = mapping->a_ops->writepage(page, &wbc);
> - if (!PageWriteback(page))
> - ClearPageReclaim(page);
> - if (!ret)
> - goto put;
> - }
> - unlock_page(page);
> -put:
> - put_page(page);
> - }
> -}
> -
> -static void
> -shmem_writeback(struct drm_i915_gem_object *obj)
> -{
> - __shmem_writeback(obj->base.size, obj->base.filp->f_mapping);
> -}
> -
> void
> __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
> struct sg_table *pages,
> @@ -503,7 +455,6 @@ const struct drm_i915_gem_object_ops
> i915_gem_shmem_ops = {
> .get_pages = shmem_get_pages,
> .put_pages = shmem_put_pages,
> .truncate = shmem_truncate,
> - .writeback = shmem_writeback,
>
> .pwrite = shmem_pwrite,
> .pread = shmem_pread,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> index 157a9765f483..99a38e016780 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> @@ -55,12 +55,11 @@ static bool unsafe_drop_pages(struct
> drm_i915_gem_object *obj,
> return false;
> }
>
> -static int try_to_writeback(struct drm_i915_gem_object *obj,
> unsigned int flags)
> +static int obj_invalidate(struct drm_i915_gem_object *obj, unsigned
> int flags)
> {
> if (obj->ops->shrinker_release_pages)
> return obj->ops->shrinker_release_pages(obj,
> - !(flags &
> I915_SHRINK_ACTIVE),
> - flags &
> I915_SHRINK_WRITEBACK);
> + !(flags &
> I915_SHRINK_ACTIVE));
>
> switch (obj->mm.madv) {
> case I915_MADV_DONTNEED:
> @@ -70,8 +69,9 @@ static int try_to_writeback(struct
> drm_i915_gem_object *obj, unsigned int flags)
> return 0;
> }
>
> - if (flags & I915_SHRINK_WRITEBACK)
> - i915_gem_object_writeback(obj);
> + if (obj->base.filp)
> + invalidate_mapping_pages(file_inode(obj->base.filp)-
> >i_mapping,
> + 0, (loff_t)-1);
>
> return 0;
> }
> @@ -227,7 +227,7 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
> }
>
> if
> (!__i915_gem_object_put_pages(obj)) {
> - if (!try_to_writeback(obj,
> shrink))
> + if (!obj_invalidate(obj,
> shrink))
> count += obj-
> >base.size >> PAGE_SHIFT;
> }
> if (!ww)
> @@ -339,8 +339,7 @@ i915_gem_shrinker_scan(struct shrinker *shrinker,
> struct shrink_control *sc)
> &sc->nr_scanned,
> I915_SHRINK_ACTIVE |
> I915_SHRINK_BOUND |
> - I915_SHRINK_UNBOUND
> |
> -
> I915_SHRINK_WRITEBACK);
> +
> I915_SHRINK_UNBOUND);
> }
> }
>
> @@ -361,8 +360,7 @@ i915_gem_shrinker_oom(struct notifier_block *nb,
> unsigned long event, void *ptr)
> with_intel_runtime_pm(&i915->runtime_pm, wakeref)
> freed_pages += i915_gem_shrink(NULL, i915, -1UL,
> NULL,
> I915_SHRINK_BOUND |
> - I915_SHRINK_UNBOUND |
> -
> I915_SHRINK_WRITEBACK);
> + I915_SHRINK_UNBOUND);
>
> /* Because we may be allocating inside our own driver, we
> cannot
> * assert that there are no objects with pinned pages that
> are not
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
> index 8512470f6fd6..789d6947f9b9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
> @@ -22,7 +22,6 @@ unsigned long i915_gem_shrink(struct
> i915_gem_ww_ctx *ww,
> #define I915_SHRINK_BOUND BIT(1)
> #define I915_SHRINK_ACTIVE BIT(2)
> #define I915_SHRINK_VMAPS BIT(3)
> -#define I915_SHRINK_WRITEBACK BIT(4)
>
> unsigned long i915_gem_shrink_all(struct drm_i915_private *i915);
> void i915_gem_driver_register__shrinker(struct drm_i915_private
> *i915);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 218a9b3037c7..b7ca7b66afe7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -425,8 +425,7 @@ int i915_ttm_purge(struct drm_i915_gem_object
> *obj)
> }
>
> static int i915_ttm_shrinker_release_pages(struct
> drm_i915_gem_object *obj,
> - bool no_wait_gpu,
> - bool should_writeback)
> + bool no_wait_gpu)
> {
> struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> struct i915_ttm_tt *i915_tt =
> @@ -467,9 +466,6 @@ static int i915_ttm_shrinker_release_pages(struct
> drm_i915_gem_object *obj,
> return ret;
> }
>
> - if (should_writeback)
> - __shmem_writeback(obj->base.size, i915_tt->filp-
> >f_mapping);
> -
> return 0;
> }
>
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index c69c7d45aabc..24bbf4d6a63d 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -1647,8 +1647,7 @@ static int igt_shrink_thp(void *arg)
> i915_gem_shrink(NULL, i915, -1UL, NULL,
> I915_SHRINK_BOUND |
> I915_SHRINK_UNBOUND |
> - I915_SHRINK_ACTIVE |
> - I915_SHRINK_WRITEBACK);
> + I915_SHRINK_ACTIVE);
> if (should_swap == i915_gem_object_has_pages(obj)) {
> pr_err("unexpected pages mismatch, should_swap=%s\n",
> yesno(should_swap));
More information about the Intel-gfx
mailing list