[Intel-gfx] [PATCH] drm/i915: Stop doing writeback from the shrinker

Thomas Hellström thomas.hellstrom at linux.intel.com
Fri Dec 10 14:46:56 UTC 2021


On Fri, 2021-12-10 at 11:05 +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> This effectively removes writeback which was added in 2d6692e642e7
> ("drm/i915: Start writeback from the shrinker").
> 
> Digging through the history it seems we went back and forth on the
> topic
> of whether it would be safe a couple of times. See for instance
> 5537252b6b6d ("drm/i915: Invalidate our pages under memory pressure")
> where Hugh Dickins has advised against it. I do not have enough
> expertise
> in the memory management area so am hoping for expert input here.
> 
> Reason for proposing removal is that there are reports from the field
> which indicate a sysetm wide deadlock (of a sort) implicating i915
> doing
> writeback at shrinking time.
> 
> Signature is a hung task notifier kicking in and task traces such as:

It would be interesting to see what exactly the find_get_entry is
blocked on. The other two tasks are blocked on the shrinker_rwsem which
is held by i915. If it's indeed a deadlock with either of those two,
then the fix Chris is working on for an unrelated issue we discovered
with shrinking would move out the writeback call from the
shrinker_rwsem and resolve this, but if i915 is in turn deadlocking
with another process and these two are just hanging waiting for the
shrinker_rwsem, we would still have other issues.

Do you by any chance have the list of the locks held by the system at
this point?

/Thomas


> 
>  [  247.030274] minijail-init   D    0  1773   1770 0x80004082
>  [  247.036419] Call Trace:
>  [  247.039167]  __schedule+0x57e/0x10d2
>  [  247.043175]  ? __schedule+0x586/0x10d2
>  [  247.047381]  ? _raw_spin_unlock+0xe/0x20
>  [  247.051779]  ? __queue_work+0x316/0x371
>  [   247.056079]  schedule+0x7c/0x9f
>  [  247.059602]  rwsem_down_write_slowpath+0x2ae/0x494
>  [  247.064971]  unregister_shrinker+0x20/0x65
>  [  247.069562]  deactivate_locked_super+0x38/0x88
>  [  247.074538]  cleanup_mnt+0xcc/0x10e
>  [  247.078447]  task_work_run+0x7d/0xa6
>  [  247.082459]  do_exit+0x23d/0x831
>  [  247.086079]  ? syscall_trace_enter+0x207/0x20e
>  [  247.091055]  do_group_exit+0x8d/0x9d
>  [  247.095062]  __x64_sys_exit_group+0x17/0x17
>  [  247.099750]  do_syscall_64+0x54/0x7e
>  [  247.103758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
>  [  246.876816] chrome          D    0  1791   1785 0x00004080
>  [  246.882965] Call Trace:
>  [  246.885713]  __schedule+0x57e/0x10d2
>  [  246.889724]  ? pcpu_alloc_area+0x25d/0x273
>  [  246.894314]  schedule+0x7c/0x9f
>  [  246.897836]  rwsem_down_write_slowpath+0x2ae/0x494
>  [  246.903207]  register_shrinker_prepared+0x19/0x48
>  [  246.908479]  ? test_single_super+0x10/0x10
>  [  246.913071]  sget_fc+0x1fc/0x20e
>  [  246.916691]  ? kill_litter_super+0x40/0x40
>  [  246.921334]  ? proc_apply_options+0x42/0x42
>  [  246.926044]  vfs_get_super+0x3a/0xdf
>  [  246.930053]  vfs_get_tree+0x2b/0xc3
>  [  246.933965]  fc_mount+0x12/0x39
>  [  246.937492]  pid_ns_prepare_proc+0x9d/0xc5
>  [  246.942085]  alloc_pid+0x275/0x289
>  [  246.945900]  copy_process+0x5e5/0xeea
>  [  246.950006]  _do_fork+0x95/0x303
>  [  246.953628]  __se_sys_clone+0x65/0x7f
>  [  246.957735]  do_syscall_64+0x54/0x7e
>  [  246.961743]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> And finally the smoking gun in:
> 
>  [  247.383338] CPU: 3 PID: 88 Comm: kswapd0 Tainted: G    
> U            5.4.154 #36
>  [  247.383338] Hardware name: Google Delbin/Delbin, BIOS
> Google_Delbin.13672.57.0 02/09/2021
>  [  247.383339] RIP: 0010:__rcu_read_lock+0x0/0x1a
>  [  247.383339] Code: ff ff 0f 0b e9 61 fe ff ff 0f 0b e9 76 fe ff ff
> 0f 0b 49 8b 44 24 20 e9 59 ff ff ff 0f 0b e9 67 ff ff ff 0f 0b e9 1b
> ff ff ff <0f> 1f 44 00 00 55 48 89 e5 65 48 8b 04 25 80 5d 01 00 ff
> 80 f8 03
>  [  247.383340] RSP: 0018:ffffb0aa0031b978 EFLAGS: 00000286
>  [  247.383340] RAX: 0000000000000000 RBX: fffff6b944ca8040 RCX:
> fffff6b944ca8001
>  [  247.383341] RDX: 0000000000000028 RSI: 0000000000000001 RDI:
> ffff8b52bc618c18
>  [  247.383341] RBP: ffffb0aa0031b9d0 R08: 0000000000000000 R09:
> ffff8b52fb5f00d8
>  [  247.383341] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
>  [  247.383342] R13: 61c8864680b583eb R14: 0000000000000001 R15:
> ffffb0aa0031b980
>  [  247.383342] FS:  0000000000000000(0000) GS:ffff8b52fbf80000(0000)
> knlGS:0000000000000000
>  [  247.383343] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  [  247.383343] CR2: 00007c78a400d680 CR3: 0000000120f46006 CR4:
> 0000000000762ee0
>  [  247.383344] PKRU: 55555554
>  [  247.383344] Call Trace:
>  [  247.383345]  find_get_entry+0x4c/0x116
>  [  247.383345]  find_lock_entry+0xc8/0xec
>  [  247.383346]  shmem_writeback+0x7b/0x163
>  [  247.383346]  i915_gem_shrink+0x302/0x40b
>  [  247.383347]  ? __intel_runtime_pm_get+0x22/0x82
>  [  247.383347]  i915_gem_shrinker_scan+0x86/0xa8
>  [  247.383348]  shrink_slab+0x272/0x48b
>  [  247.383348]  shrink_node+0x784/0xbea
>  [  247.383348]  ? rcu_read_unlock_special+0x66/0x15f
>  [  247.383349]  ? update_batch_size+0x78/0x78
>  [  247.383349]  kswapd+0x75c/0xa56
>  [  247.383350]  kthread+0x147/0x156
>  [  247.383350]  ? kswapd_run+0xb6/0xb6
>  [  247.383351]  ? kthread_blkcg+0x2e/0x2e
>  [  247.383351]  ret_from_fork+0x1f/0x40
> 
> You will notice the trace is from an older kernel, the problem being
> reproducing the issue on latest upstream base is proving to be tricky
> due
> other (unrelated) issues.
> 
> It is even tricky to repro on an older kernel, with it seemingly
> needing a
> very specific game, transparent huge pages enabled and a specific
> memory
> configuration.
> 
> However given the history on the topic I could find, assuming what I
> found
> is not incomplete, suspicion on writeback being not the right thing
> to do
> in general is still there. I would therefore like to have input from
> the
> experts here.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Fixes: 2d6692e642e7 ("drm/i915: Start writeback from the shrinker")
> References: 5537252b6b6d ("drm/i915: Invalidate our pages under
> memory pressure")
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Matthew Auld <matthew.auld at intel.com>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> Cc: Michal Hocko <mhocko at suse.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com> #v1
> Cc: Jani Nikula <jani.nikula at linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Cc: Hugh Dickins <hughd at google.com>
> Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy at intel.com>
> Cc: Renato Pereyra <renatopereyra at google.com>
> Cc: intel-gfx at lists.freedesktop.org
> Cc: <stable at vger.kernel.org> # v5.3+
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |  2 -
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  4 +-
>  drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 10 ----
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c     | 49 -----------------
> --
>  drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  | 18 +++----
>  drivers/gpu/drm/i915/gem/i915_gem_shrinker.h  |  1 -
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |  6 +--
>  .../gpu/drm/i915/gem/selftests/huge_pages.c   |  3 +-
>  8 files changed, 11 insertions(+), 82 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 66f20b803b01..352c7158a487 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -455,7 +455,6 @@ i915_gem_object_unpin_pages(struct
> drm_i915_gem_object *obj)
>  
>  int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
>  int i915_gem_object_truncate(struct drm_i915_gem_object *obj);
> -void i915_gem_object_writeback(struct drm_i915_gem_object *obj);
>  
>  /**
>   * i915_gem_object_pin_map - return a contiguous mapping of the
> entire object
> @@ -621,7 +620,6 @@ int shmem_sg_alloc_table(struct drm_i915_private
> *i915, struct sg_table *st,
>                          unsigned int max_segment);
>  void shmem_sg_free_table(struct sg_table *st, struct address_space
> *mapping,
>                          bool dirty, bool backup);
> -void __shmem_writeback(size_t size, struct address_space *mapping);
>  
>  #ifdef CONFIG_MMU_NOTIFIER
>  static inline bool
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index f9f7e44099fe..e188d6137cc0 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -57,10 +57,8 @@ struct drm_i915_gem_object_ops {
>         void (*put_pages)(struct drm_i915_gem_object *obj,
>                           struct sg_table *pages);
>         int (*truncate)(struct drm_i915_gem_object *obj);
> -       void (*writeback)(struct drm_i915_gem_object *obj);
>         int (*shrinker_release_pages)(struct drm_i915_gem_object
> *obj,
> -                                     bool no_gpu_wait,
> -                                     bool should_writeback);
> +                                     bool no_gpu_wait);
>  
>         int (*pread)(struct drm_i915_gem_object *obj,
>                      const struct drm_i915_gem_pread *arg);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> index 49c6e55c68ce..52e975f57956 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> @@ -168,16 +168,6 @@ int i915_gem_object_truncate(struct
> drm_i915_gem_object *obj)
>         return 0;
>  }
>  
> -/* Try to discard unwanted pages */
> -void i915_gem_object_writeback(struct drm_i915_gem_object *obj)
> -{
> -       assert_object_held_shared(obj);
> -       GEM_BUG_ON(i915_gem_object_has_pages(obj));
> -
> -       if (obj->ops->writeback)
> -               obj->ops->writeback(obj);
> -}
> -
>  static void __i915_gem_object_reset_page_iter(struct
> drm_i915_gem_object *obj)
>  {
>         struct radix_tree_iter iter;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index cc9fe258fba7..b4b8c921063e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -283,54 +283,6 @@ shmem_truncate(struct drm_i915_gem_object *obj)
>         return 0;
>  }
>  
> -void __shmem_writeback(size_t size, struct address_space *mapping)
> -{
> -       struct writeback_control wbc = {
> -               .sync_mode = WB_SYNC_NONE,
> -               .nr_to_write = SWAP_CLUSTER_MAX,
> -               .range_start = 0,
> -               .range_end = LLONG_MAX,
> -               .for_reclaim = 1,
> -       };
> -       unsigned long i;
> -
> -       /*
> -        * Leave mmapings intact (GTT will have been revoked on
> unbinding,
> -        * leaving only CPU mmapings around) and add those pages to
> the LRU
> -        * instead of invoking writeback so they are aged and paged
> out
> -        * as normal.
> -        */
> -
> -       /* Begin writeback on each dirty page */
> -       for (i = 0; i < size >> PAGE_SHIFT; i++) {
> -               struct page *page;
> -
> -               page = find_lock_page(mapping, i);
> -               if (!page)
> -                       continue;
> -
> -               if (!page_mapped(page) &&
> clear_page_dirty_for_io(page)) {
> -                       int ret;
> -
> -                       SetPageReclaim(page);
> -                       ret = mapping->a_ops->writepage(page, &wbc);
> -                       if (!PageWriteback(page))
> -                               ClearPageReclaim(page);
> -                       if (!ret)
> -                               goto put;
> -               }
> -               unlock_page(page);
> -put:
> -               put_page(page);
> -       }
> -}
> -
> -static void
> -shmem_writeback(struct drm_i915_gem_object *obj)
> -{
> -       __shmem_writeback(obj->base.size, obj->base.filp->f_mapping);
> -}
> -
>  void
>  __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
>                                 struct sg_table *pages,
> @@ -503,7 +455,6 @@ const struct drm_i915_gem_object_ops
> i915_gem_shmem_ops = {
>         .get_pages = shmem_get_pages,
>         .put_pages = shmem_put_pages,
>         .truncate = shmem_truncate,
> -       .writeback = shmem_writeback,
>  
>         .pwrite = shmem_pwrite,
>         .pread = shmem_pread,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> index 157a9765f483..99a38e016780 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> @@ -55,12 +55,11 @@ static bool unsafe_drop_pages(struct
> drm_i915_gem_object *obj,
>         return false;
>  }
>  
> -static int try_to_writeback(struct drm_i915_gem_object *obj,
> unsigned int flags)
> +static int obj_invalidate(struct drm_i915_gem_object *obj, unsigned
> int flags)
>  {
>         if (obj->ops->shrinker_release_pages)
>                 return obj->ops->shrinker_release_pages(obj,
> -                                                       !(flags &
> I915_SHRINK_ACTIVE),
> -                                                       flags &
> I915_SHRINK_WRITEBACK);
> +                                                       !(flags &
> I915_SHRINK_ACTIVE));
>  
>         switch (obj->mm.madv) {
>         case I915_MADV_DONTNEED:
> @@ -70,8 +69,9 @@ static int try_to_writeback(struct
> drm_i915_gem_object *obj, unsigned int flags)
>                 return 0;
>         }
>  
> -       if (flags & I915_SHRINK_WRITEBACK)
> -               i915_gem_object_writeback(obj);
> +       if (obj->base.filp)
> +               invalidate_mapping_pages(file_inode(obj->base.filp)-
> >i_mapping,
> +                                        0, (loff_t)-1);
>  
>         return 0;
>  }
> @@ -227,7 +227,7 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
>                                 }
>  
>                                 if
> (!__i915_gem_object_put_pages(obj)) {
> -                                       if (!try_to_writeback(obj,
> shrink))
> +                                       if (!obj_invalidate(obj,
> shrink))
>                                                 count += obj-
> >base.size >> PAGE_SHIFT;
>                                 }
>                                 if (!ww)
> @@ -339,8 +339,7 @@ i915_gem_shrinker_scan(struct shrinker *shrinker,
> struct shrink_control *sc)
>                                                  &sc->nr_scanned,
>                                                  I915_SHRINK_ACTIVE |
>                                                  I915_SHRINK_BOUND |
> -                                                I915_SHRINK_UNBOUND
> |
> -                                               
> I915_SHRINK_WRITEBACK);
> +                                               
> I915_SHRINK_UNBOUND);
>                 }
>         }
>  
> @@ -361,8 +360,7 @@ i915_gem_shrinker_oom(struct notifier_block *nb,
> unsigned long event, void *ptr)
>         with_intel_runtime_pm(&i915->runtime_pm, wakeref)
>                 freed_pages += i915_gem_shrink(NULL, i915, -1UL,
> NULL,
>                                                I915_SHRINK_BOUND |
> -                                              I915_SHRINK_UNBOUND |
> -                                             
> I915_SHRINK_WRITEBACK);
> +                                              I915_SHRINK_UNBOUND);
>  
>         /* Because we may be allocating inside our own driver, we
> cannot
>          * assert that there are no objects with pinned pages that
> are not
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
> index 8512470f6fd6..789d6947f9b9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
> @@ -22,7 +22,6 @@ unsigned long i915_gem_shrink(struct
> i915_gem_ww_ctx *ww,
>  #define I915_SHRINK_BOUND      BIT(1)
>  #define I915_SHRINK_ACTIVE     BIT(2)
>  #define I915_SHRINK_VMAPS      BIT(3)
> -#define I915_SHRINK_WRITEBACK  BIT(4)
>  
>  unsigned long i915_gem_shrink_all(struct drm_i915_private *i915);
>  void i915_gem_driver_register__shrinker(struct drm_i915_private
> *i915);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 218a9b3037c7..b7ca7b66afe7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -425,8 +425,7 @@ int i915_ttm_purge(struct drm_i915_gem_object
> *obj)
>  }
>  
>  static int i915_ttm_shrinker_release_pages(struct
> drm_i915_gem_object *obj,
> -                                          bool no_wait_gpu,
> -                                          bool should_writeback)
> +                                          bool no_wait_gpu)
>  {
>         struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
>         struct i915_ttm_tt *i915_tt =
> @@ -467,9 +466,6 @@ static int i915_ttm_shrinker_release_pages(struct
> drm_i915_gem_object *obj,
>                 return ret;
>         }
>  
> -       if (should_writeback)
> -               __shmem_writeback(obj->base.size, i915_tt->filp-
> >f_mapping);
> -
>         return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index c69c7d45aabc..24bbf4d6a63d 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -1647,8 +1647,7 @@ static int igt_shrink_thp(void *arg)
>         i915_gem_shrink(NULL, i915, -1UL, NULL,
>                         I915_SHRINK_BOUND |
>                         I915_SHRINK_UNBOUND |
> -                       I915_SHRINK_ACTIVE |
> -                       I915_SHRINK_WRITEBACK);
> +                       I915_SHRINK_ACTIVE);
>         if (should_swap == i915_gem_object_has_pages(obj)) {
>                 pr_err("unexpected pages mismatch, should_swap=%s\n",
>                        yesno(should_swap));




More information about the Intel-gfx mailing list