[Intel-gfx] [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock

Guenter Roeck linux at roeck-us.net
Tue Jun 30 15:48:53 UTC 2020


On Tue, Jun 30, 2020 at 04:08:00PM +0100, Chris Wilson wrote:
> Quoting Guenter Roeck (2020-06-30 16:01:05)
> > On Tue, Jun 30, 2020 at 10:14:25AM +0100, Chris Wilson wrote:
> > [ ... ]
> > > > > @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
> > > > >                       struct task_struct *task;
> > > > >                       char name[80];
> > > > >  
> > > > > -                     spin_lock(&file->table_lock);
> > > > > +                     rcu_read_lock();
> > > > >                       idr_for_each(&file->object_idr, per_file_stats, &stats);
> > > > > -                     spin_unlock(&file->table_lock);
> > > > > +                     rcu_read_unlock();
> > > > >  
> > > > For my education - is it indeed possible and valid to replace spin_lock()
> > > > with rcu_read_lock() to prevent list manipulation for a list used by
> > > > idr_for_each(), even if that list is otherwise manipulated under the
> > > > spinlock ?
> > > 
> > > It's a pure read of a radixtree here, and is supposed to be RCU safe:
> > > 
> > >  * idr_for_each() can be called concurrently with idr_alloc() and
> > >  * idr_remove() if protected by RCU.  Newly added entries may not be
> > >  * seen and deleted entries may be seen, but adding and removing entries
> > >  * will not cause other entries to be skipped, nor spurious ones to be seen.
> > > 
> > > That is the tree structure is stable.
> > > 
> > Ah, that makes sense. Thanks for the clarification.
> > 
> > > > Background: we are seeing a crash with the following call trace.
> > > > 
> > > > [ 1016.651593] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > ...
> > > > [ 1016.651693] Call Trace:
> > > > [ 1016.651703]  idr_for_each+0x8a/0xe8
> > > > [ 1016.651711]  i915_gem_object_info+0x2a3/0x3eb
> > > > [ 1016.651720]  seq_read+0x162/0x3ca
> > > > [ 1016.651727]  full_proxy_read+0x5b/0x8d
> > > > [ 1016.651733]  __vfs_read+0x45/0x1bb
> > > > [ 1016.651741]  vfs_read+0xc9/0x15e
> > > > [ 1016.651746]  ksys_read+0x7e/0xde
> > > > [ 1016.651752]  do_syscall_64+0x54/0x68
> > > > [ 1016.651758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > 
> > Actually, the crash is not in idr_for_each, but in per_file_stats:
> 
> Ok, let's assume that the object is being closed as we read the idr. The
> idr will temporarily hold an error pointer for the handle to indicate the
> in-progress closure, so something like:
> 
> @@ -230,7 +230,7 @@ static int per_file_stats(int id, void *ptr, void *data)
>         struct file_stats *stats = data;
>         struct i915_vma *vma;
> 
> -       if (!kref_get_unless_zero(&obj->base.refcount))
> +       if (IS_ERR_OR_NULL(obj) || !kref_get_unless_zero(&obj->base.refcount))
>                 return 0;
> 
Makes sense. Thanks a lot for the patch!

Guenter


More information about the Intel-gfx mailing list