[Intel-gfx] [PATCH] drm/i915: Protect debugfs per_file_stats with RCU lock

Chris Wilson chris at chris-wilson.co.uk
Tue Jun 30 15:08:00 UTC 2020


Quoting Guenter Roeck (2020-06-30 16:01:05)
> On Tue, Jun 30, 2020 at 10:14:25AM +0100, Chris Wilson wrote:
> [ ... ]
> > > > @@ -328,9 +334,9 @@ static void print_context_stats(struct seq_file *m,
> > > >                       struct task_struct *task;
> > > >                       char name[80];
> > > >  
> > > > -                     spin_lock(&file->table_lock);
> > > > +                     rcu_read_lock();
> > > >                       idr_for_each(&file->object_idr, per_file_stats, &stats);
> > > > -                     spin_unlock(&file->table_lock);
> > > > +                     rcu_read_unlock();
> > > >  
> > > For my education - is it indeed possible and valid to replace spin_lock()
> > > with rcu_read_lock() to prevent list manipulation for a list used by
> > > idr_for_each(), even if that list is otherwise manipulated under the
> > > spinlock ?
> > 
> > It's a pure read of a radixtree here, and is supposed to be RCU safe:
> > 
> >  * idr_for_each() can be called concurrently with idr_alloc() and
> >  * idr_remove() if protected by RCU.  Newly added entries may not be
> >  * seen and deleted entries may be seen, but adding and removing entries
> >  * will not cause other entries to be skipped, nor spurious ones to be seen.
> > 
> > That is the tree structure is stable.
> > 
> Ah, that makes sense. Thanks for the clarification.
> 
> > > Background: we are seeing a crash with the following call trace.
> > > 
> > > [ 1016.651593] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > ...
> > > [ 1016.651693] Call Trace:
> > > [ 1016.651703]  idr_for_each+0x8a/0xe8
> > > [ 1016.651711]  i915_gem_object_info+0x2a3/0x3eb
> > > [ 1016.651720]  seq_read+0x162/0x3ca
> > > [ 1016.651727]  full_proxy_read+0x5b/0x8d
> > > [ 1016.651733]  __vfs_read+0x45/0x1bb
> > > [ 1016.651741]  vfs_read+0xc9/0x15e
> > > [ 1016.651746]  ksys_read+0x7e/0xde
> > > [ 1016.651752]  do_syscall_64+0x54/0x68
> > > [ 1016.651758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> Actually, the crash is not in idr_for_each, but in per_file_stats:

Ok, let's assume that the object is being closed as we read the idr. The
idr will temporarily hold an error pointer for the handle to indicate the
in-progress closure, so something like:

@@ -230,7 +230,7 @@ static int per_file_stats(int id, void *ptr, void *data)
        struct file_stats *stats = data;
        struct i915_vma *vma;

-       if (!kref_get_unless_zero(&obj->base.refcount))
+       if (IS_ERR_OR_NULL(obj) || !kref_get_unless_zero(&obj->base.refcount))
                return 0;

-Chris


More information about the Intel-gfx mailing list