[Intel-gfx] [PATCH] file, i915: fix file reference for mmap_singleton()

Christian Brauner brauner at kernel.org
Wed Oct 25 20:36:39 UTC 2023


On Wed, Oct 25, 2023 at 08:52:57AM -1000, Linus Torvalds wrote:
> On Wed, 25 Oct 2023 at 02:01, Christian Brauner <brauner at kernel.org> wrote:
> >
> >         rcu_read_lock();
> > -       file = get_file_rcu(&i915->gem.mmap_singleton);
> > +       file = get_file_rcu_once(&i915->gem.mmap_singleton);
> >         rcu_read_unlock();
> >         if (file)
> >                 return file;
> 
> Honestly, the get_file_rcu_once() function seems a bit pointless.
> 
> The above doesn't want a loop at all. Yet your "once" still does loop,
> because "even get_file_rcu_once() is trying to deal with the whole
> "the pointer itself changed".
> 
> And the i915 code is actually designed to just depend on the atomicity
> of the mmap_singleton access itself, in how it uses
> 
>         cmpxchg(&i915->gem.mmap_singleton, file, NULL);
> ...
>         file = READ_ONCE(i915->gem.mmap_singleton);
> 
> and literally says "I'll remove my singleton as it is released". The
> important part there is that the 'map_singleton' pointer itself isn't
> actually reference-counted - it's the reverse, where
> reference-counting *other* instances will then auto-clear it.
> 
> And that's what then makes that get_file_rcu() model not work with it,
> because get_file_rcu() kind of assumes that the argument it gets is
> *part* of the reference counting, not a cached *result* of the
> reference counting that gets cleared as a result of the ref going down
> to zero.
> 
> I may explain my objections badly, but hopefully you get what I mean.

No, I get it. I was on the fence how to deal with this because I
honestly find this model here extremely weird and very unintuitive to
begin with with the pointer being NULLed during release and replaced
that way.

> 
> And I think that also means that using that new get_file_rcu_once()
> may be hiding the actual problem, but it's still conceptually wrong,
> because it still has that conceptual model of "the pointer I'm getting
> is part of the reference counting", when it really isn't.
> 
> So I think we'd actually be better off with something that is more
> clearly different from get_file_rcu() in naming, so make that
> conceptual difference clearer. Make it be something like
> "get_active_file(struct file **)", and make the implementation be just
> exactly what your current __get_file_rcu() is with no loops at all.
> 
> Then thew i915 code ends up being
> 
>         rcu_read_lock();
>         file = get_active_file(&i915->gem.mmap_singleton);
>         rcu_read_unlock();
> 
>         if (!IS_ERR_OR_NULL(file))
>                 return file;
> 
>        .. create new mmap_singleton ..
> 
> and that's it.
> 
> If you don't want to expose __get_file_rcu() as-is, you could maybe just do
> 
>   struct file *get_active_file(struct file **fp)
>   {
>         struct file *file;
>         rcu_read_lock();
>         file = __get_file_rcu(fp);
>         rcu_read_unlock();
>         return file;
>   }
> 
> and then the i916 code would just drop the RCU locking that it has no
> business even knowing about.

Yeah, fair idea. I just want this fixed so we don't have to drop. Pushed
to vfs.misc with your suggested changes for get_file_active() with rcu
hidden in that function. Double-check and yell if something looks wrong:

> The old i915 code is already racy, in that it's called a "singleton",
> but if you have multiple concurrent callers to mmap_singleton(), they
> all might see a NULL file at first, and then they all create
> *separate* new "singleton" files, and they *all* do that
> 
>         smp_store_mb(i915->gem.mmap_singleton, file);
> 
> and one random case of them happens to win the race and set *its* file
> as "THE singleton" file.

Yeah, I think that's all really weird but whatever.

> Am I missing something?

No, I don't think so. I thought you might have a good opinion here.

I did think that the loop didn't really matter for this case but since
it seemingly does paper over the weird semantics here let's drop it.
Anyway, pushed to:

https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs.misc

and appended here. Please yell, if something's still off.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-file-i915-fix-file-reference-for-mmap_singleton.patch
Type: text/x-diff
Size: 7877 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20231025/72b77e09/attachment-0001.patch>


More information about the Intel-gfx mailing list