Possible use_mm() mis-uses

Wed Aug 22 19:37:29 UTC 2018

On Wed, Aug 22, 2018 at 7:44 PM Linus Torvalds
<torvalds at linux-foundation.org> wrote:
> One of the complex ones is the amdgpu driver. It does a
> "use_mm(mmptr)" deep deep in the guts of a macro that ends up being
> used in fa few places, and it's very hard to tell if it's right.
>
> It looks almost certainly buggy (there is no mmget/mmput to get the
> refcount), but there _is_ a "release" mmu_notifier function and that
> one - unlike the kvm case - looks like it might actually be trying to
> flush the existing pending users of that mm.
>
> But on the whole, I'm suspicious of the amdgpu use. It smells. Jann
> Horn pointed out that even if it migth be ok due to the mmu_notifier,
> the comments are garbage:
>
> >  Where "process" in the uniquely-named "struct queue" is a "struct
> >  kfd_process"; that struct's definition has this comment in it:
> >
> >        /*
> >         * Opaque pointer to mm_struct. We don't hold a reference to
> >         * it so it should never be dereferenced from here. This is
> >         * only used for looking up processes by their mm.
> >         */
> >        void *mm;
> >
> >  So I think either that comment is wrong, or their code is wrong?
>
> so I'm chalking the amdgpu use up in the "broken" column.
>
Hello Linus,

I looked at the amdkfd code and indeed the comment does not match the
actual code because the mm pointer is clearly dereferenced directly in
the macro you mentioned (read_user_wptr). That macro is used in the
code path of loading a descriptor to the H/W (load_hqd). That function
is called in several cases, where in some of them we are in the
context of the calling process, but in others we are in a kernel
thread context (hence the use_mm). That's why we check these two
situations inside that macro and only do use_mm if we are in kernel
thread.

We need to fix that behavior and obviously make sure that in future
code we don't cast this pointer to mm_struct* and derereference it
directly.
Actually, the original code had "mm_struct *mm" instead of "void *mm"
in the structure, and I think the reason we changed it to void* is to
"make sure" that we won't dereference it directly, but clearly that
failed :(

Having said that, I think we *are* protected by the mmu_notifier
release because if the process suddenly dies, we will gracefully clean
the process's data in our driver and on the H/W before returning to
the mm core code. And before we return to the mm core code, we set the
mm pointer to NULL. And the graceful cleaning should be serialized
with the load_hqd uses.

Felix, do you have anything to add here that I might have missed ?

Thanks,
Oded