[Intel-gfx] Possible use_mm() mis-uses
Linus Torvalds
torvalds at linux-foundation.org
Wed Aug 22 19:58:04 UTC 2018
On Wed, Aug 22, 2018 at 12:37 PM Oded Gabbay <oded.gabbay at gmail.com> wrote:
>
> Having said that, I think we *are* protected by the mmu_notifier
> release because if the process suddenly dies, we will gracefully clean
> the process's data in our driver and on the H/W before returning to
> the mm core code. And before we return to the mm core code, we set the
> mm pointer to NULL. And the graceful cleaning should be serialized
> with the load_hqd uses.
So I'm a bit nervous about the mmu_notifier model (and the largely
equivalent exit_aio() model for the USB gardget AIO uses).
The reason I'm nervous about it is that the mmu_notifier() gets called
only after the mm_users count has already been decremented to zero
(and the exact same thing goes for exit_aio()).
Now that's fine if you actually get rid of all accesses in
mmu_notifier_release() or in exit_aio(), because the page tables still
exist at that point - they are in the process of being torn down, but
they haven't been torn down yet.
But for something like a kernel thread doing use_mm(), the thing that
worries me is a pattern something like this:
kwork thread exit thread
-------- --------
mmput() ->
mm_users goes to zero
use_mm(mmptr);
..
mmu_notifier_release();
exit_mm() ->
exit_aio()
and the pattern is basically the same regatdless of whether you use
mmu_notifier_release() or depend on some exit_aio() flushing your aio
work: the use_mm() can be called with a mm that has already had its
mm_users count decremented to zero, and that is now scheduled to be
free'd.
Does it "work"? Yes. Kind of. At least if the mmu notifier and/or
exit_aio() actually makes sure to wait for any kwork thread thing. But
it's a bit of a worrisome pattern.
Linus
More information about the Intel-gfx
mailing list