[PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
John Hubbard
jhubbard at nvidia.com
Fri Jun 7 03:47:28 UTC 2019
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg at mellanox.com>
>
> hmm_release() is called exactly once per hmm. ops->release() cannot
> accidentally trigger any action that would recurse back onto
> hmm->mirrors_sem.
>
> This fixes a use after-free race of the form:
>
> CPU0 CPU1
> hmm_release()
> up_write(&hmm->mirrors_sem);
> hmm_mirror_unregister(mirror)
> down_write(&hmm->mirrors_sem);
> up_write(&hmm->mirrors_sem);
> kfree(mirror)
> mirror->ops->release(mirror)
>
> The only user we have today for ops->release is an empty function, so this
> is unambiguously safe.
>
> As a consequence of plugging this race drivers are not allowed to
> register/unregister mirrors from within a release op.
>
> Signed-off-by: Jason Gunthorpe <jgg at mellanox.com>
> ---
> mm/hmm.c | 28 +++++++++-------------------
> 1 file changed, 9 insertions(+), 19 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 709d138dd49027..3a45dd3d778248 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -136,26 +136,16 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> WARN_ON(!list_empty(&hmm->ranges));
> mutex_unlock(&hmm->lock);
>
> - down_write(&hmm->mirrors_sem);
> - mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror,
> - list);
> - while (mirror) {
> - list_del_init(&mirror->list);
> - if (mirror->ops->release) {
> - /*
> - * Drop mirrors_sem so the release callback can wait
> - * on any pending work that might itself trigger a
> - * mmu_notifier callback and thus would deadlock with
> - * us.
> - */
> - up_write(&hmm->mirrors_sem);
> + down_read(&hmm->mirrors_sem);
This is cleaner and simpler, but I suspect it is leading to the deadlock
that Ralph Campbell is seeing in his driver testing. (And in general, holding
a lock during a driver callback usually leads to deadlocks.)
Ralph, is this the one? It's the only place in this patchset where I can
see a lock around a callback to driver code, that wasn't there before. So
I'm pretty sure it is the one...
thanks,
--
John Hubbard
NVIDIA
More information about the amd-gfx
mailing list