[PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
Ralph Campbell
rcampbell at nvidia.com
Fri Jun 7 21:37:07 UTC 2019
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg at mellanox.com>
>
> hmm_release() is called exactly once per hmm. ops->release() cannot
> accidentally trigger any action that would recurse back onto
> hmm->mirrors_sem.
>
> This fixes a use after-free race of the form:
>
> CPU0 CPU1
> hmm_release()
> up_write(&hmm->mirrors_sem);
> hmm_mirror_unregister(mirror)
> down_write(&hmm->mirrors_sem);
> up_write(&hmm->mirrors_sem);
> kfree(mirror)
> mirror->ops->release(mirror)
>
> The only user we have today for ops->release is an empty function, so this
> is unambiguously safe.
>
> As a consequence of plugging this race drivers are not allowed to
> register/unregister mirrors from within a release op.
>
> Signed-off-by: Jason Gunthorpe <jgg at mellanox.com>
I agree with the analysis above but I'm not sure that release() will
always be an empty function. It might be more efficient to write back
all data migrated to a device "in one pass" instead of relying
on unmap_vmas() calling hmm_start_range_invalidate() per VMA.
I think the bigger issue is potential deadlocks while calling
sync_cpu_device_pagetables() and tasks calling hmm_mirror_unregister():
Say you have three threads:
- Thread A is in try_to_unmap(), either without holding mmap_sem or with
mmap_sem held for read.
- Thread B has some unrelated driver calling hmm_mirror_unregister().
This doesn't require mmap_sem.
- Thread C is about to call migrate_vma().
Thread A Thread B Thread C
try_to_unmap hmm_mirror_unregister migrate_vma
---------------------- ----------------------- ----------------------
hmm_invalidate_range_start
down_read(mirrors_sem)
down_write(mirrors_sem)
// Blocked on A
device_lock
device_lock
// Blocked on C
migrate_vma()
hmm_invalidate_range_s
down_read(mirrors_sem)
// Blocked on B
// Deadlock
Perhaps we should consider using SRCU for walking the mirror->list?
> ---
> mm/hmm.c | 28 +++++++++-------------------
> 1 file changed, 9 insertions(+), 19 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 709d138dd49027..3a45dd3d778248 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -136,26 +136,16 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> WARN_ON(!list_empty(&hmm->ranges));
> mutex_unlock(&hmm->lock);
>
> - down_write(&hmm->mirrors_sem);
> - mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror,
> - list);
> - while (mirror) {
> - list_del_init(&mirror->list);
> - if (mirror->ops->release) {
> - /*
> - * Drop mirrors_sem so the release callback can wait
> - * on any pending work that might itself trigger a
> - * mmu_notifier callback and thus would deadlock with
> - * us.
> - */
> - up_write(&hmm->mirrors_sem);
> + down_read(&hmm->mirrors_sem);
> + list_for_each_entry(mirror, &hmm->mirrors, list) {
> + /*
> + * Note: The driver is not allowed to trigger
> + * hmm_mirror_unregister() from this thread.
> + */
> + if (mirror->ops->release)
> mirror->ops->release(mirror);
> - down_write(&hmm->mirrors_sem);
> - }
> - mirror = list_first_entry_or_null(&hmm->mirrors,
> - struct hmm_mirror, list);
> }
> - up_write(&hmm->mirrors_sem);
> + up_read(&hmm->mirrors_sem);
>
> hmm_put(hmm);
> }
> @@ -287,7 +277,7 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror)
> struct hmm *hmm = mirror->hmm;
>
> down_write(&hmm->mirrors_sem);
> - list_del_init(&mirror->list);
> + list_del(&mirror->list);
> up_write(&hmm->mirrors_sem);
> hmm_put(hmm);
> memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm));
>
More information about the dri-devel
mailing list