[PATCH v2 0/8] Fix several device private page reference counting issues

Alistair Popple apopple at nvidia.com
Wed Sep 28 12:01:14 UTC 2022


This series aims to fix a number of page reference counting issues in
drivers dealing with device private ZONE_DEVICE pages. These result in
use-after-free type bugs, either from accessing a struct page which no
longer exists because it has been removed or accessing fields within the
struct page which are no longer valid because the page has been freed.

During normal usage it is unlikely these will cause any problems. However
without these fixes it is possible to crash the kernel from userspace.
These crashes can be triggered either by unloading the kernel module or
unbinding the device from the driver prior to a userspace task exiting. In
modules such as Nouveau it is also possible to trigger some of these issues
by explicitly closing the device file-descriptor prior to the task exiting
and then accessing device private memory.

This involves some minor changes to both PowerPC and AMD GPU code.
Unfortunately I lack hardware to test either of those so any help there
would be appreciated. The changes mimic what is done in for both Nouveau
and hmm-tests though so I doubt they will cause problems.

To: Andrew Morton <akpm at linux-foundation.org>
To: linux-mm at kvack.org
Cc: linux-kernel at vger.kernel.org
Cc: amd-gfx at lists.freedesktop.org
Cc: nouveau at lists.freedesktop.org
Cc: dri-devel at lists.freedesktop.org

Alistair Popple (8):
  mm/memory.c: Fix race when faulting a device private page
  mm: Free device private pages have zero refcount
  mm/memremap.c: Take a pgmap reference on page allocation
  mm/migrate_device.c: Refactor migrate_vma and migrate_deivce_coherent_page()
  mm/migrate_device.c: Add migrate_device_range()
  nouveau/dmem: Refactor nouveau_dmem_fault_copy_one()
  nouveau/dmem: Evict device private memory during release
  hmm-tests: Add test for migrate_device_range()

 arch/powerpc/kvm/book3s_hv_uvmem.c       |  17 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  19 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c     |  11 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c   | 108 +++++++----
 include/linux/memremap.h                 |   1 +-
 include/linux/migrate.h                  |  15 ++-
 lib/test_hmm.c                           | 129 ++++++++++---
 lib/test_hmm_uapi.h                      |   1 +-
 mm/memory.c                              |  16 +-
 mm/memremap.c                            |  30 ++-
 mm/migrate.c                             |  34 +--
 mm/migrate_device.c                      | 239 +++++++++++++++++-------
 mm/page_alloc.c                          |   8 +-
 tools/testing/selftests/vm/hmm-tests.c   |  49 +++++-
 15 files changed, 516 insertions(+), 163 deletions(-)

base-commit: 088b8aa537c2c767765f1c19b555f21ffe555786
-- 
git-series 0.9.1


More information about the amd-gfx mailing list