[PATCH 00/25] KFD fixes, robutness enhancements and cleanups

Oded Gabbay oded.gabbay at gmail.com
Sat Jul 28 09:14:30 UTC 2018


Hi Felix,
Thanks for the patch-set. Applied to -next.

Oded

On Thu, Jul 12, 2018 at 10:52 AM Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> Patches which don't already have my rb are Acked-by: Christian König
> <christian.koenig at amd.com>.
>
> Regards,
> Christian.
>
> Am 12.07.2018 um 04:32 schrieb Felix Kuehling:
> > This series fixes some KFD issues, adds robustness enhancements and
> > finally a few cleanups.
> >
> > Patches 1-4 are important fixes.
> > Patches 5-8 add handling of GPU VM faults
> > Patches 9-22 add handling of GPU resets and detection of HWS hangs
> > Patches 23-25 are various cleanups
> >
> > Felix Kuehling (2):
> >    drm/amdkfd: Reliably prevent reclaim-FS while holding DQM lock
> >    drm/amdkfd: Stop using GFP_NOIO explicitly
> >
> > Jay Cornwall (1):
> >    drm/amdkfd: Fix race between scheduler and context restore
> >
> > Lan Xiao (1):
> >    drm/amdkfd: fix zero reading of VMID and PASID for Hawaii
> >
> > Moses Reuben (1):
> >    drm/amdkfd: When we get KFD_EVENT_TYPE_MEMORY we send the process
> >      SIGSEGV
> >
> > Shaoyun Liu (13):
> >    drm/amd: Add gpu reset interfaces between amdgpu and amdkfd
> >    drm/amd: Add kfd ioctl defines for hw_exception event
> >    drm/amdkfd: Add gpu reset interface and place holder
> >    drm/amdgpu: Call KFD reset handlers during GPU reset
> >    drm/amdkfd: Implement GPU reset handlers in KFD
> >    drm/amdgpu: Enable the gpu reset from KFD
> >    drm/amdkfd: Implement hang detection in KFD and call amdgpu
> >    drm/amdgpu: Don't use shadow BO for compute context
> >    drm/amdgpu: Check NULL pointer for job before reset job's ring
> >    drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculation
> >    drm/amdgpu: Avoid invalidate tlbs when gpu is on reset
> >    drm/amdgpu: Avoid destroy hqd when GPU is on reset
> >    drm/amdkfd: Add debugfs interface to trigger HWS hang
> >
> > Wei Lu (1):
> >    drm/amdkfd: Fix error codes in kfd_get_process
> >
> > Yong Zhao (4):
> >    drm/amdkfd: Introduce KFD module parameter halt_if_hws_hang
> >    drm/amdkfd: Use module parameters noretry as the internal variable
> >      name
> >    drm/amdkfd: Replace mqd with mqd_mgr as the variable name for
> >      mqd_manager
> >    drm/amdkfd: Clean up reference of radeon
> >
> > shaoyunl (2):
> >    drm/amdgpu: get_vm_fault implementation on amdgpu side
> >    drm/amdkfd: Handle VM faults in KFD
> >
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |  27 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |   9 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  |  26 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  |   8 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c  |   7 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   |  14 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c         |   7 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h            |   2 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c             |  13 +-
> >   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c              |  33 +-
> >   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c              |  33 +-
> >   drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c   |  54 ++-
> >   drivers/gpu/drm/amd/amdkfd/cik_int.h               |   7 +-
> >   drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h     | 458 +++++++++++----------
> >   .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  |  18 +-
> >   .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm  |  16 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   3 +
> >   drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |   1 -
> >   drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.h            |  37 ++
> >   drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c           |  48 +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  94 ++++-
> >   .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 247 ++++++-----
> >   .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  26 +-
> >   .../drm/amd/amdkfd/kfd_device_queue_manager_v9.c   |   2 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |   9 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  71 ++++
> >   drivers/gpu/drm/amd/amdkfd/kfd_events.h            |   1 +
> >   drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c    |  22 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |   6 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  17 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.h      |   2 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  16 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |   2 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c    |   4 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |   2 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |  26 ++
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  34 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c           |   2 +
> >   .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  10 +-
> >   drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  35 ++
> >   include/uapi/linux/kfd_ioctl.h                     |  22 +-
> >   41 files changed, 1081 insertions(+), 390 deletions(-)
> >
>


More information about the amd-gfx mailing list