[PATCH 00/25] KFD fixes, robutness enhancements and cleanups

Christian König ckoenig.leichtzumerken at gmail.com
Thu Jul 12 07:52:00 UTC 2018


Patches which don't already have my rb are Acked-by: Christian König 
<christian.koenig at amd.com>.

Regards,
Christian.

Am 12.07.2018 um 04:32 schrieb Felix Kuehling:
> This series fixes some KFD issues, adds robustness enhancements and
> finally a few cleanups.
>
> Patches 1-4 are important fixes.
> Patches 5-8 add handling of GPU VM faults
> Patches 9-22 add handling of GPU resets and detection of HWS hangs
> Patches 23-25 are various cleanups
>
> Felix Kuehling (2):
>    drm/amdkfd: Reliably prevent reclaim-FS while holding DQM lock
>    drm/amdkfd: Stop using GFP_NOIO explicitly
>
> Jay Cornwall (1):
>    drm/amdkfd: Fix race between scheduler and context restore
>
> Lan Xiao (1):
>    drm/amdkfd: fix zero reading of VMID and PASID for Hawaii
>
> Moses Reuben (1):
>    drm/amdkfd: When we get KFD_EVENT_TYPE_MEMORY we send the process
>      SIGSEGV
>
> Shaoyun Liu (13):
>    drm/amd: Add gpu reset interfaces between amdgpu and amdkfd
>    drm/amd: Add kfd ioctl defines for hw_exception event
>    drm/amdkfd: Add gpu reset interface and place holder
>    drm/amdgpu: Call KFD reset handlers during GPU reset
>    drm/amdkfd: Implement GPU reset handlers in KFD
>    drm/amdgpu: Enable the gpu reset from KFD
>    drm/amdkfd: Implement hang detection in KFD and call amdgpu
>    drm/amdgpu: Don't use shadow BO for compute context
>    drm/amdgpu: Check NULL pointer for job before reset job's ring
>    drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculation
>    drm/amdgpu: Avoid invalidate tlbs when gpu is on reset
>    drm/amdgpu: Avoid destroy hqd when GPU is on reset
>    drm/amdkfd: Add debugfs interface to trigger HWS hang
>
> Wei Lu (1):
>    drm/amdkfd: Fix error codes in kfd_get_process
>
> Yong Zhao (4):
>    drm/amdkfd: Introduce KFD module parameter halt_if_hws_hang
>    drm/amdkfd: Use module parameters noretry as the internal variable
>      name
>    drm/amdkfd: Replace mqd with mqd_mgr as the variable name for
>      mqd_manager
>    drm/amdkfd: Clean up reference of radeon
>
> shaoyunl (2):
>    drm/amdgpu: get_vm_fault implementation on amdgpu side
>    drm/amdkfd: Handle VM faults in KFD
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |  27 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |   9 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  |  26 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  |   8 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c  |   7 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   |  14 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c         |   7 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h            |   2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c             |  13 +-
>   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c              |  33 +-
>   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c              |  33 +-
>   drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c   |  54 ++-
>   drivers/gpu/drm/amd/amdkfd/cik_int.h               |   7 +-
>   drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h     | 458 +++++++++++----------
>   .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  |  18 +-
>   .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm  |  16 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   3 +
>   drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |   1 -
>   drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.h            |  37 ++
>   drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c           |  48 +++
>   drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  94 ++++-
>   .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 247 ++++++-----
>   .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  26 +-
>   .../drm/amd/amdkfd/kfd_device_queue_manager_v9.c   |   2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |   9 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  71 ++++
>   drivers/gpu/drm/amd/amdkfd/kfd_events.h            |   1 +
>   drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c    |  22 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |   6 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  17 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.h      |   2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  16 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |   2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c    |   4 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |   2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |  26 ++
>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  34 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_process.c           |   2 +
>   .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  10 +-
>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  35 ++
>   include/uapi/linux/kfd_ioctl.h                     |  22 +-
>   41 files changed, 1081 insertions(+), 390 deletions(-)
>



More information about the amd-gfx mailing list