[PATCH v2 00/10] drm/amdgpu: prevent concurrent GPU access during reset
Yunxiang Li
Yunxiang.Li at amd.com
Tue May 28 17:23:30 UTC 2024
If another thread accesses the gpu while the GPU is being reset, the
reset could fail. This is especially problematic on SRIOV since host
may reset the GPU even if guest is not yet ready.
There are code in place that tries to prevent stray access, but over
time bugs have crept in making it not reliable. This series hopes to
address these bugs.
Likun Gao (1):
drm/amd/amdgpu: remove unnecessary flush when enable gart
Yunxiang Li (9):
drm/amdgpu: add skip_hw_access checks for sriov
drm/amdgpu: fix sriov host flr handler
drm/amdgpu: abort fence poll if reset is started
drm/amdgpu/kfd: remove is_hws_hang and is_resetting
drm/amdgpu: remove tlb flush in amdgpu_gtt_mgr_recover
drm/amdgpu: use helper in amdgpu_gart_unbind
drm/amdgpu: fix locking scope when flushing tlb
drm/amdgpu: fix missing reset domain locks
Revert "drm/amdgpu: Queue KFD reset workitem in VF FED"
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 9 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 66 ++++++++--------
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 2 -
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 7 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 25 +++++-
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 2 +
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 -
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 3 -
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 -
drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 -
drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 4 -
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 37 ++++-----
drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 37 ++++-----
drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 6 --
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 1 -
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 79 ++++++++-----------
.../drm/amd/amdkfd/kfd_device_queue_manager.h | 1 -
drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 11 ++-
.../gpu/drm/amd/amdkfd/kfd_packet_manager.c | 4 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 4 +-
.../amd/amdkfd/kfd_process_queue_manager.c | 13 ++-
27 files changed, 164 insertions(+), 177 deletions(-)
--
2.34.1
More information about the amd-gfx
mailing list