[PATCH v3 0/4] drm/xe: Rework rebinding in preparation for same-vm eviction
Thomas Hellström
thomas.hellstrom at linux.intel.com
Wed Mar 27 09:11:32 UTC 2024
We are not allowing eviction / shrinking of completely unbound
local objects during exec and rebind worker.
Since unbinding is the UMD primary means of
freeing up memory on local VM overcommit situations,
this needs to be addressed.
Such funtionality will also open up the possibility to evict
purgeable local objects with upcoming changes.
To make this work properly, rebinding needs to be moved to the
while-not-all-locked drm-exec loop, since rebinding may allocate
gpu page table bos and thus cause evictions which forces us
to re-run validation.
This is done in patch 4, but when crafting that patch, a number
couple of rebinding flaws were discovered.
1) When saving the rebinding fence we always presumed the
rebinding fences were ordered. That is not true, and is
fixed in patch 2, where we attach the rebind fences as
kernel fences to the vm's resv.
2) In fact, TLB invalidation fences may currently not be assumed to be
ordered at all. This is fixed in patch 3.
3) The combination of fixes for 1) and 2) makes the rebind
of each vma wait for the TLB invalidation of the previous
rebind, which is unnecessary and would incur unneeded latency.
This is fixed in patch 1 where we move rebind TLB invalidation
to the ring ops.
v2:
- Simplify if-statements around the tlb_flush_seqno.
(Matthew Brost)
- Add some comments and asserts.
- Remove a leftover call to xe_vm_rebind() (Matt Brost)
- Add a helper function xe_vm_validate_rebind() (Matt Brost)
- Rebasing
v3:
- Only include the patches that reworks rebinding and
includes it in the drm_exec locking loop, since
the patches that dealt with same-vm eviction allowed
page-table allocation to evict the object being bound,
and while this was addressed properly during exec and
rebind worker, it was not during the VM_BIND ioctl.
- Squash the patches moving fence reservation and
moving rebinding into the locking loop since the
code was not properly working in between those
patches. (Matt Brost)
- Add code comments (Matt Brost)
Thomas Hellström (4):
drm/xe: Use ring ops TLB invalidation for rebinds
drm/xe: Rework rebinding
drm/xe: Make TLB invalidation fences unordered
drm/xe: Move vma rebinding to the drm_exec locking loop
drivers/gpu/drm/xe/xe_exec.c | 79 ++------------
drivers/gpu/drm/xe/xe_exec_queue_types.h | 5 +
drivers/gpu/drm/xe/xe_gt_pagefault.c | 3 +-
drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 1 -
drivers/gpu/drm/xe/xe_gt_types.h | 7 --
drivers/gpu/drm/xe/xe_pt.c | 25 ++++-
drivers/gpu/drm/xe/xe_ring_ops.c | 11 +-
drivers/gpu/drm/xe/xe_sched_job.c | 10 ++
drivers/gpu/drm/xe/xe_sched_job_types.h | 2 +
drivers/gpu/drm/xe/xe_vm.c | 110 ++++++++++++--------
drivers/gpu/drm/xe/xe_vm.h | 8 +-
drivers/gpu/drm/xe/xe_vm_types.h | 8 +-
12 files changed, 126 insertions(+), 143 deletions(-)
--
2.44.0
More information about the Intel-xe
mailing list